RE: [PATCH 0/6] Kill setup_irq()
> -Original Message- > From: linux-hexagon-ow...@vger.kernel.org ow...@vger.kernel.org> On Behalf Of afzal mohammed > Sent: Friday, March 27, 2020 11:08 AM > To: Thomas Gleixner > Cc: linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > linux-samsung-...@vger.kernel.org; x...@kernel.org; linux- > s...@vger.kernel.org; linux-s...@vger.kernel.org; linuxppc- > d...@lists.ozlabs.org; linux-par...@vger.kernel.org; linux- > m...@vger.kernel.org; linux-m...@lists.linux-m68k.org; linux- > i...@vger.kernel.org; linux-hexa...@vger.kernel.org; linux-c6x-dev@linux- > c6x.org; linux-o...@vger.kernel.org; linux-al...@vger.kernel.org > Subject: [PATCH 0/6] Kill setup_irq() ... > Note 1: sh toolchain is available, but that will not make the relevant changes > compile as it has dependency of 64bit arch toolchain, did try a Kconfig hack > to make it compile w/ 32bit sh toolchain, but it failed due to other reasons > (unknown operands), so gave up on that. > Note 2: hexagon final image creation fails even w/o my patch, but it has > been ensured that w/ my changes relevant object files are getting built w/o > warnings. Afzal, What's the nature of the failure in "Note 2"? -Brian
Re: [yyu168-linux_cet:cet 55/58] powerpc64le-linux-ld: warning: discarding dynamic section .rela___ksymtab+jiffies_to_timeval
On Thu, 2020-02-06 at 04:55 -0800, H.J. Lu wrote: > On Wed, Feb 5, 2020 at 7:26 PM Michael Ellerman wrote: > > "H.J. Lu" writes: > > > On Tue, Feb 4, 2020 at 3:37 PM kbuild test robot wrote: > > > > tree: https://github.com/yyu168/linux_cet.git cet > > > > head: bba707cc4715c1036b6561ab38b16747f9c49cfa > > > > commit: 71bb971dd76eeacd351690f28864ad5c5bec3691 [55/58] Discard > > > > .note.gnu.property sections in generic NOTES > > > > config: powerpc-rhel-kconfig (attached as .config) > > > > compiler: powerpc64le-linux-gcc (GCC) 7.5.0 > > > > reproduce: > > > > wget > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross > > > > -O ~/bin/make.cross > > > > chmod +x ~/bin/make.cross > > > > git checkout 71bb971dd76eeacd351690f28864ad5c5bec3691 > > > > # save the attached .config to linux build tree > > > > GCC_VERSION=7.5.0 make.cross ARCH=powerpc > > > > > > > > If you fix the issue, kindly add following tag > > > > Reported-by: kbuild test robot > > > > > > > > All warnings (new ones prefixed by >>): > > > > > > > >powerpc64le-linux-ld: warning: discarding dynamic section > > > > .rela___ksymtab_gpl+__wait_rcu_gp > > > > > > arch/powerpc/kernel/vmlinux.lds.S has > > > > > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x)) > > > { > > > __rela_dyn_start = .; > > > *(.rela*) Keep .rela* sections > > > } > > > > The above is inside #ifdef CONFIG_RELOCATABLE > > > > > ... > > > /DISCARD/ : { > > > *(*.EMB.apuinfo) > > > *(.glink .iplt .plt .rela* .comment) > > > Discard .rela* sections. But it is > > > ignored. > > > *(.gnu.version*) > > > *(.gnu.attributes) > > > *(.eh_frame) > > > } > > > > But that is not #ifdef'ed at all. > > > > > With my > > > > > > ommit 71bb971dd76eeacd351690f28864ad5c5bec3691 > > > Author: H.J. Lu > > > Date: Thu Jan 30 12:39:09 2020 -0800 > > > > > > Discard .note.gnu.property sections in generic NOTES > > > > > > With the command-line option, -mx86-used-note=yes, the x86 assembler > > > in binutils 2.32 and above generates a program property note in a note > > > section, .note.gnu.property, to encode used x86 ISAs and features. > > > But > > > kernel linker script only contains a single NOTE segment: > > > > > > /DISCARD/ : { *(.note.gnu.property) } > > > > > > is placed before > > > > > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x)) > > > { > > > __rela_dyn_start = .; > > > *(.rela*) Keep .rela* sections > > > } > > > > > > Then .rela* in > > > > > > /DISCARD/ : { > > > *(*.EMB.apuinfo) > > > *(.glink .iplt .plt .rela* .comment) > > > *(.gnu.version*) > > > *(.gnu.attributes) > > > *(.eh_frame) > > > } > > > > > > is honored. Can someone from POWERPC comment on it? > > > > Hmm OK. I'm not really a toolchain person. > > > > The comment on DISCARDS says: > > > >* Some archs want to discard exit text/data at runtime rather than > >* link time due to cross-section references such as alt instructions, > >* bug table, eh_frame, etc. DISCARDS must be the last of output > >* section definitions so that such archs put those in earlier section > >* definitions. > >*/ > > > > But I guess you're changing those semantics in your series. > > > > This seems to fix the warning for me? > > > > diff --git a/arch/powerpc/kernel/vmlinux.lds.S > > b/arch/powerpc/kernel/vmlinux.lds.S > > index b4c89a1acebb..076b3e8a849d 100644 > > --- a/arch/powerpc/kernel/vmlinux.lds.S > > +++ b/arch/powerpc/kernel/vmlinux.lds.S > > @@ -365,9 +365,12 @@ SECTIONS > > DISCARDS > > /DISCARD/ : { > > *(*.EMB.apuinfo) > > - *(.glink .iplt .plt .rela* .comment) > > + *(.glink .iplt .plt .comment) > > *(.gnu.version*) > > *(.gnu.attributes) > > *(.eh_frame) > > +#ifndef CONFIG_RELOCATABLE > > + *(.rela*) > > +#endif > > } > > } > > > > > > cheers > > This looks correct me. > > Reviewed-by: H.J. Lu > > Thanks. > Has this been merged into any branch yet? I just checked the tip tree and did not see it. Thanks, Yu-cheng
[PATCH 6/9] powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig
Set CONFIG_UEVENT_HELPER=y in ps3_defconfig. commit 1be01d4a57142ded23bdb9e0c8d9369e693b26cc (driver: base: Disable CONFIG_UEVENT_HELPER by default) disabled the CONFIG_UEVENT_HELPER option that is needed for hotplug and module loading by most older 32bit powerpc distributions that users typically install on the PS3. Cc: Geert Uytterhoeven Signed-off-by: Geoff Levand --- arch/powerpc/configs/ps3_defconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/configs/ps3_defconfig b/arch/powerpc/configs/ps3_defconfig index 4db51719342a..81b55c880fc3 100644 --- a/arch/powerpc/configs/ps3_defconfig +++ b/arch/powerpc/configs/ps3_defconfig @@ -60,6 +60,8 @@ CONFIG_CFG80211=m CONFIG_CFG80211_WEXT=y CONFIG_MAC80211=m # CONFIG_MAC80211_RC_MINSTREL is not set +CONFIG_UEVENT_HELPER=y +CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_BLK_DEV_LOOP=y CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=65535 -- 2.20.1
[PATCH 2/9] drivers/ps3: Remove duplicate error messages
From: Markus Elfring Remove duplicate memory allocation failure error messages. Signed-off-by: Markus Elfring Signed-off-by: Geoff Levand --- drivers/ps3/ps3-lpm.c | 2 -- drivers/ps3/ps3-vuart.c | 1 - 2 files changed, 3 deletions(-) diff --git a/drivers/ps3/ps3-lpm.c b/drivers/ps3/ps3-lpm.c index 83c45659bc9d..fcc346296e3a 100644 --- a/drivers/ps3/ps3-lpm.c +++ b/drivers/ps3/ps3-lpm.c @@ -,8 +,6 @@ int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void *tb_cache, lpm_priv->tb_cache_internal = kzalloc( lpm_priv->tb_cache_size + 127, GFP_KERNEL); if (!lpm_priv->tb_cache_internal) { - dev_err(sbd_core(), "%s:%u: alloc internal tb_cache " - "failed\n", __func__, __LINE__); result = -ENOMEM; goto fail_malloc; } diff --git a/drivers/ps3/ps3-vuart.c b/drivers/ps3/ps3-vuart.c index ddaa5ea5801a..3b5878edc6c2 100644 --- a/drivers/ps3/ps3-vuart.c +++ b/drivers/ps3/ps3-vuart.c @@ -917,7 +917,6 @@ static int ps3_vuart_bus_interrupt_get(void) vuart_bus_priv.bmp = kzalloc(sizeof(struct ports_bmp), GFP_KERNEL); if (!vuart_bus_priv.bmp) { - pr_debug("%s:%d: kzalloc failed.\n", __func__, __LINE__); result = -ENOMEM; goto fail_bmp_malloc; } -- 2.20.1
[PATCH 5/9] ps3disk: use the default segment boundary
From: Emmanuel Nicolet Hi, since commit dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count"), the kernel will bug_on on the PS3 because bio_split() is called with sectors == 0: kernel BUG at block/bio.c:1853! Oops: Exception in kernel mode, sig: 5 [#1] BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3 Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \ ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1 NIP: c027d0d0 LR: c027d0b0 CTR: REGS: c135ae90 TRAP: 0700 Not tainted (5.3.0-rc4) MSR: 80028032 CR: 44008240 XER: 2000 IRQMASK: 0 GPR00: c0289368 c135b120 c084a500 c4ff8300 GPR04: 0c00 c4c905e0 c4c905e0 GPR08: 0001 GPR12: c08ef000 003e 00080001 GPR16: 0100 0004 GPR20: c062fd7e 0001 0080 GPR24: c0781788 c135b350 0080 c4c905e0 GPR28: c135b348 c4ff8300 c4c9 NIP [c027d0d0] .bio_split+0x28/0xac LR [c027d0b0] .bio_split+0x8/0xac Call Trace: [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable) [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4 [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294 [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174 [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c [c135b600] [c0206184] .mpage_readpages+0xa4/0x184 [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38 [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8 [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184 [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8 [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830 [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0 [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124 [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8 [c135be20] [c000a524] system_call+0x5c/0x70 Instruction dump: 7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0 7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62 [ end trace 313fec760f30aa1f ]--- The problem originates from setting the segment boundary of the request queue to -1UL. This makes get_max_segment_size() return zero when offset is zero, whatever the max segment size. The test with BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows to zero in the return statement. Not setting the segment boundary and using the default value (BLK_SEG_BOUNDARY_MASK) fixes the problem. Maybe BLK_SEG_BOUNDARY_MASK should be set to -1UL? It's currently set to only 0xUL. I don't know if that would break anything. Signed-off-by: Emmanuel Nicolet Signed-off-by: Geoff Levand --- drivers/block/ps3disk.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c index c5c6487a19d5..7b55811c2a81 100644 --- a/drivers/block/ps3disk.c +++ b/drivers/block/ps3disk.c @@ -454,7 +454,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev) queue->queuedata = dev; blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9); - blk_queue_segment_boundary(queue, -1UL); blk_queue_dma_alignment(queue, dev->blk_size-1); blk_queue_logical_block_size(queue, dev->blk_size); -- 2.20.1
[PATCH 9/9] powerpc/ps3: Add udbg_panic
BUG_ON() won't work in the early init code, so replace it with a new routine udbg_panic() that uses udbg_printf() and lv1_panic(). Signed-off-by: Geoff Levand --- arch/powerpc/platforms/ps3/mm.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c index 423be34f0f5f..7c7c2f53eacb 100644 --- a/arch/powerpc/platforms/ps3/mm.c +++ b/arch/powerpc/platforms/ps3/mm.c @@ -26,6 +26,13 @@ #define DBG pr_devel #endif +#define udbg_panic(_val, _msg) \ +if (_val) { \ + udbg_printf("%s:%d: " _msg ": %d\n", \ + __func__, __LINE__, (int)(_val)); \ + lv1_panic(0); \ +} + enum { #if defined(CONFIG_PS3_DYNAMIC_DMA) USE_DYNAMIC_DMA = 1, @@ -313,7 +320,7 @@ static void ps3_mm_region_destroy(struct mem_region *r) if (r->base) { result = lv1_release_memory(r->base); - BUG_ON(result); + udbg_panic(result, "lv1_release_memory failed"); r->size = r->base = r->offset = 0; map.total = map.rm.size; } -- 2.20.1
[PATCH 3/9] net/ps3_gelic_net: Remove duplicate error message
From: Markus Elfring Remove an extra message for a memory allocation failure in function gelic_descr_prepare_rx(). Signed-off-by: Markus Elfring Signed-off-by: Geoff Levand --- drivers/net/ethernet/toshiba/ps3_gelic_net.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c index 070dd6fa9401..8522f3898e0d 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c @@ -382,8 +382,6 @@ static int gelic_descr_prepare_rx(struct gelic_card *card, descr->skb = dev_alloc_skb(bufsize + GELIC_NET_RXBUF_ALIGN - 1); if (!descr->skb) { descr->buf_addr = 0; /* tell DMAC don't touch memory */ - dev_info(ctodev(card), -"%s:allocate skb failed !!\n", __func__); return -ENOMEM; } descr->buf_size = cpu_to_be32(bufsize); -- 2.20.1
[PATCH 8/9] powerpc/ps3: Add lv1_panic
lv1_panic takes a single parameter, 0=halt, 1=reboot, and it will never return. Signed-off-by: Geoff Levand --- arch/powerpc/boot/ppc_asm.h| 6 ++ arch/powerpc/include/asm/ppc_asm.h | 6 ++ 2 files changed, 12 insertions(+) diff --git a/arch/powerpc/boot/ppc_asm.h b/arch/powerpc/boot/ppc_asm.h index 192b97523b05..cf758bc63846 100644 --- a/arch/powerpc/boot/ppc_asm.h +++ b/arch/powerpc/boot/ppc_asm.h @@ -8,6 +8,12 @@ * Copyright (C) 1995-1999 Gary Thomas, Paul Mackerras, Cort Dougan. */ +.macro lv1_panic + li r3, 0 + li r11, 255 + .long 0x4422 +.endm + /* Condition Register Bit Fields */ #definecr0 0 diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 6b03dff61a05..e76a6a4020ea 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -13,6 +13,12 @@ #ifdef __ASSEMBLY__ +.macro lv1_panic + li r3, 0 + li r11, 255 + .long 0x4422 +.endm + #define SZL(BITS_PER_LONG/8) /* -- 2.20.1
[PATCH 0/9] PS3 patches for v5.7
Hi Michael, Here are a few PS3 specific patches. A few remove some reduntant messages, a few add some minor debugging support, and a few fix some problems during system boot. Please consider for v5.7. -Geoff The following changes since commit 16fbf79b0f83bc752cee8589279f1ebfe57b3b6e: Linux 5.6-rc7 (2020-03-22 18:31:56 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git for-merge-ps3 for you to fetch changes up to 1333a8985c4190763c9c0312bcefad8b1ea863c7: powerpc/ps3: Add udbg_panic (2020-03-27 13:07:31 -0700) Dan Carpenter (1): powerpc/ps3: remove an unneeded NULL check Emmanuel Nicolet (1): ps3disk: use the default segment boundary Geoff Levand (4): powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig powerpc/ps3: Add check for otheros image size powerpc/ps3: Add lv1_panic powerpc/ps3: Add udbg_panic Markus Elfring (3): powerpc/ps3: Remove duplicate error messages drivers/ps3: Remove duplicate error messages net/ps3_gelic_net: Remove duplicate error message arch/powerpc/boot/ppc_asm.h | 6 ++ arch/powerpc/boot/wrapper| 13 +++-- arch/powerpc/configs/ps3_defconfig | 2 ++ arch/powerpc/include/asm/ppc_asm.h | 6 ++ arch/powerpc/platforms/ps3/mm.c | 9 - arch/powerpc/platforms/ps3/os-area.c | 4 +--- drivers/block/ps3disk.c | 1 - drivers/net/ethernet/toshiba/ps3_gelic_net.c | 2 -- drivers/ps3/ps3-lpm.c| 2 -- drivers/ps3/ps3-vuart.c | 1 - drivers/ps3/sys-manager-core.c | 2 +- 11 files changed, 35 insertions(+), 13 deletions(-) -- 2.20.1
[PATCH 4/9] powerpc/ps3: remove an unneeded NULL check
From: Dan Carpenter Static checkers don't like the inconsistent NULL checking on "ops". This function is only called once and "ops" isn't NULL so the check can be removed. Signed-off-by: Dan Carpenter Signed-off-by: Geoff Levand --- drivers/ps3/sys-manager-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ps3/sys-manager-core.c b/drivers/ps3/sys-manager-core.c index 24709c572c0c..e061b7d0632b 100644 --- a/drivers/ps3/sys-manager-core.c +++ b/drivers/ps3/sys-manager-core.c @@ -31,7 +31,7 @@ void ps3_sys_manager_register_ops(const struct ps3_sys_manager_ops *ops) { BUG_ON(!ops); BUG_ON(!ops->dev); - ps3_sys_manager_ops = ops ? *ops : ps3_sys_manager_ops; + ps3_sys_manager_ops = *ops; } EXPORT_SYMBOL_GPL(ps3_sys_manager_register_ops); -- 2.20.1
[PATCH 7/9] powerpc/ps3: Add check for otheros image size
The ps3's otheros flash loader has a size limit of 16 MiB for the uncompressed image. If that limit will be reached output the flash image file as 'otheros-too-big.bld'. Signed-off-by: Geoff Levand --- arch/powerpc/boot/wrapper | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper index ed6266367bc0..1dfd9fd929c8 100755 --- a/arch/powerpc/boot/wrapper +++ b/arch/powerpc/boot/wrapper @@ -570,7 +570,16 @@ ps3) count=$overlay_size bs=1 odir="$(dirname "$ofile.bin")" -rm -f "$odir/otheros.bld" -gzip -n --force -9 --stdout "$ofile.bin" > "$odir/otheros.bld" + +# The ps3's flash loader has a size limit of 16 MiB for the uncompressed +# image. If a compressed image that exceeded this limit is written to +# flash the loader will decompress that image until the 16 MiB limit is +# reached, then enter the system reset vector of the partially decompressed +# image. No warning is issued. +rm -f "$odir"/{otheros,otheros-too-big}.bld +size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' ' -f1) +bld="otheros.bld" +[ $size -le 16777216 ] || bld="otheros-too-big.bld" +gzip -n --force -9 --stdout "$ofile.bin" > "$odir/$bld" ;; esac -- 2.20.1
[PATCH 1/9] powerpc/ps3: Remove duplicate error messages
From: Markus Elfring Remove duplicate memory allocation failure error messages. Signed-off-by: Markus Elfring Signed-off-by: Geoff Levand --- arch/powerpc/platforms/ps3/os-area.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/ps3/os-area.c b/arch/powerpc/platforms/ps3/os-area.c index cbddd63caf2d..e8530371aed6 100644 --- a/arch/powerpc/platforms/ps3/os-area.c +++ b/arch/powerpc/platforms/ps3/os-area.c @@ -613,10 +613,8 @@ static int update_flash_db(void) /* Read in header and db from flash. */ header = kmalloc(buf_len, GFP_KERNEL); - if (!header) { - pr_debug("%s: kmalloc failed\n", __func__); + if (!header) return -ENOMEM; - } count = os_area_flash_read(header, buf_len, 0); if (count < 0) { -- 2.20.1
Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S
Le 27/03/2020 à 19:19, Segher Boessenkool a écrit : On Fri, Mar 27, 2020 at 04:12:13PM +0100, Christophe Leroy wrote: Maybe you could also change invalidate_dcache_range(): for (i = 0; i < size >> shift; i++, addr += bytes) { if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) dcbf(addr); else dcbi(addr); } But please note that flushing is pretty much the opposite from invalidating (a flush (dcbf) makes sure that what is in the cache now ends up in memory, while an invalidate (dcbi) makes sure it will *not* end up in memory). (Both will remove the addressed cache line from the data caches). So you cannot blindly replace them; in all cases you need to look and see if it does what you need here. (dcbi is much harder to use correctly -- it can race very easily -- so in practice you will be fine most of the time; but be careful around startup code and the like). At the time being, invalidate_dcache_range() is used in only one place, and that's a place for PPC32 only. So I was just suggesting that just in case. Maybe there is no point in bothering with that at the time being. Christophe
Re: [patch V3 12/20] powerpc/ps3: Convert half completion to rcuwait
Hi, On 3/21/20 4:25 AM, Thomas Gleixner wrote: > From: Thomas Gleixner > > The PS3 notification interrupt and kthread use a hacked up completion to > communicate. Since we're wanting to change the completion implementation and > this is abuse anyway, replace it with a simple rcuwait since there is only > ever > the one waiter. > > AFAICT the kthread uses TASK_INTERRUPTIBLE to not increase loadavg, kthreads > cannot receive signals by default and this one doesn't look different. Use > TASK_IDLE instead. I tested the patch set applied against v5.6-rc7 on the PS3 and it worked as expected. Tested by: Geoff Levand
Re: [PATCH v2] powerpc/boot: Delete unneeded .globl _zimage_start
On Fri, Mar 27, 2020 at 09:50:54AM -0700, Fangrui Song wrote: > We aim for compatibility with GNU in many aspects to make it easier for > people to switch over. However, just because there is a subtle behavior > in GNU toolchain does not mean we need to emulate that behavior. It isn't subtle. It is the explicit documented behaviour. > With > all due respect, there are a large quantity of legacy behaviors we don't > want to support. And it isn't legacy at all. It is simply the defined semantics. It is semantics that real code relies on (and has for ages) as well. Segher
Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard
On Fri, Mar 27, 2020 at 06:45:21PM +0100, Christophe Leroy wrote: > Subject line, change longjump to longjmp > > Le 27/03/2020 à 11:07, Clement Courbet a écrit : > > Declaring setjmp()/longjmp() as taking longs makes the signature > > non-standard, and makes clang complain. In the past, this has been > > worked around by adding -ffreestanding to the compile flags. > > > > The implementation looks like it only ever propagates the value > > (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp > > with integer parameters. > > > > This allows removing -ffreestanding from the compilation flags. > > > > Context: > > https://lore.kernel.org/patchwork/patch/1214060 > > https://lore.kernel.org/patchwork/patch/1216174 > > > > Signed-off-by: Clement Courbet > > --- > > arch/powerpc/include/asm/setjmp.h | 6 -- > > arch/powerpc/kexec/Makefile | 3 --- > > 2 files changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/setjmp.h > > b/arch/powerpc/include/asm/setjmp.h > > index e9f81bb3f83b..84bb0d140d59 100644 > > --- a/arch/powerpc/include/asm/setjmp.h > > +++ b/arch/powerpc/include/asm/setjmp.h > > @@ -7,7 +7,9 @@ > > #define JMP_BUF_LEN23 > > -extern long setjmp(long *) __attribute__((returns_twice)); > > -extern void longjmp(long *, long) __attribute__((noreturn)); > > +typedef long *jmp_buf; > > Do we need that new opaque typedef ? Why not just keep long * ? Yes, otherwise the warning comes back: In file included from arch/powerpc/kexec/crash.c:25: arch/powerpc/include/asm/setjmp.h:10:12: error: declaration of built-in function 'setjmp' requires the declaration of the 'jmp_buf' type, commonly provided in the header . [-Werror,-Wincomplete-setjmp-declaration] extern int setjmp(long *env) __attribute__((returns_twice)); ^ arch/powerpc/include/asm/setjmp.h:11:13: error: declaration of built-in function 'longjmp' requires the declaration of the 'jmp_buf' type, commonly provided in the header . [-Werror,-Wincomplete-setjmp-declaration] extern void longjmp(long *env, int val) __attribute__((noreturn)); ^ 2 errors generated. > > + > > +extern int setjmp(jmp_buf env) __attribute__((returns_twice)); > > +extern void longjmp(jmp_buf env, int val) __attribute__((noreturn)); > > #endif /* _ASM_POWERPC_SETJMP_H */ > > diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile > > index 378f6108a414..86380c69f5ce 100644 > > --- a/arch/powerpc/kexec/Makefile > > +++ b/arch/powerpc/kexec/Makefile > > @@ -3,9 +3,6 @@ > > # Makefile for the linux kernel. > > # > > -# Avoid clang warnings around longjmp/setjmp declarations > > -CFLAGS_crash.o += -ffreestanding > > - > > obj-y += core.o crash.o core_$(BITS).o > > obj-$(CONFIG_PPC32) += relocate_32.o > > > > Christophe
Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S
On Fri, Mar 27, 2020 at 04:12:13PM +0100, Christophe Leroy wrote: > Maybe you could also change invalidate_dcache_range(): > > for (i = 0; i < size >> shift; i++, addr += bytes) { > if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) > dcbf(addr); > else > dcbi(addr); > } But please note that flushing is pretty much the opposite from invalidating (a flush (dcbf) makes sure that what is in the cache now ends up in memory, while an invalidate (dcbi) makes sure it will *not* end up in memory). (Both will remove the addressed cache line from the data caches). So you cannot blindly replace them; in all cases you need to look and see if it does what you need here. (dcbi is much harder to use correctly -- it can race very easily -- so in practice you will be fine most of the time; but be careful around startup code and the like). Segher
[PATCH V2 5/5] selftests/powerpc: Add README for GZIP engine tests
Include a README file with the instructions to use the testcases at selftests/powerpc/nx-gzip. Signed-off-by: Bulent Abali Signed-off-by: Raphael Moreira Zinsly --- .../powerpc/nx-gzip/99-nx-gzip.rules | 1 + .../testing/selftests/powerpc/nx-gzip/README | 44 +++ 2 files changed, 45 insertions(+) create mode 100644 tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules create mode 100644 tools/testing/selftests/powerpc/nx-gzip/README diff --git a/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules b/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules new file mode 100644 index ..5a7118495cb3 --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules @@ -0,0 +1 @@ +SUBSYSTEM=="nxgzip", KERNEL=="nx-gzip", MODE="0666" diff --git a/tools/testing/selftests/powerpc/nx-gzip/README b/tools/testing/selftests/powerpc/nx-gzip/README new file mode 100644 index ..a80c289f1d2c --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/README @@ -0,0 +1,44 @@ +Test the nx-gzip function: += + +Verify that following device exists: + /dev/crypto/nx-gzip +If you get a permission error run as sudo or set the device permissions: + sudo chmod go+rw /dev/crypto/nx-gzip +However, chmod may not survive across boots. You may create a udev file such +as: + /etc/udev/rules.d/99-nx-gzip.rules + + +Then make and run: +$ make +gcc -O3 -I./inc -o gzfht_test gzfht_test.c gzip_vas.c +gcc -O3 -I./inc -o gunz_test gunz_test.c gzip_vas.c + + +Compress any file using Fixed Huffman mode. Output will have a .nx.gz suffix: +$ ./gzfht_test gzip_vas.c +file gzip_vas.c read, 5218 bytes +compressed 5218 to 2545 bytes total, crc32 checksum = 817543a3 + + +Uncompress the previous output. Output will have a .nx.gunzip suffix: +./gunz_test gzip_vas.c.nx.gz +gzHeader FLG 0 +00 00 00 00 04 03 +gzHeader MTIME, XFL, OS ignored +computed checksum 817543a3 isize 1462 +stored checksum 817543a3 isize 1462 +decomp is complete: fclose + + +Compare two files: +$ sha1sum gzip_vas.c.nx.gz.nx.gunzip gzip_vas.c +4e87536f3ee9e771ef30fb0fb27572032ca44ef8 gzip_vas.c.nx.gz.nx.gunzip +4e87536f3ee9e771ef30fb0fb27572032ca44ef8 gzip_vas.c + + +Note that the code here are intended for testing the nx-gzip hardware function. +They are not intended for demonstrating performance or compression ratio. +For more information and source code consider using: +https://github.com/libnxz/power-gzip -- 2.21.0
[PATCH V2 4/5] selftests/powerpc: Add NX-GZIP engine decompress testcase
Include a decompression testcase for the powerpc NX-GZIP engine. Signed-off-by: Bulent Abali Signed-off-by: Raphael Moreira Zinsly --- .../selftests/powerpc/nx-gzip/Makefile|7 +- .../selftests/powerpc/nx-gzip/gunz_test.c | 1078 + 2 files changed, 1082 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gunz_test.c diff --git a/tools/testing/selftests/powerpc/nx-gzip/Makefile b/tools/testing/selftests/powerpc/nx-gzip/Makefile index ab903f63bbbd..82abc19a49a0 100644 --- a/tools/testing/selftests/powerpc/nx-gzip/Makefile +++ b/tools/testing/selftests/powerpc/nx-gzip/Makefile @@ -1,9 +1,9 @@ CC = gcc CFLAGS = -O3 INC = ./inc -SRC = gzfht_test.c +SRC = gzfht_test.c gunz_test.c OBJ = $(SRC:.c=.o) -TESTS = gzfht_test +TESTS = gzfht_test gunz_test EXTRA_SOURCES = gzip_vas.c all: $(TESTS) @@ -16,6 +16,7 @@ $(TESTS): $(OBJ) run_tests: $(TESTS) ./gzfht_test gzip_vas.c + ./gunz_test gzip_vas.c.nx.gz clean: - rm -f $(TESTS) *.o *~ *.gz + rm -f $(TESTS) *.o *~ *.gz *.gunzip diff --git a/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c b/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c new file mode 100644 index ..82eb268a8397 --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c @@ -0,0 +1,1078 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* P9 gunzip sample code for demonstrating the P9 NX hardware + * interface. Not intended for productive uses or for performance or + * compression ratio measurements. Note also that /dev/crypto/gzip, + * VAS and skiboot support are required + * + * Copyright 2020 IBM Corp. + * + * Author: Bulent Abali + * + * https://github.com/libnxz/power-gzip for zlib api and other utils + * Definitions of acronyms used here. See + * P9 NX Gzip Accelerator User's Manual for details: + * https://github.com/libnxz/power-gzip/blob/develop/doc/power_nx_gzip_um.pdf + * + * adler/crc: 32 bit checksums appended to stream tail + * ce: completion extension + * cpb: coprocessor parameter block (metadata) + * crb: coprocessor request block (command) + * csb: coprocessor status block (status) + * dht: dynamic huffman table + * dde: data descriptor element (address, length) + * ddl: list of ddes + * dh/fh:dynamic and fixed huffman types + * fc: coprocessor function code + * histlen: history/dictionary length + * history: sliding window of up to 32KB of data + * lzcount: Deflate LZ symbol counts + * rembytecnt: remaining byte count + * sfbt: source final block type; last block's type during decomp + * spbc: source processed byte count + * subc: source unprocessed bit count + * tebc: target ending bit count; valid bits in the last byte + * tpbc: target processed byte count + * vas: virtual accelerator switch; the user mode interface + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "nxu.h" +#include "nx.h" +#include "crb.h" + +int nx_dbg; +FILE *nx_gzip_log; + +#define NX_MIN(X, Y) (((X) < (Y))?(X):(Y)) +#define NX_MAX(X, Y) (((X) > (Y))?(X):(Y)) + +#define GETINPC(X) fgetc(X) +#define FNAME_MAX 1024 + +/* fifo queue management */ +#define fifo_used_bytes(used) (used) +#define fifo_free_bytes(used, len) ((len)-(used)) +/* amount of free bytes in the first and last parts */ +#define fifo_free_first_bytes(cur, used, len) cur)+(used)) <= (len)) \ + ? (len)-((cur)+(used)) : 0) +#define fifo_free_last_bytes(cur, used, len) cur)+(used)) <= (len)) \ + ? (cur) : (len)-(used)) +/* amount of used bytes in the first and last parts */ +#define fifo_used_first_bytes(cur, used, len) cur)+(used)) <= (len)) \ + ? (used) : (len)-(cur)) +#define fifo_used_last_bytes(cur, used, len) cur)+(used)) <= (len)) \ + ? 0 : ((used)+(cur))-(len)) +/* first and last free parts start here */ +#define fifo_free_first_offset(cur, used) ((cur)+(used)) +#define fifo_free_last_offset(cur, used, len) \ + fifo_used_last_bytes(cur, used, len) +/* first and last used parts start here */ +#define fifo_used_first_offset(cur)(cur) +#define fifo_used_last_offset(cur) (0) + +const int fifo_in_len = 1<<24; +const int fifo_out_len = 1<<24; +const int page_sz = 1<<16; +const int line_sz = 1<<7; +const int window_max = 1<<15; +const int retry_max = 50; + +void *nx_fault_storage_address; + +/* + * Fault in pages prior to NX job submission. wr=1 may be required to + * touch writeable pages. System zero pages do not fault-in the page as + * intended. Typically set wr=1 for NX target pages and set wr=0 fo
[PATCH V2 2/5] selftests/powerpc: Add header files for NX compresion/decompression
Add files to be able to compress and decompress files using the powerpc NX-GZIP engine. Signed-off-by: Bulent Abali Signed-off-by: Raphael Moreira Zinsly --- .../powerpc/nx-gzip/inc/copy-paste.h | 54 ++ .../selftests/powerpc/nx-gzip/inc/nx_dbg.h| 95 +++ .../selftests/powerpc/nx-gzip/inc/nxu.h | 651 ++ 3 files changed, 800 insertions(+) create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nxu.h diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h b/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h new file mode 100644 index ..107139b6c7df --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#include "nx-helpers.h" + +/* + * Macros taken from arch/powerpc/include/asm/ppc-opcode.h and other + * header files. + */ +#define ___PPC_RA(a)(((a) & 0x1f) << 16) +#define ___PPC_RB(b)(((b) & 0x1f) << 11) + +#define PPC_INST_COPY 0x7c20060c +#define PPC_INST_PASTE 0x7c20070d + +#define PPC_COPY(a, b) stringify_in_c(.long PPC_INST_COPY | \ + ___PPC_RA(a) | ___PPC_RB(b)) +#define PPC_PASTE(a, b) stringify_in_c(.long PPC_INST_PASTE | \ + ___PPC_RA(a) | ___PPC_RB(b)) +#define CR0_SHIFT 28 +#define CR0_MASK 0xF +/* + * Copy/paste instructions: + * + * copy RA,RB + * Copy contents of address (RA) + effective_address(RB) + * to internal copy-buffer. + * + * paste RA,RB + * Paste contents of internal copy-buffer to the address + * (RA) + effective_address(RB) + */ +static inline int vas_copy(void *crb, int offset) +{ + asm volatile(PPC_COPY(%0, %1)";" + : + : "b" (offset), "b" (crb) + : "memory"); + + return 0; +} + +static inline int vas_paste(void *paste_address, int offset) +{ + u32 cr; + + cr = 0; + asm volatile(PPC_PASTE(%1, %2)";" + "mfocrf %0, 0x80;" + : "=r" (cr) + : "b" (offset), "b" (paste_address) + : "memory", "cr0"); + + return (cr >> CR0_SHIFT) & CR0_MASK; +} diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h b/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h new file mode 100644 index ..f2c0eee2317e --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h @@ -0,0 +1,95 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright 2020 IBM Corporation + * + */ + +#ifndef _NXU_DBG_H_ +#define _NXU_DBG_H_ + +#include +#include +#include +#include +#include + +extern FILE * nx_gzip_log; +extern int nx_gzip_trace; +extern unsigned int nx_gzip_inflate_impl; +extern unsigned int nx_gzip_deflate_impl; +extern unsigned int nx_gzip_inflate_flags; +extern unsigned int nx_gzip_deflate_flags; + +extern int nx_dbg; +pthread_mutex_t mutex_log; + +#define nx_gzip_trace_enabled() (nx_gzip_trace & 0x1) +#define nx_gzip_hw_trace_enabled()(nx_gzip_trace & 0x2) +#define nx_gzip_sw_trace_enabled()(nx_gzip_trace & 0x4) +#define nx_gzip_gather_statistics() (nx_gzip_trace & 0x8) +#define nx_gzip_per_stream_stat() (nx_gzip_trace & 0x10) + +#define prt(fmt, ...) do { \ + pthread_mutex_lock(&mutex_log); \ + flock(nx_gzip_log->_fileno, LOCK_EX); \ + time_t t; struct tm *m; time(&t); m = localtime(&t);\ + fprintf(nx_gzip_log, "[%04d/%02d/%02d %02d:%02d:%02d] " \ + "pid %d: " fmt, \ + (int)m->tm_year + 1900, (int)m->tm_mon+1, (int)m->tm_mday, \ + (int)m->tm_hour, (int)m->tm_min, (int)m->tm_sec,\ + (int)getpid(), ## __VA_ARGS__); \ + fflush(nx_gzip_log);\ + flock(nx_gzip_log->_fileno, LOCK_UN); \ + pthread_mutex_unlock(&mutex_log); \ +} while (0) + +/* Use in case of an error */ +#define prt_err(fmt, ...) do { if (nx_dbg >= 0) { \ + prt("%s:%u: Error: "fmt,\ + __FILE__, __LINE__, ## __VA_ARGS__);\ +}} while (0) + +/* Use in case of an warning */ +#define prt_warn(fmt, ...) do {if (nx_dbg >= 1) { \ + prt("%s:%u: Warning: "fmt, \ + __FILE__, __LINE__, ## __VA_ARGS__);\ +}} while (0) + +/* Informational printouts */ +#define prt_info(fmt, ...) do {if (nx_dbg >= 2) {
[PATCH V2 3/5] selftests/powerpc: Add NX-GZIP engine compress testcase
Add a compression testcase for the powerpc NX-GZIP engine. Signed-off-by: Bulent Abali Signed-off-by: Raphael Moreira Zinsly --- .../selftests/powerpc/nx-gzip/Makefile| 21 + .../selftests/powerpc/nx-gzip/gzfht_test.c| 489 ++ .../selftests/powerpc/nx-gzip/gzip_vas.c | 259 ++ 3 files changed, 769 insertions(+) create mode 100644 tools/testing/selftests/powerpc/nx-gzip/Makefile create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzip_vas.c diff --git a/tools/testing/selftests/powerpc/nx-gzip/Makefile b/tools/testing/selftests/powerpc/nx-gzip/Makefile new file mode 100644 index ..ab903f63bbbd --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/Makefile @@ -0,0 +1,21 @@ +CC = gcc +CFLAGS = -O3 +INC = ./inc +SRC = gzfht_test.c +OBJ = $(SRC:.c=.o) +TESTS = gzfht_test +EXTRA_SOURCES = gzip_vas.c + +all: $(TESTS) + +$(OBJ): %.o: %.c + $(CC) $(CFLAGS) -I$(INC) -c $< + +$(TESTS): $(OBJ) + $(CC) $(CFLAGS) -I$(INC) -o $@ $@.o $(EXTRA_SOURCES) + +run_tests: $(TESTS) + ./gzfht_test gzip_vas.c + +clean: + rm -f $(TESTS) *.o *~ *.gz diff --git a/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c b/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c new file mode 100644 index ..7a21c25f5611 --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c @@ -0,0 +1,489 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* P9 gzip sample code for demonstrating the P9 NX hardware interface. + * Not intended for productive uses or for performance or compression + * ratio measurements. For simplicity of demonstration, this sample + * code compresses in to fixed Huffman blocks only (Deflate btype=1) + * and has very simple memory management. Dynamic Huffman blocks + * (Deflate btype=2) are more involved as detailed in the user guide. + * Note also that /dev/crypto/gzip, VAS and skiboot support are + * required. + * + * Copyright 2020 IBM Corp. + * + * https://github.com/libnxz/power-gzip for zlib api and other utils + * + * Author: Bulent Abali + * + * Definitions of acronyms used here. See + * P9 NX Gzip Accelerator User's Manual for details: + * https://github.com/libnxz/power-gzip/blob/develop/doc/power_nx_gzip_um.pdf + * + * adler/crc: 32 bit checksums appended to stream tail + * ce: completion extension + * cpb: coprocessor parameter block (metadata) + * crb: coprocessor request block (command) + * csb: coprocessor status block (status) + * dht: dynamic huffman table + * dde: data descriptor element (address, length) + * ddl: list of ddes + * dh/fh:dynamic and fixed huffman types + * fc: coprocessor function code + * histlen: history/dictionary length + * history: sliding window of up to 32KB of data + * lzcount: Deflate LZ symbol counts + * rembytecnt: remaining byte count + * sfbt: source final block type; last block's type during decomp + * spbc: source processed byte count + * subc: source unprocessed bit count + * tebc: target ending bit count; valid bits in the last byte + * tpbc: target processed byte count + * vas: virtual accelerator switch; the user mode interface + */ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "nxu.h" +#include "nx.h" + +int nx_dbg; +FILE *nx_gzip_log; +void *nx_fault_storage_address; + +#define NX_MIN(X, Y) (((X) < (Y)) ? (X) : (Y)) +#define FNAME_MAX 1024 +#define FEXT ".nx.gz" + +/* + * LZ counts returned in the user supplied nx_gzip_crb_cpb_t structure. + */ +static int compress_fht_sample(char *src, uint32_t srclen, char *dst, + uint32_t dstlen, int with_count, + struct nx_gzip_crb_cpb_t *cmdp, void *handle) +{ + int cc; + uint32_t fc; + + assert(!!cmdp); + + put32(cmdp->crb, gzip_fc, 0); /* clear */ + fc = (with_count) ? GZIP_FC_COMPRESS_RESUME_FHT_COUNT : + GZIP_FC_COMPRESS_RESUME_FHT; + putnn(cmdp->crb, gzip_fc, fc); + putnn(cmdp->cpb, in_histlen, 0); /* resuming with no history */ + memset((void *) &cmdp->crb.csb, 0, sizeof(cmdp->crb.csb)); + + /* Section 6.6 programming notes; spbc may be in two different +* places depending on FC. +*/ + if (!with_count) + put32(cmdp->cpb, out_spbc_comp, 0); + else + put32(cmdp->cpb, out_spbc_comp_with_count, 0); + + /* Figure 6-3 6-4; CSB location */ + put64(cmdp->crb, csb_address, 0); + put64(cmdp->crb, csb_address, + (uint64_t) &cmdp->crb.csb & csb_address_mask); + + /* Source direct dde (scatter-gather list) */ + clear_dde(cmdp->crb.source_dde); + putnn(cmdp->crb.source_dde, dde_count, 0);
[PATCH V2 1/5] selftests/powerpc: Add header files for GZIP engine test
Add files to access the powerpc NX-GZIP engine in user space. Signed-off-by: Bulent Abali Signed-off-by: Raphael Moreira Zinsly --- .../selftests/powerpc/nx-gzip/inc/crb.h | 159 ++ .../selftests/powerpc/nx-gzip/inc/nx-gzip.h | 27 +++ .../powerpc/nx-gzip/inc/nx-helpers.h | 54 ++ .../selftests/powerpc/nx-gzip/inc/nx.h| 38 + 4 files changed, 278 insertions(+) create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/crb.h create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx-gzip.h create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx-helpers.h create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx.h diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h b/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h new file mode 100644 index ..9056e3dc1831 --- /dev/null +++ b/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h @@ -0,0 +1,159 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef __CRB_H +#define __CRB_H +#include +#include "nx.h" + +typedef unsigned char u8; +typedef unsigned int u32; +typedef unsigned long long u64; + +/* CCW 842 CI/FC masks + * NX P8 workbook, section 4.3.1, figure 4-6 + * "CI/FC Boundary by NX CT type" + */ +#define CCW_CI_842 (0x3ff8) +#define CCW_FC_842 (0x0007) + +/* Chapter 6.5.8 Coprocessor-Completion Block (CCB) */ + +#define CCB_VALUE (0x3fff) +#define CCB_ADDRESS(0xfff8) +#define CCB_CM (0x0007) +#define CCB_CM0(0x0004) +#define CCB_CM12 (0x0003) + +#define CCB_CM0_ALL_COMPLETIONS(0x0) +#define CCB_CM0_LAST_IN_CHAIN (0x4) +#define CCB_CM12_STORE (0x0) +#define CCB_CM12_INTERRUPT (0x1) + +#define CCB_SIZE (0x10) +#define CCB_ALIGN CCB_SIZE + +struct coprocessor_completion_block { + __be64 value; + __be64 address; +} __aligned(CCB_ALIGN); + + +/* Chapter 6.5.7 Coprocessor-Status Block (CSB) */ + +#define CSB_V (0x80) +#define CSB_F (0x04) +#define CSB_CH (0x03) +#define CSB_CE_INCOMPLETE (0x80) +#define CSB_CE_TERMINATION (0x40) +#define CSB_CE_TPBC(0x20) + +#define CSB_CC_SUCCESS (0) +#define CSB_CC_INVALID_ALIGN (1) +#define CSB_CC_OPERAND_OVERLAP (2) +#define CSB_CC_DATA_LENGTH (3) +#define CSB_CC_TRANSLATION (5) +#define CSB_CC_PROTECTION (6) +#define CSB_CC_RD_EXTERNAL (7) +#define CSB_CC_INVALID_OPERAND (8) +#define CSB_CC_PRIVILEGE (9) +#define CSB_CC_INTERNAL(10) +#define CSB_CC_WR_EXTERNAL (12) +#define CSB_CC_NOSPC (13) +#define CSB_CC_EXCESSIVE_DDE (14) +#define CSB_CC_WR_TRANSLATION (15) +#define CSB_CC_WR_PROTECTION (16) +#define CSB_CC_UNKNOWN_CODE(17) +#define CSB_CC_ABORT (18) +#define CSB_CC_TRANSPORT (20) +#define CSB_CC_SEGMENTED_DDL (31) +#define CSB_CC_PROGRESS_POINT (32) +#define CSB_CC_DDE_OVERFLOW(33) +#define CSB_CC_SESSION (34) +#define CSB_CC_PROVISION (36) +#define CSB_CC_CHAIN (37) +#define CSB_CC_SEQUENCE(38) +#define CSB_CC_HW (39) + +#define CSB_SIZE (0x10) +#define CSB_ALIGN CSB_SIZE + +struct coprocessor_status_block { + u8 flags; + u8 cs; + u8 cc; + u8 ce; + __be32 count; + __be64 address; +} __aligned(CSB_ALIGN); + + +/* Chapter 6.5.10 Data-Descriptor List (DDL) + * each list contains one or more Data-Descriptor Entries (DDE) + */ + +#define DDE_P (0x8000) + +#define DDE_SIZE (0x10) +#define DDE_ALIGN DDE_SIZE + +struct data_descriptor_entry { + __be16 flags; + u8 count; + u8 index; + __be32 length; + __be64 address; +} __aligned(DDE_ALIGN); + + +/* Chapter 6.5.2 Coprocessor-Request Block (CRB) */ + +#define CRB_SIZE (0x80) +#define CRB_ALIGN (0x100) /* Errata: requires 256 alignment */ + + +/* Coprocessor Status Block field + * ADDRESS address of CSB + * C CCB is valid + * AT0 = addrs are virtual, 1 = addrs are phys + * M enable perf monitor + */ +#define CRB_CSB_ADDRESS(0xfff0) +#define CRB_CSB_C (0x0008) +#define CRB_CSB_AT (0x0002) +#define CRB_CSB_M (0x0001) + +struct coprocessor_request_block { + __be32 ccw; + __be32 flags; + __be64 csb_addr; + + struct data_descriptor_entry source; + struct data_descriptor_entry target; + + struct coprocessor_completion_block ccb; + + u8 reserved[48]; + + struct coprocessor_status_block csb; +} __aligned(CRB_ALIGN); + +#define crb_csb_addr(c) __be64_to_cpu(c->csb_addr) +#d
[PATCH V2 0/5] selftests/powerpc: Add NX-GZIP engine testcase
This patch series are intended to test the POWER9 Nest Accelerator (NX) GZIP engine that is being introduced by https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-March/205659.html More information about how to access the NX can be found in that patch, also a complete userspace library and more documentation can be found at: https://github.com/libnxz/power-gzip Changes in V2: - Fixed errors and warnings caught by scripts/checkpatch.pl, including line breaks inside strings. - Fixed infinite loop and out-of-boundaries writing found by Daniel Axtens. Best regards, Raphael
Re: [PATCH v2 1/1] ppc/crash: Skip spinlocks during crash
Hello Michael, On Fri, 2020-03-27 at 14:50 +1100, Michael Ellerman wrote: > Hi Leonardo, > > Leonardo Bras writes: > > During a crash, there is chance that the cpus that handle the NMI IPI > > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it > > will cause a deadlock. (rtas_lock and printk logbuf_log as of today) > > Please give us more detail on how those locks are causing you trouble, a > stack trace would be good if you have it. Sure, I have hit it in printf and rtas_call, as said before. After crash_send_ipi(), it's tested how many cpus_in_crash are there, and once they hit the total value, it's printed "IPI complete". This printk call itself already got stuck in spin_lock, for example. Here are the stack traces: #0 arch_spin_lock #1 do_raw_spin_lock #2 __raw_spin_lock #3 _raw_spin_lock #4 vprintk_emit #5 vprintk_func #7 crash_kexec_prepare_cpus #8 default_machine_crash_shutdown #9 machine_crash_shutdown #10 __crash_kexec #11 crash_kexec #12 oops_end #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end > > This is a problem if the system has kdump set up, given if it crashes > > for any reason kdump may not be saved for crash analysis. > > > > Skip spinlocks after NMI IPI is sent to all other cpus. > > We don't want to add overhead to all spinlocks for the life of the > system, just to handle this one case. I understand. Other than this patch, I would propose doing something uglier, like forcing the said locks to unlocked state when cpus_in_crash hits it's maximum value, before printing "IPI complete". Creating similar functions that don't lock, just for this case, looks like overkill to me. Do you have any other suggestion? > There's already a flag that is set when the system is crashing, > "oops_in_progress", maybe we need to use that somewhere to skip a lock > or do an early return. I think that would not work, because oops_in_progress should be 0 here: oops_end() calls bust_spinlocks(0) before calling crash_kexec(), and bust_spinlocks(0) will decrement oops_in_progress. (just verified, it's 0 before printing "IPI complete"). Thank you the feedback, :) Best regards, Leonardo signature.asc Description: This is a digitally signed message part
[PATCH 1/3] powerpc/head_check: Automatic verbosity
To aid debugging build problems turn on shell tracing for the head_check script when the build is verbose. Signed-off-by: Geoff Levand --- arch/powerpc/tools/head_check.sh | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh index ad9e57209aa4..37061fb9b58e 100644 --- a/arch/powerpc/tools/head_check.sh +++ b/arch/powerpc/tools/head_check.sh @@ -31,8 +31,10 @@ # level entry code (boot, interrupt vectors, etc) until r2 is set up. This # could cause the kernel to die in early boot. -# Turn this on if you want more debug output: -# set -x +# Allow for verbose output +if [ "$V" = "1" ]; then + set -x +fi if [ $# -lt 2 ]; then echo "$0 [path to nm] [path to vmlinux]" 1>&2 -- 2.20.1
[PATCH 0/3] powerpc: Minor updates to improve build debugging
Hi Michael, Here are a few minor updates to the powerpc build files that make debugging build problems a little easier. Please consider. -Geoff The following changes since commit 16fbf79b0f83bc752cee8589279f1ebfe57b3b6e: Linux 5.6-rc7 (2020-03-22 18:31:56 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git for-merge-powerpc for you to fetch changes up to e6fb89a0946dd620f1d8120578744942ca934e11: powerpc/head_check: Avoid broken pipe (2020-03-27 09:41:41 -0700) Geoff Levand (3): powerpc/head_check: Automatic verbosity powerpc/wrapper: Output linker map file powerpc/head_check: Avoid broken pipe arch/powerpc/boot/wrapper| 3 ++- arch/powerpc/tools/head_check.sh | 8 +--- 2 files changed, 7 insertions(+), 4 deletions(-) -- 2.20.1
[PATCH 3/3] powerpc/head_check: Avoid broken pipe
Remove the '-m4' option to grep to allow grep to process all of nm's output. This avoids the nm warning: nm terminated with signal 13 [Broken pipe] Signed-off-by: Geoff Levand --- arch/powerpc/tools/head_check.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh index 37061fb9b58e..e32d3162e5ed 100644 --- a/arch/powerpc/tools/head_check.sh +++ b/arch/powerpc/tools/head_check.sh @@ -46,7 +46,7 @@ nm="$1" vmlinux="$2" # gcc-4.6-era toolchain make _stext an A (absolute) symbol rather than T -$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" -m4 > .tmp_symbols.txt +$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" > .tmp_symbols.txt vma=$(cat .tmp_symbols.txt | grep -e " [TA] _stext$" | cut -d' ' -f1) -- 2.20.1
[PATCH 2/3] powerpc/wrapper: Output linker map file
To aid debugging wrapper troubles, output a linker map file 'wrapper.map' when the build is verbose. Signed-off-by: Geoff Levand --- arch/powerpc/boot/wrapper | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper index ed6266367bc0..35ace40d9fc2 100755 --- a/arch/powerpc/boot/wrapper +++ b/arch/powerpc/boot/wrapper @@ -29,6 +29,7 @@ set -e # Allow for verbose output if [ "$V" = 1 ]; then set -x +map="-Map wrapper.map" fi # defaults @@ -500,7 +501,7 @@ if [ "$platform" != "miboot" ]; then text_start="-Ttext $link_address" fi #link everything -${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" \ +${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" $map \ $platformo $tmp $object/wrapper.a rm $tmp fi -- 2.20.1
[RFC PATCH] powerpc: Use ppu_intrinsics.h instead of opencoding
ppu_intrinsics.h already includes helpers for things like sync(), isync(), dcbX(), etc ... Use it instead of opencoding. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/barrier.h | 11 +++ arch/powerpc/include/asm/cache.h| 25 ++--- arch/powerpc/include/asm/checksum.h | 4 +++- arch/powerpc/include/asm/synch.h| 12 4 files changed, 20 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h index 123adcefd40f..392c91519220 100644 --- a/arch/powerpc/include/asm/barrier.h +++ b/arch/powerpc/include/asm/barrier.h @@ -7,6 +7,10 @@ #include +#ifndef __ASSEMBLY__ +#include +#endif + /* * Memory barrier. * The sync instruction guarantees that all memory accesses initiated @@ -31,9 +35,9 @@ * However, on CPUs that don't support lwsync, lwsync actually maps to a * heavy-weight sync, so smp_wmb() can be a lighter-weight eieio. */ -#define mb() __asm__ __volatile__ ("sync" : : : "memory") -#define rmb() __asm__ __volatile__ ("sync" : : : "memory") -#define wmb() __asm__ __volatile__ ("sync" : : : "memory") +#define mb() __sync() +#define rmb() __sync() +#define wmb() __sync() /* The sub-arch has lwsync */ #if defined(__powerpc64__) || defined(CONFIG_PPC_E500MC) @@ -42,7 +46,6 @@ #define SMPWMB eieio #endif -#define __lwsync() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory") #define dma_rmb() __lwsync() #define dma_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory") diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index 72b81015cebe..5b5e9a63060a 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -34,6 +34,8 @@ #define IFETCH_ALIGN_BYTES (1 << IFETCH_ALIGN_SHIFT) #if !defined(__ASSEMBLY__) +#include + #ifdef CONFIG_PPC64 struct ppc_cache_info { @@ -111,31 +113,16 @@ extern void _set_L3CR(unsigned long); #define _set_L3CR(val) do { } while(0) #endif -static inline void dcbz(void *addr) -{ - __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory"); -} +#define dcbz __dcbz +#define dcbf __dcbf +#define dcbst __dcbst +#define icbi __icbi static inline void dcbi(void *addr) { __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory"); } -static inline void dcbf(void *addr) -{ - __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); -} - -static inline void dcbst(void *addr) -{ - __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory"); -} - -static inline void icbi(void *addr) -{ - asm volatile ("icbi 0, %0" : : "r"(addr) : "memory"); -} - static inline void iccci(void *addr) { asm volatile ("iccci 0, %0" : : "r"(addr) : "memory"); diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h index 9cce06194dcc..16abea7c3c64 100644 --- a/arch/powerpc/include/asm/checksum.h +++ b/arch/powerpc/include/asm/checksum.h @@ -8,6 +8,8 @@ #include #include + +#include /* * Computes the checksum of a memory block at src, length len, * and adds in "sum" (32-bit), while copying the block to dst. @@ -42,7 +44,7 @@ static inline __sum16 csum_fold(__wsum sum) unsigned int tmp; /* swap the two 16-bit halves of sum */ - __asm__("rlwinm %0,%1,16,0,31" : "=r" (tmp) : "r" (sum)); + tmp = __rlwinm(sum, 16, 0, 31); /* if there is a carry from adding the two 16-bit halves, it will carry from the lower half into the upper half, giving us the correct sum in the upper half. */ diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h index aca70fb43147..44020f89854e 100644 --- a/arch/powerpc/include/asm/synch.h +++ b/arch/powerpc/include/asm/synch.h @@ -7,19 +7,15 @@ #include #ifndef __ASSEMBLY__ +#include + extern unsigned int __start___lwsync_fixup, __stop___lwsync_fixup; extern void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end); -static inline void eieio(void) -{ - __asm__ __volatile__ ("eieio" : : : "memory"); -} +#define eieio __eieio +#define isync __isync -static inline void isync(void) -{ - __asm__ __volatile__ ("isync" : : : "memory"); -} #endif /* __ASSEMBLY__ */ #if defined(__powerpc64__) -- 2.25.0
Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard
Subject line, change longjump to longjmp Le 27/03/2020 à 11:07, Clement Courbet a écrit : Declaring setjmp()/longjmp() as taking longs makes the signature non-standard, and makes clang complain. In the past, this has been worked around by adding -ffreestanding to the compile flags. The implementation looks like it only ever propagates the value (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp with integer parameters. This allows removing -ffreestanding from the compilation flags. Context: https://lore.kernel.org/patchwork/patch/1214060 https://lore.kernel.org/patchwork/patch/1216174 Signed-off-by: Clement Courbet --- arch/powerpc/include/asm/setjmp.h | 6 -- arch/powerpc/kexec/Makefile | 3 --- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/setjmp.h b/arch/powerpc/include/asm/setjmp.h index e9f81bb3f83b..84bb0d140d59 100644 --- a/arch/powerpc/include/asm/setjmp.h +++ b/arch/powerpc/include/asm/setjmp.h @@ -7,7 +7,9 @@ #define JMP_BUF_LEN23 -extern long setjmp(long *) __attribute__((returns_twice)); -extern void longjmp(long *, long) __attribute__((noreturn)); +typedef long *jmp_buf; Do we need that new opaque typedef ? Why not just keep long * ? + +extern int setjmp(jmp_buf env) __attribute__((returns_twice)); +extern void longjmp(jmp_buf env, int val) __attribute__((noreturn)); #endif /* _ASM_POWERPC_SETJMP_H */ diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile index 378f6108a414..86380c69f5ce 100644 --- a/arch/powerpc/kexec/Makefile +++ b/arch/powerpc/kexec/Makefile @@ -3,9 +3,6 @@ # Makefile for the linux kernel. # -# Avoid clang warnings around longjmp/setjmp declarations -CFLAGS_crash.o += -ffreestanding - obj-y += core.o crash.o core_$(BITS).o obj-$(CONFIG_PPC32) += relocate_32.o Christophe
RFA [PPC kernel] Avoid upcoming PPC kernel build failure
On Fri, Mar 27, 2020 at 7:54 AM Yu-cheng Yu wrote: > > On Thu, 2020-02-06 at 04:55 -0800, H.J. Lu wrote: > > On Wed, Feb 5, 2020 at 7:26 PM Michael Ellerman wrote: > > > "H.J. Lu" writes: > > > > On Tue, Feb 4, 2020 at 3:37 PM kbuild test robot wrote: > > > > > tree: https://github.com/yyu168/linux_cet.git cet > > > > > head: bba707cc4715c1036b6561ab38b16747f9c49cfa > > > > > commit: 71bb971dd76eeacd351690f28864ad5c5bec3691 [55/58] Discard > > > > > .note.gnu.property sections in generic NOTES > > > > > config: powerpc-rhel-kconfig (attached as .config) > > > > > compiler: powerpc64le-linux-gcc (GCC) 7.5.0 > > > > > reproduce: > > > > > wget > > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross > > > > > -O ~/bin/make.cross > > > > > chmod +x ~/bin/make.cross > > > > > git checkout 71bb971dd76eeacd351690f28864ad5c5bec3691 > > > > > # save the attached .config to linux build tree > > > > > GCC_VERSION=7.5.0 make.cross ARCH=powerpc > > > > > > > > > > If you fix the issue, kindly add following tag > > > > > Reported-by: kbuild test robot > > > > > > > > > > All warnings (new ones prefixed by >>): > > > > > > > > > >powerpc64le-linux-ld: warning: discarding dynamic section > > > > > .rela___ksymtab_gpl+__wait_rcu_gp > > > > > > > > arch/powerpc/kernel/vmlinux.lds.S has > > > > > > > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x)) > > > > { > > > > __rela_dyn_start = .; > > > > *(.rela*) Keep .rela* sections > > > > } > > > > > > The above is inside #ifdef CONFIG_RELOCATABLE > > > > > > > ... > > > > /DISCARD/ : { > > > > *(*.EMB.apuinfo) > > > > *(.glink .iplt .plt .rela* .comment) > > > > Discard .rela* sections. But it is > > > > ignored. > > > > *(.gnu.version*) > > > > *(.gnu.attributes) > > > > *(.eh_frame) > > > > } > > > > > > But that is not #ifdef'ed at all. > > > > > > > With my > > > > > > > > ommit 71bb971dd76eeacd351690f28864ad5c5bec3691 > > > > Author: H.J. Lu > > > > Date: Thu Jan 30 12:39:09 2020 -0800 > > > > > > > > Discard .note.gnu.property sections in generic NOTES > > > > > > > > With the command-line option, -mx86-used-note=yes, the x86 assembler > > > > in binutils 2.32 and above generates a program property note in a > > > > note > > > > section, .note.gnu.property, to encode used x86 ISAs and features. > > > > But > > > > kernel linker script only contains a single NOTE segment: > > > > > > > > /DISCARD/ : { *(.note.gnu.property) } > > > > > > > > is placed before > > > > > > > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x)) > > > > { > > > > __rela_dyn_start = .; > > > > *(.rela*) Keep .rela* sections > > > > } > > > > > > > > Then .rela* in > > > > > > > > /DISCARD/ : { > > > > *(*.EMB.apuinfo) > > > > *(.glink .iplt .plt .rela* .comment) > > > > *(.gnu.version*) > > > > *(.gnu.attributes) > > > > *(.eh_frame) > > > > } > > > > > > > > is honored. Can someone from POWERPC comment on it? > > > > > > Hmm OK. I'm not really a toolchain person. > > > > > > The comment on DISCARDS says: > > > > > >* Some archs want to discard exit text/data at runtime rather than > > >* link time due to cross-section references such as alt instructions, > > >* bug table, eh_frame, etc. DISCARDS must be the last of output > > >* section definitions so that such archs put those in earlier section > > >* definitions. > > >*/ > > > > > > But I guess you're changing those semantics in your series. > > > > > > This seems to fix the warning for me? > > > > > > diff --git a/arch/powerpc/kernel/vmlinux.lds.S > > > b/arch/powerpc/kernel/vmlinux.lds.S > > > index b4c89a1acebb..076b3e8a849d 100644 > > > --- a/arch/powerpc/kernel/vmlinux.lds.S > > > +++ b/arch/powerpc/kernel/vmlinux.lds.S > > > @@ -365,9 +365,12 @@ SECTIONS > > > DISCARDS > > > /DISCARD/ : { > > > *(*.EMB.apuinfo) > > > - *(.glink .iplt .plt .rela* .comment) > > > + *(.glink .iplt .plt .comment) > > > *(.gnu.version*) > > > *(.gnu.attributes) > > > *(.eh_frame) > > > +#ifndef CONFIG_RELOCATABLE > > > + *(.rela*) > > > +#endif > > > } > > > } > > > > > > > > > cheers > > > > This looks correct me. > > > > Reviewed-by: H.J. Lu > > > > Thanks. > > > > Has this been merged into any branch yet? I just checked the tip tree and did > not see it. > FYI, my patches have been queued on x86/build branch. Could someone from PPC community add this patch to PPC kernel to avoid upcoming PPC kernel build failure? Thanks. -- H.J.
Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard
On Fri, Mar 27, 2020 at 10:10:44AM -0700, Nick Desaulniers wrote: > On Fri, Mar 27, 2020 at 3:08 AM Clement Courbet wrote: > > > > Declaring setjmp()/longjmp() as taking longs makes the signature > > non-standard, and makes clang complain. In the past, this has been > > worked around by adding -ffreestanding to the compile flags. > > > > The implementation looks like it only ever propagates the value > > (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp > > with integer parameters. > > > > This allows removing -ffreestanding from the compilation flags. > > > > Context: > > https://lore.kernel.org/patchwork/patch/1214060 > > https://lore.kernel.org/patchwork/patch/1216174 > > > > Signed-off-by: Clement Courbet Thanks for fixing this properly, not really sure why I did not think of this in the first place. I guess my thought was the warning makes it seem like clang is going to ignore the kernel's implementation of setjmp/longjmp but I can't truly remember. > Hi Clement, thanks for the patch! Would you mind sending a V2 that > included a similar fix to arch/powerpc/xmon/Makefile? Agreed. > For context, this was the original patch: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aea447141c7e7824b81b49acd1bc78 > which was then modified to: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9029ef9c95765e7b63c4d9aa780674447db1ec0 > > So on your V2, if you include in the commit message, the line: > > Fixes c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp") > > then that will help our LTS branch maintainers back port it to the > appropriate branches. The tags should be: Cc: sta...@vger.kernel.org # v4.14+ Fixes: c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp") that way it explicitly gets picked up for stable, rather than Sasha's AUTOSEL process, which could miss it. With the xmon/Makefile -ffreestanding removed and the tags updated, consider this: Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Cheers, Nathan
Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard
On Fri, Mar 27, 2020 at 3:08 AM Clement Courbet wrote: > > Declaring setjmp()/longjmp() as taking longs makes the signature > non-standard, and makes clang complain. In the past, this has been > worked around by adding -ffreestanding to the compile flags. > > The implementation looks like it only ever propagates the value > (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp > with integer parameters. > > This allows removing -ffreestanding from the compilation flags. > > Context: > https://lore.kernel.org/patchwork/patch/1214060 > https://lore.kernel.org/patchwork/patch/1216174 > > Signed-off-by: Clement Courbet Hi Clement, thanks for the patch! Would you mind sending a V2 that included a similar fix to arch/powerpc/xmon/Makefile? For context, this was the original patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aea447141c7e7824b81b49acd1bc78 which was then modified to: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9029ef9c95765e7b63c4d9aa780674447db1ec0 So on your V2, if you include in the commit message, the line: Fixes c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp") then that will help our LTS branch maintainers back port it to the appropriate branches. > > --- > arch/powerpc/include/asm/setjmp.h | 6 -- > arch/powerpc/kexec/Makefile | 3 --- > 2 files changed, 4 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/include/asm/setjmp.h > b/arch/powerpc/include/asm/setjmp.h > index e9f81bb3f83b..84bb0d140d59 100644 > --- a/arch/powerpc/include/asm/setjmp.h > +++ b/arch/powerpc/include/asm/setjmp.h > @@ -7,7 +7,9 @@ > > #define JMP_BUF_LEN23 > > -extern long setjmp(long *) __attribute__((returns_twice)); > -extern void longjmp(long *, long) __attribute__((noreturn)); > +typedef long *jmp_buf; > + > +extern int setjmp(jmp_buf env) __attribute__((returns_twice)); > +extern void longjmp(jmp_buf env, int val) __attribute__((noreturn)); > > #endif /* _ASM_POWERPC_SETJMP_H */ > diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile > index 378f6108a414..86380c69f5ce 100644 > --- a/arch/powerpc/kexec/Makefile > +++ b/arch/powerpc/kexec/Makefile > @@ -3,9 +3,6 @@ > # Makefile for the linux kernel. > # > > -# Avoid clang warnings around longjmp/setjmp declarations > -CFLAGS_crash.o += -ffreestanding > - > obj-y += core.o crash.o core_$(BITS).o > > obj-$(CONFIG_PPC32)+= relocate_32.o > -- > 2.25.1.696.g5e7596f4ac-goog > -- Thanks, ~Nick Desaulniers
Re: [PATCH] x86: Alias memset to __builtin_memset.
Hi! On Thu, Mar 26, 2020 at 01:38:39PM +0100, Clement Courbet wrote: > --- a/arch/powerpc/include/asm/setjmp.h > +++ b/arch/powerpc/include/asm/setjmp.h > @@ -12,7 +12,9 @@ > > #define JMP_BUF_LEN23 > -extern long setjmp(long *); > -extern void longjmp(long *, long); > +typedef long * jmp_buf; > + > +extern int setjmp(jmp_buf); > +extern void longjmp(jmp_buf, int); > > I'm happy to send a patch for this, and get rid of more -ffreestanding. > Opinions ? Pedantically, jmp_buf should be an array type. But, this will probably work fine, and it certainly is better than what we had before. You could do typedef long jmp_buf[JMP_BUF_LEN]; perhaps? Thanks, Segher
[PATCH] soc: fsl: qe: clean up an indentation issue
From: Colin Ian King There is a statement that not indented correctly, remove the extraneous space. Signed-off-by: Colin Ian King --- drivers/soc/fsl/qe/ucc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c index d6c93970df4d..cac0fb7693a0 100644 --- a/drivers/soc/fsl/qe/ucc.c +++ b/drivers/soc/fsl/qe/ucc.c @@ -519,7 +519,7 @@ int ucc_set_tdm_rxtx_clk(u32 tdm_num, enum qe_clock clock, int clock_bits; u32 shift; struct qe_mux __iomem *qe_mux_reg; -__be32 __iomem *cmxs1cr; + __be32 __iomem *cmxs1cr; qe_mux_reg = &qe_immr->qmx; -- 2.25.1
[PATCH 0/6] Kill setup_irq()
Hi Thomas, As compared to the situation mentioned earlier[1], now powerpc patch is also in -next, and the pending ARM patches has been picked up by ARM SoC maintainers today and is expected to show up in next -next. All other subsytem patches has been picked by relevant maintainers & are already in -next except alpha, c6x, hexagon, unicore32 & sh. As it is the case, i am sending you patches for the above 5 architecture's plus the core removal patch. Status of 5 arch's: --- alpha: received ack from Matt Turner, build test success c6x:did receive ack from Mark Salter in v1, the final version (v3) was with minor changes, hence removed his ack & cc'ed him, build test success hexagon:build test success unicore32: couldn't get toolchain from kernel.org, 0day test robot or Segher's buildall sh: To compile the relevant changes sh64 compiler is required, couldn't get it from above mentioned 3 sources. Note 1: sh toolchain is available, but that will not make the relevant changes compile as it has dependency of 64bit arch toolchain, did try a Kconfig hack to make it compile w/ 32bit sh toolchain, but it failed due to other reasons (unknown operands), so gave up on that. Note 2: hexagon final image creation fails even w/o my patch, but it has been ensured that w/ my changes relevant object files are getting built w/o warnings. Regards afzal [1] https://lkml.kernel.org/r/20200321172626.GA6323@afzalpc afzal mohammed (6): alpha: Replace setup_irq() by request_irq() c6x: replace setup_irq() by request_irq() hexagon: replace setup_irq() by request_irq() sh: replace setup_irq() by request_irq() unicore32: replace setup_irq() by request_irq() genirq: Remove setup_irq() and remove_irq() arch/alpha/kernel/irq_alpha.c | 29 arch/alpha/kernel/irq_i8259.c | 8 ++ arch/alpha/kernel/irq_impl.h | 7 + arch/alpha/kernel/irq_pyxis.c | 3 ++- arch/alpha/kernel/sys_alcor.c | 3 ++- arch/alpha/kernel/sys_cabriolet.c | 3 ++- arch/alpha/kernel/sys_eb64p.c | 3 ++- arch/alpha/kernel/sys_marvel.c| 2 +- arch/alpha/kernel/sys_miata.c | 6 +++-- arch/alpha/kernel/sys_ruffian.c | 3 ++- arch/alpha/kernel/sys_rx164.c | 3 ++- arch/alpha/kernel/sys_sx164.c | 3 ++- arch/alpha/kernel/sys_wildfire.c | 7 ++--- arch/alpha/kernel/time.c | 6 ++--- arch/c6x/platforms/timer64.c | 11 +++- arch/hexagon/kernel/smp.c | 22 arch/hexagon/kernel/time.c| 11 +++- arch/sh/boards/mach-cayman/irq.c | 18 + arch/sh/drivers/dma/dma-pvr2.c| 9 +++ arch/unicore32/kernel/time.c | 11 +++- include/linux/irq.h | 2 -- kernel/irq/manage.c | 44 --- 22 files changed, 60 insertions(+), 154 deletions(-) -- 2.25.1
Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash
Hello Christophe, thanks for the feedback. I noticed an error in this patch and sent a v2, that can be seen here: http://patchwork.ozlabs.org/patch/1262468/ Comments inline:: On Fri, 2020-03-27 at 07:50 +0100, Christophe Leroy wrote: > > @@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock) > > if (likely(__arch_spin_trylock(lock) == 0)) > > break; > > do { > > + if (unlikely(crash_skip_spinlock)) > > + return; Complete function for reference: static inline void arch_spin_lock(arch_spinlock_t *lock) { while (1) { if (likely(__arch_spin_trylock(lock) == 0)) break; do { if (unlikely(crash_skip_spinlock)) return; HMT_low(); if (is_shared_processor()) splpar_spin_yield(lock); } while (unlikely(lock->slock != 0)); HMT_medium(); } } > You are adding a test that reads a global var in the middle of a so hot > path ? That must kill performance. I thought it would, in worst case scenario, increase a maximum delay of an arch_spin_lock() call 1 spin cycle. Here is what I thought: - If the lock is already free, it would change nothing, - Otherwise, the lock will wait. - Waiting cycle just got bigger. - Worst case scenario: running one more cycle, given lock->slock can turn to 0 just after checking. Could you please point where I failed to see the performance penalty? (I need to get better at this :) ) > Can we do different ? Sure, a less intrusive way of doing it would be to free the currently needed locks before proceeding. I just thought it would be harder to maintain. > Christophe Best regards, Leonardo signature.asc Description: This is a digitally signed message part
Re: [PATCH v2] powerpc/boot: Delete unneeded .globl _zimage_start
On Thu, Mar 26, 2020 at 03:26:12PM -0700, Fangrui Song wrote: > On 2020-03-26, Segher Boessenkool wrote: > >On Wed, Mar 25, 2020 at 09:42:57AM -0700, Fangrui Song wrote: > >>.globl sets the symbol binding to STB_GLOBAL while .weak sets the > >>binding to STB_WEAK. GNU as let .weak override .globl since binutils-gdb > >>5ca547dc2399a0a5d9f20626d4bf5547c3ccfddd (1996). Clang integrated > >>assembler let the last win but it may error in the future. > > > >GNU AS works for more than just ELF. The way the assembler language > >is defined, it is not .weak overriding .globl -- instead, .weak sets a > >symbol attribute. On an existing symbol (but it creates on if there is > >none yet). > > > >Clang is buggy if it does not allow valid (and perfectly normal) > >assembler code like this. > > https://sourceware.org/pipermail/binutils/2020-March/110399.html > > Alan: "I think it is completely fine for you to make the llvm assembler > error on inconsistent binding, or the last directive win. Either of > those behaviours is logical and good, but you quite possibly will run > into a need to fix more user assembly. This would be fine and consistent behaviour, of course. But it is not appropriate if you want to pretend to be compatible to GNU toolchains. Which is exactly why you want this kernel patch at all. And the kernel can (in this case) accommodate your buggy assembler, sure, but are you going to "fix" all other programs with this "problem" as well? Segher
Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S
Le 27/03/2020 à 10:03, Balamuruhan S a écrit : On Fri, 2020-03-27 at 07:48 +0100, Christophe Leroy wrote: Le 26/03/2020 à 07:15, Balamuruhan S a écrit : Data Cache Block Invalidate (dcbi) instruction was implemented back in PowerPC architecture version 2.03. It is obsolete and attempt to use of this illegal instruction results in a hypervisor emulation assistance interrupt. So, ifdef it out the option `i` in xmon for 64bit Book3S. I don't understand. You say two contradictory things: 1/ You say it _was_ added back. 2/ You say it _is_ obsolete. How can it be obsolete if it was added back ? I actually learnt it from P8 and P9 User Manual, The POWER8/POWER9 core does not provide support for the following optional or obsolete instructions (attempted use of these results in a hypervisor emulation assistance interrupt): • tlbia - TLB invalidate all • tlbiex - TLB invalidate entry by index (obsolete) • slbiex - SLB invalidate entry by index (obsolete) • dcba - Data cache block allocate (Book II; obsolete) • dcbi - Data cache block invalidate (obsolete) • rfi - Return from interrupt (32-bit; obsolete) Then that's exactly what you have to say in the coming log. Maybe you could also change invalidate_dcache_range(): for (i = 0; i < size >> shift; i++, addr += bytes) { if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) dcbf(addr); else dcbi(addr); } Christophe
Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Fri, Mar 27, 2020 at 02:22:55PM +0100, Arnd Bergmann wrote: > On Fri, Mar 27, 2020 at 2:15 PM Andy Shevchenko > wrote: > > On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote: > > > On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote: > > > > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek > > > > wrote: ... > > > > It does raise a follow-up question about ppc40x though: is it time to > > > > retire all of it? > > > > > > Who knows? > > > > > > I have in possession nice WD My Book Live, based on this architecture, > > > and I > > > won't it gone from modern kernel support. OTOH I understand that amount > > > of real > > > users not too big. > > > > +Cc: Christian Lamparter, whom I owe for that WD box. > > According to https://openwrt.org/toh/wd/mybooklive, that one is based on > APM82181/ppc464, so it is about several generations newer than what I > asked about (ppc40x). > > > > Ah, and I have Amiga board, but that one is being used only for testing, > > > so, > > > I don't care much. > > I think there are a couple of ppc440 based Amiga boards, but again, not 405 > to my knowledge. Ah, you are right. No objections from ppc40x removal! -- With Best Regards, Andy Shevchenko
Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Fri, Mar 27, 2020 at 2:15 PM Andy Shevchenko wrote: > On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote: > > On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote: > > > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek > > > wrote: > > > > > > > > recently we wanted to update xilinx intc driver and we found that > > > > function > > > > which we wanted to remove is still wired by ancient Xilinx PowerPC > > > > platforms. Here is the thread about it. > > > > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ > > > > > > > > I have been talking about it internally and there is no interest in > > > > these > > > > platforms and it is also orphan for quite a long time. None is really > > > > running/testing these platforms regularly that's why I think it makes > > > > sense > > > > to remove them also with drivers which are specific to this platform. > > > > > > > > U-Boot support was removed in 2017 without anybody complain about it > > > > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed > > > > > > > > Based on current ppc/next. > > > > > > > > If anyone has any objection about it, please let me know. > > > > > > Acked-by: Arnd Bergmann > > > > > > This looks reasonable to me as well, in particular as the code only > > > supports the two > > > ppc44x virtex developer boards and no commercial products. > > > > > > It does raise a follow-up question about ppc40x though: is it time to > > > retire all of it? > > > > Who knows? > > > > I have in possession nice WD My Book Live, based on this architecture, and I > > won't it gone from modern kernel support. OTOH I understand that amount of > > real > > users not too big. > > +Cc: Christian Lamparter, whom I owe for that WD box. According to https://openwrt.org/toh/wd/mybooklive, that one is based on APM82181/ppc464, so it is about several generations newer than what I asked about (ppc40x). > > Ah, and I have Amiga board, but that one is being used only for testing, so, > > I don't care much. I think there are a couple of ppc440 based Amiga boards, but again, not 405 to my knowledge. Arnd
Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote: > On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote: > > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek > > wrote: > > > > > > recently we wanted to update xilinx intc driver and we found that function > > > which we wanted to remove is still wired by ancient Xilinx PowerPC > > > platforms. Here is the thread about it. > > > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ > > > > > > I have been talking about it internally and there is no interest in these > > > platforms and it is also orphan for quite a long time. None is really > > > running/testing these platforms regularly that's why I think it makes > > > sense > > > to remove them also with drivers which are specific to this platform. > > > > > > U-Boot support was removed in 2017 without anybody complain about it > > > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed > > > > > > Based on current ppc/next. > > > > > > If anyone has any objection about it, please let me know. > > > > Acked-by: Arnd Bergmann > > > > This looks reasonable to me as well, in particular as the code only > > supports the two > > ppc44x virtex developer boards and no commercial products. > > > > It does raise a follow-up question about ppc40x though: is it time to > > retire all of it? > > Who knows? > > I have in possession nice WD My Book Live, based on this architecture, and I > won't it gone from modern kernel support. OTOH I understand that amount of > real > users not too big. +Cc: Christian Lamparter, whom I owe for that WD box. > > Ah, and I have Amiga board, but that one is being used only for testing, so, > I don't care much. > > > The other ppc405 machines appear to have seen even fewer updates after the > > OpenBlockS 600 got added in 2011, so it's possible nobody is using them any > > more > > with modern kernels. > > > > I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after > > they had not been maintained for years. > > > > However, 44x (in its ppc476 incarnation) is clearly still is used > > through the fsp2 platform, > > and can not be deprecated at least until that is known to have stopped > > getting kernel > > updates. > > > > Arnd > > -- > With Best Regards, > Andy Shevchenko > > -- With Best Regards, Andy Shevchenko
Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote: > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek wrote: > > > > recently we wanted to update xilinx intc driver and we found that function > > which we wanted to remove is still wired by ancient Xilinx PowerPC > > platforms. Here is the thread about it. > > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ > > > > I have been talking about it internally and there is no interest in these > > platforms and it is also orphan for quite a long time. None is really > > running/testing these platforms regularly that's why I think it makes sense > > to remove them also with drivers which are specific to this platform. > > > > U-Boot support was removed in 2017 without anybody complain about it > > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed > > > > Based on current ppc/next. > > > > If anyone has any objection about it, please let me know. > > Acked-by: Arnd Bergmann > > This looks reasonable to me as well, in particular as the code only > supports the two > ppc44x virtex developer boards and no commercial products. > > It does raise a follow-up question about ppc40x though: is it time to > retire all of it? Who knows? I have in possession nice WD My Book Live, based on this architecture, and I won't it gone from modern kernel support. OTOH I understand that amount of real users not too big. Ah, and I have Amiga board, but that one is being used only for testing, so, I don't care much. > The other ppc405 machines appear to have seen even fewer updates after the > OpenBlockS 600 got added in 2011, so it's possible nobody is using them any > more > with modern kernels. > > I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after > they had not been maintained for years. > > However, 44x (in its ppc476 incarnation) is clearly still is used > through the fsp2 platform, > and can not be deprecated at least until that is known to have stopped > getting kernel > updates. > > Arnd -- With Best Regards, Andy Shevchenko
Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Fri, Mar 27, 2020 at 1:12 PM Michal Simek wrote: > > recently we wanted to update xilinx intc driver and we found that function > which we wanted to remove is still wired by ancient Xilinx PowerPC > platforms. Here is the thread about it. > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ > > I have been talking about it internally and there is no interest in these > platforms and it is also orphan for quite a long time. None is really > running/testing these platforms regularly that's why I think it makes sense > to remove them also with drivers which are specific to this platform. > > U-Boot support was removed in 2017 without anybody complain about it > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed > > Based on current ppc/next. > > If anyone has any objection about it, please let me know. Acked-by: Arnd Bergmann This looks reasonable to me as well, in particular as the code only supports the two ppc44x virtex developer boards and no commercial products. It does raise a follow-up question about ppc40x though: is it time to retire all of it? The other ppc405 machines appear to have seen even fewer updates after the OpenBlockS 600 got added in 2011, so it's possible nobody is using them any more with modern kernels. I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after they had not been maintained for years. However, 44x (in its ppc476 incarnation) is clearly still is used through the fsp2 platform, and can not be deprecated at least until that is known to have stopped getting kernel updates. Arnd
Re: [patch V3 03/20] usb: gadget: Use completion interface instead of open coding it
On 2020-03-25 10:37:57 [+0200], Felipe Balbi wrote: > Do you want to carry it via your tree? If so: We would like to do so. > Acked-by: Felipe Balbi Thank you. > Otherwise, let me know and I'll pick this patch. Sebastian
[PATCH 2/2] powerpc: Remove Xilinx PPC405/PPC440 support
The latest Xilinx design tools called ISE and EDK has been released in October 2013. New tool doesn't support any PPC405/PPC440 new designs. These platforms are no longer supported and tested. PowerPC 405/440 port is orphan from 2013 by commit cdeb89943bfc ("MAINTAINERS: Fix incorrect status tag") and commit 19624236cce1 ("MAINTAINERS: Update Grant's email address and maintainership") that's why it is time to remove the support fot these platforms. Signed-off-by: Michal Simek --- Documentation/devicetree/bindings/xilinx.txt | 143 -- Documentation/powerpc/bootwrapper.rst| 28 +- MAINTAINERS | 6 - arch/powerpc/Kconfig.debug | 2 +- arch/powerpc/boot/Makefile | 7 +- arch/powerpc/boot/dts/Makefile | 1 - arch/powerpc/boot/dts/virtex440-ml507.dts| 406 arch/powerpc/boot/dts/virtex440-ml510.dts| 466 --- arch/powerpc/boot/ops.h | 1 - arch/powerpc/boot/serial.c | 5 - arch/powerpc/boot/uartlite.c | 79 arch/powerpc/boot/virtex.c | 97 arch/powerpc/boot/virtex405-head.S | 31 -- arch/powerpc/boot/wrapper| 8 - arch/powerpc/configs/40x/virtex_defconfig| 75 --- arch/powerpc/configs/44x/virtex5_defconfig | 74 --- arch/powerpc/configs/ppc40x_defconfig| 8 - arch/powerpc/configs/ppc44x_defconfig| 8 - arch/powerpc/include/asm/xilinx_intc.h | 16 - arch/powerpc/include/asm/xilinx_pci.h| 21 - arch/powerpc/kernel/cputable.c | 39 -- arch/powerpc/platforms/40x/Kconfig | 31 -- arch/powerpc/platforms/40x/Makefile | 1 - arch/powerpc/platforms/40x/virtex.c | 54 --- arch/powerpc/platforms/44x/Kconfig | 37 -- arch/powerpc/platforms/44x/Makefile | 2 - arch/powerpc/platforms/44x/virtex.c | 60 --- arch/powerpc/platforms/44x/virtex_ml510.c| 30 -- arch/powerpc/platforms/Kconfig | 4 - arch/powerpc/sysdev/Makefile | 2 - arch/powerpc/sysdev/xilinx_intc.c| 88 arch/powerpc/sysdev/xilinx_pci.c | 132 -- arch/powerpc/xmon/ppc-dis.c | 6 - arch/powerpc/xmon/ppc-opc.c | 23 - arch/powerpc/xmon/ppc.h | 5 - drivers/char/Kconfig | 2 +- drivers/video/fbdev/Kconfig | 2 +- 37 files changed, 7 insertions(+), 1993 deletions(-) delete mode 100644 arch/powerpc/boot/dts/virtex440-ml507.dts delete mode 100644 arch/powerpc/boot/dts/virtex440-ml510.dts delete mode 100644 arch/powerpc/boot/uartlite.c delete mode 100644 arch/powerpc/boot/virtex.c delete mode 100644 arch/powerpc/boot/virtex405-head.S delete mode 100644 arch/powerpc/configs/40x/virtex_defconfig delete mode 100644 arch/powerpc/configs/44x/virtex5_defconfig delete mode 100644 arch/powerpc/include/asm/xilinx_intc.h delete mode 100644 arch/powerpc/include/asm/xilinx_pci.h delete mode 100644 arch/powerpc/platforms/40x/virtex.c delete mode 100644 arch/powerpc/platforms/44x/virtex.c delete mode 100644 arch/powerpc/platforms/44x/virtex_ml510.c delete mode 100644 arch/powerpc/sysdev/xilinx_intc.c delete mode 100644 arch/powerpc/sysdev/xilinx_pci.c diff --git a/Documentation/devicetree/bindings/xilinx.txt b/Documentation/devicetree/bindings/xilinx.txt index d058ace29345..28199b31fe5e 100644 --- a/Documentation/devicetree/bindings/xilinx.txt +++ b/Documentation/devicetree/bindings/xilinx.txt @@ -86,149 +86,6 @@ xlnx,use-parity = <0>; }; - Some IP cores actually implement 2 or more logical devices. In - this case, the device should still describe the whole IP core with - a single node and add a child node for each logical device. The - ranges property can be used to translate from parent IP-core to the - registers of each device. In addition, the parent node should be - compatible with the bus type 'xlnx,compound', and should contain - #address-cells and #size-cells, as with any other bus. (Note: this - makes the assumption that both logical devices have the same bus - binding. If this is not true, then separate nodes should be used - for each logical device). The 'cell-index' property can be used to - enumerate logical devices within an IP core. For example, the - following is the system.mhs entry for the dual ps2 controller found - on the ml403 reference design. - - BEGIN opb_ps2_dual_ref - PARAMETER INSTANCE = opb_ps2_dual_ref_0 - PARAMETER HW_VER = 1.00.a - PARAMETER C_BASEADDR = 0xA900 - PARAMETER C_HIGHADDR = 0xA9001FFF - BUS_INTERFACE SOPB = opb_v20_0 - PORT Sys_Intr1 = ps2_1_intr - PORT Sys_Intr2 = ps2_2_intr - PORT Clkin
[PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
Hi, recently we wanted to update xilinx intc driver and we found that function which we wanted to remove is still wired by ancient Xilinx PowerPC platforms. Here is the thread about it. https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ I have been talking about it internally and there is no interest in these platforms and it is also orphan for quite a long time. None is really running/testing these platforms regularly that's why I think it makes sense to remove them also with drivers which are specific to this platform. U-Boot support was removed in 2017 without anybody complain about it https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed Based on current ppc/next. If anyone has any objection about it, please let me know. Thanks, Michal Michal Simek (2): sound: ac97: Remove sound driver for ancient platform powerpc: Remove Xilinx PPC405/PPC440 support Documentation/devicetree/bindings/xilinx.txt | 143 -- Documentation/powerpc/bootwrapper.rst| 28 +- MAINTAINERS |6 - arch/powerpc/Kconfig.debug |2 +- arch/powerpc/boot/Makefile |7 +- arch/powerpc/boot/dts/Makefile |1 - arch/powerpc/boot/dts/virtex440-ml507.dts| 406 -- arch/powerpc/boot/dts/virtex440-ml510.dts| 466 --- arch/powerpc/boot/ops.h |1 - arch/powerpc/boot/serial.c |5 - arch/powerpc/boot/uartlite.c | 79 -- arch/powerpc/boot/virtex.c | 97 -- arch/powerpc/boot/virtex405-head.S | 31 - arch/powerpc/boot/wrapper|8 - arch/powerpc/configs/40x/virtex_defconfig| 75 - arch/powerpc/configs/44x/virtex5_defconfig | 74 - arch/powerpc/configs/ppc40x_defconfig|8 - arch/powerpc/configs/ppc44x_defconfig|8 - arch/powerpc/include/asm/xilinx_intc.h | 16 - arch/powerpc/include/asm/xilinx_pci.h| 21 - arch/powerpc/kernel/cputable.c | 39 - arch/powerpc/platforms/40x/Kconfig | 31 - arch/powerpc/platforms/40x/Makefile |1 - arch/powerpc/platforms/40x/virtex.c | 54 - arch/powerpc/platforms/44x/Kconfig | 37 - arch/powerpc/platforms/44x/Makefile |2 - arch/powerpc/platforms/44x/virtex.c | 60 - arch/powerpc/platforms/44x/virtex_ml510.c| 30 - arch/powerpc/platforms/Kconfig |4 - arch/powerpc/sysdev/Makefile |2 - arch/powerpc/sysdev/xilinx_intc.c| 88 -- arch/powerpc/sysdev/xilinx_pci.c | 132 -- arch/powerpc/xmon/ppc-dis.c |6 - arch/powerpc/xmon/ppc-opc.c | 23 - arch/powerpc/xmon/ppc.h |5 - drivers/char/Kconfig |2 +- drivers/video/fbdev/Kconfig |2 +- sound/drivers/Kconfig| 12 - sound/drivers/Makefile |2 - sound/drivers/ml403-ac97cr.c | 1298 -- 40 files changed, 7 insertions(+), 3305 deletions(-) delete mode 100644 arch/powerpc/boot/dts/virtex440-ml507.dts delete mode 100644 arch/powerpc/boot/dts/virtex440-ml510.dts delete mode 100644 arch/powerpc/boot/uartlite.c delete mode 100644 arch/powerpc/boot/virtex.c delete mode 100644 arch/powerpc/boot/virtex405-head.S delete mode 100644 arch/powerpc/configs/40x/virtex_defconfig delete mode 100644 arch/powerpc/configs/44x/virtex5_defconfig delete mode 100644 arch/powerpc/include/asm/xilinx_intc.h delete mode 100644 arch/powerpc/include/asm/xilinx_pci.h delete mode 100644 arch/powerpc/platforms/40x/virtex.c delete mode 100644 arch/powerpc/platforms/44x/virtex.c delete mode 100644 arch/powerpc/platforms/44x/virtex_ml510.c delete mode 100644 arch/powerpc/sysdev/xilinx_intc.c delete mode 100644 arch/powerpc/sysdev/xilinx_pci.c delete mode 100644 sound/drivers/ml403-ac97cr.c -- 2.26.0
Re: [PATCH v3] powerpc/perf: Use SIER_USER_MASK while updating SPRN_SIER for EBB events
> On 19-Mar-2020, at 4:22 PM, Michael Ellerman wrote: > > Hi Athira, > > Athira Rajeev writes: >> Sampled Instruction Event Register (SIER), is a PMU register, > ^ > that >> captures architecture state for a given sample. And sier_user_mask > ^ ^ > don't think we need "architecture" SIER_USER_MASK > >> defined in commit 330a1eb7775b ("powerpc/perf: Core EBB support for 64-bit >> book3s") defines the architected bits that needs to be saved from the SPR. > > Not quite, it defines the bits that are visible to userspace. > > And I think it's true that for EBB events the bits we need/want to save > are only the user visible bits. > >> Currently all of the bits from SIER are saved for EBB events. Patch fixes >> this by ANDing the "sier_user_mask" to data from SIER in ebb_switch_out(). >> This will force save only architected bits from the SIER. > > s/architected/user visible/ > > > But, why does it matter? The kernel saves the user visible bits, as well > as the kernel-only bits into the thread struct. And then later the > kernel restores that value into the hardware before returning to > userspace. > > But the hardware enforces the visibility of the bits, so userspace can't > observe any bits that it shouldn't. > > Or is there some other mechanism whereby userspace can see those bits? ;) > > If there was, what would the security implications of that be? Hi Michael, Thanks for your comments. In ebb_switch_in, we set PMCC bit [MMCR0 44:45 ] to 10 which means SIER ( Group B ) register is readable in problem state. Hence the intention of the patch was to make sure we are not exposing the bits which the userspace shouldn't be reading. But following your comment about "hardware enforcing the visibility of bits", I did try an "ebb" experiment which showed that reading SPRN_SIER didn't expose any bits other than the user visible bits. Sorry for the confusion here. In that case, Can we drop the existing definition of SIER_USER_MASK if it is no more needed ? Thanks Athira > > cheers > >> Fixes: 330a1eb7775b ("powerpc/perf: Core EBB support for 64-bit book3s") >> Signed-off-by: Athira Rajeev >> --- >> Changelog: >> v2 -> v3: >> - Corrected name of SIER register in commit message >> as pointed by Segher Boessenkool >> v1 -> v2: >> - Make the commit message more clearer. >> >> arch/powerpc/perf/core-book3s.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/perf/core-book3s.c >> b/arch/powerpc/perf/core-book3s.c >> index 3086055..48b61cc 100644 >> --- a/arch/powerpc/perf/core-book3s.c >> +++ b/arch/powerpc/perf/core-book3s.c >> @@ -579,7 +579,7 @@ static void ebb_switch_out(unsigned long mmcr0) >> return; >> >> current->thread.siar = mfspr(SPRN_SIAR); >> -current->thread.sier = mfspr(SPRN_SIER); >> +current->thread.sier = mfspr(SPRN_SIER) & SIER_USER_MASK; >> current->thread.sdar = mfspr(SPRN_SDAR); >> current->thread.mmcr0 = mmcr0 & MMCR0_USER_MASK; >> current->thread.mmcr2 = mfspr(SPRN_MMCR2) & MMCR2_USER_MASK; >> -- >> 1.8.3.1
[PATCH v4 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
From: "Gautham R. Shenoy" On Pseries LPARs, to calculate utilization, we need to know the [S]PURR ticks when the CPUs were busy or idle. The total PURR and SPURR ticks are already exposed via the per-cpu sysfs files "purr" and "spurr". This patch adds support for exposing the idle PURR and SPURR ticks via new per-cpu sysfs files named "idle_purr" and "idle_spurr". Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/sysfs.c | 82 +++-- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index 479c706..571b325 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include "cacheinfo.h" @@ -760,6 +761,74 @@ static void create_svm_file(void) } #endif /* CONFIG_PPC_SVM */ +#ifdef CONFIG_PPC_PSERIES +static void read_idle_purr(void *val) +{ + u64 *ret = val; + + *ret = read_this_idle_purr(); +} + +static ssize_t idle_purr_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cpu *cpu = container_of(dev, struct cpu, dev); + u64 val; + + smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1); + return sprintf(buf, "%llx\n", val); +} +static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL); + +static void create_idle_purr_file(struct device *s) +{ + if (firmware_has_feature(FW_FEATURE_LPAR)) + device_create_file(s, &dev_attr_idle_purr); +} + +static void remove_idle_purr_file(struct device *s) +{ + if (firmware_has_feature(FW_FEATURE_LPAR)) + device_remove_file(s, &dev_attr_idle_purr); +} + +static void read_idle_spurr(void *val) +{ + u64 *ret = val; + + *ret = read_this_idle_spurr(); +} + +static ssize_t idle_spurr_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cpu *cpu = container_of(dev, struct cpu, dev); + u64 val; + + smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1); + return sprintf(buf, "%llx\n", val); +} +static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL); + +static void create_idle_spurr_file(struct device *s) +{ + if (firmware_has_feature(FW_FEATURE_LPAR)) + device_create_file(s, &dev_attr_idle_spurr); +} + +static void remove_idle_spurr_file(struct device *s) +{ + if (firmware_has_feature(FW_FEATURE_LPAR)) + device_remove_file(s, &dev_attr_idle_spurr); +} + +#else /* CONFIG_PPC_PSERIES */ +#define create_idle_purr_file(s) +#define remove_idle_purr_file(s) +#define create_idle_spurr_file(s) +#define remove_idle_spurr_file(s) +#endif /* CONFIG_PPC_PSERIES */ + static int register_cpu_online(unsigned int cpu) { struct cpu *c = &per_cpu(cpu_devices, cpu); @@ -823,10 +892,13 @@ static int register_cpu_online(unsigned int cpu) if (!firmware_has_feature(FW_FEATURE_LPAR)) add_write_permission_dev_attr(&dev_attr_purr); device_create_file(s, &dev_attr_purr); + create_idle_purr_file(s); } - if (cpu_has_feature(CPU_FTR_SPURR)) + if (cpu_has_feature(CPU_FTR_SPURR)) { device_create_file(s, &dev_attr_spurr); + create_idle_spurr_file(s); + } if (cpu_has_feature(CPU_FTR_DSCR)) device_create_file(s, &dev_attr_dscr); @@ -910,11 +982,15 @@ static int unregister_cpu_online(unsigned int cpu) device_remove_file(s, &dev_attr_mmcra); #endif /* CONFIG_PMU_SYSFS */ - if (cpu_has_feature(CPU_FTR_PURR)) + if (cpu_has_feature(CPU_FTR_PURR)) { device_remove_file(s, &dev_attr_purr); + remove_idle_purr_file(s); + } - if (cpu_has_feature(CPU_FTR_SPURR)) + if (cpu_has_feature(CPU_FTR_SPURR)) { device_remove_file(s, &dev_attr_spurr); + remove_idle_spurr_file(s); + } if (cpu_has_feature(CPU_FTR_DSCR)) device_remove_file(s, &dev_attr_dscr); -- 1.9.4
[PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr
From: "Gautham R. Shenoy" Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU via the sysfs interface /sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently generates an IPI to obtain the desired value from the target CPU X. Since these aforementioned sysfs are typically read one after another, we end up generating 4 IPIs per CPU in a short duration. In order to minimize the IPI noise, this patch caches the values of all the four entities whenever one of them is read. If subsequently any of these are read within the next 10ms, the cached value is returned. With this, we will generate at most one IPI every 10ms for every CPU. Test-results: While reading the four sysfs files back-to-back for a given CPU every second for 100 seconds. Without the patch: 16 [XICS 2 Edge IPI] = 422 times DBL [Doorbell interrupts] = 13 times Total : 435 IPIs. With the patch: 16 [XICS 2 Edge IPI] = 111 times DBL [Doorbell interrupts] = 17 times Total : 128 IPIs. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/sysfs.c | 117 1 file changed, 97 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index 571b325..bd92023 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -586,8 +586,6 @@ void ppc_enable_pmcs(void) * SPRs which are not related to PMU. */ #ifdef CONFIG_PPC64 -SYSFS_SPRSETUP(purr, SPRN_PURR); -SYSFS_SPRSETUP(spurr, SPRN_SPURR); SYSFS_SPRSETUP(pir, SPRN_PIR); SYSFS_SPRSETUP(tscr, SPRN_TSCR); @@ -596,8 +594,6 @@ void ppc_enable_pmcs(void) enable write when needed with a separate function. Lets be conservative and default to pseries. */ -static DEVICE_ATTR(spurr, 0400, show_spurr, NULL); -static DEVICE_ATTR(purr, 0400, show_purr, store_purr); static DEVICE_ATTR(pir, 0400, show_pir, NULL); static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr); #endif /* CONFIG_PPC64 */ @@ -761,22 +757,110 @@ static void create_svm_file(void) } #endif /* CONFIG_PPC_SVM */ +#ifdef CONFIG_PPC64 +/* + * The duration (in ms) from the last IPI to the target CPU until + * which a cached value of purr, spurr, idle_purr, idle_spurr can be + * reported to the user on a corresponding sysfs file read. Beyond + * this duration, fresh values need to be obtained by sending IPIs to + * the target CPU when the sysfs files are read. + */ +static unsigned long util_stats_staleness_tolerance_ms = 10; +struct util_acct_stats { + u64 latest_purr; + u64 latest_spurr; +#ifdef CONFIG_PPC_PSERIES + u64 latest_idle_purr; + u64 latest_idle_spurr; +#endif + unsigned long last_update_jiffies; +}; + +DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats); + +static void update_util_acct_stats(void *ptr) +{ + struct util_acct_stats *stats = ptr; + + stats->latest_purr = mfspr(SPRN_PURR); + stats->latest_spurr = mfspr(SPRN_SPURR); #ifdef CONFIG_PPC_PSERIES -static void read_idle_purr(void *val) + stats->latest_idle_purr = read_this_idle_purr(); + stats->latest_idle_spurr = read_this_idle_spurr(); +#endif + stats->last_update_jiffies = jiffies; +} + +struct util_acct_stats *get_util_stats_ptr(int cpu) +{ + struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu); + unsigned long delta_jiffies; + + delta_jiffies = jiffies - stats->last_update_jiffies; + + /* +* If we have a recent enough data, reuse that instead of +* sending an IPI. +*/ + if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms) + return stats; + + smp_call_function_single(cpu, update_util_acct_stats, stats, 1); + return stats; +} + +static ssize_t show_purr(struct device *dev, +struct device_attribute *attr, char *buf) { - u64 *ret = val; + struct cpu *cpu = container_of(dev, struct cpu, dev); + struct util_acct_stats *stats; - *ret = read_this_idle_purr(); + stats = get_util_stats_ptr(cpu->dev.id); + return sprintf(buf, "%llx\n", stats->latest_purr); } +static void write_purr(void *val) +{ + mtspr(SPRN_PURR, *(unsigned long *)val); +} + +static ssize_t __used store_purr(struct device *dev, +struct device_attribute *attr, +const char *buf, size_t count) +{ + struct cpu *cpu = container_of(dev, struct cpu, dev); + unsigned long val; + int ret = kstrtoul(buf, 16, &val); + + if (ret != 0) + return -EINVAL; + + smp_call_function_single(cpu->dev.id, write_purr, &val, 1); + return count; +} +static DEVICE_ATTR(purr, 0400, show_purr, store_purr); + +static ssize_t show_spurr(struct device *dev, + struct device_attribute *attr, char *buf) +{
[PATCH v4 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
From: "Gautham R. Shenoy" Add documentation for the following sysfs interfaces: /sys/devices/system/cpu/cpuX/purr /sys/devices/system/cpu/cpuX/spurr /sys/devices/system/cpu/cpuX/idle_purr /sys/devices/system/cpu/cpuX/idle_spurr Signed-off-by: Gautham R. Shenoy --- Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++ 1 file changed, 39 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 2e0e3b4..bc07677 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -580,3 +580,42 @@ Description: Secure Virtual Machine If 1, it means the system is using the Protected Execution Facility in POWER9 and newer processors. i.e., it is a Secure Virtual Machine. + +What: /sys/devices/system/cpu/cpuX/purr +Date: Apr 2005 +Contact: Linux for PowerPC mailing list +Description: PURR ticks for this CPU since the system boot. + + The Processor Utilization Resources Register (PURR) is + a 64-bit counter which provides an estimate of the + resources used by the CPU thread. The contents of this + register increases monotonically. This sysfs interface + exposes the number of PURR ticks for cpuX. + +What: /sys/devices/system/cpu/cpuX/spurr +Date: Dec 2006 +Contact: Linux for PowerPC mailing list +Description: SPURR ticks for this CPU since the system boot. + + The Scaled Processor Utilization Resources Register + (SPURR) is a 64-bit counter that provides a frequency + invariant estimate of the resources used by the CPU + thread. The contents of this register increases + monotonically. This sysfs interface exposes the number + of SPURR ticks for cpuX. + +What: /sys/devices/system/cpu/cpuX/idle_purr +Date: Mar 2020 +Contact: Linux for PowerPC mailing list +Description: PURR ticks for cpuX when it was idle. + + This sysfs interface exposes the number of PURR ticks + for cpuX when it was idle. + +What: /sys/devices/system/cpu/cpuX/idle_spurr +Date: Mar 2020 +Contact: Linux for PowerPC mailing list +Description: SPURR ticks for cpuX when it was idle. + + This sysfs interface exposes the number of SPURR ticks + for cpuX when it was idle. -- 1.9.4
[PATCH v4 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file
From: "Gautham R. Shenoy" Currently prior to entering an idle state on a Linux Guest, the pseries cpuidle driver implement an idle_loop_prolog() and idle_loop_epilog() functions which ensure that idle_purr is correctly computed, and the hypervisor is informed that the CPU cycles have been donated. These prolog and epilog functions are also required in the default idle call, i.e pseries_lpar_idle(). Hence move these accessor functions to a common header file and call them from pseries_lpar_idle(). Since the existing header files such as asm/processor.h have enough clutter, create a new header file asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog() to pseries_idle_prolog() and pseries_idle_epilog() as they are only relavent for on pseries guests. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/idle.h| 31 + arch/powerpc/platforms/pseries/setup.c | 7 +-- drivers/cpuidle/cpuidle-pseries.c | 36 +++--- 3 files changed, 43 insertions(+), 31 deletions(-) create mode 100644 arch/powerpc/include/asm/idle.h diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h new file mode 100644 index 000..32064a4c --- /dev/null +++ b/arch/powerpc/include/asm/idle.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _ASM_POWERPC_IDLE_H +#define _ASM_POWERPC_IDLE_H +#include +#include + +#ifdef CONFIG_PPC_PSERIES +static inline void pseries_idle_prolog(unsigned long *in_purr) +{ + ppc64_runlatch_off(); + *in_purr = mfspr(SPRN_PURR); + /* +* Indicate to the HV that we are idle. Now would be +* a good time to find other work to dispatch. +*/ + get_lppaca()->idle = 1; +} + +static inline void pseries_idle_epilog(unsigned long in_purr) +{ + u64 wait_cycles; + + wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles); + wait_cycles += mfspr(SPRN_PURR) - in_purr; + get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); + get_lppaca()->idle = 0; + + ppc64_runlatch_on(); +} +#endif /* CONFIG_PPC_PSERIES */ +#endif diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 0c8421d..2f53e6b 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -68,6 +68,7 @@ #include #include #include +#include #include #include @@ -319,6 +320,8 @@ static int alloc_dispatch_log_kmem_cache(void) static void pseries_lpar_idle(void) { + unsigned long in_purr; + /* * Default handler to go into low thread priority and possibly * low power mode by ceding processor to hypervisor @@ -328,7 +331,7 @@ static void pseries_lpar_idle(void) return; /* Indicate to hypervisor that we are idle. */ - get_lppaca()->idle = 1; + pseries_idle_prolog(&in_purr); /* * Yield the processor to the hypervisor. We return if @@ -339,7 +342,7 @@ static void pseries_lpar_idle(void) */ cede_processor(); - get_lppaca()->idle = 0; + pseries_idle_epilog(in_purr); } /* diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index 74c2479..46d5e05 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -19,6 +19,7 @@ #include #include #include +#include #include struct cpuidle_driver pseries_idle_driver = { @@ -31,29 +32,6 @@ struct cpuidle_driver pseries_idle_driver = { static u64 snooze_timeout __read_mostly; static bool snooze_timeout_en __read_mostly; -static inline void idle_loop_prolog(unsigned long *in_purr) -{ - ppc64_runlatch_off(); - *in_purr = mfspr(SPRN_PURR); - /* -* Indicate to the HV that we are idle. Now would be -* a good time to find other work to dispatch. -*/ - get_lppaca()->idle = 1; -} - -static inline void idle_loop_epilog(unsigned long in_purr) -{ - u64 wait_cycles; - - wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles); - wait_cycles += mfspr(SPRN_PURR) - in_purr; - get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); - get_lppaca()->idle = 0; - - ppc64_runlatch_on(); -} - static int snooze_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) @@ -63,7 +41,7 @@ static int snooze_loop(struct cpuidle_device *dev, set_thread_flag(TIF_POLLING_NRFLAG); - idle_loop_prolog(&in_purr); + pseries_idle_prolog(&in_purr); local_irq_enable(); snooze_exit_time = get_tb() + snooze_timeout; @@ -87,7 +65,7 @@ static int snooze_loop(struct cpuidle_device *dev, local_irq_disable(); - idle_loop_epilog(in_purr); + pseries_idle_epilog(in_purr); return index; } @@ -115,7 +93,7 @@ static int dedicate
[PATCH v4 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs
From: "Gautham R. Shenoy" On Pseries LPARs, to calculate utilization, we need to know the [S]PURR ticks when the CPUs were busy or idle. Via pseries_idle_prolog(), pseries_idle_epilog(), we track the idle PURR ticks in the VPA variable "wait_state_cycles". This patch extends the support to account for the idle SPURR ticks. It also provides an accessor function to accurately reads idle SPURR ticks. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/idle.h| 33 + arch/powerpc/platforms/pseries/setup.c | 2 ++ 2 files changed, 35 insertions(+) diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h index d4bfb6a..accd1f5 100644 --- a/arch/powerpc/include/asm/idle.h +++ b/arch/powerpc/include/asm/idle.h @@ -5,13 +5,20 @@ #include #ifdef CONFIG_PPC_PSERIES +DECLARE_PER_CPU(u64, idle_spurr_cycles); DECLARE_PER_CPU(u64, idle_entry_purr_snap); +DECLARE_PER_CPU(u64, idle_entry_spurr_snap); static inline void snapshot_purr_idle_entry(void) { *this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR); } +static inline void snapshot_spurr_idle_entry(void) +{ + *this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR); +} + static inline void update_idle_purr_accounting(void) { u64 wait_cycles; @@ -22,10 +29,19 @@ static inline void update_idle_purr_accounting(void) get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); } +static inline void update_idle_spurr_accounting(void) +{ + u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles); + u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap); + + *idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr; +} + static inline void pseries_idle_prolog(void) { ppc64_runlatch_off(); snapshot_purr_idle_entry(); + snapshot_spurr_idle_entry(); /* * Indicate to the HV that we are idle. Now would be * a good time to find other work to dispatch. @@ -36,6 +52,7 @@ static inline void pseries_idle_prolog(void) static inline void pseries_idle_epilog(void) { update_idle_purr_accounting(); + update_idle_spurr_accounting(); get_lppaca()->idle = 0; ppc64_runlatch_on(); } @@ -56,5 +73,21 @@ static inline u64 read_this_idle_purr(void) return be64_to_cpu(get_lppaca()->wait_state_cycles); } +static inline u64 read_this_idle_spurr(void) +{ + /* +* If we are reading from an idle context, update the +* idle-spurr cycles corresponding to the last idle period. +* Since the idle context is not yet over, take a fresh +* snapshot of the idle-spurr. +*/ + if (get_lppaca()->idle == 1) { + update_idle_spurr_accounting(); + snapshot_spurr_idle_entry(); + } + + return *this_cpu_ptr(&idle_spurr_cycles); +} + #endif /* CONFIG_PPC_PSERIES */ #endif diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 4905c96..1b55e80 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -318,7 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void) } machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache); +DEFINE_PER_CPU(u64, idle_spurr_cycles); DEFINE_PER_CPU(u64, idle_entry_purr_snap); +DEFINE_PER_CPU(u64, idle_entry_spurr_snap); static void pseries_lpar_idle(void) { /* -- 1.9.4
[PATCH v4 2/6] powerpc/idle: Add accessor function to always read latest idle PURR
From: "Gautham R. Shenoy" Currently when CPU goes idle, we take a snapshot of PURR via pseries_idle_prolog() which is used at the CPU idle exit to compute the idle PURR cycles via the function pseries_idle_epilog(). Thus, the value of idle PURR cycle thus read before pseries_idle_prolog() and after pseries_idle_epilog() is always correct. However, if we were to read the idle PURR cycles from an interrupt context between pseries_idle_prolog() and pseries_idle_epilog() (this will be done in a future patch), then, the value of the idle PURR thus read will not include the cycles spent in the most recent idle period. This patch addresses the issue by providing accessor function to read the idle PURR such such that it includes the cycles spent in the most recent idle period, if we read it between pseries_idle_prolog() and pseries_idle_epilog(). In order to achieve it, the patch saves the snapshot of PURR in pseries_idle_prolog() in a per-cpu variable, instead of on the stack, so that it can be accessed from an interrupt context. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/idle.h| 47 +++--- arch/powerpc/platforms/pseries/setup.c | 7 +++-- drivers/cpuidle/cpuidle-pseries.c | 15 +-- 3 files changed, 47 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h index 32064a4c..d4bfb6a 100644 --- a/arch/powerpc/include/asm/idle.h +++ b/arch/powerpc/include/asm/idle.h @@ -5,10 +5,27 @@ #include #ifdef CONFIG_PPC_PSERIES -static inline void pseries_idle_prolog(unsigned long *in_purr) +DECLARE_PER_CPU(u64, idle_entry_purr_snap); + +static inline void snapshot_purr_idle_entry(void) +{ + *this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR); +} + +static inline void update_idle_purr_accounting(void) +{ + u64 wait_cycles; + u64 in_purr = *this_cpu_ptr(&idle_entry_purr_snap); + + wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles); + wait_cycles += mfspr(SPRN_PURR) - in_purr; + get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); +} + +static inline void pseries_idle_prolog(void) { ppc64_runlatch_off(); - *in_purr = mfspr(SPRN_PURR); + snapshot_purr_idle_entry(); /* * Indicate to the HV that we are idle. Now would be * a good time to find other work to dispatch. @@ -16,16 +33,28 @@ static inline void pseries_idle_prolog(unsigned long *in_purr) get_lppaca()->idle = 1; } -static inline void pseries_idle_epilog(unsigned long in_purr) +static inline void pseries_idle_epilog(void) { - u64 wait_cycles; - - wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles); - wait_cycles += mfspr(SPRN_PURR) - in_purr; - get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); + update_idle_purr_accounting(); get_lppaca()->idle = 0; - ppc64_runlatch_on(); } + +static inline u64 read_this_idle_purr(void) +{ + /* +* If we are reading from an idle context, update the +* idle-purr cycles corresponding to the last idle period. +* Since the idle context is not yet over, take a fresh +* snapshot of the idle-purr. +*/ + if (unlikely(get_lppaca()->idle == 1)) { + update_idle_purr_accounting(); + snapshot_purr_idle_entry(); + } + + return be64_to_cpu(get_lppaca()->wait_state_cycles); +} + #endif /* CONFIG_PPC_PSERIES */ #endif diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 2f53e6b..4905c96 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -318,10 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void) } machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache); +DEFINE_PER_CPU(u64, idle_entry_purr_snap); static void pseries_lpar_idle(void) { - unsigned long in_purr; - /* * Default handler to go into low thread priority and possibly * low power mode by ceding processor to hypervisor @@ -331,7 +330,7 @@ static void pseries_lpar_idle(void) return; /* Indicate to hypervisor that we are idle. */ - pseries_idle_prolog(&in_purr); + pseries_idle_prolog(); /* * Yield the processor to the hypervisor. We return if @@ -342,7 +341,7 @@ static void pseries_lpar_idle(void) */ cede_processor(); - pseries_idle_epilog(in_purr); + pseries_idle_epilog(); } /* diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index 46d5e05..6513ef2 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -36,12 +36,11 @@ static int snooze_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long in_purr; u64
[PATCH v4 0/6] [PATCH v4 0/6] Track and expose idle PURR and SPURR ticks
From: "Gautham R. Shenoy" Hi, This is the fourth version of the patches to track and expose idle PURR and SPURR ticks. These patches are required by tools such as lparstat to compute system utilization for capacity planning purposes. The previous versions can be found here: v3: https://lkml.org/lkml/2020/3/11/331 v2: https://lkml.org/lkml/2020/2/21/21 v1: https://lore.kernel.org/patchwork/cover/1159341/ They key changes from v3 are: - Fixed the build errors on !CONFIG_PPC64 and !CONFIG_PPC_PSERIES configurations notified by the kbuild bot. Motivation: === On PSeries LPARs, the data centers planners desire a more accurate view of system utilization per resource such as CPU to plan the system capacity requirements better. Such accuracy can be obtained by reading PURR/SPURR registers for CPU resource utilization. Tools such as lparstat which are used to compute the utilization need to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR counters are already exposed through sysfs. We already account for PURR ticks when we go to idle so that we can update the VPA area. This patchset extends support to account for SPURR ticks when idle, and expose both via per-cpu sysfs files. This patch series also introduces a patch (Patch 6/6) to send an IPI in order to read and cache the values of purr, spurr, idle_purr and idle_spurr of the target CPU when any one of them is read via sysfs. These cached values will be presented if any of these sysfs are read within the next 10ms. If these sysfs files are read after 10ms from the earlier IPI, a fresh IPI is issued to read and cache the values again. This minimizes the number of IPIs required to be sent when these values are read back-to-back via the sysfs interface. Without patch 6/6 (Without caching): 16 [XICS 2 Edge IPI] = 422 times DBL [Doorbell interrupts] = 13 times Total : 435 IPIs. With patch 6/6 (With caching): 16 [XICS 2 Edge IPI] = 111 times DBL [Doorbell interrupts] = 17 times Total : 128 IPIs. These patches are required for enhancement to the lparstat utility that compute the CPU utilization based on PURR and SPURR which can be found here : https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4 With the patches, when lparstat is run on a LPAR running CPU-Hogs, = sudo ./src/lparstat -E 1 3 System Configuration type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 ---Actual--- -Normalized- %busy %idle Frequency %busy %idle -- -- - -- -- 99.99 0.00 3.35GHz[111%] 110.99 0.00 100.00 0.00 3.35GHz[111%] 111.00 0.00 100.00 0.00 3.35GHz[111%] 111.00 0.00 With patches, when lparstat is run on and idle LPAR = ---Actual--- -Normalized- %busy %idle Frequency %busy %idle -- -- - -- -- 0.20 99.81 2.17GHz[ 72%] 0.19 71.82 0.42 99.58 2.11GHz[ 70%] 0.31 69.69 0.41 99.59 2.11GHz[ 70%] 0.31 69.69 Gautham R. Shenoy (6): powerpc: Move idle_loop_prolog()/epilog() functions to header file powerpc/idle: Add accessor function to always read latest idle PURR powerpc/pseries: Account for SPURR ticks on idle CPUs powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Documentation/ABI/testing/sysfs-devices-system-cpu | 39 + arch/powerpc/include/asm/idle.h| 93 arch/powerpc/kernel/sysfs.c| 167 - arch/powerpc/platforms/pseries/setup.c | 8 +- drivers/cpuidle/cpuidle-pseries.c | 39 + 5 files changed, 305 insertions(+), 41 deletions(-) create mode 100644 arch/powerpc/include/asm/idle.h -- 1.9.4
[PATCH v7 6/6] perf/tools/pmu-events/powerpc: Add hv_24x7 socket/chip level metric events
The hv_24×7 feature in IBM® POWER9™ processor-based servers provide the facility to continuously collect large numbers of hardware performance metrics efficiently and accurately. This patch adds hv_24x7 metric file for different Socket/chip resources. Result: power9 platform: command:# ./perf stat --metric-only -M Memory_RD_BW_Chip -C 0 -I 1000 1.96188 0.9 0.3 2.000285720 0.5 0.1 3.000424990 0.4 0.1 command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000 1.979812.32.3 2.0002917132.32.3 3.0004217192.32.3 4.0005509122.32.3 Signed-off-by: Kajol Jain --- .../arch/powerpc/power9/nest_metrics.json | 19 +++ 1 file changed, 19 insertions(+) create mode 100644 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json diff --git a/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json new file mode 100644 index ..c121e526442a --- /dev/null +++ b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json @@ -0,0 +1,19 @@ +[ +{ +"MetricExpr": "(hv_24x7@PM_MCS01_128B_RD_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS01_128B_RD_DISP_PORT23\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_RD_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_RD_DISP_PORT23\\,chip\\=?@)", +"MetricName": "Memory_RD_BW_Chip", +"MetricGroup": "Memory_BW", +"ScaleUnit": "1.6e-2MB" +}, +{ + "MetricExpr": "(hv_24x7@PM_MCS01_128B_WR_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS01_128B_WR_DISP_PORT23\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_WR_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_WR_DISP_PORT23\\,chip\\=?@ )", +"MetricName": "Memory_WR_BW_Chip", +"MetricGroup": "Memory_BW", +"ScaleUnit": "1.6e-2MB" +}, +{ + "MetricExpr": "(hv_24x7@PM_PB_CYC\\,chip\\=?@ )", +"MetricName": "PowerBUS_Frequency", +"ScaleUnit": "2.5e-7GHz" +} +] -- 2.18.1
[PATCH v7 5/6] tools/perf: Enable Hz/hz prinitg for --metric-only option
Commit 54b5091606c18 ("perf stat: Implement --metric-only mode") added function 'valid_only_metric()' which drops "Hz" or "hz", if it is part of "ScaleUnit". This patch enable it since hv_24x7 supports couple of frequency events. Signed-off-by: Kajol Jain --- tools/perf/util/stat-display.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index 16efdba1973a..ecdebfcdd379 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -237,8 +237,6 @@ static bool valid_only_metric(const char *unit) if (!unit) return false; if (strstr(unit, "/sec") || - strstr(unit, "hz") || - strstr(unit, "Hz") || strstr(unit, "CPUs utilized")) return false; return true; -- 2.18.1
[PATCH v7 4/6] perf/tools: Enhance JSON/metric infrastructure to handle "?"
Patch enhances current metric infrastructure to handle "?" in the metric expression. The "?" can be use for parameters whose value not known while creating metric events and which can be replace later at runtime to the proper value. It also add flexibility to create multiple events out of single metric event added in json file. Patch adds function 'arch_get_runtimeparam' which is a arch specific function, returns the count of metric events need to be created. By default it return 1. This infrastructure needed for hv_24x7 socket/chip level events. "hv_24x7" chip level events needs specific chip-id to which the data is requested. Function 'arch_get_runtimeparam' implemented in header.c which extract number of sockets from sysfs file "sockets" under "/sys/devices/hv_24x7/interface/". With this patch basically we are trying to create as many metric events as define by runtime_param. For that one loop is added in function 'metricgroup__add_metric', which create multiple events at run time depend on return value of 'arch_get_runtimeparam' and merge that event in 'group_list'. To achieve that we are actually passing this parameter value as part of `expr__find_other` function and changing "?" present in metric expression with this value. As in our json file, there gonna be single metric event, and out of which we are creating multiple events. To understand which data count belongs to which parameter value, we also printing param value in generic_metric function. For example, command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000 1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 So, here _0 and _1 after PowerBUS_Frequency specify parameter value. Signed-off-by: Kajol Jain --- tools/perf/arch/powerpc/util/header.c | 8 tools/perf/tests/expr.c | 8 tools/perf/util/expr.c| 11 ++- tools/perf/util/expr.h| 5 +++-- tools/perf/util/expr.l| 27 +++--- tools/perf/util/metricgroup.c | 28 --- tools/perf/util/metricgroup.h | 2 ++ tools/perf/util/stat-shadow.c | 17 ++-- 8 files changed, 79 insertions(+), 27 deletions(-) diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c index 3b4cdfc5efd6..d4870074f14c 100644 --- a/tools/perf/arch/powerpc/util/header.c +++ b/tools/perf/arch/powerpc/util/header.c @@ -7,6 +7,8 @@ #include #include #include "header.h" +#include "metricgroup.h" +#include #define mfspr(rn) ({unsigned long rval; \ asm volatile("mfspr %0," __stringify(rn) \ @@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused) return bufp; } + +int arch_get_runtimeparam(void) +{ + int count; + return sysfs__read_int("/devices/hv_24x7/interface/sockets", &count) < 0 ? 1 : count; +} diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c index ea10fc4412c4..516504cf0ea5 100644 --- a/tools/perf/tests/expr.c +++ b/tools/perf/tests/expr.c @@ -10,7 +10,7 @@ static int test(struct expr_parse_ctx *ctx, const char *e, double val2) { double val; - if (expr__parse(&val, ctx, e)) + if (expr__parse(&val, ctx, e, 1)) TEST_ASSERT_VAL("parse test failed", 0); TEST_ASSERT_VAL("unexpected value", val == val2); return 0; @@ -44,15 +44,15 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused) return ret; p = "FOO/0"; - ret = expr__parse(&val, &ctx, p); + ret = expr__parse(&val, &ctx, p, 1); TEST_ASSERT_VAL("division by zero", ret == -1); p = "BAR/"; - ret = expr__parse(&val, &ctx, p); + ret = expr__parse(&val, &ctx, p, 1); TEST_ASSERT_VAL("missing operand", ret == -1); TEST_ASSERT_VAL("find other", - expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other) == 0); + expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other, 1) == 0); TEST_ASSERT_VAL("find other", num_other == 3); TEST_ASSERT_VAL("find other", !strcmp(other[0], "BAR")); TEST_ASSERT_VAL("find other", !strcmp(other[1], "BAZ")); diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c index c3382d58cf40..9d56e5fb3873 100644 --- a/tools/perf/util/expr.c +++ b/tools/perf/util/expr.c @@ -27,10 +27,11 @@ void expr__ctx_init(struct expr_parse_ctx *ctx) static int __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr, - int start)
[PATCH v7 3/6] perf/tools: Refactoring metricgroup__add_metric function
This patch refactor metricgroup__add_metric function where some part of it move to function metricgroup__add_metric_param. No logic change. Signed-off-by: Kajol Jain --- tools/perf/util/metricgroup.c | 61 +-- 1 file changed, 36 insertions(+), 25 deletions(-) diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index 926449a7cdbf..b905eaa907a7 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -485,6 +485,40 @@ static bool metricgroup__has_constraint(struct pmu_event *pe) return false; } +static int __metricgroup__add_metric(struct strbuf *events, + struct list_head *group_list, struct pmu_event *pe) +{ + + const char **ids; + int idnum; + struct egroup *eg; + int ret = -EINVAL; + + if (expr__find_other(pe->metric_expr, NULL, &ids, &idnum) < 0) + return ret; + + if (events->len > 0) + strbuf_addf(events, ","); + + if (metricgroup__has_constraint(pe)) + metricgroup__add_metric_non_group(events, ids, idnum); + else + metricgroup__add_metric_weak_group(events, ids, idnum); + + eg = malloc(sizeof(*eg)); + if (!eg) + return -ENOMEM; + + eg->ids = ids; + eg->idnum = idnum; + eg->metric_name = pe->metric_name; + eg->metric_expr = pe->metric_expr; + eg->metric_unit = pe->unit; + list_add_tail(&eg->nd, group_list); + + return 0; +} + static int metricgroup__add_metric(const char *metric, struct strbuf *events, struct list_head *group_list) { @@ -504,35 +538,12 @@ static int metricgroup__add_metric(const char *metric, struct strbuf *events, continue; if (match_metric(pe->metric_group, metric) || match_metric(pe->metric_name, metric)) { - const char **ids; - int idnum; - struct egroup *eg; pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name); - if (expr__find_other(pe->metric_expr, -NULL, &ids, &idnum) < 0) + ret = __metricgroup__add_metric(events, group_list, pe); + if (ret == -EINVAL || !ret) continue; - if (events->len > 0) - strbuf_addf(events, ","); - - if (metricgroup__has_constraint(pe)) - metricgroup__add_metric_non_group(events, ids, idnum); - else - metricgroup__add_metric_weak_group(events, ids, idnum); - - eg = malloc(sizeof(struct egroup)); - if (!eg) { - ret = -ENOMEM; - break; - } - eg->ids = ids; - eg->idnum = idnum; - eg->metric_name = pe->metric_name; - eg->metric_expr = pe->metric_expr; - eg->metric_unit = pe->unit; - list_add_tail(&eg->nd, group_list); - ret = 0; } } return ret; -- 2.18.1
[PATCH v7 2/6] perf expr: Add expr_scanner_ctx object
From: Jiri Olsa Adding expr_scanner_ctx object to hold user data for the expr scanner. Currently it holds only start_token, Kajol Jain will use it to hold 24x7 runtime param. Signed-off-by: Jiri Olsa --- tools/perf/util/expr.c | 6 -- tools/perf/util/expr.h | 4 tools/perf/util/expr.l | 10 +- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c index c8ccc548a585..c3382d58cf40 100644 --- a/tools/perf/util/expr.c +++ b/tools/perf/util/expr.c @@ -3,7 +3,6 @@ #include #include "expr.h" #include "expr-bison.h" -#define YY_EXTRA_TYPE int #include "expr-flex.h" #ifdef PARSER_DEBUG @@ -30,11 +29,14 @@ static int __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr, int start) { + struct expr_scanner_ctx scanner_ctx = { + .start_token = start, + }; YY_BUFFER_STATE buffer; void *scanner; int ret; - ret = expr_lex_init_extra(start, &scanner); + ret = expr_lex_init_extra(&scanner_ctx, &scanner); if (ret) return ret; diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h index b9e53f2b5844..0938ad166ece 100644 --- a/tools/perf/util/expr.h +++ b/tools/perf/util/expr.h @@ -15,6 +15,10 @@ struct expr_parse_ctx { struct expr_parse_id ids[MAX_PARSE_ID]; }; +struct expr_scanner_ctx { + int start_token; +}; + void expr__ctx_init(struct expr_parse_ctx *ctx); void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val); int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr); diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l index eaad29243c23..2582c2464938 100644 --- a/tools/perf/util/expr.l +++ b/tools/perf/util/expr.l @@ -76,13 +76,13 @@ sym [0-9a-zA-Z_\.:@]+ symbol {spec}*{sym}*{spec}*{sym}* %% - { - int start_token; + struct expr_scanner_ctx *sctx = expr_get_extra(yyscanner); - start_token = expr_get_extra(yyscanner); + { + int start_token = sctx->start_token; - if (start_token) { - expr_set_extra(NULL, yyscanner); + if (sctx->start_token) { + sctx->start_token = 0; return start_token; } } -- 2.18.1
[PATCH v7 1/6] perf expr: Add expr_ prefix for parse_ctx and parse_id
From: Jiri Olsa Adding expr_ prefix for parse_ctx and parse_id, to straighten out the expr* namespace. There's no functional change. Signed-off-by: Jiri Olsa --- tools/perf/tests/expr.c | 4 ++-- tools/perf/util/expr.c| 10 +- tools/perf/util/expr.h| 12 ++-- tools/perf/util/expr.y| 6 +++--- tools/perf/util/stat-shadow.c | 2 +- 5 files changed, 17 insertions(+), 17 deletions(-) diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c index 28313e59d6f6..ea10fc4412c4 100644 --- a/tools/perf/tests/expr.c +++ b/tools/perf/tests/expr.c @@ -6,7 +6,7 @@ #include #include -static int test(struct parse_ctx *ctx, const char *e, double val2) +static int test(struct expr_parse_ctx *ctx, const char *e, double val2) { double val; @@ -22,7 +22,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused) const char **other; double val; int i, ret; - struct parse_ctx ctx; + struct expr_parse_ctx ctx; int num_other; expr__ctx_init(&ctx); diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c index fd192ddf93c1..c8ccc548a585 100644 --- a/tools/perf/util/expr.c +++ b/tools/perf/util/expr.c @@ -11,7 +11,7 @@ extern int expr_debug; #endif /* Caller must make sure id is allocated */ -void expr__add_id(struct parse_ctx *ctx, const char *name, double val) +void expr__add_id(struct expr_parse_ctx *ctx, const char *name, double val) { int idx; @@ -21,13 +21,13 @@ void expr__add_id(struct parse_ctx *ctx, const char *name, double val) ctx->ids[idx].val = val; } -void expr__ctx_init(struct parse_ctx *ctx) +void expr__ctx_init(struct expr_parse_ctx *ctx) { ctx->num_ids = 0; } static int -__expr__parse(double *val, struct parse_ctx *ctx, const char *expr, +__expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr, int start) { YY_BUFFER_STATE buffer; @@ -52,7 +52,7 @@ __expr__parse(double *val, struct parse_ctx *ctx, const char *expr, return ret; } -int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr) +int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr) { return __expr__parse(final_val, ctx, expr, EXPR_PARSE) ? -1 : 0; } @@ -75,7 +75,7 @@ int expr__find_other(const char *expr, const char *one, const char ***other, int *num_other) { int err, i = 0, j = 0; - struct parse_ctx ctx; + struct expr_parse_ctx ctx; expr__ctx_init(&ctx); err = __expr__parse(NULL, &ctx, expr, EXPR_OTHER); diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h index 9377538f4097..b9e53f2b5844 100644 --- a/tools/perf/util/expr.h +++ b/tools/perf/util/expr.h @@ -5,19 +5,19 @@ #define EXPR_MAX_OTHER 20 #define MAX_PARSE_ID EXPR_MAX_OTHER -struct parse_id { +struct expr_parse_id { const char *name; double val; }; -struct parse_ctx { +struct expr_parse_ctx { int num_ids; - struct parse_id ids[MAX_PARSE_ID]; + struct expr_parse_id ids[MAX_PARSE_ID]; }; -void expr__ctx_init(struct parse_ctx *ctx); -void expr__add_id(struct parse_ctx *ctx, const char *id, double val); -int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr); +void expr__ctx_init(struct expr_parse_ctx *ctx); +void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val); +int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr); int expr__find_other(const char *expr, const char *one, const char ***other, int *num_other); diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y index 4720cbe79357..cd17486c1c5d 100644 --- a/tools/perf/util/expr.y +++ b/tools/perf/util/expr.y @@ -15,7 +15,7 @@ %define api.pure full %parse-param { double *final_val } -%parse-param { struct parse_ctx *ctx } +%parse-param { struct expr_parse_ctx *ctx } %parse-param {void *scanner} %lex-param {void* scanner} @@ -39,14 +39,14 @@ %{ static void expr_error(double *final_val __maybe_unused, - struct parse_ctx *ctx __maybe_unused, + struct expr_parse_ctx *ctx __maybe_unused, void *scanner, const char *s) { pr_debug("%s\n", s); } -static int lookup_id(struct parse_ctx *ctx, char *id, double *val) +static int lookup_id(struct expr_parse_ctx *ctx, char *id, double *val) { int i; diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c index 0fd713d3674f..402af3e8d287 100644 --- a/tools/perf/util/stat-shadow.c +++ b/tools/perf/util/stat-shadow.c @@ -729,7 +729,7 @@ static void generic_metric(struct perf_stat_config *config, struct runtime_stat *st) { print_metric_t print_metric = out->print_metric; - struct parse_ctx pctx; + struct expr_parse_ctx pctx;
[PATCH v7 0/6] powerpc/perf: Add json file metric support for the hv_24x7 socket/chip level events
Patchset adds json file metric support for the hv_24x7 socket/chip level events. "hv_24x7" pmu interface events needs system dependent parameter like socket/chip/core. For example, hv_24x7 chip level events needs specific chip-id to which the data is requested should be added as part of pmu events. So to enable JSON file support to "hv_24x7" interface, patchset reads total number of sockets details in sysfs under "/sys/devices/hv_24x7/interface/". Second patch of the patchset adds expr_scanner_ctx object to hold user data for the expr scanner, which can be used to hold runtime parameter. Patch 4 & 6 of the patchset handles perf tool plumbing needed to replace the "?" character in the metric expression to proper value and hv_24x7 json metric file for different Socket/chip resources. Patch set also enable Hz/hz prinitg for --metric-only option to print metric data for bus frequency. Applied and tested all these patches cleanly on top of jiri's flex changes with the changes done by Kan Liang for "Support metric group constraint" patchset and made required changes. Also apply this patch on top of the fix patch send earlier for printing metric name incase overlapping events. https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=37cd7f65bf71a48f25eeb6d9be5dacb20d008ea6 Changelog: v6 -> v7 - Spit patchset into two patch series one for kernel changes and other for tool side changes. - Made changes Suggested by Jiri, including rather then reading runtime parameter from metric name, actually add it in structure egroup and metric_expr. - As we don't need to read runtime parameter from metric name, now I am not appending it and rather just printing it in generic_metric function. Kernel Side changes patch series: https://lkml.org/lkml/2020/3/27/58 v5 -> v6 - resolve compilation issue due to rearranging patch series. - Rather then adding new function to take careof case for runtime param in metricgroup__add_metric, using metricgroup__add_metric_param itself for that work. - Address some optimization suggested like using directly file path rather then adding new macro in header.c - Change commit message on patch where we are adding "?" support by adding simple example. v4 -> v5 - Using sysfs__read_int instead of sysfs__read_ull while reading parameter value in powerpc/util/header.c file. - Using asprintf rather then malloc and sprintf Suggested by Arnaldo Carvalho de Melo - Break patch 6 from previous version to two patch, - One to add refactor current "metricgroup__add_metric" function and another where actually "?" handling infra added. - Add expr__runtimeparam as part of 'expr_scanner_ctx' struct rather then making it global variable. Thanks Jiri for adding this structure to hold user data for the expr scanner. - Add runtime param as agrugement to function 'expr__find_other' and 'expr__parse' and made changes on references accordingly. v3 -> v4 - Apply these patch on top of Kan liang changes. As suggested by Jiri. v2 -> v3 - Remove setting event_count to 0 part in function 'h_24x7_event_read' with comment rather then adding 0 to event_count value. Suggested by: Sukadev Bhattiprolu - Apply tool side changes require to replace "?" on Jiri's flex patch series and made all require changes to make it compatible with added flex change. v1 -> v2 - Rename hv-24x7 metric json file as nest_metrics.json Jiri Olsa (2): perf expr: Add expr_ prefix for parse_ctx and parse_id perf expr: Add expr_scanner_ctx object Kajol Jain (4): perf/tools: Refactoring metricgroup__add_metric function perf/tools: Enhance JSON/metric infrastructure to handle "?" tools/perf: Enable Hz/hz prinitg for --metric-only option perf/tools/pmu-events/powerpc: Add hv_24x7 socket/chip level metric events tools/perf/arch/powerpc/util/header.c | 8 ++ .../arch/powerpc/power9/nest_metrics.json | 19 + tools/perf/tests/expr.c | 12 +-- tools/perf/util/expr.c| 25 +++--- tools/perf/util/expr.h| 19 +++-- tools/perf/util/expr.l| 37 ++--- tools/perf/util/expr.y| 6 +- tools/perf/util/metricgroup.c | 79 +-- tools/perf/util/metricgroup.h | 2 + tools/perf/util/stat-display.c| 2 - tools/perf/util/stat-shadow.c | 19 +++-- 11 files changed, 157 insertions(+), 71 deletions(-) create mode 100644 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json -- 2.18.1
Re: hardcoded SIGSEGV in __die() ?
On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote: > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > > Joakim Tjernlund writes: > > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote: > > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit : > > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit : > > > > > In __die(), see below, there is this call to notify_send() with > > > > > SIGSEGV hardcoded, this seems odd > > > > > to me as the variable "err" holds the true signal(in my case SIGBUS) > > > > > Should not SIGSEGV be replaced with the true signal no.? > > > > > > > > As far as I can see, comes from > > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3D&reserved=0 > > > > > > > > > > And > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2ad&data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714&sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3D&reserved=0 > > > shows it is (was?) similar on x86. > > > > > > > I tried to follow that chain thinking it would end up sending a signal to > > user space but I cannot see > > that happens. Seems to be related to debugging. > > > > In short, I cannot see any signal being delivered to user space. If so that > > would explain why > > our user space process never dies. > > Is there a signal hidden in machine_check handler for SIGBUS I cannot see? > > It's platform specific. What platform are you on? I am on e500, e5500(e500mc) and 83xx :) > > See the ppc_md & cur_cpu_spec calls here: > > void machine_check_exception(struct pt_regs *regs) > { > int recover = 0; > bool nested = in_nmi(); > if (!nested) > nmi_enter(); > > __this_cpu_inc(irq_stat.mce_exceptions); > > add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); > > /* See if any machine dependent calls. In theory, we would want > * to call the CPU first, and call the ppc_md. one if the CPU > * one returns a positive number. However there is existing code > * that assumes the board gets a first chance, so let's keep it > * that way for now and fix things later. --BenH. > */ > if (ppc_md.machine_check_exception) > recover = ppc_md.machine_check_exception(regs); > else if (cur_cpu_spec->machine_check) > recover = cur_cpu_spec->machine_check(regs); > > if (recover > 0) > goto bail; > > > Either the ppc_md or cpu_spec handlers can send a signal, but after a > bit of grepping I think only the pseries and powernv ones do. Seems so > > If you get into die() then it's an oops, which is not the same as a > normal signal. Exactly, and the die/OOPS does not seem work as intended either. The system tries to limp along and generates more similar OOPses and may even hang. > > cheers
Re: [PATCH v6 09/11] perf/tools: Enhance JSON/metric infrastructure to handle "?"
On 3/24/20 6:41 PM, Jiri Olsa wrote: > On Fri, Mar 20, 2020 at 06:24:04PM +0530, Kajol Jain wrote: >> Patch enhances current metric infrastructure to handle "?" in the metric >> expression. The "?" can be use for parameters whose value not known while >> creating metric events and which can be replace later at runtime to >> the proper value. It also add flexibility to create multiple events out >> of single metric event added in json file. >> >> Patch adds function 'arch_get_runtimeparam' which is a arch specific >> function, returns the count of metric events need to be created. >> By default it return 1. >> >> This infrastructure needed for hv_24x7 socket/chip level events. >> "hv_24x7" chip level events needs specific chip-id to which the >> data is requested. Function 'arch_get_runtimeparam' implemented >> in header.c which extract number of sockets from sysfs file >> "sockets" under "/sys/devices/hv_24x7/interface/". >> >> >> With this patch basically we are trying to create as many metric events >> as define by runtime_param. >> >> For that one loop is added in function 'metricgroup__add_metric', >> which create multiple events at run time depend on return value of >> 'arch_get_runtimeparam' and merge that event in 'group_list'. >> >> To achieve that we are actually passing this parameter value as part of >> `expr__find_other` function and changing "?" present in metric expression >> with this value. >> >> As in our json file, there gonna be single metric event, and out of >> which we are creating multiple events, I am also merging this value >> to the original metric name to specify parameter value. >> >> For example, >> command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000 >> # time counts unit events >> 1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # >> 2.3 GHz PowerBUS_Frequency_0 >> 1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # >> 2.3 GHz PowerBUS_Frequency_1 >> 2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # >> 2.3 GHz PowerBUS_Frequency_0 >> 2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # >> 2.3 GHz PowerBUS_Frequency_1 >> >> So, here _0 and _1 after PowerBUS_Frequency specify parameter value. >> >> As after adding this to group_list, again we call expr__parse >> in 'generic_metric' function present in util/stat-display.c. >> By this time again we need to pass this parameter value. So, now to get this >> value >> actually I am trying to extract it from metric name itself. Because >> otherwise it gonna point to last updated value present in runtime_param. >> And gonna match for that value only. > > so why can't we pass that param as integer value through the metric objects? > > it get's created in metricgroup__add_metric_param: > - as struct egroup *eg > - we can add egroup::param and store the param value there > > then in metricgroup__setup_events it moves to: > - struct metric_expr *expr > - we can add metric_expr::param to keep the param > > then in perf_stat__print_shadow_stats there's: > - struct metric_expr *mexp loop > - calling generic_metric metric - we could call it with mexp::param > - and pass the param to expr__parse > Hi jiri, Thanks for the suggestion, Yes it make more sense to use like that. Will update. Thanks, Kajol > jirka >
[PATCH] selftests/powerpc: Fix try-run when source tree is not writable
We added a usage of try-run to pmu/ebb/Makefile to detect if the toolchain supported the -no-pie option. This fails if we build out-of-tree and the source tree is not writable, as try-run tries to write its temporary files to the current directory. That leads to the -no-pie option being silently dropped, which leads to broken executables with some toolchains. If we remove the redirect to /dev/null in try-run, we see the error: make[3]: Entering directory '/linux/tools/testing/selftests/powerpc/pmu/ebb' /usr/bin/ld: cannot open output file .54.tmp: Read-only file system collect2: error: ld returned 1 exit status make[3]: Nothing to be done for 'all'. And looking with strace we see it's trying to use a file that's in the source tree: lstat("/linux/tools/testing/selftests/powerpc/pmu/ebb/.54.tmp", 0x7c0f83c8) We can fix it by setting TMPOUT to point to the $(OUTPUT) directory, and we can verify with strace it's now trying to write to the output directory: lstat("/output/kselftest/powerpc/pmu/ebb/.54.tmp", 0x7fffd1bf6bf8) And also see that the -no-pie option is now correctly detected. Fixes: 0695f8bca93e ("selftests/powerpc: Handle Makefile for unrecognized option") Cc: sta...@vger.kernel.org # v5.5+ Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/pmu/ebb/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile b/tools/testing/selftests/powerpc/pmu/ebb/Makefile index 417306353e07..ca35dd8848b0 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile +++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile @@ -7,6 +7,7 @@ include ../../../../../../scripts/Kbuild.include # The EBB handler is 64-bit code and everything links against it CFLAGS += -m64 +TMPOUT = $(OUTPUT)/ # Toolchains may build PIE by default which breaks the assembly no-pie-option := $(call try-run, echo 'int main() { return 0; }' | \ $(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -no-pie -x c - -o "$$TMP", -no-pie) base-commit: 11a48a5a18c63fd7621bb050228cebf13566e4d8 prerequisite-patch-id: e460261e180666bfb63bcd0cc554874d73c3b71f prerequisite-patch-id: 67db368cfdf8d3aefd78f140420281f9b4b53e07 prerequisite-patch-id: cf5d957a366998b4a3ce70a79b6e969eb98fca7d prerequisite-patch-id: 3ace935c6ae425ad635eb38f906e790c3c9bf41f -- 2.25.1
Re: [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check
On 26.03.20 14:32, Aneesh Kumar K.V wrote: > Fixes the below crash > > BUG: Kernel NULL pointer dereference on read at 0x > Faulting instruction address: 0xc0c3447c > Oops: Kernel access of bad area, sig: 11 [#1] > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries > CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1 > ... > NIP [c0c3447c] vmemmap_populated+0x98/0xc0 > LR [c0088354] vmemmap_free+0x144/0x320 > Call Trace: > section_deactivate+0x220/0x240 > __remove_pages+0x118/0x170 > arch_remove_memory+0x3c/0x150 > memunmap_pages+0x1cc/0x2f0 > devm_action_release+0x30/0x50 > release_nodes+0x2f8/0x3e0 > device_release_driver_internal+0x168/0x270 > unbind_store+0x130/0x170 > drv_attr_store+0x44/0x60 > sysfs_kf_write+0x68/0x80 > kernfs_fop_write+0x100/0x290 > __vfs_write+0x3c/0x70 > vfs_write+0xcc/0x240 > ksys_write+0x7c/0x140 > system_call+0x5c/0x68 > > The crash is due to NULL dereference at > > test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in > pfn_section_valid() > > With commit: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in > SPARSEMEM|!VMEMMAP case") > section_mem_map is set to NULL after depopulate_section_mem(). This > was done so that pfn_page() can work correctly with kernel config that > disables > SPARSEMEM_VMEMMAP. With that config pfn_to_page does > > __section_mem_map_addr(__sec) + __pfn; > where > > static inline struct page *__section_mem_map_addr(struct mem_section *section) > { > unsigned long map = section->section_mem_map; > map &= SECTION_MAP_MASK; > return (struct page *)map; > } > > Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used > to > check the pfn validity (pfn_valid()). Since section_deactivate release > mem_section->usage if a section is fully deactivated, pfn_valid() check after > a subsection_deactivate cause a kernel crash. > > static inline int pfn_valid(unsigned long pfn) > { > ... > return early_section(ms) || pfn_section_valid(ms, pfn); > } > > where > > static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > { > int idx = subsection_map_index(pfn); > > return test_bit(idx, ms->usage->subsection_map); > } > > Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed. > For architectures like ppc64 where large pages are used for vmmemap mapping > (16MB), > a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap > mapping page can be freed, the kernel needs to make sure there are no valid > sections > within that mapping. Clearing the section valid bit before > depopulate_section_memap enables this. > > Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in > SPARSEMEM|!VMEMMAP case") > Reported-by: Sachin Sant > Tested-by: Sachin Sant > Cc: Baoquan He > Cc: Michael Ellerman > Cc: Dan Williams > Cc: Pankaj Gupta > Cc: David Hildenbrand > Cc: Michal Hocko > Cc: Wei Yang > Cc: Oscar Salvador > Cc: Mike Rapoport > Cc: > Signed-off-by: Aneesh Kumar K.V > --- > mm/sparse.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/sparse.c b/mm/sparse.c > index aadb7298dcef..65599e8bd636 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -781,6 +781,12 @@ static void section_deactivate(unsigned long pfn, > unsigned long nr_pages, > ms->usage = NULL; > } > memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); > + /* > + * Mark the section invalid so that valid_section() > + * return false. This prevents code from dereferencing s/return/returns/ > + * ms->usage array. maybe "(see pfn_section_valid())" > + */ > + ms->section_mem_map &= ~SECTION_HAS_MEM_MAP; -- Thanks, David / dhildenb
Re: [PATCH 1/2] powerpc/vmlinux.lds: Explicitly retain .gnu.hash
Alan Modra writes: > On Thu, Feb 27, 2020 at 03:59:32PM +1100, Michael Ellerman wrote: >> Relocatable kernel builds produce a warning about .gnu.hash being an >> orphan section: >> >> ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed >> in section `.gnu.hash' >> >> If we try to discard it the build fails: >> >> ld -EL -m elf64lppc -pie --orphan-handling=warn --build-id -o >> .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive >> arch/powerpc/kernel/head_64.o arch/powerpc/kernel/entry_64.o >> ... >> sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive >> --start-group lib/lib.a --end-group >> ld: could not find section .gnu.hash >> >> So add an entry to explicitly retain it, as we do for .hash. > > Looks fine to me. You can also pass --hash-style=sysv to ld (since > binutils-2.18) to disable generation of .gnu.hash. Our current minimum is 2.21, so that's probably 5-10 years away :) cheers
Re: [PATCH 2/2] powerpc/vmlinux.lds: Discard .interp section
Alan Modra writes: > On Thu, Feb 27, 2020 at 03:59:33PM +1100, Michael Ellerman wrote: >> The .interp section specifies which "interpreter", ie. dynamic loader, >> the kernel requests. But that doesn't make any sense, the kernel is >> not a regular binary that is run with an interpreter. >> >> The content seems to be some default value, this file doesn't even >> exist on my system: >> 2f 75 73 72 2f 6c 69 62 2f 6c 64 2e 73 6f 2e 31 >> |/usr/lib/ld.so.1| >> >> So the section serves no useful purpose and consumes a small amount of >> space. >> >> Also Alan Modra says we "likely could discard" it, so do so. > > Yes, but you ought to check with the mimimum required binutils. It is > quite possible that an older linker will blow up. OK, I guess I'll have to test. > If the minimum required binutils is at least binutils-2.26 then > passing --no-dynamic-linker to ld is a more elegant solution. The current minimum is 2.21, though there's talk of increasing it to 2.23. cheers
Re: [PATCH v2] powerpc/kprobes: Blacklist functions running with MMU disabled on PPC32
Christophe Leroy wrote: kprobe does not handle events happening in real mode, all functions running with MMU disabled have to be blacklisted. As already done for PPC64, do it for PPC32. Signed-off-by: Christophe Leroy --- v2: - Don't rename nonrecoverable as local, mark it noprobe instead. - Add missing linux/kprobes.h include in pq2.c --- arch/powerpc/include/asm/ppc_asm.h | 10 +++ arch/powerpc/kernel/cpu_setup_6xx.S | 4 +- arch/powerpc/kernel/entry_32.S | 65 arch/powerpc/kernel/fpu.S| 1 + arch/powerpc/kernel/idle_6xx.S | 2 +- arch/powerpc/kernel/idle_e500.S | 2 +- arch/powerpc/kernel/l2cr_6xx.S | 2 +- arch/powerpc/kernel/misc.S | 2 + arch/powerpc/kernel/misc_32.S| 4 +- arch/powerpc/kernel/swsusp_32.S | 6 +- arch/powerpc/kernel/vector.S | 1 + arch/powerpc/mm/book3s32/hash_low.S | 38 ++-- arch/powerpc/mm/mem.c| 2 + arch/powerpc/platforms/52xx/lite5200_sleep.S | 2 + arch/powerpc/platforms/82xx/pq2.c| 3 + arch/powerpc/platforms/83xx/suspend-asm.S| 1 + arch/powerpc/platforms/powermac/cache.S | 2 + arch/powerpc/platforms/powermac/sleep.S | 13 ++-- 18 files changed, 86 insertions(+), 74 deletions(-) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 6b03dff61a05..e8f34ba89497 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -267,8 +267,18 @@ GLUE(.,name): .pushsection "_kprobe_blacklist","aw"; \ PPC_LONG (entry) ; \ .popsection +#define _NOKPROBE_ENTRY(entry) \ + _ASM_NOKPROBE_SYMBOL(entry) \ + _ENTRY(entry) +#define _NOKPROBE_GLOBAL(entry)\ + _ASM_NOKPROBE_SYMBOL(entry) \ + _GLOBAL(entry) #else #define _ASM_NOKPROBE_SYMBOL(entry) +#define _NOKPROBE_ENTRY(entry) \ + _ENTRY(entry) +#define _NOKPROBE_GLOBAL(entry)\ + _GLOBAL(entry) #endif Michael hasn't preferred including NOKPROBE variants of those macros previously, since he would like to see some cleanups there: https://patchwork.ozlabs.org/patch/696138/ #define FUNC_START(name) _GLOBAL(name) diff --git a/arch/powerpc/kernel/cpu_setup_6xx.S b/arch/powerpc/kernel/cpu_setup_6xx.S index f6517f67265a..1cb947268546 100644 --- a/arch/powerpc/kernel/cpu_setup_6xx.S +++ b/arch/powerpc/kernel/cpu_setup_6xx.S @@ -276,7 +276,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM) * in some 750 cpus where using a not yet initialized FPU register after * power on reset may hang the CPU */ -_GLOBAL(__init_fpu_registers) +_NOKPROBE_GLOBAL(__init_fpu_registers) mfmsr r10 ori r11,r10,MSR_FP mtmsr r11 @@ -381,7 +381,7 @@ _GLOBAL(__save_cpu_setup) * restore CPU state as backed up by the previous * function. This does not include cache setting */ -_GLOBAL(__restore_cpu_setup) +_NOKPROBE_GLOBAL(__restore_cpu_setup) /* Some CR fields are volatile, we back it up all */ mfcrr7 diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 16af0d8d90a8..f788d586254d 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -44,24 +44,21 @@ .align 12 #ifdef CONFIG_BOOKE - .globl mcheck_transfer_to_handler -mcheck_transfer_to_handler: +_NOKPROBE_ENTRY(mcheck_transfer_to_handler) mfspr r0,SPRN_DSRR0 stw r0,_DSRR0(r11) mfspr r0,SPRN_DSRR1 stw r0,_DSRR1(r11) /* fall through */ - .globl debug_transfer_to_handler -debug_transfer_to_handler: +_NOKPROBE_ENTRY(debug_transfer_to_handler) mfspr r0,SPRN_CSRR0 stw r0,_CSRR0(r11) mfspr r0,SPRN_CSRR1 stw r0,_CSRR1(r11) /* fall through */ - .globl crit_transfer_to_handler -crit_transfer_to_handler: +_NOKPROBE_ENTRY(crit_transfer_to_handler) #ifdef CONFIG_PPC_BOOK3E_MMU mfspr r0,SPRN_MAS0 stw r0,MAS0(r11) @@ -97,8 +94,7 @@ crit_transfer_to_handler: #endif #ifdef CONFIG_40x - .globl crit_transfer_to_handler -crit_transfer_to_handler: +_NOKPROBE_ENTRY(crit_transfer_to_handler) lwz r0,crit_r10@l(0) stw r0,GPR10(r11) lwz r0,crit_r11@l(0) @@ -124,13 +120,11 @@ crit_transfer_to_handler: * Note that we rely on the caller having set cr0.eq iff the exception * occurred in kernel mode (i.e. MSR:PR = 0). */ - .globl transfer_to_handler_full -transfer_to_handler_full: +_NOKPROBE_ENTRY(transfer_to_handler_full) SAVE_NVGPRS(r11) /* fall through */ - .globl transfer_to_handler -transfer_t
Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S
On Fri, 2020-03-27 at 07:48 +0100, Christophe Leroy wrote: > > Le 26/03/2020 à 07:15, Balamuruhan S a écrit : > > Data Cache Block Invalidate (dcbi) instruction was implemented back in > > PowerPC > > architecture version 2.03. It is obsolete and attempt to use of this > > illegal > > instruction results in a hypervisor emulation assistance interrupt. So, > > ifdef > > it out the option `i` in xmon for 64bit Book3S. > > I don't understand. You say two contradictory things: > 1/ You say it _was_ added back. > 2/ You say it _is_ obsolete. > > How can it be obsolete if it was added back ? I actually learnt it from P8 and P9 User Manual, The POWER8/POWER9 core does not provide support for the following optional or obsolete instructions (attempted use of these results in a hypervisor emulation assistance interrupt): • tlbia - TLB invalidate all • tlbiex - TLB invalidate entry by index (obsolete) • slbiex - SLB invalidate entry by index (obsolete) • dcba - Data cache block allocate (Book II; obsolete) • dcbi - Data cache block invalidate (obsolete) • rfi - Return from interrupt (32-bit; obsolete) > > [...] > > > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c > > index 0ec9640335bb..bfd5a97689cd 100644 > > --- a/arch/powerpc/xmon/xmon.c > > +++ b/arch/powerpc/xmon/xmon.c > > @@ -335,10 +335,12 @@ static inline void cflush(void *p) > > asm volatile ("dcbf 0,%0; icbi 0,%0" : : "r" (p)); > > } > > > > +#ifndef CONFIG_PPC_BOOK3S_64 > > You don't need that #ifndef. Keeping it should be harmless. okay. > > > static inline void cinval(void *p) > > { > > asm volatile ("dcbi 0,%0; icbi 0,%0" : : "r" (p)); > > } > > +#endif > > > > /** > >* write_ciabr() - write the CIABR SPR > > @@ -1791,8 +1793,9 @@ static void prregs(struct pt_regs *fp) > > > > static void cacheflush(void) > > { > > - int cmd; > > unsigned long nflush; > > +#ifndef CONFIG_PPC_BOOK3S_64 > > Don't make it so complex, see below > > > + int cmd; > > > > cmd = inchar(); > > if (cmd != 'i') > > @@ -1800,13 +1803,14 @@ static void cacheflush(void) > > scanhex((void *)&adrs); > > if (termch != '\n') > > termch = 0; > > +#endif > > nflush = 1; > > scanhex(&nflush); > > nflush = (nflush + L1_CACHE_BYTES - 1) / L1_CACHE_BYTES; > > if (setjmp(bus_error_jmp) == 0) { > > catch_memory_errors = 1; > > sync(); > > - > > +#ifndef CONFIG_PPC_BOOK3S_64 > > You don't need that ifndef, just ensure below that regardless of cmd, > book3s/64 calls cflush and not cinval. > > > if (cmd != 'i') { > > The only thing you have to do is to replace the above test by: > > if (cmd != 'i' || IS_ENABLED(CONFIG_PPC_BOOK3S_64)) { yes, this is the better way. > > > for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES) > > cflush((void *) adrs); > > @@ -1814,6 +1818,10 @@ static void cacheflush(void) > > for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES) > > cinval((void *) adrs); > > } > > +#else > > Don't need that at all, it's a duplication of the above. sure :+1: Thanks for reviewing. -- Bala > > > + for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES) > > + cflush((void *)adrs); > > +#endif > > sync(); > > /* wait a little while to see if we get a machine check */ > > __delay(200); > > > > base-commit: a87b93bdf800a4d7a42d95683624a4516e516b4f > > > > Christophe
Re: [PATCH 3/4] powerpc/uaccess: evaluate macro arguments once, before user access is allowed
Le 27/03/2020 à 08:02, Nicholas Piggin a écrit : get/put_user can be called with nontrivial arguments. fs/proc/page.c has a good example: if (put_user(stable_page_flags(ppage), out)) { stable_page_flags is quite a lot of code, including spin locks in the page allocator. Ensure these arguments are evaluated before user access is allowed. This improves security by reducing code with access to userspace, but it also fixes a PREEMPT bug with KUAP on powerpc/64s: stable_page_flags is currently called with AMR set to allow writes, it ends up calling spin_unlock(), which can call preempt_schedule. But the task switch code can not be called with AMR set (it relies on interrupts saving the register), so this blows up. It's fine if the code inside allow_user_access is preemptible, because a timer or IPI will save the AMR, but it's not okay to explicitly cause a reschedule. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/uaccess.h | 97 ++ 1 file changed, 59 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 670910df3cc7..1cf8595aeef1 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -162,36 +162,48 @@ do { \ prevent_write_to_user(ptr, size); \ } while (0) -#define __put_user_nocheck(x, ptr, size, do_allow) \ +#define __put_user_nocheck(x, ptr, size, do_allow) \ No need to touch this line. Anyway at the end, you still have several \ which are not aligned. ({\ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ if (!is_kernel_addr((unsigned long)__pu_addr)) \ might_fault(); \ - __chk_user_ptr(ptr);\ - if (do_allow) \ No need to touch that line - __put_user_size((x), __pu_addr, (size), __pu_err); \ - else \ No need to touch that line - __put_user_size_allowed((x), __pu_addr, (size), __pu_err); \ + __chk_user_ptr(__pu_addr); \ + if (do_allow) \ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + else\ + __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ __pu_err; \ }) -#define __put_user_check(x, ptr, size) \ -({ \ - long __pu_err = -EFAULT;\ - __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ - might_fault(); \ - if (access_ok(__pu_addr, size)) \ - __put_user_size((x), __pu_addr, (size), __pu_err); \ - __pu_err; \ Same comment applies, you are touching some lines just to change the \, but at the end you still have some misaligned ones. It would help the review not to touch unchanged lines just for that. Same comment applies a few places below as well. +#define __put_user_check(x, ptr, size) \ +({ \ + long __pu_err = -EFAULT;\ + __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ + might_fault(); \ + if (access_ok(__pu_addr, __pu_size))\ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ + __pu_err; \ }) #define __put_user_nosleep(x, ptr, size) \ ({\ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr);
Re: [PATCH 1/4] powerpc/64s: implement probe_kernel_read/write without touching AMR
Le 27/03/2020 à 08:02, Nicholas Piggin a écrit : There is no need to allow user accesses when probing kernel addresses. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/uaccess.h | 25 ++- arch/powerpc/lib/Makefile | 2 +- arch/powerpc/lib/uaccess.c | 50 ++ 3 files changed, 68 insertions(+), 9 deletions(-) create mode 100644 arch/powerpc/lib/uaccess.c [...] diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index b8de3be10eb4..a15060b5008e 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -36,7 +36,7 @@ extra-$(CONFIG_PPC64) += crtsavres.o endif obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ - memcpy_power7.o + memcpy_power7.o uaccess.o Why only book3s/64 ? It applies to the 8xx and book3s/32 as well, I think it should just be for all powerpc. obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \ memcpy_64.o memcpy_mcsafe_64.o diff --git a/arch/powerpc/lib/uaccess.c b/arch/powerpc/lib/uaccess.c new file mode 100644 index ..0057ab52d6fe --- /dev/null +++ b/arch/powerpc/lib/uaccess.c @@ -0,0 +1,50 @@ +#include +#include + +static __always_inline long +probe_read_common(void *dst, const void __user *src, size_t size) +{ + long ret; + + pagefault_disable(); + ret = raw_copy_from_user_allowed(dst, src, size); + pagefault_enable(); + + return ret ? -EFAULT : 0; +} + +static __always_inline long +probe_write_common(void __user *dst, const void *src, size_t size) +{ + long ret; + + pagefault_disable(); + ret = raw_copy_to_user_allowed(dst, src, size); + pagefault_enable(); + + return ret ? -EFAULT : 0; +} + +long probe_kernel_read(void *dst, const void *src, size_t size) +{ + long ret; + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); + ret = probe_read_common(dst, (__force const void __user *)src, size); I think you should squash probe_read_common() here, having it separated is a lot of lines for no added value. It also may make people believe it overwrites the generic probe_read_common() + set_fs(old_fs); + + return ret; +} + +long probe_kernel_write(void *dst, const void *src, size_t size) +{ + long ret; + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); + ret = probe_write_common((__force void __user *)dst, src, size); Same comment as for probe_read_common() + set_fs(old_fs); + + return ret; +} Christophe
[PATCH 4/4] powerpc/uaccess: add more __builtin_expect annotations
Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/uaccess.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 1cf8595aeef1..896d43d8c891 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -48,16 +48,16 @@ static inline void set_fs(mm_segment_t fs) * gap between user addresses and the kernel addresses */ #define __access_ok(addr, size, segment) \ - (((addr) <= (segment).seg) && ((size) <= (segment).seg)) + likely(((addr) <= (segment).seg) && ((size) <= (segment).seg)) #else static inline int __access_ok(unsigned long addr, unsigned long size, mm_segment_t seg) { - if (addr > seg.seg) + if (unlikely(addr > seg.seg)) return 0; - return (size == 0 || size - 1 <= seg.seg - addr); + return likely(size == 0 || size - 1 <= seg.seg - addr); } #endif @@ -177,7 +177,7 @@ do { \ else\ __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, __pu_err); \ \ - __pu_err; \ + __builtin_expect(__pu_err, 0); \ }) #define __put_user_check(x, ptr, size) \ @@ -191,7 +191,7 @@ do { \ if (access_ok(__pu_addr, __pu_size))\ __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ \ - __pu_err; \ + __builtin_expect(__pu_err, 0); \ }) #define __put_user_nosleep(x, ptr, size) \ @@ -204,7 +204,7 @@ do { \ __chk_user_ptr(__pu_addr); \ __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ \ - __pu_err; \ + __builtin_expect(__pu_err, 0); \ }) @@ -307,7 +307,7 @@ do { \ __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \ (x) = (__typeof__(*(ptr)))__gu_val; \ \ - __gu_err; \ + __builtin_expect(__gu_err, 0); \ }) #define __get_user_check(x, ptr, size) \ @@ -324,7 +324,7 @@ do { \ } \ (x) = (__force __typeof__(*(ptr)))__gu_val; \ \ - __gu_err; \ + __builtin_expect(__gu_err, 0); \ }) #define __get_user_nosleep(x, ptr, size) \ @@ -339,7 +339,7 @@ do { \ __get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \ (x) = (__force __typeof__(*(ptr)))__gu_val; \ \ - __gu_err; \ + __builtin_expect(__gu_err, 0); \ }) -- 2.23.0
[PATCH 3/4] powerpc/uaccess: evaluate macro arguments once, before user access is allowed
get/put_user can be called with nontrivial arguments. fs/proc/page.c has a good example: if (put_user(stable_page_flags(ppage), out)) { stable_page_flags is quite a lot of code, including spin locks in the page allocator. Ensure these arguments are evaluated before user access is allowed. This improves security by reducing code with access to userspace, but it also fixes a PREEMPT bug with KUAP on powerpc/64s: stable_page_flags is currently called with AMR set to allow writes, it ends up calling spin_unlock(), which can call preempt_schedule. But the task switch code can not be called with AMR set (it relies on interrupts saving the register), so this blows up. It's fine if the code inside allow_user_access is preemptible, because a timer or IPI will save the AMR, but it's not okay to explicitly cause a reschedule. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/uaccess.h | 97 ++ 1 file changed, 59 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 670910df3cc7..1cf8595aeef1 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -162,36 +162,48 @@ do { \ prevent_write_to_user(ptr, size); \ } while (0) -#define __put_user_nocheck(x, ptr, size, do_allow) \ +#define __put_user_nocheck(x, ptr, size, do_allow) \ ({ \ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ if (!is_kernel_addr((unsigned long)__pu_addr)) \ might_fault(); \ - __chk_user_ptr(ptr);\ - if (do_allow) \ - __put_user_size((x), __pu_addr, (size), __pu_err); \ - else \ - __put_user_size_allowed((x), __pu_addr, (size), __pu_err); \ + __chk_user_ptr(__pu_addr); \ + if (do_allow) \ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + else\ + __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ __pu_err; \ }) -#define __put_user_check(x, ptr, size) \ -({ \ - long __pu_err = -EFAULT;\ - __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ - might_fault(); \ - if (access_ok(__pu_addr, size)) \ - __put_user_size((x), __pu_addr, (size), __pu_err); \ - __pu_err; \ +#define __put_user_check(x, ptr, size) \ +({ \ + long __pu_err = -EFAULT;\ + __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ + might_fault(); \ + if (access_ok(__pu_addr, __pu_size))\ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ + __pu_err; \ }) #define __put_user_nosleep(x, ptr, size) \ ({ \ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ - __chk_user_ptr(ptr);\ - __put_user_size((x), __pu_addr, (size), __pu_err); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ + __chk_user_ptr(__pu_addr); \ + __put
[PATCH 2/4] powerpc/64s: use mmu_has_feature in set_kuap() and get_kuap()
Commit 8150a153c013 ("powerpc/64s: Use early_mmu_has_feature() in set_kuap()"), had to switch to using the _early feature test, because probe_kernel_read was being called very early. After the previous patch, probe_kernel_read no longer touches kuap, so it can go back to using the non-_early variant, for better performance. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/book3s/64/kup-radix.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h index 3bcef989a35d..67a7fd0182e6 100644 --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h @@ -79,7 +79,7 @@ static inline void kuap_check_amr(void) static inline unsigned long get_kuap(void) { - if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP)) + if (!mmu_has_feature(MMU_FTR_RADIX_KUAP)) return 0; return mfspr(SPRN_AMR); @@ -87,7 +87,7 @@ static inline unsigned long get_kuap(void) static inline void set_kuap(unsigned long value) { - if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP)) + if (!mmu_has_feature(MMU_FTR_RADIX_KUAP)) return; /* -- 2.23.0
[PATCH 1/4] powerpc/64s: implement probe_kernel_read/write without touching AMR
There is no need to allow user accesses when probing kernel addresses. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/uaccess.h | 25 ++- arch/powerpc/lib/Makefile | 2 +- arch/powerpc/lib/uaccess.c | 50 ++ 3 files changed, 68 insertions(+), 9 deletions(-) create mode 100644 arch/powerpc/lib/uaccess.c diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 2f500debae21..670910df3cc7 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -341,8 +341,8 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n) } #endif /* __powerpc64__ */ -static inline unsigned long raw_copy_from_user(void *to, - const void __user *from, unsigned long n) +static inline unsigned long +raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n) { unsigned long ret; if (__builtin_constant_p(n) && (n <= 8)) { @@ -351,19 +351,19 @@ static inline unsigned long raw_copy_from_user(void *to, switch (n) { case 1: barrier_nospec(); - __get_user_size(*(u8 *)to, from, 1, ret); + __get_user_size_allowed(*(u8 *)to, from, 1, ret); break; case 2: barrier_nospec(); - __get_user_size(*(u16 *)to, from, 2, ret); + __get_user_size_allowed(*(u16 *)to, from, 2, ret); break; case 4: barrier_nospec(); - __get_user_size(*(u32 *)to, from, 4, ret); + __get_user_size_allowed(*(u32 *)to, from, 4, ret); break; case 8: barrier_nospec(); - __get_user_size(*(u64 *)to, from, 8, ret); + __get_user_size_allowed(*(u64 *)to, from, 8, ret); break; } if (ret == 0) @@ -371,9 +371,18 @@ static inline unsigned long raw_copy_from_user(void *to, } barrier_nospec(); - allow_read_from_user(from, n); ret = __copy_tofrom_user((__force void __user *)to, from, n); - prevent_read_from_user(from, n); + return ret; +} + +static inline unsigned long +raw_copy_from_user(void *to, const void __user *from, unsigned long n) +{ + unsigned long ret; + + allow_read_from_user(to, n); + ret = raw_copy_from_user_allowed(to, from, n); + prevent_read_from_user(to, n); return ret; } diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index b8de3be10eb4..a15060b5008e 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -36,7 +36,7 @@ extra-$(CONFIG_PPC64) += crtsavres.o endif obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ - memcpy_power7.o + memcpy_power7.o uaccess.o obj64-y+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \ memcpy_64.o memcpy_mcsafe_64.o diff --git a/arch/powerpc/lib/uaccess.c b/arch/powerpc/lib/uaccess.c new file mode 100644 index ..0057ab52d6fe --- /dev/null +++ b/arch/powerpc/lib/uaccess.c @@ -0,0 +1,50 @@ +#include +#include + +static __always_inline long +probe_read_common(void *dst, const void __user *src, size_t size) +{ + long ret; + + pagefault_disable(); + ret = raw_copy_from_user_allowed(dst, src, size); + pagefault_enable(); + + return ret ? -EFAULT : 0; +} + +static __always_inline long +probe_write_common(void __user *dst, const void *src, size_t size) +{ + long ret; + + pagefault_disable(); + ret = raw_copy_to_user_allowed(dst, src, size); + pagefault_enable(); + + return ret ? -EFAULT : 0; +} + +long probe_kernel_read(void *dst, const void *src, size_t size) +{ + long ret; + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); + ret = probe_read_common(dst, (__force const void __user *)src, size); + set_fs(old_fs); + + return ret; +} + +long probe_kernel_write(void *dst, const void *src, size_t size) +{ + long ret; + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); + ret = probe_write_common((__force void __user *)dst, src, size); + set_fs(old_fs); + + return ret; +} -- 2.23.0
Re: [PATCH V2 0/3] mm/debug: Add more arch page table helper tests
On 03/27/2020 06:46 AM, Anshuman Khandual wrote: On 03/26/2020 08:53 PM, Christophe Leroy wrote: Le 26/03/2020 à 03:23, Anshuman Khandual a écrit : On 03/24/2020 10:52 AM, Anshuman Khandual wrote: This series adds more arch page table helper tests. The new tests here are either related to core memory functions and advanced arch pgtable helpers. This also creates a documentation file enlisting all expected semantics as suggested by Mike Rapoport (https://lkml.org/lkml/2020/1/30/40). This series has been tested on arm64 and x86 platforms. If folks can test these patches out on remaining ARCH_HAS_DEBUG_VM_PGTABLE enabled platforms i.e s390, arc, powerpc (32 and 64), that will be really appreciated. Thank you. On powerpc 8xx (PPC32), I get: [ 53.338368] debug_vm_pgtable: debug_vm_pgtable: Validating architecture page table helpers [ 53.347403] [ cut here ] [ 53.351832] WARNING: CPU: 0 PID: 1 at mm/debug_vm_pgtable.c:647 debug_vm_pgtable+0x280/0x3f4 mm/debug_vm_pgtable.c:647 ? With the following commits in place 53a8338ce (HEAD) Documentation/mm: Add descriptions for arch page table helper 5d4913fc1 mm/debug: Add tests validating arch advanced page table helpers bcaf120a7 mm/debug: Add tests validating arch page table helpers for core features d6ed5a4a5 x86/memory: Drop pud_mknotpresent() 0739d1f8d mm/debug: Add tests validating architecture page table helpers 16fbf79b0 (tag: v5.6-rc7) Linux 5.6-rc7 I have: facaa5eb5909 (HEAD -> helpers0) mm/debug: Add tests validating arch advanced page table helpers 6389fed515fc mm/debug: Add tests validating arch page table helpers for core features dc14ecc8b94e mm/debug: add tests validating architecture page table helpers c6624071c338 (origin/merge, merge) Automatic merge of branches 'master', 'next' and 'fixes' into merge 58e05c5508e6 Automatic merge of branches 'master', 'next' and 'fixes' into merge 1b649e0bcae7 (origin/master, origin/HEAD) Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net origin is https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git I can't see your last patch in powerpc mailing list (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=166237) mm/debug_vm_pgtable.c:647 is here. Line 647 is: WARN_ON(!pte_same(pte, __swp_entry_to_pte(swp))); #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { swp_entry_t swp; pmd_t pmd; -> Line #647 pmd = pfn_pmd(pfn, prot); swp = __pmd_to_swp_entry(pmd); WARN_ON(!pmd_same(pmd, __swp_entry_to_pmd(swp))); } #else static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { } #end Did I miss something ? [...] Could you please point me to the exact test which is failing ? [ 53.519778] Freeing unused kernel memory: 608K So I assume that the system should have come till runtime just fine apart from the above warning message because. Yes it boots fine otherwise. Christophe