Re: [PATCH v3 1/3] clk: analogbits: add Wide-Range PLL library
On Mon, 29 Apr 2019, Stephen Boyd wrote: > Quoting Paul Walmsley (2019-04-29 12:42:07) > > On Fri, 26 Apr 2019, Paul Walmsley wrote: > > > On Fri, 26 Apr 2019, Stephen Boyd wrote: > > > > > > > Quoting Paul Walmsley (2019-04-11 01:27:32) > > > > > Add common library code for the Analog Bits Wide-Range PLL (WRPLL) IP > > > > > block, as implemented in TSMC CLN28HPC. > > > > > > > > I haven't deeply reviewed at all, but I already get two problems when > > > > compile testing these patches. I can fix them up if nothing else needs > > > > fixing. > > > > > > > > drivers/clk/analogbits/wrpll-cln28hpc.c:165 __wrpll_calc_divq() warn: > > > > should 'target_rate << divq' be a 64 bit type? > > > > drivers/clk/sifive/fu540-prci.c:214:16: error: return expression in > > > > void function > > > > > > Hmm, that's odd. I will definitely take a look and repost. > > > > I'm not able to reproduce these problems. The configs tried here were: > > > > - 64-bit RISC-V defconfig w/ PRCI driver enabled (gcc 8.2.0 built with > > crosstool-NG 1.24.0) > > > > - 32-bit ARM defconfig w/ PRCI driver enabled (gcc 8.3.0 built with > > crosstool-NG 1.24.0) > > > > - 32-bit i386 defconfig w/ PRCI driver enabled (gcc > > 5.4.0-6ubuntu1~16.04.11) > > > > Could you post the toolchain and kernel config you're using? > > > > I'm running sparse and smatch too. OK. I was able to reproduce the __wrpll_calc_divq() warning. It's been resolved in the upcoming revision. But I don't see the second error with either sparse or smatch. (This is with sparse at commit 2b96cd804dc7 and smatch at commit f0092daff69d.) - Paul
Re: [tip:sched/urgent] sched/cpufreq: Fix kobject memleak
On 29-04-19, 22:52, tip-bot for Tobin C. Harding wrote: > Commit-ID: 8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > Gitweb: > https://git.kernel.org/tip/8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > Author: Tobin C. Harding > AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 > Committer: Ingo Molnar > CommitDate: Tue, 30 Apr 2019 06:24:09 +0200 > > sched/cpufreq: Fix kobject memleak > > Currently the error return path from kobject_init_and_add() is not > followed by a call to kobject_put() - which means we are leaking > the kobject. > > Fix it by adding a call to kobject_put() in the error path of > kobject_init_and_add(). > > Signed-off-by: Tobin C. Harding > Add call to kobject_put() in error path of kobject_init_and_add(). This should have been present before the signed-off ? > Cc: Greg Kroah-Hartman > Cc: Linus Torvalds > Cc: Peter Zijlstra > Cc: Rafael J. Wysocki > Cc: Thomas Gleixner > Cc: Tobin C. Harding > Cc: Vincent Guittot > Cc: Viresh Kumar > Link: http://lkml.kernel.org/r/20190430001144.24890-1-to...@kernel.org > Signed-off-by: Ingo Molnar > --- > kernel/sched/cpufreq_schedutil.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 5c41ea367422..3638d2377e3c 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -771,6 +771,7 @@ out: > return 0; > > fail: > + kobject_put(>attr_set.kobj); > policy->governor_data = NULL; > sugov_tunables_free(tunables); > -- viresh
Re: linux-next: build warning after merge of the clk tree
Hi Anson, On Tue, 30 Apr 2019 01:44:58 + Anson Huang wrote: > > Thanks for notice. > As it is intentional, I will send out a patch to add "/* fall through > */" to avoid this build warning, Excellent, thanks. -- Cheers, Stephen Rothwell pgpWOKjnAq9zo.pgp Description: OpenPGP digital signature
[tip:sched/urgent] sched/cpufreq: Fix kobject memleak
Commit-ID: 8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 Gitweb: https://git.kernel.org/tip/8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 Author: Tobin C. Harding AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 Committer: Ingo Molnar CommitDate: Tue, 30 Apr 2019 06:24:09 +0200 sched/cpufreq: Fix kobject memleak Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Tobin C. Harding Add call to kobject_put() in error path of kobject_init_and_add(). Cc: Greg Kroah-Hartman Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rafael J. Wysocki Cc: Thomas Gleixner Cc: Tobin C. Harding Cc: Vincent Guittot Cc: Viresh Kumar Link: http://lkml.kernel.org/r/20190430001144.24890-1-to...@kernel.org Signed-off-by: Ingo Molnar --- kernel/sched/cpufreq_schedutil.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 5c41ea367422..3638d2377e3c 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -771,6 +771,7 @@ out: return 0; fail: + kobject_put(>attr_set.kobj); policy->governor_data = NULL; sugov_tunables_free(tunables);
Re: [PATCH] RISC-V: Add an Image header that boot loader can parse.
On 4/29/19 4:40 PM, Palmer Dabbelt wrote: On Tue, 23 Apr 2019 16:25:06 PDT (-0700), atish.pa...@wdc.com wrote: Currently, last stage boot loaders such as U-Boot can accept only uImage which is an unnecessary additional step in automating boot flows. Add a simple image header that boot loaders can parse and directly load kernel flat Image. The existing booting methods will continue to work as it is. Tested on both QEMU and HiFive Unleashed using OpenSBI + U-Boot + Linux. Signed-off-by: Atish Patra --- arch/riscv/include/asm/image.h | 32 arch/riscv/kernel/head.S | 28 2 files changed, 60 insertions(+) create mode 100644 arch/riscv/include/asm/image.h diff --git a/arch/riscv/include/asm/image.h b/arch/riscv/include/asm/image.h new file mode 100644 index ..76a7e0d4068a --- /dev/null +++ b/arch/riscv/include/asm/image.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ASM_IMAGE_H +#define __ASM_IMAGE_H + +#define RISCV_IMAGE_MAGIC "RISCV" + +#ifndef __ASSEMBLY__ +/* + * struct riscv_image_header - riscv kernel image header + * + * @code0: Executable code + * @code1: Executable code + * @text_offset: Image load offset + * @image_size:Effective Image size + * @reserved: reserved + * @magic: Magic number + * @reserved: reserved + */ + +struct riscv_image_header { + u32 code0; + u32 code1; + u64 text_offset; + u64 image_size; + u64 res1; + u64 magic; + u32 res2; + u32 res3; +}; I don't want to invent our own file format. Is there a reason we can't just use something standard? Off the top of my head I can think of ELF files and multiboot. Additional header is required to accommodate PE header format. Currently, this is only used for booti command but it will be reused for EFI headers as well. Linux kernel Image can pretend as an EFI application if PE/COFF header is present. This removes the need of an explicit EFI boot loader and EFI firmware can directly load Linux (obviously after EFI stub implementation for RISC-V). ARM64 follows the similar header format as well. https://www.kernel.org/doc/Documentation/arm64/booting.txt Regards, Atish +#endif /* __ASSEMBLY__ */ +#endif /* __ASM_IMAGE_H */ diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index fe884cd69abd..154647395601 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -19,9 +19,37 @@ #include #include #include +#include __INIT ENTRY(_start) + /* +* Image header expected by Linux boot-loaders. The image header data +* structure is described in asm/image.h. +* Do not modify it without modifying the structure and all bootloaders +* that expects this header format!! +*/ + /* jump to start kernel */ + j _start_kernel + /* reserved */ + .word 0 + .balign 8 +#if __riscv_xlen == 64 + /* Image load offset(2MB) from start of RAM */ + .dword 0x20 +#else + /* Image load offset(4MB) from start of RAM */ + .dword 0x40 +#endif + /* Effective size of kernel image */ + .dword _end - _start + .dword 0 + .asciz RISCV_IMAGE_MAGIC + .word 0 + .word 0 + +.global _start_kernel +_start_kernel: /* Mask all interrupts */ csrw sie, zero ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:undefined reference to `followparent_recalc'
On 4/29/19 9:48 PM, kbuild test robot wrote: > Hi Randy, > > It's probably a bug fix that unveils the link errors. Yoshinori Sato (cc-ed) has a patch for this. I guess that it's not in the arch/sh git tree yet ??? or wherever arch/sh changes come from. > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 83a50840e72a5a964b4704fcdc2fbb2d771015ab > commit: acaf892ecbf5be7710ae05a61fd43c668f68ad95 sh: fix multiple function > definition build errors > date: 3 weeks ago > config: sh-allmodconfig (attached as .config) > compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 > reproduce: > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout acaf892ecbf5be7710ae05a61fd43c668f68ad95 > # save the attached .config to linux build tree > GCC_VERSION=7.2.0 make.cross ARCH=sh > > If you fix the issue, kindly add following tag > Reported-by: kbuild test robot > > All errors (new ones prefixed by >>): > >>> sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.data+0x1c): >>> undefined reference to `followparent_recalc' > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation > -- ~Randy
Re: [PATCH 7/7] dmaengine: sprd: Add interrupt support for 2-stage transfer
On Mon, 29 Apr 2019 at 22:10, Vinod Koul wrote: > > On 29-04-19, 20:11, Baolin Wang wrote: > > On Mon, 29 Apr 2019 at 20:01, Vinod Koul wrote: > > > On 15-04-19, 20:15, Baolin Wang wrote: > > > > > @@ -429,6 +433,9 @@ static int sprd_dma_set_2stage_config(struct > > > > sprd_dma_chn *schan) > > > > val = chn & SPRD_DMA_GLB_SRC_CHN_MASK; > > > > val |= BIT(schan->trg_mode - 1) << > > > > SPRD_DMA_GLB_TRG_OFFSET; > > > > val |= SPRD_DMA_GLB_2STAGE_EN; > > > > + if (schan->int_type != SPRD_DMA_NO_INT) > > > > > > Who configure int_type? > > > > The int_type is configured through the flags of > > sprd_dma_prep_slave_sg() by users, see: > > https://elixir.bootlin.com/linux/v5.1-rc6/source/include/linux/dma/sprd-dma.h#L9 > > Please use DMA_PREP_INTERRUPT flag instead! We can not use DMA_PREP_INTERRUPT flag, since we have some Spreadtrum specific DMA interrupt flags configured by users, which I think we have made a consensus before. See: https://elixir.bootlin.com/linux/v5.1-rc6/source/include/linux/dma/sprd-dma.h#L105 -- Baolin Wang Best Regards
[PATCH] pid: Remove unneeded hash header file
Hash functions are not needed since idr is used now. Let's remove hash header file for cleanup. Signed-off-by: Timmy Li --- kernel/pid.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/pid.c b/kernel/pid.c index 20881598bdfa..89548d35eefb 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -32,7 +32,6 @@ #include #include #include -#include #include #include #include -- 2.17.1
Re: [PATCH 4/7] dmaengine: sprd: Add device validation to support multiple controllers
On Mon, 29 Apr 2019 at 22:05, Vinod Koul wrote: > > On 29-04-19, 20:20, Baolin Wang wrote: > > On Mon, 29 Apr 2019 at 19:57, Vinod Koul wrote: > > > > > > On 15-04-19, 20:14, Baolin Wang wrote: > > > > From: Eric Long > > > > > > > > Since we can support multiple DMA engine controllers, we should add > > > > device validation in filter function to check if the correct controller > > > > to be requested. > > > > > > > > Signed-off-by: Eric Long > > > > Signed-off-by: Baolin Wang > > > > --- > > > > drivers/dma/sprd-dma.c |5 + > > > > 1 file changed, 5 insertions(+) > > > > > > > > diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c > > > > index 0f92e60..9f99d4b 100644 > > > > --- a/drivers/dma/sprd-dma.c > > > > +++ b/drivers/dma/sprd-dma.c > > > > @@ -1020,8 +1020,13 @@ static void sprd_dma_free_desc(struct > > > > virt_dma_desc *vd) > > > > static bool sprd_dma_filter_fn(struct dma_chan *chan, void *param) > > > > { > > > > struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); > > > > + struct of_phandle_args *dma_spec = > > > > + container_of(param, struct of_phandle_args, args[0]); > > > > u32 slave_id = *(u32 *)param; > > > > > > > > + if (chan->device->dev->of_node != dma_spec->np) > > > > > > Are you not using of_dma_find_controller() that does this, so this would > > > be useless! > > > > Yes, we can use of_dma_find_controller(), but that will be a little > > complicated than current solution. Since we need introduce one > > structure to save the node to validate in the filter function like > > below, which seems make things complicated. But if you still like to > > use of_dma_find_controller(), I can change to use it in next version. > > Sorry I should have clarified more.. > > of_dma_find_controller() is called by xlate, so you already run this > check, so why use this :) The of_dma_find_controller() can save the requested device node into dma_spec, and in the of_dma_simple_xlate() function, it will call dma_request_channel() to request one channel, but it did not validate the device node to find the corresponding dma device in dma_request_channel(). So we should in our filter function to validate the device node with the device node specified by the dma_spec. Hope I make things clear. -- Baolin Wang Best Regards
Re: [PATCH v4] panic: add an option to replay all the printk message in buffer
On (04/29/19 13:44), Petr Mladek wrote: > On Sat 2019-04-27 02:16:40, Sergey Senozhatsky wrote: > > On (04/27/19 01:43), Sergey Senozhatsky wrote: > > [..] > > > > The console waiter logic is effective but it does not always > > > > work. The current console owner must be calling the console > > > > drivers. > > > > > > > > > Hmm, we might have a bit of a problem here, maybe. > > > > > > > > Hmm, the printk() might wait forever when NMI stopped > > > > the current console owner in the console driver code > > > > or with the logbuf_lock taken. > > > > > > I guess this is why we re-init logbuf lock from panic, > > > however, we don't do anything with the console_owner. > > > > > The console waiter logic might get solved by clearing > > > > the console_owner in console_flush_on_panic(). It can't > > > > be much worse, we already ignore console_lock() there, ... > > > > Hmm, or maybe we are fine... console_waiter logic should work > > before we send out stop IPI/NMI from panic CPU. When we call > > flush_on_panic() console_unlock() clears console_owner, so > > panic_print_sys_info() should not deadlock on console_owner. > > Good point! > > > It's probably only problematic if we kill a console_owner > > CPU and then try to printk() (from smp_send_stop()) before > > we do flush_on_panic()->console_unlock(). > > Yup. There are called several functions between smp_send_stop() > and console_flush_on_panic(). > > The question is if it is worth a code complication. We could > never 100% guarantee that printk() would work in panic(). > I more and more understand what Peter Zijlstra means > by the duct taping. Agreed. -ss
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
> On Apr 29, 2019, at 10:26 PM, Al Viro wrote: > > On Mon, Apr 29, 2019 at 10:18:04PM -0600, Andreas Dilger wrote: >>> >>> void*i_private; /* fs or device private pointer */ >>> + void (*free_inode)(struct inode *); >> >> It seems like a waste to increase the size of every struct inode just to >> access >> a static pointer. Is this the only place that ->free_inode() is called? Why >> not move the ->free_inode() pointer into inode->i_fop->free_inode() so that >> it >> is still directly accessible at this point. > > i_op, surely? Yes, i_op is what I was thinking. > In any case, increasing sizeof(struct inode) is not a problem - > if anything, I'd turn ->i_fop into an anon union with that. As in, > > diff --git a/fs/inode.c b/fs/inode.c > index fb45590d284e..627e1766503a 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -211,8 +211,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); > static void i_callback(struct rcu_head *head) > { > struct inode *inode = container_of(head, struct inode, i_rcu); > - if (inode->i_sb->s_op->free_inode) > - inode->i_sb->s_op->free_inode(inode); > + if (inode->free_inode) > + inode->free_inode(inode); > else > free_inode_nonrcu(inode); > } > @@ -236,6 +236,7 @@ static struct inode *alloc_inode(struct super_block *sb) > if (!ops->free_inode) > return NULL; > } > + inode->free_inode = ops->free_inode; > i_callback(>i_rcu); > return NULL; > } > @@ -276,6 +277,7 @@ static void destroy_inode(struct inode *inode) > if (!ops->free_inode) > return; > } > + inode->free_inode = ops->free_inode; > call_rcu(>i_rcu, i_callback); > } This seems like kind of a hack. I guess your goal is to have ->free_inode accessible regardless of whether the filesystem has installed its own ->i_op methods or not, and i_fop is no longer used by this point. That said, this seems better than increasing the size of struct inode. > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 2e9b9f87caca..92732286b748 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -694,7 +694,10 @@ struct inode { > #ifdef CONFIG_IMA > atomic_ti_readcount; /* struct files open RO */ > #endif > - const struct file_operations*i_fop; /* former > ->i_op->default_file_ops */ > + union { > + const struct file_operations*i_fop; /* former > ->i_op->default_file_ops */ > + void (*free_inode)(struct inode *); > + }; Cheers, Andreas signature.asc Description: Message signed with OpenPGP
RE: [PATCH v3 1/1] Add support for IPMB driver
> -Original Message- > From: Asmaa Mnebhi > Sent: Tuesday, April 30, 2019 12:57 AM > To: miny...@acm.org; w...@the-dreams.de; Vadim Pasternak > ; Michael Shych > Cc: Asmaa Mnebhi ; linux-kernel@vger.kernel.org; > linux-...@vger.kernel.org > Subject: [PATCH v3 1/1] Add support for IPMB driver > > Support receiving IPMB requests on a Satellite MC from the BMC. > Once a response is ready, this driver will send back a response to the BMC via > the IPMB channel. Hi Asmaa, Few common questions. You define this driver as "Mellanox BlueField IPMB driver". What makes it Mellanox BlueField specific? Which HW configuration you used for testing? Could you please explain connectivity schema between main BMC and satellite BMCs? How this module is supposed to be activated? Don't you need to add DTS/ACPI records? Also few comments below. > > Signed-off-by: Asmaa Mnebhi > --- > drivers/char/ipmi/Kconfig| 8 + > drivers/char/ipmi/Makefile | 1 + > drivers/char/ipmi/ipmb_dev_int.c | 386 > +++ > 3 files changed, 395 insertions(+) > create mode 100644 drivers/char/ipmi/ipmb_dev_int.c > > diff --git a/drivers/char/ipmi/Kconfig b/drivers/char/ipmi/Kconfig index > 94719fc..12fe8f2 100644 > --- a/drivers/char/ipmi/Kconfig > +++ b/drivers/char/ipmi/Kconfig > @@ -74,6 +74,14 @@ config IPMI_SSIF >have a driver that must be accessed over an I2C bus instead of a >standard interface. This module requires I2C support. > > +config IPMB_DEVICE_INTERFACE > + tristate 'IPMB Interface handler' > + depends on I2C && I2C_SLAVE > + help > + Provides a driver for a device (Satellite MC) to > + receive requests and send responses back to the BMC via > + the IPMB interface. This module requires I2C support. > + > config IPMI_POWERNV > depends on PPC_POWERNV > tristate 'POWERNV (OPAL firmware) IPMI interface' > diff --git a/drivers/char/ipmi/Makefile b/drivers/char/ipmi/Makefile index > 3f06b20..0822adc 100644 > --- a/drivers/char/ipmi/Makefile > +++ b/drivers/char/ipmi/Makefile > @@ -26,3 +26,4 @@ obj-$(CONFIG_IPMI_KCS_BMC) += kcs_bmc.o > obj-$(CONFIG_ASPEED_BT_IPMI_BMC) += bt-bmc.o > obj-$(CONFIG_ASPEED_KCS_IPMI_BMC) += kcs_bmc_aspeed.o > obj-$(CONFIG_NPCM7XX_KCS_IPMI_BMC) += kcs_bmc_npcm7xx.o > +obj-$(CONFIG_IPMB_DEVICE_INTERFACE) += ipmb_dev_int.o > diff --git a/drivers/char/ipmi/ipmb_dev_int.c > b/drivers/char/ipmi/ipmb_dev_int.c > new file mode 100644 > index 000..63122c3 > --- /dev/null > +++ b/drivers/char/ipmi/ipmb_dev_int.c > @@ -0,0 +1,386 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * Mellanox IPMB driver to receive a request and send a response > + * > + * Copyright (C) 2018 Mellanox Techologies, Ltd. > + * > + * This was inspired by Brendan Higgins' ipmi-bmc-bt-i2c driver. > + */ > + > +#define pr_fmt(fmt) "ipmb_dev_int: " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define MAX_MSG_LEN 128 > +#define IPMB_REQUEST_LEN_MIN7 > +#define NETFN_RSP_BIT_MASK 0x4 > +#define REQUEST_QUEUE_MAX_LEN 256 > + > +#define IPMB_MSG_LEN_IDX0 > +#define RQ_SA_8BIT_IDX 1 > +#define NETFN_LUN_IDX 2 > + > +#define IPMB_MSG_PAYLOAD_LEN_MAX (MAX_MSG_LEN - > IPMB_REQUEST_LEN_MIN - 1) > + > +struct ipmb_msg { > + u8 len; > + u8 rs_sa; > + u8 netfn_rs_lun; > + u8 checksum1; > + u8 rq_sa; > + u8 rq_seq_rq_lun; > + u8 cmd; > + u8 payload[IPMB_MSG_PAYLOAD_LEN_MAX]; > + /* checksum2 is included in payload */ } __packed; > + > +static u32 ipmb_msg_len(struct ipmb_msg *ipmb_msg) { > + return ipmb_msg->len + 1; > +} Do you really need it as function? > + > +struct ipmb_request_elem { > + struct list_head list; > + struct ipmb_msg request; > +}; > + > +struct ipmb_dev { > + struct i2c_client *client; > + struct miscdevice miscdev; > + struct ipmb_msg request; > + struct list_head request_queue; > + atomic_t request_queue_len; > + struct ipmb_msg response; Where you are using 'response' field? > + size_t msg_idx; > + spinlock_t lock; > + wait_queue_head_t wait_queue; > + struct mutex file_mutex; > +}; > + > +static int receive_ipmb_request(struct ipmb_dev *ipmb_dev_p, > + bool non_blocking, > + struct ipmb_msg *ipmb_request) > +{ > + struct ipmb_request_elem *queue_elem; > + unsigned long flags; > + int res; > + > + spin_lock_irqsave(_dev_p->lock, flags); > + > + while (!atomic_read(_dev_p->request_queue_len)) { > + spin_unlock_irqrestore(_dev_p->lock, flags); > + if (non_blocking) > + return -EAGAIN; > + > + res = wait_event_interruptible(ipmb_dev_p->wait_queue, > +
Re: [PATCH RESEND] sched/cpufreq: Fix kobject memleak
On Tue, Apr 30, 2019 at 06:24:43AM +0200, Ingo Molnar wrote: > > * Tobin C. Harding wrote: > > > Currently error return from kobject_init_and_add() is not followed by a > > call to kobject_put(). This means there is a memory leak. > > > > Add call to kobject_put() in error path of kobject_init_and_add(). > > > > Signed-off-by: Tobin C. Harding > > --- > > > > Resend with SOB tag. > > Please ignore my previous mail :-) Cheers Ingo, caught myself not checkpatching :( thanks, Tobin.
[PATCH v1] mmc: dt: add DT bindings for ls1028a eSDHC host controller
From: Yinbo Zhu Add "fsl,ls1028a-esdhc" bindings for ls1028a eSDHC host controller Signed-off-by: Yinbo Zhu --- .../devicetree/bindings/mmc/fsl-esdhc.txt |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt index 99c5cf8..a7250b9 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt @@ -21,6 +21,7 @@ Required properties: "fsl,ls1043a-esdhc" "fsl,ls1046a-esdhc" "fsl,ls2080a-esdhc" + "fsl,ls1028a-esdhc" - clock-frequency : specifies eSDHC base clock frequency. Optional properties: -- 1.7.1
Re: [PATCH v2 17/19] iommu: Add max num of cache and granu types
Hi Jacob, On 4/29/19 6:17 PM, Jacob Pan wrote: > On Fri, 26 Apr 2019 18:22:46 +0200 > Auger Eric wrote: > >> Hi Jacob, >> >> On 4/24/19 1:31 AM, Jacob Pan wrote: >>> To convert to/from cache types and granularities between generic and >>> VT-d specific counterparts, a 2D arrary is used. Introduce the >>> limits >> array >>> to help define the converstion array size. >> conversion >>> > will fix, thanks >>> Signed-off-by: Jacob Pan >>> --- >>> include/uapi/linux/iommu.h | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h >>> index 5c95905..2d8fac8 100644 >>> --- a/include/uapi/linux/iommu.h >>> +++ b/include/uapi/linux/iommu.h >>> @@ -197,6 +197,7 @@ struct iommu_inv_addr_info { >>> __u64 granule_size; >>> __u64 nb_granules; >>> }; >>> +#define NR_IOMMU_CACHE_INVAL_GRANU (3) >>> >>> /** >>> * First level/stage invalidation information >>> @@ -235,6 +236,7 @@ struct iommu_cache_invalidate_info { >>> struct iommu_inv_addr_info addr_info; >>> }; >>> }; >>> +#define NR_IOMMU_CACHE_TYPE(3) >>> /** >>> * struct gpasid_bind_data - Information about device and guest >>> PASID binding >>> * @gcr3: Guest CR3 value from guest mm >>> >> Is it really something that needs to be exposed in the uapi? >> > I put it in uapi since the related definitions for granularity and > cache type are in the same file. > Maybe putting them close together like this? I was thinking you can just > fold it into your next series as one patch for introducing cache > invalidation. > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > index 2d8fac8..4ff6929 100644 > --- a/include/uapi/linux/iommu.h > +++ b/include/uapi/linux/iommu.h > @@ -164,6 +164,7 @@ enum iommu_inv_granularity { > IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */ > IOMMU_INV_GRANU_PASID, /* pasid-selective invalidation */ > IOMMU_INV_GRANU_ADDR, /* page-selective invalidation */ > + NR_IOMMU_INVAL_GRANU, /* number of invalidation granularities > */ }; > > /** > @@ -228,6 +229,7 @@ struct iommu_cache_invalidate_info { > #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ > #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ > #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ > +#define NR_IOMMU_CACHE_TYPE(3) OK I will add this. Thanks Eric > __u8cache; > __u8granularity; > >> Thanks >> >> Eric > > [Jacob Pan] >
Re: [RFC PATCH 2/7] x86/sci: add core implementation for system call isolation
* Andy Lutomirski wrote: > On Sat, Apr 27, 2019 at 3:46 AM Ingo Molnar wrote: > > So I'm wondering whether there's a 4th choice as well, which avoids > > control flow corruption *before* it happens: > > > > - A C language runtime that is a subset of current C syntax and > >semantics used in the kernel, and which doesn't allow access outside > >of existing objects and thus creates a strictly enforced separation > >between memory used for data, and memory used for code and control > >flow. > > > > - This would involve, at minimum: > > > > - tracking every type and object and its inherent length and valid > > access patterns, and never losing track of its type. > > > > - being a lot more organized about initialization, i.e. no > > uninitialized variables/fields. > > > > - being a lot more strict about type conversions and pointers in > > general. > > You're not the only one to suggest this. There are at least a few > things that make this extremely difficult if not impossible. For > example, consider this code: > > void maybe_buggy(void) > { > int a, b; > int *p = > int *q = (int *)some_function((unsigned long)p); > *q = 1; > } > > If some_function() returns , then all is well. But if > some_function() returns or even a valid address of some unrelated > kernel object, then the code might be entirely valid and correct C, > but I don't see how the runtime checks are supposed to tell whether > the resulting address is valid or is a bug. This type of code is, I > think, quite common in the kernel -- it happens in every data > structure where we have unions of pointers and integers or where we > steal some known-zero bits of a pointer to store something else. So the thing is, for the infinitely large state space of "valid C code" we already disallow an infinitely many versions in the Linux kernel. We have complicated rules that disallow certain C syntactical and semantical constructs, both on the tooling (build failure/warning) and on the review (style/taste) level. So the question IMHO isn't whether it's "valid C", because we already have the Linux kernel's own C syntax variant and are enforcing it with varying degrees of success. The question is whether the example you gave can be written in a strongly typed fashion, whether it makes sense to do so, and what the costs are. I think it's evident that it can be written with strongly typed constructs, by separating pointers from embedded error codes - with negative side effects to code generation: for example it increases structure sizes and error return paths. I think there's four main costs of converting such a pattern to strongly typed constructs: - memory/cache footprint: there's a nonzero cost there. - performance: this will hurt too. - code readability:this will probably improve. - code robustness: this will improve too. So I think the proper question to ask is not whether there's common C syntax within the kernel that would have to be rewritten, but whether the total sum of memory and runtime overhead of strongly typed C programming (if it's possible/desirable) is larger than the total sum of a typical Linux distro enabling the various current and proposed kernel hardening features that have a runtime overhead: - the SMAP/SMEP overhead of STAC/CLAC for every single user copy - other usercopy hardening features - stackprotector - KASLR - compiler plugins against information leaks - proposed KASLR extension to implement module randomization and -PIE overhead - proposed function call integrity checks - proposed per system call kernel stack offset randomization - ( and I'm sure I forgot about a few more, and it's all still only reactive security, not proactive security. ) That's death by a thousand cuts and CR3 switching during system calls is also throwing a hand grenade into the fight ;-) So if people are also proposing to do CR3 switches in every system call, I'm pretty sure the answer is "yes, even a managed C runtime is probably faster than *THAT* sum of a performanc mess" - at least with the current CR3 switching x86-uarch cost structure... Thanks, Ingo
Re: [PATCH v3 1/4] include: dt-bindings: add Performance Monitoring Unit for Exynos
Hi, I agree of this patch. But, I add the minor comments. If you edit them according to my comment, feel free to add my following tag: Acked-by: Chanwoo Choi On 19. 4. 19. 오후 10:48, Lukasz Luba wrote: > This patch add support of a new feature which can be used in DT: > Performance Monitoring Unit with defined event data type. > In this patch the event data types are defined for Exynos PPMU. > The patch also updates the MAINTAINERS file accordingly and > adds the header file to devfreq event subsystem. > > Signed-off-by: Lukasz Luba > --- > MAINTAINERS | 1 + > include/dt-bindings/pmu/exynos_ppmu.h | 26 ++ > 2 files changed, 27 insertions(+) > create mode 100644 include/dt-bindings/pmu/exynos_ppmu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 3671fde..1ba4b9b 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -4560,6 +4560,7 @@ T: git > git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git > S: Supported > F: drivers/devfreq/event/ > F: drivers/devfreq/devfreq-event.c > +F: include/dt-bindings/pmu/exynos_ppmu.h > F: include/linux/devfreq-event.h > F: Documentation/devicetree/bindings/devfreq/event/ > > diff --git a/include/dt-bindings/pmu/exynos_ppmu.h > b/include/dt-bindings/pmu/exynos_ppmu.h > new file mode 100644 > index 000..08fdce9 > --- /dev/null > +++ b/include/dt-bindings/pmu/exynos_ppmu.h > @@ -0,0 +1,26 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Samsung Exynos PPMU event types for counting in regs > + * > + * Copyright (c) 2019, Samsung Mabye, "Samsung Electronics" instead of 'Samsung'. > + * Author: Lukasz Luba > + */ > + > +#ifndef __DT_BINDINGS_PMU_EXYNOS_PPMU_H > +#define __DT_BINDINGS_PMU_EXYNOS_PPMU_H > + > + Remove unneeded blank line. > +#define PPMU_RO_BUSY_CYCLE_CNT 0x0 > +#define PPMU_WO_BUSY_CYCLE_CNT 0x1 > +#define PPMU_RW_BUSY_CYCLE_CNT 0x2 > +#define PPMU_RO_REQUEST_CNT 0x3 > +#define PPMU_WO_REQUEST_CNT 0x4 > +#define PPMU_RO_DATA_CNT 0x5 > +#define PPMU_WO_DATA_CNT 0x6 > +#define PPMU_RO_LATENCY 0x12 > +#define PPMU_WO_LATENCY 0x16 > +#define PPMU_V2_RO_DATA_CNT 0x4 > +#define PPMU_V2_WO_DATA_CNT 0x5 > +#define PPMU_V2_EVT3_RW_DATA_CNT 0x22 > + > +#endif > -- Best Regards, Chanwoo Choi Samsung Electronics
sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:undefined reference to `followparent_recalc'
Hi Randy, It's probably a bug fix that unveils the link errors. tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 83a50840e72a5a964b4704fcdc2fbb2d771015ab commit: acaf892ecbf5be7710ae05a61fd43c668f68ad95 sh: fix multiple function definition build errors date: 3 weeks ago config: sh-allmodconfig (attached as .config) compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout acaf892ecbf5be7710ae05a61fd43c668f68ad95 # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=sh If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): >> sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.data+0x1c): >> undefined reference to `followparent_recalc' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v6 01/10] clk: samsung: add needed IDs for DMC clocks in Exynos5420
Hi, On 19. 4. 19. 오후 11:19, Lukasz Luba wrote: > Define new IDs for clocks used by Dynamic Memory Controller in > Exynos5422 SoC. > > Acked-by: Rob Herring > Signed-off-by: Lukasz Luba > --- > include/dt-bindings/clock/exynos5420.h | 18 +- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/include/dt-bindings/clock/exynos5420.h > b/include/dt-bindings/clock/exynos5420.h > index 355f469..abb1842 100644 > --- a/include/dt-bindings/clock/exynos5420.h > +++ b/include/dt-bindings/clock/exynos5420.h > @@ -60,6 +60,7 @@ > #define CLK_MAU_EPLL 159 > #define CLK_SCLK_HSIC_12M160 > #define CLK_SCLK_MPHY_IXTAL24161 > +#define CLK_SCLK_BPLL162 > > /* gate clocks */ > #define CLK_UART0257 > @@ -195,6 +196,18 @@ > #define CLK_ACLK432_CAM 518 > #define CLK_ACLK_FL1550_CAM 519 > #define CLK_ACLK550_CAM 520 > +#define CLK_CLKM_PHY0521 > +#define CLK_CLKM_PHY1522 > +#define CLK_ACLK_PPMU_DREX0_0523 > +#define CLK_ACLK_PPMU_DREX0_1524 > +#define CLK_ACLK_PPMU_DREX1_0525 > +#define CLK_ACLK_PPMU_DREX1_1526 > +#define CLK_PCLK_PPMU_DREX0_0527 > +#define CLK_PCLK_PPMU_DREX0_1528 > +#define CLK_PCLK_PPMU_DREX1_0529 > +#define CLK_PCLK_PPMU_DREX1_1530 > +#define CLK_CDREX_PAUSE 531 > +#define CLK_CDREX_TIMING_SET 532 I cannot find the usage code of both CLK_CDREX_PAUSE and CLK_CDREX_TIMING_SET in these patchset. Please remove them. (snip) -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH 1/2] i2c: imx: I2C Driver doesn't consider I2C_IPGCLK_SEL RCW bit when using ls1046a SoC
The current kernel driver does not consider I2C_IPGCLK_SEL (424 bit of RCW) in deciding i2c_clk_rate in function i2c_imx_set_clk() { 0 Platform clock/4, 1 Platform clock/2}. When using ls1046a SoC, this populates incorrect value in IBFD register if I2C_IPGCLK_SEL = 0, which generates half of the desired Clock. Therefore, if ls1046a SoC is used, we need to set the i2c clock according to the corresponding RCW. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- drivers/i2c/busses/i2c-imx.c | 64 1 file changed, 64 insertions(+) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 422f1a445b55..7186cf3c7d24 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -45,6 +45,8 @@ #include #include #include +#include +#include /* This will be the driver name the kernel reports */ #define DRIVER_NAME "imx-i2c" @@ -109,6 +111,21 @@ #define I2C_PM_TIMEOUT 10 /* ms */ +/* 14-1 Since array index starts from 0 */ +#define RCW_I2C_IPGCLK_WORD (14 - 1) +/* + * Set mask for RCW 424th bit, reading from DCFG_CCSR RCW Status Registers + * Since this register in RM depicted as big endian, + * so consider 31st bit as LSB for creating the mask. + */ +#define RCW_I2C_IPGCLK_MASK0x80 +int i2c_ipgclk_sel = 1; + +static const struct soc_device_attribute ls1046a_soc[] = { + {.family = "QorIQ LS1046A"}, + { /* sentinel */ } +}; + /* * sorted list of clock divider, register value pairs * taken from table 26-5, p.26-9, Freescale i.MX @@ -304,6 +321,11 @@ static const struct platform_device_id imx_i2c_devtype[] = { }; MODULE_DEVICE_TABLE(platform, imx_i2c_devtype); +static const struct of_device_id guts_device_ids[] = { + { .compatible = "fsl,qoriq-device-config", }, + {} +}; + static const struct of_device_id i2c_imx_dt_ids[] = { { .compatible = "fsl,imx1-i2c", .data = _i2c_hwdata, }, { .compatible = "fsl,imx21-i2c", .data = _i2c_hwdata, }, @@ -533,6 +555,9 @@ static void i2c_imx_set_clk(struct imx_i2c_struct *i2c_imx, unsigned int div; int i; + if (!i2c_ipgclk_sel) + i2c_clk_rate = i2c_clk_rate / 2; + /* Divider value calculation */ if (i2c_imx->cur_clk == i2c_clk_rate) return; @@ -551,6 +576,10 @@ static void i2c_imx_set_clk(struct imx_i2c_struct *i2c_imx, /* Store divider value */ i2c_imx->ifdr = i2c_clk_div[i].val; + pr_alert("[%s] CLK Rate=%u Bitrate =%u Div =%u Value =%d\n", +__func__, i2c_clk_rate, i2c_imx->bitrate, +div, i2c_clk_div[i].val); + /* * There dummy delay is calculated. * It should be about one I2C clock period long. @@ -1116,6 +1145,9 @@ static int i2c_imx_probe(struct platform_device *pdev) int irq, ret; dma_addr_t phy_addr; u32 mul_value; + struct device_node *guts_node; + static struct ccsr_guts __iomem *guts_regs; + u32 rcw_reg; dev_dbg(>dev, "<%s>\n", __func__); @@ -1135,6 +1167,38 @@ static int i2c_imx_probe(struct platform_device *pdev) if (!i2c_imx) return -ENOMEM; + if (soc_device_match(ls1046a_soc)) { + /* +* Make device node for GUTS/DCFG (global utilities block) +* to read RCW. +*/ + guts_node = of_find_matching_node(NULL, guts_device_ids); + if (!guts_node) { + dev_err(>dev, "Could not find GUTS node\n"); + return -ENODEV; + } + /* +* Memory (IO) MAP the DCFG registers(for RCW) to +* be used in kernel virtual address space. +*/ + guts_regs = of_iomap(guts_node, 0); + of_node_put(guts_node); + if (!guts_regs) { + dev_err(>dev, "IOREMAP of GUTS node failed\n"); + return -ENOMEM; + } + /* Read rcw bit 424 (starting from 0) */ + rcw_reg = ioread32be(_regs->rcwsr[RCW_I2C_IPGCLK_WORD]); + pr_alert("RCW REG[%d]=0x%x\n", RCW_I2C_IPGCLK_WORD, rcw_reg); + if (rcw_reg & RCW_I2C_IPGCLK_MASK) { + pr_alert("Div by 2 Case Detected in RCW\n"); + i2c_ipgclk_sel = 1; + } else { + pr_alert("Div by 4 Case Detected in RCW\n"); + i2c_ipgclk_sel = 0; + } + } + if (of_id) { i2c_imx->hwdata = of_id->data; ret = of_property_read_u32(pdev->dev.of_node, -- 2.17.1
Re: [PATCH v6 06/10] dt-bindings: memory-controllers: add Exynos5422 DMC device description
On 19. 4. 19. 오후 11:19, Lukasz Luba wrote: > The patch adds description for DT binding for a new Exynos5422 Dynamic > Memory Controller device. > > Signed-off-by: Lukasz Luba > --- > .../bindings/memory-controllers/exynos5422-dmc.txt | 73 > ++ > 1 file changed, 73 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > > diff --git > a/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > b/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > new file mode 100644 > index 000..133b3cc > --- /dev/null > +++ b/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > @@ -0,0 +1,73 @@ > +* Exynos5422 frequency and voltage scaling for Dynamic Memory Controller > device > + > +The Samsung Exynos5422 SoC has DMC (Dynamic Memory Controller) to which the > DRAM > +memory chips are connected. The driver is to monitor the controller in > runtime > +and switch frequency and voltage. To monitor the usage of the controller in > +runtime, the driver uses the PPMU (Platform Performance Monitoring Unit), > which > +is able to measure the current load of the memory. > +When 'userspace' governor is used for the driver, an application is able to > +switch the DMC and memory frequency. > + > +Required properties for DMC device for Exynos5422: > +- compatible: Should be "samsung,exynos5422-bus". As I already mentioned on many times, it is not fixed. You have to fix it as following: - exynos5422-bus -> exynos5422-dmc > +- clock-names : the name of clock used by the bus, "bus". The below examples doesn't contain the 'bus' clock name. > +- clocks : phandles for clock specified in "clock-names" property. > +- devfreq-events : phandles for PPMU devices connected to this DMC. > +- vdd-supply : phandle for voltage regulator which is connected. > +- reg : registers of two CDREX controllers, chip information, clocks > subsystem. > +- operating-points-v2 : phandle for OPPs described in v2 definition. > +- device-handle : phandle of the connected DRAM memory device. For more > + information please refer to Documentation > +- devfreq-events : phandles of the PPMU events used by the controller. > + > +Example: > + > + ppmu_dmc0_0: ppmu@10d0 { > + compatible = "samsung,exynos-ppmu"; > + reg = <0x10d0 0x2000>; > + clocks = < CLK_PCLK_PPMU_DREX0_0>; > + clock-names = "ppmu"; > + status = "okay"; > + events { > + ppmu_event_dmc0_0: ppmu-event3-dmc0_0 { > + event-name = "ppmu-event3-dmc0_0"; > + }; > + }; > + }; > + > + dmc: memory-controller@10c2 { > + compatible = "samsung,exynos5422-dmc"; > + reg = <0x10c2 0x1>, <0x10c3 0x1>, > + <0x1000 0x1000>, <0x1003 0x1000>; > + clocks =< CLK_FOUT_SPLL>, > + < CLK_MOUT_SCLK_SPLL>, > + < CLK_FF_DOUT_SPLL2>, > + < CLK_FOUT_BPLL>, > + < CLK_MOUT_BPLL>, > + < CLK_SCLK_BPLL>, > + < CLK_MOUT_MX_MSPLL_CCORE>, > + < CLK_MOUT_MX_MSPLL_CCORE_PHY>, > + < CLK_MOUT_MCLK_CDREX>, > + < CLK_DOUT_CLK2X_PHY0>, > + < CLK_CLKM_PHY0>, > + < CLK_CLKM_PHY1>; > + clock-names = "fout_spll", > + "mout_sclk_spll", > + "ff_dout_spll2", > + "fout_bpll", > + "mout_bpll", > + "sclk_bpll", > + "mout_mx_mspll_ccore", > + "mout_mx_mspll_ccore_phy", > + "mout_mclk_cdrex", > + "dout_clk2x_phy0", > + "clkm_phy0", > + "clkm_phy1"; > + status = "okay"; > + operating-points-v2 = <_opp_table>; > + devfreq-events = <_event3_dmc0_0>, <_event3_dmc0_1>, > + <_event3_dmc1_0>, <_event3_dmc1_1>; > + operating-points-v2 = <_opp_table>; > + device-handle = <_K3QF2F20DB>; > + vdd-supply = <_reg>; > + }; > -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH 2/2] arm64: dts: fsl: ls1046a: Add the guts node in dts
For NXP ls1046a SoC, the i2c clock needs to be configured with the appropriate bit of RCW, so we add the guts node (GUTS/DCFG global utilities block) for the driver to read. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index 373310e4c0ea..f88599df18bb 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -205,6 +205,11 @@ status = "disabled"; }; + guts: global-utilities@1ee { + compatible = "fsl,qoriq-device-config"; + reg = <0x0 0x1ee 0x0 0x1000>; + }; + qspi: spi@155 { compatible = "fsl,ls1021a-qspi"; #address-cells = <1>; -- 2.17.1
Re: [RFC PATCH v2 00/17] Core scheduling v2
* Aubrey Li wrote: > On Tue, Apr 30, 2019 at 12:01 AM Ingo Molnar wrote: > > * Li, Aubrey wrote: > > > > > > I.e. showing the approximate CPU thread-load figure column would be > > > > very useful too, where '50%' shows half-loaded, '100%' fully-loaded, > > > > '200%' over-saturated, etc. - for each row? > > > > > > See below, hope this helps. > > > .--. > > > |NA/AVX vanilla-SMT [std% / sem%] cpu% |coresched-SMT [std% / > > > sem%] +/- cpu% | no-SMT [std% / sem%] +/- cpu% | > > > |--| > > > | 1/1508.5 [ 0.2%/ 0.0%] 2.1% |504.7 [ 1.1%/ > > > 0.1%]-0.8%2.1% | 509.0 [ 0.2%/ 0.0%] 0.1% 4.3% | > > > | 2/2 1000.2 [ 1.4%/ 0.1%] 4.1% | 1004.1 [ 1.6%/ > > > 0.2%] 0.4%4.1% | 997.6 [ 1.2%/ 0.1%] -0.3% 8.1% | > > > | 4/4 1912.1 [ 1.0%/ 0.1%] 7.9% | 1904.2 [ 1.1%/ > > > 0.1%]-0.4%7.9% | 1914.9 [ 1.3%/ 0.1%] 0.1%15.1% | > > > | 8/8 3753.5 [ 0.3%/ 0.0%]14.9% | 3748.2 [ 0.3%/ > > > 0.0%]-0.1% 14.9% | 3751.3 [ 0.4%/ 0.0%] -0.1%30.5% | > > > | 16/16 7139.3 [ 2.4%/ 0.2%]30.3% | 7137.9 [ 1.8%/ > > > 0.2%]-0.0% 30.3% | 7049.2 [ 2.4%/ 0.2%] -1.3%60.4% | > > > | 32/32 10899.0 [ 4.2%/ 0.4%]60.3% | 10780.3 [ 4.4%/ > > > 0.4%]-1.1% 55.9% | 10339.2 [ 9.6%/ 0.9%] -5.1%97.7% | > > > | 64/64 15086.1 [11.5%/ 1.2%]97.7% | 14262.0 [ 8.2%/ > > > 0.8%]-5.5% 82.0% | 11168.7 [22.2%/ 1.7%] -26.0% 100.0% | > > > |128/12815371.9 [22.0%/ 2.2%] 100.0% | 14675.8 [14.4%/ > > > 1.4%]-4.5% 82.8% | 10963.9 [18.5%/ 1.4%] -28.7% 100.0% | > > > |256/25615990.8 [22.0%/ 2.2%] 100.0% | 12227.9 [10.3%/ > > > 1.0%] -23.5% 73.2% | 10469.9 [19.6%/ 1.7%] -34.5% 100.0% | > > > '--' > > > > Very nice, thank you! > > > > What's interesting is how in the over-saturated case (the last three > > rows: 128, 256 and 512 total threads) coresched-SMT leaves 20-30% CPU > > performance on the floor according to the load figures. > > Yeah, I found the next focus. > > > Is this true idle time (which shows up as 'id' during 'top'), or some > > load average artifact? > > vmstat periodically reported intermediate CPU utilization in one > second, it was running simultaneously when the benchmarks run. The cpu% > is computed by the average of (100-idle) series. Ok - so 'vmstat' uses /proc/stat, which uses cpustat[CPUTIME_IDLE] (or its NOHZ work-alike), so this should be true idle time - to the extent the HZ process clock's sampling is accurate. So I guess the answer to my question is "yes". ;-) BTW., for robustness sake you might want to add iowait to idle time (it's the 'wa' field of vmstat) - it shouldn't matter for this particular benchmark which doesn't do much IO, but it might for others. Both CPUTIME_IDLE and CPUTIME_IOWAIT are idle states when a CPU is not utilized. [ Side note: we should really implement precise idle time accounting when CONFIG_IRQ_TIME_ACCOUNTING=y is enabled. We pay all the costs of the timestamps, but AFAICS we don't propagate that into the idle cputime metrics. ] Thanks, Ingo
Re: [PATCH v3 2/2] dt-bindings: cpufreq: Document allwinner,cpu-operating-points-v2
On 29-04-19, 11:18, Rob Herring wrote: > On Sun, Apr 28, 2019 at 4:53 AM Frank Lee wrote: > > > > On Sat, Apr 27, 2019 at 5:15 AM Rob Herring wrote: > > > > > > On Wed, Apr 10, 2019 at 01:41:39PM -0400, Yangtao Li wrote: > > > > Allwinner Process Voltage Scaling Tables defines the voltage and > > > > frequency value based on the speedbin blown in the efuse combination. > > > > The sunxi-cpufreq-nvmem driver reads the efuse value from the SoC to > > > > provide the OPP framework with required information. > > > > This is used to determine the voltage and frequency value for each > > > > OPP of operating-points-v2 table when it is parsed by the OPP framework. > > > > > > > > The "allwinner,cpu-operating-points-v2" DT extends the > > > > "operating-points-v2" > > > > with following parameters: > > > > - nvmem-cells (NVMEM area containig the speedbin information) > > > > - opp-microvolt-: voltage in micro Volts. > > > > At runtime, the platform can pick a and matching > > > > opp-microvolt- property. > > > > HW: : > > > > sun50iw-h6 speed0 speed1 speed2 > > > > > > We already have at least one way to support speed bins with QC kryo > > > binding. Why do we need a different way? > > > > For some SOCs, for some reason (making the CPU have approximate > > performance), > > they use the same frequency but different voltage. In the case where > > this speed bin > > is not a lot and opp uses the same frequency, too many repeated opp > > nodes are a bit > > redundant and not intuitive enough. > > > > So, I think it's worth the new method. > > Well, I don't. > > We can't have every SoC vendor doing their own thing just because they > want to. If there are technical reasons why existing bindings don't > work, then maybe we need to do something different. But I haven't > heard any reasons. Well there is a good reason for attempting the new bindings and I wasn't sure if updating the earlier bindings or adding another one for platform is correct. As we aren't really adding new bindings, but just documentation around it. So there are two ways OPP core support this thing: - opp-supported-hw: This is a better fit if we have a smaller group of frequencies to select from a bigger group, so we disable non-required OPPs completely. This is what Qcom did as they wanted to select different frequencies all together. - opp-microvolt-: This is a better fit if the frequencies remain same and only few of the properties like voltage/current have a different value. So we don't disable any OPPs but just select the right voltage/current for those frequencies. This avoids unnecessary duplication of the OPPs in DT and that's what allwinner guys want. The kryo nvmem bindings currently supports opp-supported-hw, maybe we can add mention support for second one in the same file and rename it well. -- viresh
[PATCH 1/3] dt-bindings: i2c: add optional mul-value property to binding
NXP Layerscape SoC have up to three MUL options available for all divider values, we choice of MUL determines the internal monitor rate of the I2C bus (SCL and SDA signals): A lower MUL value results in a higher sampling rate of the I2C signals. A higher MUL value results in a lower sampling rate of the I2C signals. So in Optional properties we added our custom mul-value property in the binding to select which mul option for the device tree i2c controller node. Signed-off-by: Chuanhua Han --- Documentation/devicetree/bindings/i2c/i2c-imx.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt b/Documentation/devicetree/bindings/i2c/i2c-imx.txt index b967544590e8..ba8e7b7b3fa8 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt +++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt @@ -18,6 +18,9 @@ Optional properties: - sda-gpios: specify the gpio related to SDA pin - pinctrl: add extra pinctrl to configure i2c pins to gpio function for i2c bus recovery, call it "gpio" state +- mul-value: NXP Layerscape SoC have up to three MUL options available for +all I2C divider values, it describes which MUL we choose to use for the driver, +the values should be 1,2,4. Examples: -- 2.17.1
[PATCH 2/3] i2c: imx: I2C Driver IBC and SCL Divider for MUL=2 and MUL=4
NXP Layerscape SoC have up to three MUL options available for all divider values,we choice of MUL determines the internal monitor rate of the I2C bus (SCL and SDA signals). The current kernel driver supports MUL=1 by default ,but doesn't have the IBC and SCL Divider entries in vf610_i2c_clk_div for MUL=2 and MUL=4,so we need to add the corresponding support. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- drivers/i2c/busses/i2c-imx.c | 71 +++- 1 file changed, 69 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 42fed40198a0..ac5a334b7339 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -156,6 +157,44 @@ static struct imx_i2c_clk_pair vf610_i2c_clk_div[] = { { 3840, 0x3F }, { 4096, 0x7B }, { 5120, 0x7D }, { 6144, 0x7E }, }; +static struct imx_i2c_clk_pair mul2_i2c_clk_div[] = { + { 40, 0x40 }, { 44, 0x41 }, { 48, 0x42 }, { 52, 0x43 }, + { 56, 0x44 }, { 60, 0x45 }, { 68, 0x46 }, { 80, 0x47 }, + { 56, 0x48 }, { 64, 0x49 }, { 72, 0x4A }, { 80, 0x4B }, + { 88, 0x4C }, { 96, 0x4D }, { 112, 0x4E }, { 136, 0x4F }, + { 96, 0x50 }, { 112, 0x51 }, { 128, 0x52 }, { 144, 0x53 }, + { 160, 0x54 }, { 176, 0x55 }, { 208, 0x56 }, { 256, 0x57 }, + { 160, 0x58 }, { 192, 0x59 }, { 224, 0x5A }, { 256, 0x5B }, + { 288, 0x5C }, { 320, 0x5D }, { 384, 0x5E }, { 480, 0x5F }, + { 320, 0x60 }, { 384, 0x61 }, { 448, 0x62 }, { 512, 0x63 }, + { 576, 0x64 }, { 640, 0x65 }, { 768, 0x66 }, { 960, 0x67 }, + { 640, 0x68 }, { 768, 0x69 }, { 896, 0x6A }, { 1024, 0x6B }, + { 1152, 0x6C }, { 1280, 0x6D }, { 1536, 0x6E }, { 1920, 0x6F }, + { 1280, 0x70 }, { 1536, 0x71 }, { 1792, 0x72 }, { 2048, 0x73 }, + { 2304, 0x74 }, { 2560, 0x75 }, { 3072, 0x76 }, { 3840, 0x77 }, + { 2560, 0x78 }, { 3072, 0x79 }, { 3584, 0x7A }, { 4096, 0x7B }, + { 4608, 0x7C }, { 5120, 0x7D }, { 6144, 0x7E }, { 7680, 0x7F }, +}; + +static struct imx_i2c_clk_pair mul4_i2c_clk_div[] = { + { 80,0x80 }, { 88,0x81 }, { 96,0x82 }, { 104, 0x83 }, + { 112, 0x84 }, { 120, 0x85 }, { 136, 0x86 }, { 160, 0x87 }, + { 112, 0x88 }, { 128, 0x89 }, { 144, 0x8A }, { 160, 0x8B }, + { 176, 0x8C }, { 192, 0x8D }, { 224, 0x8E }, { 272, 0x8F }, + { 192, 0x90 }, { 224, 0x91 }, { 256, 0x92 }, { 288, 0x93 }, + { 320, 0x94 }, { 352, 0x95 }, { 416, 0x96 }, { 512, 0x97 }, + { 320, 0x98 }, { 384, 0x99 }, { 448, 0x9A }, { 512, 0x9B }, + { 576, 0x9C }, { 640, 0x9D }, { 768, 0x9E }, { 960, 0x9F }, + { 640, 0xA0 }, { 768, 0xA1 }, { 896, 0xA2 }, { 1024, 0xA3 }, + { 1152, 0xA4 }, { 1280, 0xA5 }, { 1536, 0xA6 }, { 1792, 0xAA }, + { 1280, 0xA8 }, { 1536, 0xA9 }, { 1920, 0xA7 }, { 2048, 0xAB }, + { 2304, 0xAC }, { 2560, 0xAD }, { 3072, 0xAE }, { 3584, 0xB2 }, + { 2560, 0xB0 }, { 3072, 0xB1 }, { 3820, 0xAF }, { 4096, 0xB3 }, + { 4608, 0xB4 }, { 5120, 0xB5 }, { 6144, 0xB6 }, { 7680, 0xB7 }, + { 5120, 0xB8 }, { 6144, 0xB9 }, { 7168, 0xBA }, { 8192, 0xBB }, + { 9216, 0xBC }, { 10240, 0xBD }, { 12288, 0xBE }, { 15360, 0xBF }, +}; + enum imx_i2c_type { IMX1_I2C, IMX21_I2C, @@ -234,6 +273,24 @@ static struct imx_i2c_hwdata vf610_i2c_hwdata = { }; +static struct imx_i2c_hwdata mul2_i2c_hwdata = { + .devtype= VF610_I2C, + .regshift = VF610_I2C_REGSHIFT, + .clk_div= mul2_i2c_clk_div, + .ndivs = ARRAY_SIZE(mul2_i2c_clk_div), + .i2sr_clr_opcode= I2SR_CLR_OPCODE_W1C, + .i2cr_ien_opcode= I2CR_IEN_OPCODE_0, +}; + +static struct imx_i2c_hwdata mul4_i2c_hwdata = { + .devtype= VF610_I2C, + .regshift = VF610_I2C_REGSHIFT, + .clk_div= mul4_i2c_clk_div, + .ndivs = ARRAY_SIZE(mul4_i2c_clk_div), + .i2sr_clr_opcode= I2SR_CLR_OPCODE_W1C, + .i2cr_ien_opcode= I2CR_IEN_OPCODE_0, +}; + static const struct platform_device_id imx_i2c_devtype[] = { { .name = "imx1-i2c", @@ -1058,6 +1115,7 @@ static int i2c_imx_probe(struct platform_device *pdev) void __iomem *base; int irq, ret; dma_addr_t phy_addr; + u32 mul_value; dev_dbg(>dev, "<%s>\n", __func__); @@ -1077,11 +1135,20 @@ static int i2c_imx_probe(struct platform_device *pdev) if (!i2c_imx) return -ENOMEM; - if (of_id) + if (of_id) { i2c_imx->hwdata = of_id->data; - else + ret = of_property_read_u32(pdev->dev.of_node, +
[PATCH 3/3] arm64: dts: fsl: ls1046a: Add mul-value property of the i2c controller nodes
According to LS1046A Reference Manual, for the i2c controller, you have up to three MUL options available for all divider values. Therefore, we need to determine which MUL to use in the device tree for driver use. The "mul-value" property provides which mul is used in our driver. Signed-off-by: Chuanhua Han --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index b0ef08b090dd..373310e4c0ea 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -385,6 +385,7 @@ dmas = < 1 39>, < 1 38>; dma-names = "tx", "rx"; + mul-value = <4>; status = "disabled"; }; @@ -395,6 +396,7 @@ reg = <0x0 0x219 0x0 0x1>; interrupts = ; clocks = < 4 1>; + mul-value = <4>; status = "disabled"; }; @@ -405,6 +407,7 @@ reg = <0x0 0x21a 0x0 0x1>; interrupts = ; clocks = < 4 1>; + mul-value = <4>; status = "disabled"; }; @@ -415,6 +418,7 @@ reg = <0x0 0x21b 0x0 0x1>; interrupts = ; clocks = < 4 1>; + mul-value = <4>; status = "disabled"; }; -- 2.17.1
PROBLEM: Elan touchpad regression on Kernel 5.0.10
Hello, [1.] One line summary of the problem: Elan touchpad regression on Kernel 5.0.10 [2.] Full description of the problem/report: Elan touchpad does not work on 5.0.10 while working on 5.0.9 [3.] Keywords: elan_i2c_core elan i2c touchpad 5.0.10 [4.] Kernel information [4.1.] Kernel version: Linux version 5.0.10-arch1-1-ARCH (builduser@heftig-2592) (gcc version 8.3.0 (GCC)) #1 SMP PREEMPT Sat Apr 27 20:06:45 UTC 2019 [4.2.] Kernel .config file: I'm not sure, but I think it may be referring to https://git.archlinux.org/svntogit/packages.git/tree/trunk/config?h=packages/linux [5.] Most recent kernel version which did not have the bug: 5.0.9 [6.] Output of Oops.. message (if applicable) with symbolic information resolved (Not appliable) [7.] A small shell script or example program which triggers the problem: (Not appliable) [8.] Environment [8.1.] Software (add the output of the ver_linux script here) Linux sheltty 5.0.10-arch1-1-ARCH #1 SMP PREEMPT Sat Apr 27 20:06:45 UTC 2019 x86_64 GNU/Linux GNU C 8.3.0 GNU Make4.2.1 Binutils2.32 Util-linux 2.33.2 Mount 2.33.2 Module-init-tools 26 E2fsprogs 1.45.0 Jfsutils1.1.15 Reiserfsprogs 3.6.27 Xfsprogs4.20.0 PPP 2.4.7 Linux C Library 2.29 Dynamic linker (ldd)2.29 Linux C++ Library 6.0.25 Procps 3.3.15 Kbd 2.0.4 Console-tools 2.0.4 Sh-utils8.31 Udev242 Modules Loaded 8021q 8250_dw ac ac97_bus acpi_thermal_rel aesni_intel aes_x86_64 agpgart ahci arc4 atkbd battery bbswitch bluetooth btbcm btintel btrtl btusb cfg80211 coretemp crc16 crc32c_generic crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd crypto_user drm drm_kms_helper ecdh_generic elan_i2c evdev ext4 fat fb_sys_fops fscrypto garp ghash_clmulni_intel glue_helper hid hid_generic i2c_algo_bit i2c_hid i2c_i801 i8042 i915 idma64 input_leds int3400_thermal int3403_thermal int340x_thermal_zone intel_cstate intel_gtt intel_lpss intel_lpss_pci intel_pch_thermal intel_powerclamp intel_rapl intel_rapl_perf intel_soc_dts_iosf intel_uncore intel_wmi_thunderbolt ip_tables irqbypass iTCO_vendor_support iTCO_wdt jbd2 joydev kvm kvmgt kvm_intel ledtrig_audio libahci libata libphy libps2 llc mac80211 mac_hid mbcache mdev media mei mei_me mousedev mrp nls_cp437 nls_iso8859_1 pcc_cpufreq processor_thermal_device r8169 r8822be realtek rfkill rng_core scsi_mod serio serio_raw snd snd_compress snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_core snd_hda_ext_core snd_hda_intel snd_hwdep snd_pcm snd_pcm_dmaengine snd_soc_acpi snd_soc_acpi_intel_match snd_soc_core snd_soc_hdac_hda snd_soc_skl snd_soc_skl_ipc snd_soc_sst_dsp snd_soc_sst_ipc snd_timer soundcore stp syscopyarea sysfillrect sysimgblt tpm tpm_crb tpm_tis tpm_tis_core typec typec_ucsi ucsi_acpi usbhid uvcvideo vfat vfio vfio_iommu_type1 vfio_mdev videobuf2_common videobuf2_memops videobuf2_v4l2 videobuf2_vmalloc videodev wmi wmi_bmof x86_pkg_temp_thermal xhci_hcd xhci_pci x_tables [8.2.] Processor information (from /proc/cpuinfo): (Maybe not appliable) [8.3.] Module information (from /proc/modules): (Parts related to i2c and elan:) i2c_algo_bit 16384 1 i915, Live 0x i2c_hid 32768 0 - Live 0x hid 147456 3 hid_generic,usbhid,i2c_hid, Live 0x elan_i2c 49152 0 - Live 0x i2c_i801 36864 0 - Live 0x [8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) /proc/ioports: - : PCI Bus :00 - : dma1 - : pic1 - : iTCO_wdt - : timer0 - : timer1 - : keyboard - : PNP0C09:00 - : EC data - : keyboard - : PNP0C09:00 - : EC cmd - : rtc0 - : dma page reg - : pic2 - : dma2 - : fpu - : PNP0C04:00 - : iTCO_wdt - : pnp 00:02 - : PCI conf1 - : PCI Bus :00 - : pnp 00:02 - : pnp 00:00 - : ACPI PM1a_EVT_BLK - : ACPI PM1a_CNT_BLK - : ACPI PM_TMR - : ACPI CPU throttle - : ACPI PM2_CNT_BLK - : pnp 00:04 - : ACPI GPE0_BLK - : pnp 00:01 - : PCI Bus :08 - : :08:00.0 - : PCI Bus :07 - : :07:00.0 - : r8822be - : PCI Bus :01 - : :01:00.0 - : :00:02.0 - : :00:1f.4 - : i801_smbus - : :00:17.0 - : ahci - : :00:17.0 - : ahci - : :00:17.0 - : ahci [8.5.] PCI information It seems to be long
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Mon, Apr 29, 2019 at 10:18:04PM -0600, Andreas Dilger wrote: > > > > void*i_private; /* fs or device private pointer */ > > + void (*free_inode)(struct inode *); > > It seems like a waste to increase the size of every struct inode just to > access > a static pointer. Is this the only place that ->free_inode() is called? Why > not move the ->free_inode() pointer into inode->i_fop->free_inode() so that it > is still directly accessible at this point. i_op, surely? In any case, increasing sizeof(struct inode) is not a problem - if anything, I'd turn ->i_fop into an anon union with that. As in, diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 9d80f9e0855e..b8d3ddd8b8db 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -655,3 +655,11 @@ in your dentry operations instead. * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() * combination of NULL ->destroy_inode and NULL ->free_inode is treated as NULL/free_inode_nonrcu, to preserve the compatibility. + + Note that the callback (be it via ->free_inode() or explicit call_rcu() + in ->destroy_inode()) is *NOT* ordered wrt superblock destruction; + as the matter of fact, the superblock and all associated structures + might be already gone. The filesystem driver is guaranteed to be still + there, but that's it. Freeing memory in the callback is fine; doing + more than that is possible, but requires a lot of care and is best + avoided. diff --git a/fs/inode.c b/fs/inode.c index fb45590d284e..627e1766503a 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -211,8 +211,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); static void i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); - if (inode->i_sb->s_op->free_inode) - inode->i_sb->s_op->free_inode(inode); + if (inode->free_inode) + inode->free_inode(inode); else free_inode_nonrcu(inode); } @@ -236,6 +236,7 @@ static struct inode *alloc_inode(struct super_block *sb) if (!ops->free_inode) return NULL; } + inode->free_inode = ops->free_inode; i_callback(>i_rcu); return NULL; } @@ -276,6 +277,7 @@ static void destroy_inode(struct inode *inode) if (!ops->free_inode) return; } + inode->free_inode = ops->free_inode; call_rcu(>i_rcu, i_callback); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2e9b9f87caca..92732286b748 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -694,7 +694,10 @@ struct inode { #ifdef CONFIG_IMA atomic_ti_readcount; /* struct files open RO */ #endif - const struct file_operations*i_fop; /* former ->i_op->default_file_ops */ + union { + const struct file_operations*i_fop; /* former ->i_op->default_file_ops */ + void (*free_inode)(struct inode *); + }; struct file_lock_context*i_flctx; struct address_spacei_data; struct list_headi_devices;
Re: [PATCH RESEND] sched/cpufreq: Fix kobject memleak
* Tobin C. Harding wrote: > Currently error return from kobject_init_and_add() is not followed by a > call to kobject_put(). This means there is a memory leak. > > Add call to kobject_put() in error path of kobject_init_and_add(). > > Signed-off-by: Tobin C. Harding > --- > > Resend with SOB tag. Please ignore my previous mail :-) Thanks, Ingo
Re: [PATCH] sched/cpufreq: Fix kobject memleak
* Tobin C. Harding wrote: > Currently error return from kobject_init_and_add() is not followed by a > call to kobject_put(). This means there is a memory leak. > > Add call to kobject_put() in error path of kobject_init_and_add(). > --- > kernel/sched/cpufreq_schedutil.c | 1 + > 1 file changed, 1 insertion(+) I've added your: Signed-off-by: Tobin C. Harding Which I suppose you intended to include? Thanks, Ingo
Re: [PATCH 1/2] RISC-V: Add DT documentation for SiFive L2 Cache Controller
On Fri, Apr 26, 2019 at 3:04 PM Sudeep Holla wrote: > > On Fri, Apr 26, 2019 at 11:20:17AM +0530, Yash Shah wrote: > > On Thu, Apr 25, 2019 at 3:43 PM Sudeep Holla wrote: > > > > > > On Thu, Apr 25, 2019 at 11:24:55AM +0530, Yash Shah wrote: > > > > Add device tree bindings for SiFive FU540 L2 cache controller driver > > > > > > > > Signed-off-by: Yash Shah > > > > --- > > > > .../devicetree/bindings/riscv/sifive-l2-cache.txt | 53 > > > > ++ > > > > 1 file changed, 53 insertions(+) > > > > create mode 100644 > > > > Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > > > > > diff --git > > > > a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > new file mode 100644 > > > > index 000..15132e2 > > > > --- /dev/null > > > > +++ b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > @@ -0,0 +1,53 @@ > > > > +SiFive L2 Cache Controller > > > > +-- > > > > +The SiFive Level 2 Cache Controller is used to provide access to fast > > > > copies > > > > +of memory for masters in a Core Complex. The Level 2 Cache Controller > > > > also > > > > +acts as directory-based coherency manager. > > > > + > > > > +Required Properties: > > > > + > > > > +- compatible: Should be "sifive,fu540-c000-ccache" > > > > + > > > > +- cache-block-size: Specifies the block size in bytes of the cache > > > > + > > > > +- cache-level: Should be set to 2 for a level 2 cache > > > > + > > > > +- cache-sets: Specifies the number of associativity sets of the cache > > > > + > > > > +- cache-size: Specifies the size in bytes of the cache > > > > + > > > > +- cache-unified: Specifies the cache is a unified cache > > > > + > > > > +- interrupt-parent: Must be core interrupt controller > > > > + > > > > +- interrupts: Must contain 3 entries (DirError, DataError and DataFail > > > > signals) > > > > + > > > > +- reg: Physical base address and size of L2 cache controller registers > > > > map > > > > + > > > > +- reg-names: Should be "control" > > > > + > > > > > > It would be good if you mark the properties that are present in DT > > > specification and those that are added for sifive,fu540-c000-ccache > > > > I believe there isn't any property which is added explicitly for > > sifive,fu540-c000-ccache. > > > > reg and interrupts are generally optional for normal cache and may be > required for cache controller like this. DT specification[1] covers > only caches and not cache controllers. Are you suggesting something like this: Required Properties: Standard Properties: - compatible: Should be "sifive,-ccache" Supported compatible strings are: "sifive,fu540-c000-ccache" and "sifive,fu740-c000-ccache" - cache-block-size: Specifies the block size in bytes of the cache - cache-level: Should be set to 2 for a level 2 cache - cache-sets: Specifies the number of associativity sets of the cache - cache-size: Specifies the size in bytes of the cache - cache-unified: Specifies the cache is a unified cache Non-Standard Properties: - interrupt-parent: Must be core interrupt controller - interrupts: Must contain 3 entries for FU540 (DirError, DataError and DataFail signals) or 4 entries for other chips (DirError, DirFail, DataError, DataFail signals) - reg: Physical base address and size of L2 cache controller registers map - reg-names: Should be "control" - Yash > > -- > Regards, > Sudeep > > [1] > https://github.com/devicetree-org/devicetree-specification/releases/download/v0.2/devicetree-specification-v0.2.pdf
Re: [PATCH v4 1/7] ocxl: Split pci.c
On 27/3/19 4:31 pm, Alastair D'Silva wrote: From: Alastair D'Silva In preparation for making core code available for external drivers, move the core code out of pci.c and into core.c Signed-off-by: Alastair D'Silva There doesn't seem to be much left in pci.c, is there? Acked-by: Andrew Donnellan --- drivers/misc/ocxl/Makefile| 1 + drivers/misc/ocxl/core.c | 517 + drivers/misc/ocxl/ocxl_internal.h | 5 + drivers/misc/ocxl/pci.c | 519 +- 4 files changed, 524 insertions(+), 518 deletions(-) create mode 100644 drivers/misc/ocxl/core.c diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile index 5229dcda8297..bc4e39bfda7b 100644 --- a/drivers/misc/ocxl/Makefile +++ b/drivers/misc/ocxl/Makefile @@ -3,6 +3,7 @@ ccflags-$(CONFIG_PPC_WERROR)+= -Werror ocxl-y+= main.o pci.o config.o file.o pasid.o ocxl-y+= link.o context.o afu_irq.o sysfs.o trace.o +ocxl-y += core.o obj-$(CONFIG_OCXL)+= ocxl.o # For tracepoints to include our trace.h from tracepoint infrastructure: diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c new file mode 100644 index ..1a4411b72d35 --- /dev/null +++ b/drivers/misc/ocxl/core.c @@ -0,0 +1,517 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2019 IBM Corp. +#include +#include "ocxl_internal.h" + +static struct ocxl_fn *ocxl_fn_get(struct ocxl_fn *fn) +{ + return (get_device(>dev) == NULL) ? NULL : fn; +} + +static void ocxl_fn_put(struct ocxl_fn *fn) +{ + put_device(>dev); +} + +struct ocxl_afu *ocxl_afu_get(struct ocxl_afu *afu) +{ + return (get_device(>dev) == NULL) ? NULL : afu; +} + +void ocxl_afu_put(struct ocxl_afu *afu) +{ + put_device(>dev); +} + +static struct ocxl_afu *alloc_afu(struct ocxl_fn *fn) +{ + struct ocxl_afu *afu; + + afu = kzalloc(sizeof(struct ocxl_afu), GFP_KERNEL); + if (!afu) + return NULL; + + mutex_init(>contexts_lock); + mutex_init(>afu_control_lock); + idr_init(>contexts_idr); + afu->fn = fn; + ocxl_fn_get(fn); + return afu; +} + +static void free_afu(struct ocxl_afu *afu) +{ + idr_destroy(>contexts_idr); + ocxl_fn_put(afu->fn); + kfree(afu); +} + +static void free_afu_dev(struct device *dev) +{ + struct ocxl_afu *afu = to_ocxl_afu(dev); + + ocxl_unregister_afu(afu); + free_afu(afu); +} + +static int set_afu_device(struct ocxl_afu *afu, const char *location) +{ + struct ocxl_fn *fn = afu->fn; + int rc; + + afu->dev.parent = >dev; + afu->dev.release = free_afu_dev; + rc = dev_set_name(>dev, "%s.%s.%hhu", afu->config.name, location, + afu->config.idx); + return rc; +} + +static int assign_afu_actag(struct ocxl_afu *afu, struct pci_dev *dev) +{ + struct ocxl_fn *fn = afu->fn; + int actag_count, actag_offset; + + /* +* if there were not enough actags for the function, each afu +* reduces its count as well +*/ + actag_count = afu->config.actag_supported * + fn->actag_enabled / fn->actag_supported; + actag_offset = ocxl_actag_afu_alloc(fn, actag_count); + if (actag_offset < 0) { + dev_err(>dev, "Can't allocate %d actags for AFU: %d\n", + actag_count, actag_offset); + return actag_offset; + } + afu->actag_base = fn->actag_base + actag_offset; + afu->actag_enabled = actag_count; + + ocxl_config_set_afu_actag(dev, afu->config.dvsec_afu_control_pos, + afu->actag_base, afu->actag_enabled); + dev_dbg(>dev, "actag base=%d enabled=%d\n", + afu->actag_base, afu->actag_enabled); + return 0; +} + +static void reclaim_afu_actag(struct ocxl_afu *afu) +{ + struct ocxl_fn *fn = afu->fn; + int start_offset, size; + + start_offset = afu->actag_base - fn->actag_base; + size = afu->actag_enabled; + ocxl_actag_afu_free(afu->fn, start_offset, size); +} + +static int assign_afu_pasid(struct ocxl_afu *afu, struct pci_dev *dev) +{ + struct ocxl_fn *fn = afu->fn; + int pasid_count, pasid_offset; + + /* +* We only support the case where the function configuration +* requested enough PASIDs to cover all AFUs. +*/ + pasid_count = 1 << afu->config.pasid_supported_log; + pasid_offset = ocxl_pasid_afu_alloc(fn, pasid_count); + if (pasid_offset < 0) { + dev_err(>dev, "Can't allocate %d PASIDs for AFU: %d\n", + pasid_count, pasid_offset); + return pasid_offset; + } + afu->pasid_base = fn->pasid_base + pasid_offset; + afu->pasid_count = 0; + afu->pasid_max = pasid_count; + + ocxl_config_set_afu_pasid(dev,
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 05:33:10AM +0200, Nicholas Mc Guire wrote: > ok - my bad thn - I had assumed that using __force is reasonable > if the handling is correct and its a localized conversoin only > like var = be16_to_cpu(var) which evaded introducing additinal > variables just to have different types but no different function. If compiler can't recognize that in T1 v1; T2 v2; code using v1, but not v2 v2 = f(v1); code using v2, but not v1 it can use the same memory for v1 and v2, file a bug against the compiler. Or stop using that toy altogether - that kind of optimizations is early 60s stuff and any real compiler will handle that. Both gcc and clang certainly do handle that. Another thing they handle is figuring out that be16_to_cpu() et.al. are pure functions, so f(be16_to_cpu(n)); no modifications of n g(be16_to_cpu(n)); doesn't need to have le16_to_cpu recalculated. IOW, that particular code could as well have been dev_info(dev, "Fieldbus type: %04X", be16_to_cpu(fieldbus_type)); ... cd->client->fieldbus_type = be16_to_cpu(fieldbus_type); ... not that there's much sense keeping ->fieldbus_type in host-endian, while we are at it.
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Apr 29, 2019, at 9:09 PM, Al Viro wrote: > > On Tue, Apr 16, 2019 at 11:01:16AM -0700, Linus Torvalds wrote: >> >> I only skimmed through the actual filesystem (and one networking) >> patches, but they looked like trivial conversions to a better >> interface. > > ... except that this callback can (and always could) get executed after > freeing struct super_block. So we can't just dereference ->i_sb->s_op > and expect to survive; the table ->s_op pointed to will still be there, > but ->i_sb might very well have been freed, with all its contents overwritten. > We need to copy the callback into struct inode itself, unfortunately. > The following incremental fixes it; I'm going to fold it into the first > commit in there. > > diff --git a/fs/inode.c b/fs/inode.c > index fb45590d284e..855dad43b11d 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -164,6 +164,7 @@ int inode_init_always(struct super_block *sb, struct > inode *inode) > inode->i_wb_frn_avg_time = 0; > inode->i_wb_frn_history = 0; > #endif > + inode->free_inode = sb->s_op->free_inode; > > if (security_inode_alloc(inode)) > goto out; > @@ -211,8 +212,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); > static void i_callback(struct rcu_head *head) > { > struct inode *inode = container_of(head, struct inode, i_rcu); > - if (inode->i_sb->s_op->free_inode) > - inode->i_sb->s_op->free_inode(inode); > + if (inode->free_inode) > + inode->free_inode(inode); > else > free_inode_nonrcu(inode); > } > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 2e9b9f87caca..5ed6b39e588e 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -718,6 +718,7 @@ struct inode { > #endif > > void*i_private; /* fs or device private pointer */ > + void (*free_inode)(struct inode *); It seems like a waste to increase the size of every struct inode just to access a static pointer. Is this the only place that ->free_inode() is called? Why not move the ->free_inode() pointer into inode->i_fop->free_inode() so that it is still directly accessible at this point. Cheers, Andreas signature.asc Description: Message signed with OpenPGP
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Mon, Apr 29, 2019 at 08:37:29PM -0700, Linus Torvalds wrote: > On Mon, Apr 29, 2019, 20:09 Al Viro wrote: > > > > > ... except that this callback can (and always could) get executed after > > freeing struct super_block. > > > > Ugh. > > That food looks nasty. Shouldn't the super block freeing wait for the > filesystem to be all done instead? Do a rcu synchronization or something? > > Adding that pointer looks really wrong to me. I'd much rather delay the sb > freeing. Is there some reason that can't be done that I'm missing? Where would you put that synchronize_rcu()? Doing that before ->put_super() is too early - inode references might be dropped in there. OTOH, doing that after that point means that while struct super_block itself will be there, any number of data structures hanging from it might be not. So we are still very limited in what we can do inside ->free_inode() instance *and* we get bunch of synchronize_rcu() for no good reason. Note that for normal lockless accesses (lockless ->d_revalidate(), ->d_hash(), etc.) we are just fine with having struct super_block freeing RCU-delayed (along with any data structures we might need) - the superblock had been seen at some point after we'd taken rcu_read_lock(), so its freeing won't happen until we drop it. So we don't need synchronize_rcu() for that. Here the problem is that we are dealing with another RCU callback; synchronize_rcu() would be needed for it, but it will only protect that intermediate dereference of ->i_sb; any rcu-delayed stuff scheduled from inside ->put_super() would not be ordered wrt ->free_inode(). And if we are doing that just for the sake of that one dereference, we might as well do it before scheduling i_callback(). PS: we *are* guaranteed that module will still be there (unregister_filesystem() does synchronize_rcu() and rcu_barrier() is done before kmem_cache_destroy() in assorted exit_foo_fs()).
linux-next: manual merge of the mlx5-next tree with the rdma tree
Hi Leon, Today's linux-next merge of the mlx5-next tree got a conflict in: drivers/infiniband/hw/mlx5/main.c between commit: 35b0aa67b298 ("RDMA/mlx5: Refactor netdev affinity code") from the rdma tree and commit: c42260f19545 ("net/mlx5: Separate and generalize dma device from pci device") from the mlx5-next tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/infiniband/hw/mlx5/main.c index 6135a0b285de,fae6a6a1fbea.. --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@@ -200,12 -172,18 +200,12 @@@ static int mlx5_netdev_event(struct not switch (event) { case NETDEV_REGISTER: + /* Should already be registered during the load */ + if (ibdev->is_rep) + break; write_lock(>netdev_lock); - if (ndev->dev.parent == >pdev->dev) - if (ibdev->rep) { - struct mlx5_eswitch *esw = ibdev->mdev->priv.eswitch; - struct net_device *rep_ndev; - - rep_ndev = mlx5_ib_get_rep_netdev(esw, -ibdev->rep->vport); - if (rep_ndev == ndev) - roce->netdev = ndev; - } else if (ndev->dev.parent == mdev->device) { ++ if (ndev->dev.parent == mdev->device) roce->netdev = ndev; - } write_unlock(>netdev_lock); break; pgp_PtkGrXy9B.pgp Description: OpenPGP digital signature
REVIEW NOTICE ???
Dear friend , My name is Hans Erich Helmut . I have a client who is interested to invest in your country, she is a well known politician in her country and deserve a lucrative investment partnership with you outside her country without any delay Please can you manage such investment please Kindly reply for further details. Yours sincerely, Hans Erich Helmut London,UK.
linux-next: build warning after merge of the thermal tree
Hi Zhang, After merging the thermal tree, today's linux-next build (arm multi_v7_defconfig) produced this warning: boolean symbol THERMAL tested for 'm'? test forced to 'n' Introduced by commit be33e4fbbea5 ("thermal/drivers/core: Remove the module Kconfig's option") There is a test for =m in drivers/net/ethernet/mellanox/mlxsw/Kconfig. -- Cheers, Stephen Rothwell pgppg10Zmo5Rl.pgp Description: OpenPGP digital signature
[PATCH v6 0/4] x86: Add the support of ACRN guest under x86
ACRN is a flexible, lightweight reference hypervisor, built with real-time and safety-criticality in mind, optimized to streamline embedded development through an open source platform. It is built for embedded IOT with small footprint and real-time features. More details can be found in https://projectacrn.org/ This is the patch set that allows the Linux to work on ACRN hypervisor and it can work with the following patch set to manage the Linux guest on ACRN hypervisor. It includes the detection of ACRN hypervisor, upcall notification vector from hypervisor, hypercall. The hypervisor detection is similar to Xen/VMWARE/Hyperv. ACRN also uses the upcall notification mechanism similar to that in Xen/Microsoft HyperV when it needs to send the notification to Linux guest. The hypercall provides the mechanism that can be used to query/configure the ACRN hypervisor by Linux guest. Following this patch set, we will send acrn driver part, which provides the interface that can be used to manage the virtualized CPU/memory/device/interrupt for other guest OS after the ACRN hypervisor is detected. v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to understand. Remove the export of x86_hyper_acrn. Remove the unused API definition of acrn_setup_intr_handler and acrn_remove_intr_handler. Adjust the order of header file Add the declaration of acrn_hv_vector_handler and tracing definition of acrn_hv_callback_vector. Refine the comments for the function of acrn_hypercall0/1/2 v2-v3: Add one new config symbol to unify the conditional definition of hv_irq_callback_count Use the "vmcall" mnemonic to replace the hard-code byte definition Remove the unnecessary dependency of CONFIG_PARAVIRT for ACRN_GUEST v3-v4: Rename the file name of acrnhyper.h to acrn.h Refine the commit log and some other minor changes(more comments and redundant ifdef in acrn.h, sorting the header file in acrn.c) v4->v5: Minor changes of comments/commit log in patch 04 Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H. Use the "VMCALL" mnemonic in comment/commit log. Uppercase r8/rdi/rsi/rax for hypercall parameter register in comment. v5->v6: Remove the explicit register variable for inline assembly Add the "extern" for the function declaration in acrn.h Add comments about acking ACPI EOI in acrn_hv_callback_handler Minor changes for comments/commit log in patch 03/04 Zhao Yakui (4): x86/Kconfig: Add new config symbol to unify conditional definition of hv_irq_callback_count x86: Add the support of Linux guest on ACRN hypervisor x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector x86/acrn: Add hypercall for ACRN guest arch/x86/Kconfig | 16 +++ arch/x86/entry/entry_64.S | 5 +++ arch/x86/include/asm/acrn.h | 11 + arch/x86/include/asm/acrn_hypercall.h | 84 +++ arch/x86/include/asm/hardirq.h| 2 +- arch/x86/include/asm/hypervisor.h | 1 + arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/acrn.c| 68 arch/x86/kernel/cpu/hypervisor.c | 4 ++ arch/x86/kernel/irq.c | 2 +- arch/x86/xen/Kconfig | 1 + drivers/hv/Kconfig| 1 + 12 files changed, 194 insertions(+), 2 deletions(-) create mode 100644 arch/x86/include/asm/acrn.h create mode 100644 arch/x86/include/asm/acrn_hypercall.h create mode 100644 arch/x86/kernel/cpu/acrn.c -- 2.7.4
[PATCH v6 1/4] x86/Kconfig: Add new config symbol to unify conditional definition of hv_irq_callback_count
Add a special Kconfig symbol X86_HV_CALLBACK_VECTOR so that the guests using the hypervisor interrupt callback counter can select and thus enable that counter. Select it when xen or hyperv support is enabled. No functional changes. Signed-off-by: Zhao Yakui Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner --- v3->v4: Follow the comments to refine the commit log. --- arch/x86/Kconfig | 3 +++ arch/x86/include/asm/hardirq.h | 2 +- arch/x86/kernel/irq.c | 2 +- arch/x86/xen/Kconfig | 1 + drivers/hv/Kconfig | 1 + 5 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 62fc3fd..2fc9297 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -791,6 +791,9 @@ config QUEUED_LOCK_STAT behavior of paravirtualized queued spinlocks and report them on debugfs. +config X86_HV_CALLBACK_VECTOR + def_bool n + source "arch/x86/xen/Kconfig" config KVM_GUEST diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index d9069bb..0753379 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -37,7 +37,7 @@ typedef struct { #ifdef CONFIG_X86_MCE_AMD unsigned int irq_deferred_error_count; #endif -#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN) +#ifdef CONFIG_X86_HV_CALLBACK_VECTOR unsigned int irq_hv_callback_count; #endif #if IS_ENABLED(CONFIG_HYPERV) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 59b5f2e..a147826 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -134,7 +134,7 @@ int arch_show_interrupts(struct seq_file *p, int prec) seq_printf(p, "%10u ", per_cpu(mce_poll_count, j)); seq_puts(p, " Machine check polls\n"); #endif -#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN) +#ifdef CONFIG_X86_HV_CALLBACK_VECTOR if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) { seq_printf(p, "%*s: ", prec, "HYP"); for_each_online_cpu(j) diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index e07abef..ba5a418 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -7,6 +7,7 @@ config XEN bool "Xen guest support" depends on PARAVIRT select PARAVIRT_CLOCK + select X86_HV_CALLBACK_VECTOR depends on X86_64 || (X86_32 && X86_PAE) depends on X86_LOCAL_APIC && X86_TSC help diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 1c1a251..cafcb97 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -6,6 +6,7 @@ config HYPERV tristate "Microsoft Hyper-V client drivers" depends on X86 && ACPI && X86_LOCAL_APIC && HYPERVISOR_GUEST select PARAVIRT + select X86_HV_CALLBACK_VECTOR help Select this option to run Linux as a Hyper-V client operating system. -- 2.7.4
[PATCH v6 4/4] x86/acrn: Add hypercall for ACRN guest
When the ACRN hypervisor is detected, the hypercall is needed so that the ACRN guest can query/config some settings. For example: it can be used to query the resources in hypervisor and manage the CPU/memory/device/ interrupt for guest operating system. Add the hypercall so that the ACRN guest can communicate with the low-level ACRN hypervisor. On x86 it is implemented with the VMCALL instruction. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- V1->V2: Refine the comments for the function of acrn_hypercall0/1/2 v2->v3: Use the "vmcall" mnemonic to replace hard-code byte definition v4->v5: Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H. Use the "VMCALL" mnemonic in comment/commit log. Uppercase r8/rdi/rsi/rax for hypercall parameter register in comment. v5->v6: Remove explicit local register variable for inline assembly --- arch/x86/include/asm/acrn_hypercall.h | 84 +++ 1 file changed, 84 insertions(+) create mode 100644 arch/x86/include/asm/acrn_hypercall.h diff --git a/arch/x86/include/asm/acrn_hypercall.h b/arch/x86/include/asm/acrn_hypercall.h new file mode 100644 index 000..5cb438e --- /dev/null +++ b/arch/x86/include/asm/acrn_hypercall.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _ASM_X86_ACRN_HYPERCALL_H +#define _ASM_X86_ACRN_HYPERCALL_H + +#include + +#ifdef CONFIG_ACRN_GUEST + +/* + * Hypercalls for ACRN guest + * + * Hypercall number is passed in R8 register. + * Up to 2 arguments are passed in RDI, RSI. + * Return value will be placed in RAX. + */ + +static inline long acrn_hypercall0(unsigned long hcall_id) +{ + long result; + + /* the hypercall is implemented with the VMCALL instruction. +* volatile qualifier is added to avoid that it is dropped +* because of compiler optimization. +*/ + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id) +: "r8"); + + return result; +} + +static inline long acrn_hypercall1(unsigned long hcall_id, + unsigned long param1) +{ + long result; + + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id), "D" (param1) +: "r8"); + + return result; +} + +static inline long acrn_hypercall2(unsigned long hcall_id, + unsigned long param1, + unsigned long param2) +{ + long result; + + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id), "D" (param1), "S" (param2) +: "r8"); + + return result; +} + +#else + +static inline long acrn_hypercall0(unsigned long hcall_id) +{ + return -ENOTSUPP; +} + +static inline long acrn_hypercall1(unsigned long hcall_id, + unsigned long param1) +{ + return -ENOTSUPP; +} + +static inline long acrn_hypercall2(unsigned long hcall_id, + unsigned long param1, + unsigned long param2) +{ + return -ENOTSUPP; +} +#endif /* CONFIG_ACRN_GUEST */ +#endif /* _ASM_X86_ACRN_HYPERCALL_H */ -- 2.7.4
[PATCH v6 2/4] x86: Add the support of Linux guest on ACRN hypervisor
ACRN is an open-source hypervisor maintained by Linux Foundation. It is built for embedded IOT with small footprint and real-time features. Add the ACRN guest support so that it allows linux to be booted under the ACRN hypervisor. Following this patch it will setup the upcall notification vector, enable hypercall and provide the interface that is used to manage the virtualized CPU/memory/device/interrupt for other guest OS. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to understand. Remove the export of x86_hyper_acrn. v2->v3: Remove the unnecessary dependency of PARAVIRT v3->v4: Refine the commit log and add more meaningful description in Kconfig v4->v5: No change v5->v6: No change --- arch/x86/Kconfig | 12 arch/x86/include/asm/hypervisor.h | 1 + arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/acrn.c| 39 +++ arch/x86/kernel/cpu/hypervisor.c | 4 5 files changed, 57 insertions(+) create mode 100644 arch/x86/kernel/cpu/acrn.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2fc9297..8dc4200 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -845,6 +845,18 @@ config JAILHOUSE_GUEST cell. You can leave this option disabled if you only want to start Jailhouse and run Linux afterwards in the root cell. +config ACRN_GUEST + bool "ACRN Guest support" + depends on X86_64 + help + This option allows to run Linux as guest in ACRN hypervisor. Enabling + this will allow the kernel to boot in virtualized environment under + the ACRN hypervisor. + ACRN is a flexible, lightweight reference open-source hypervisor, built + with real-time and safety-criticality in mind. It is built for embedded + IOT with small footprint and real-time features. More details can be + found in https://projectacrn.org/ + endif #HYPERVISOR_GUEST source "arch/x86/Kconfig.cpu" diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h index 8c5aaba..50a30f6 100644 --- a/arch/x86/include/asm/hypervisor.h +++ b/arch/x86/include/asm/hypervisor.h @@ -29,6 +29,7 @@ enum x86_hypervisor_type { X86_HYPER_XEN_HVM, X86_HYPER_KVM, X86_HYPER_JAILHOUSE, + X86_HYPER_ACRN, }; #ifdef CONFIG_HYPERVISOR_GUEST diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index cfd24f9..17a7cdf 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -44,6 +44,7 @@ obj-$(CONFIG_X86_CPU_RESCTRL) += resctrl/ obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o +obj-$(CONFIG_ACRN_GUEST) += acrn.o ifdef CONFIG_X86_FEATURE_NAMES quiet_cmd_mkcapflags = MKCAP $@ diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c new file mode 100644 index 000..f556640 --- /dev/null +++ b/arch/x86/kernel/cpu/acrn.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ACRN detection support + * + * Copyright (C) 2019 Intel Corporation. All rights reserved. + * + * Jason Chen CJ + * Zhao Yakui + * + */ + +#include + +static uint32_t __init acrn_detect(void) +{ + return hypervisor_cpuid_base("ACRNACRNACRN\0\0", 0); +} + +static void __init acrn_init_platform(void) +{ +} + +static bool acrn_x2apic_available(void) +{ + /* x2apic is not supported now. +* Later it needs to check the X86_FEATURE_X2APIC bit of cpu info +* returned by CPUID to determine whether the x2apic is +* supported in Linux guest. +*/ + return false; +} + +const __initconst struct hypervisor_x86 x86_hyper_acrn = { + .name = "ACRN", + .detect = acrn_detect, + .type = X86_HYPER_ACRN, + .init.init_platform = acrn_init_platform, + .init.x2apic_available = acrn_x2apic_available, +}; diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c index 479ca47..87e39ad 100644 --- a/arch/x86/kernel/cpu/hypervisor.c +++ b/arch/x86/kernel/cpu/hypervisor.c @@ -32,6 +32,7 @@ extern const struct hypervisor_x86 x86_hyper_xen_pv; extern const struct hypervisor_x86 x86_hyper_xen_hvm; extern const struct hypervisor_x86 x86_hyper_kvm; extern const struct hypervisor_x86 x86_hyper_jailhouse; +extern const struct hypervisor_x86 x86_hyper_acrn; static const __initconst struct hypervisor_x86 * const hypervisors[] = { @@ -49,6 +50,9 @@ static const __initconst struct hypervisor_x86 * const hypervisors[] = #ifdef CONFIG_JAILHOUSE_GUEST _hyper_jailhouse, #endif +#ifdef CONFIG_ACRN_GUEST + _hyper_acrn, +#endif }; enum x86_hypervisor_type x86_hyper_type; -- 2.7.4
[PATCH v6 3/4] x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector
Linux kernel uses the HYPERVISOR_CALLBACK_VECTOR for hypervisor upcall vector. It is already used for Xen and HyperV. After the ACRN hypervisor is detected, it will also use this defined vector to notify the ACRN guest. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- V1->V2: Remove the unused API definition of acrn_setup_intr_handler and acrn_remove_intr_handler. Adjust the order of header file Add the declaration of acrn_hv_vector_handler and tracing definition of acrn_hv_callback_vector. v2->v3: No change v3->v4: Refine the file name of acrnhyper.h to acrn.h v5->v6: Add the "extern" for the function declarations in header file Add some comments for calling entering_ack_irq Some other minor changes(unnecessary spliting two lines. and minor change in commit log) --- arch/x86/Kconfig| 1 + arch/x86/entry/entry_64.S | 5 + arch/x86/include/asm/acrn.h | 11 +++ arch/x86/kernel/cpu/acrn.c | 29 + 4 files changed, 46 insertions(+) create mode 100644 arch/x86/include/asm/acrn.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8dc4200..d7a10f6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -848,6 +848,7 @@ config JAILHOUSE_GUEST config ACRN_GUEST bool "ACRN Guest support" depends on X86_64 + select X86_HV_CALLBACK_VECTOR help This option allows to run Linux as guest in ACRN hypervisor. Enabling this will allow the kernel to boot in virtualized environment under diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 1f0efdb..d1b8ad3 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1129,6 +1129,11 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \ hv_stimer0_callback_vector hv_stimer0_vector_handler #endif /* CONFIG_HYPERV */ +#if IS_ENABLED(CONFIG_ACRN_GUEST) +apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ + acrn_hv_callback_vector acrn_hv_vector_handler +#endif + idtentry debug do_debughas_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry int3 do_int3 has_error_code=0 idtentry stack_segment do_stack_segmenthas_error_code=1 diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h new file mode 100644 index 000..4adb13f --- /dev/null +++ b/arch/x86/include/asm/acrn.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_ACRN_H +#define _ASM_X86_ACRN_H + +extern void acrn_hv_callback_vector(void); +#ifdef CONFIG_TRACING +#define trace_acrn_hv_callback_vector acrn_hv_callback_vector +#endif + +extern void acrn_hv_vector_handler(struct pt_regs *regs); +#endif /* _ASM_X86_ACRN_H */ diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c index f556640..ce88d2d 100644 --- a/arch/x86/kernel/cpu/acrn.c +++ b/arch/x86/kernel/cpu/acrn.c @@ -9,7 +9,11 @@ * */ +#include +#include +#include #include +#include static uint32_t __init acrn_detect(void) { @@ -18,6 +22,8 @@ static uint32_t __init acrn_detect(void) static void __init acrn_init_platform(void) { + /* Setup the IDT for ACRN hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, acrn_hv_callback_vector); } static bool acrn_x2apic_available(void) @@ -30,6 +36,29 @@ static bool acrn_x2apic_available(void) return false; } +static void (*acrn_intr_handler)(void); + +__visible void __irq_entry acrn_hv_vector_handler(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + /* +* The hypervisor requires that the APIC EOI should be acked. +* If the APIC EOI is not acked, the APIC ISR bit for the +* HYPERVISOR_CALLBACK_VECTOR will not be cleared and then it +* will block the interrupt whose vector is lower than +* HYPERVISOR_CALLBACK_VECTOR. +*/ + entering_ack_irq(); + inc_irq_stat(irq_hv_callback_count); + + if (acrn_intr_handler) + acrn_intr_handler(); + + exiting_irq(); + set_irq_regs(old_regs); +} + const __initconst struct hypervisor_x86 x86_hyper_acrn = { .name = "ACRN", .detect = acrn_detect, -- 2.7.4
[PATCH] drivers: thermal: processor_thermal: Read PPCC on resume
Read PPCC power limits on system resume in case those limits changed while system was suspended. Signed-off-by: Srinivas Pandruvada --- .../int340x_thermal/processor_thermal_device.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c index 436c256f111d..acb22157b9ac 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c @@ -465,6 +465,18 @@ static void proc_thermal_pci_remove(struct pci_dev *pdev) pci_disable_device(pdev); } +static int proc_thermal_resume(struct device *dev) +{ + struct proc_thermal_device *proc_dev; + + proc_dev = dev_get_drvdata(dev); + proc_thermal_read_ppcc(proc_dev); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(proc_thermal_pm, NULL, proc_thermal_resume); + static const struct pci_device_id proc_thermal_pci_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PROC_BDW_THERMAL)}, { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PROC_HSB_THERMAL)}, @@ -489,6 +501,7 @@ static struct pci_driver proc_thermal_pci_driver = { .probe = proc_thermal_pci_probe, .remove = proc_thermal_pci_remove, .id_table = proc_thermal_pci_ids, + .driver.pm = _thermal_pm, }; static const struct acpi_device_id int3401_device_ids[] = { @@ -503,6 +516,7 @@ static struct platform_driver int3401_driver = { .driver = { .name = "int3401 thermal", .acpi_match_table = int3401_device_ids, + .pm = _thermal_pm, }, }; -- 2.17.2
[PATCH] drivers: thermal: processor_thermal: Downgrade error message
Downgrade "Unsupported event" message from dev_err to dev_dbg. Otherwise it floods with this message one some platforms. Signed-off-by: Srinivas Pandruvada --- .../thermal/intel/int340x_thermal/processor_thermal_device.c| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c index 4b206b594825..436c256f111d 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c @@ -275,7 +275,7 @@ static void proc_thermal_notify(acpi_handle handle, u32 event, void *data) THERMAL_DEVICE_POWER_CAPABILITY_CHANGED); break; default: - dev_err(proc_priv->dev, "Unsupported event [0x%x]\n", event); + dev_dbg(proc_priv->dev, "Unsupported event [0x%x]\n", event); break; } } -- 2.17.2
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 04:02:23AM +0100, Al Viro wrote: > On Tue, Apr 30, 2019 at 04:22:38AM +0200, Nicholas Mc Guire wrote: > > On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > > > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire > > > wrote: > > > > > > > > V2: As requested by Sven Van Asbroeck make the > > > > impact of the patch clear in the commit message. > > > > > > Thank you, but did you miss my comment about creating a local variable > > > instead? See: > > > https://lkml.org/lkml/2019/4/28/97 > > > > Did not miss it - I just don't think that makes it any more > > understandable - the __force __be16 makes it clear I believe > > that this is correct, sparse does not like this though - so tell > > sparse. > > ... to STFU, 'cause you know better. The trouble is, how do we > (or yourself a year or two later) know *why* it is correct? > Worse, how do we (or yourself, etc.) know if a change about to be > done to the code won't invalidate the proof of yours? > > > The local variable would need to be explained as it is > > functionally not necessary - therefor I find it more confusing > > that using __force here. > > What's confusing is mixing host- and fixed-endian values in the > same variable at different times. Treat those as unrelated > types that happen to have the same sizeof. > > Quite a few of __force instances in the tree should be taken out > and shot. Don't add to their number. ok - my bad thn - I had assumed that using __force is reasonable if the handling is correct and its a localized conversoin only like var = be16_to_cpu(var) which evaded introducing additinal variables just to have different types but no different function. But the long-term issue of hiding bugs by __force makes sesne to me - will give it another shot at scripting this in coccinelle. thx! hofrat
Re: [PATCH 2/2] memcg, fsnotify: no oom-kill for remote memcg charging
On Mon, Apr 29, 2019 at 5:41 PM Michal Hocko wrote: > > On Mon 29-04-19 10:13:32, Shakeel Butt wrote: > [...] > > /* > >* For queues with unlimited length lost events are not expected and > >* can possibly have security implications. Avoid losing events when > >* memory is short. > > + * > > + * Note: __GFP_NOFAIL takes precedence over __GFP_RETRY_MAYFAIL. > >*/ > > No, I there is no rule like that. Combining the two is undefined > currently and I do not think we want to legitimize it. What does it even > mean? > Actually the code is doing that but I agree this is not documented and weird. I will fix this. Shakeel
Re: [PATCH] riscv: Support non-coherency memory model
On Mon, Apr 29, 2019 at 01:11:43PM -0700, Palmer Dabbelt wrote: > On Mon, 22 Apr 2019 08:44:30 PDT (-0700), guo...@kernel.org wrote: > >From: Guo Ren > > > >The current riscv linux implementation requires SOC system to support > >memory coherence between all I/O devices and CPUs. But some SOC systems > >cannot maintain the coherence and they need support cache clean/invalid > >operations to synchronize data. > > > >Current implementation is no problem with SiFive FU540, because FU540 > >keeps all IO devices and DMA master devices coherence with CPU. But to a > >traditional SOC vendor, it may already have a stable non-coherency SOC > >system, the need is simply to replace the CPU with RV CPU and rebuild > >the whole system with IO-coherency is very expensive. > > > >So we should make riscv linux also support non-coherency memory model. > >Here are the two points that riscv linux needs to be modified: > > > > - Add _PAGE_COHERENCY bit in current page table entry attributes. The bit > > designates a coherence for this page mapping. Software set the bit to > > tell the hardware that the region of the page's memory area must be > > coherent with IOs devices in SOC system by PMA settings. > > If IOs and CPU are already coherent in SOC system, CPU just ignore > > this bit. > > > > PTE format: > > | XLEN-1 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 > > PFN C RSW D A G U X W R V > > ^ > > BIT(9): Coherence attribute bit > > 0: hardware needn't keep the page coherenct and software will > > maintain the coherence with cache clear/invalid operations. > > 1: hardware must keep the page coherenct and software needn't > > maintain the coherence. > > BIT(8): Reserved for software and now it's _PAGE_SPECIAL in linux > > > > Add a new hardware bit in PTE also need to modify Privileged > > Architecture Supervisor-Level ISA: > > https://github.com/riscv/riscv-isa-manual/pull/374 > > This is a RISC-V ISA modification, which isn't really appropriate to suggest > on > the kernel mailing lists. The right place to talk about this is at the RISC-V > foundation, which owns the ISA -- we can't change the hardware with a patch to > Linux :). I just want a discussion and a wide discussion is good for all of us :) > > > - Add SBI_FENCE_DMA 9 in riscv-sbi. > > sbi_fence_dma(start, size, dir) could synchronize CPU cache data with > > DMA device in non-coherency memory model. The third param's definition > > is the same with linux's in include/linux/dma-direction.h: > > > > enum dma_data_direction { > > DMA_BIDIRECTIONAL = 0, > > DMA_TO_DEVICE = 1, > > DMA_FROM_DEVICE = 2, > > DMA_NONE = 3, > > }; > > > > The first param:start must be physical address which could be handled > > in M-state. > > > > Here is a pull request to the riscv-sbi-doc: > > https://github.com/riscv/riscv-sbi-doc/pull/15 > > > >We have tested the patch on our fpga SOC system which network controller > >connected to a non-cache-coherency interconnect in and it couldn't work > >without the patch. > > > >There is no side effect for FU540 whose CPU don't care _PAGE_COHERENCY > >in PTE, but FU540's bbl also need to implement a simple sbi_fence_dma > >by directly return. In fact, if you give a correct configuration for > >dev_is_dma_conherent(), linux dma framework wouldn't call sbi_fence_dma > >any more. > > Non-coherent fences also need to be discussed as part of a RISC-V ISA ^^ fences instructions? not page attributes? > extension. > I know people have expressed interest, but I don't know of a > working group that's already been set up. Is that mean current RISC-V ISA forces the SOC to be coherent memory model? Best Regards Guo Ren
Re: INFO: task hung in __get_super
On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote: > Yeah, you're right. And if we push the patch a bit further to not take > loop_ctl_mutex for invalid ioctl number, that would fix the problem. I > can send a fix. Huh? We don't take it until in lo_simple_ioctl(), and that patch doesn't get to its call on invalid ioctl numbers. What am I missing here?
[RFC PATCH v4 15/15] dcache: Add CONFIG_DCACHE_SMO
In an attempt to make the SMO patchset as non-invasive as possible add a config option CONFIG_DCACHE_SMO (under "Memory Management options") for enabling SMO for the DCACHE. Whithout this option dcache constructor is used but no other code is built in, with this option enabled slab mobility is enabled and the isolate/migrate functions are built in. Add CONFIG_DCACHE_SMO to guard the partial shrinking of the dcache via Slab Movable Objects infrastructure. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 4 mm/Kconfig | 7 +++ 2 files changed, 11 insertions(+) diff --git a/fs/dcache.c b/fs/dcache.c index 3f9daba1cc78..9edce104613b 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3068,6 +3068,7 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode) } EXPORT_SYMBOL(d_tmpfile); +#ifdef CONFIG_DCACHE_SMO /* * d_isolate() - Dentry isolation callback function. * @s: The dentry cache. @@ -3140,6 +3141,7 @@ static void d_partial_shrink(struct kmem_cache *s, void **_unused, int __unused, kfree(private); } +#endif /* CONFIG_DCACHE_SMO */ static __initdata unsigned long dhash_entries; static int __init set_dhash_entries(char *str) @@ -3186,7 +3188,9 @@ static void __init dcache_init(void) sizeof_field(struct dentry, d_iname), dcache_ctor); +#ifdef CONFIG_DCACHE_SMO kmem_cache_setup_mobility(dentry_cache, d_isolate, d_partial_shrink); +#endif /* Hash may have been set up in dcache_init_early */ if (!hashdist) diff --git a/mm/Kconfig b/mm/Kconfig index 47040d939f3b..92fc27ad3472 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -265,6 +265,13 @@ config SMO_NODE help On NUMA systems enable moving objects to and from a specified node. +config DCACHE_SMO + bool "Enable Slab Movable Objects for the dcache" + depends on SLUB + help + Under memory pressure we can try to free dentry slab cache objects from + the partial slab list if this is enabled. + config PHYS_ADDR_T_64BIT def_bool 64BIT -- 2.21.0
[RFC PATCH v4 13/15] dcache: Provide a dentry constructor
In order to support object migration on the dentry cache we need to have a determined object state at all times. Without a constructor the object would have a random state after allocation. Provide a dentry constructor. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 30 +- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index aac41adf4743..3d6cc06eca56 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1603,6 +1603,16 @@ void d_invalidate(struct dentry *dentry) } EXPORT_SYMBOL(d_invalidate); +static void dcache_ctor(void *p) +{ + struct dentry *dentry = p; + + /* Mimic lockref_mark_dead() */ + dentry->d_lockref.count = -128; + + spin_lock_init(>d_lock); +} + /** * __d_alloc - allocate a dcache entry * @sb: filesystem it will belong to @@ -1658,7 +1668,6 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) dentry->d_lockref.count = 1; dentry->d_flags = 0; - spin_lock_init(>d_lock); seqcount_init(>d_seq); dentry->d_inode = NULL; dentry->d_parent = dentry; @@ -3091,14 +3100,17 @@ static void __init dcache_init_early(void) static void __init dcache_init(void) { - /* -* A constructor could be added for stable state like the lists, -* but it is probably not worth it because of the cache nature -* of the dcache. -*/ - dentry_cache = KMEM_CACHE_USERCOPY(dentry, - SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT, - d_iname); + slab_flags_t flags = + SLAB_RECLAIM_ACCOUNT | SLAB_PANIC | SLAB_MEM_SPREAD | SLAB_ACCOUNT; + + dentry_cache = + kmem_cache_create_usercopy("dentry", + sizeof(struct dentry), + __alignof__(struct dentry), + flags, + offsetof(struct dentry, d_iname), + sizeof_field(struct dentry, d_iname), + dcache_ctor); /* Hash may have been set up in dcache_init_early */ if (!hashdist) -- 2.21.0
[RFC PATCH v4 11/15] slub: Enable moving objects to/from specific nodes
We have just implemented Slab Movable Objects (object migration). Currently object migration is used to defrag a cache. On NUMA systems it would be nice to be able to control the source and destination nodes when moving objects. Add CONFIG_SMO_NODE to guard this feature. CONFIG_SMO_NODE depends on CONFIG_SLUB_DEBUG because we use the full list. Leave it like this for the RFC because the patch will be less cluttered to review, separate full list out of CONFIG_DEBUG before doing a PATCH version. Implement moving all objects (including those in full slabs) to a specific node. Expose this functionality to userspace via a sysfs entry. Add sysfs entry: /sysfs/kernel/slab//move With this users get access to the following functionality: - Move all objects to specified node. echo "N1" > move - Move all objects from specified node to other specified node (from N1 -> to N2): echo "N1 N2" > move This also enables shrinking slabs on a specific node: echo "N1 N1" > move Signed-off-by: Tobin C. Harding --- mm/Kconfig | 7 ++ mm/slub.c | 249 + 2 files changed, 256 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 25c71eb8a7db..47040d939f3b 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -258,6 +258,13 @@ config ARCH_ENABLE_HUGEPAGE_MIGRATION config ARCH_ENABLE_THP_MIGRATION bool +config SMO_NODE + bool "Enable per node control of Slab Movable Objects" + depends on SLUB && SYSFS + select SLUB_DEBUG + help + On NUMA systems enable moving objects to and from a specified node. + config PHYS_ADDR_T_64BIT def_bool 64BIT diff --git a/mm/slub.c b/mm/slub.c index e601c804ed79..e4f3dde443f5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4345,6 +4345,106 @@ static void move_slab_page(struct page *page, void *scratch, int node) s->migrate(s, vector, count, node, private); } +#ifdef CONFIG_SMO_NODE +/* + * kmem_cache_move() - Attempt to move all slab objects. + * @s: The cache we are working on. + * @node: The node to move objects away from. + * @target_node: The node to move objects on to. + * + * Attempts to move all objects (partial slabs and full slabs) to target + * node. + * + * Context: Takes the list_lock. + * Return: The number of slabs remaining on node. + */ +static unsigned long kmem_cache_move(struct kmem_cache *s, +int node, int target_node) +{ + struct kmem_cache_node *n = get_node(s, node); + LIST_HEAD(move_list); + struct page *page, *page2; + unsigned long flags; + void **scratch; + + if (!s->migrate) { + pr_warn("%s SMO not enabled, cannot move objects\n", s->name); + goto out; + } + + scratch = alloc_scratch(s); + if (!scratch) + goto out; + + spin_lock_irqsave(>list_lock, flags); + + list_for_each_entry_safe(page, page2, >partial, lru) { + if (!slab_trylock(page)) + /* Busy slab. Get out of the way */ + continue; + + if (page->inuse) { + list_move(>lru, _list); + /* Stop page being considered for allocations */ + n->nr_partial--; + page->frozen = 1; + + slab_unlock(page); + } else {/* Empty slab page */ + list_del(>lru); + n->nr_partial--; + slab_unlock(page); + discard_slab(s, page); + } + } + list_for_each_entry_safe(page, page2, >full, lru) { + if (!slab_trylock(page)) + continue; + + list_move(>lru, _list); + page->frozen = 1; + slab_unlock(page); + } + + spin_unlock_irqrestore(>list_lock, flags); + + list_for_each_entry(page, _list, lru) { + if (page->inuse) + move_slab_page(page, scratch, target_node); + } + kfree(scratch); + + /* Bail here to save taking the list_lock */ + if (list_empty(_list)) + goto out; + + /* Inspect results and dispose of pages */ + spin_lock_irqsave(>list_lock, flags); + list_for_each_entry_safe(page, page2, _list, lru) { + list_del(>lru); + slab_lock(page); + page->frozen = 0; + + if (page->inuse) { + if (page->inuse == page->objects) { + list_add(>lru, >full); + slab_unlock(page); + } else { + n->nr_partial++; + list_add_tail(>lru, >partial); + slab_unlock(page); + } + } else { +
[RFC PATCH v4 12/15] slub: Enable balancing slabs across nodes
We have just implemented Slab Movable Objects (SMO). On NUMA systems slabs can become unbalanced i.e. many slabs on one node while other nodes have few slabs. Using SMO we can balance the slabs across all the nodes. The algorithm used is as follows: 1. Move all objects to node 0 (this has the effect of defragmenting the cache). 2. Calculate the desired number of slabs for each node (this is done using the approximation nr_slabs / nr_nodes). 3. Loop over the nodes moving the desired number of slabs from node 0 to the node. Feature is conditionally built in with CONFIG_SMO_NODE, this is because we need the full list (we enable SLUB_DEBUG to get this). Future version may separate final list out of SLUB_DEBUG. Expose this functionality to userspace via a sysfs entry. Add sysfs entry: /sysfs/kernel/slab//balance Write of '1' to this file triggers balance, no other value accepted. This feature relies on SMO being enable for the cache, this is done with a call to, after the isolate/migrate functions have been defined. kmem_cache_setup_mobility(s, isolate, migrate) Signed-off-by: Tobin C. Harding --- mm/slub.c | 120 ++ 1 file changed, 120 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index e4f3dde443f5..a5c48c41d72b 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4583,6 +4583,109 @@ static unsigned long kmem_cache_move_to_node(struct kmem_cache *s, int node) return left; } + +/* + * kmem_cache_move_slabs() - Attempt to move @num slabs to target_node, + * @s: The cache we are working on. + * @node: The node to move objects from. + * @target_node: The node to move objects to. + * @num: The number of slabs to move. + * + * Attempts to move @num slabs from @node to @target_node. This is done + * by migrating objects from slabs on the full_list. + * + * Return: The number of slabs moved or error code. + */ +static long kmem_cache_move_slabs(struct kmem_cache *s, + int node, int target_node, long num) +{ + struct kmem_cache_node *n = get_node(s, node); + LIST_HEAD(move_list); + struct page *page, *page2; + unsigned long flags; + void **scratch; + long done = 0; + + if (node == target_node) + return -EINVAL; + + scratch = alloc_scratch(s); + if (!scratch) + return -ENOMEM; + + spin_lock_irqsave(>list_lock, flags); + list_for_each_entry_safe(page, page2, >full, lru) { + if (!slab_trylock(page)) + /* Busy slab. Get out of the way */ + continue; + + list_move(>lru, _list); + page->frozen = 1; + slab_unlock(page); + + if (++done >= num) + break; + } + spin_unlock_irqrestore(>list_lock, flags); + + list_for_each_entry(page, _list, lru) { + if (page->inuse) + move_slab_page(page, scratch, target_node); + } + kfree(scratch); + + /* Inspect results and dispose of pages */ + spin_lock_irqsave(>list_lock, flags); + list_for_each_entry_safe(page, page2, _list, lru) { + list_del(>lru); + slab_lock(page); + page->frozen = 0; + + if (page->inuse) { + /* +* This is best effort only, if slab still has +* objects just put it back on the partial list. +*/ + n->nr_partial++; + list_add_tail(>lru, >partial); + slab_unlock(page); + } else { + slab_unlock(page); + discard_slab(s, page); + } + } + spin_unlock_irqrestore(>list_lock, flags); + + return done; +} + +/* + * kmem_cache_balance_nodes() - Balance slabs across nodes. + * @s: The cache we are working on. + */ +static void kmem_cache_balance_nodes(struct kmem_cache *s) +{ + struct kmem_cache_node *n = get_node(s, 0); + unsigned long desired_nr_slabs_per_node; + unsigned long nr_slabs; + int nr_nodes = 0; + int nid; + + (void)kmem_cache_move_to_node(s, 0); + + for_each_node_state(nid, N_NORMAL_MEMORY) + nr_nodes++; + + nr_slabs = atomic_long_read(>nr_slabs); + desired_nr_slabs_per_node = nr_slabs / nr_nodes; + + for_each_node_state(nid, N_NORMAL_MEMORY) { + if (nid == 0) + continue; + + kmem_cache_move_slabs(s, 0, nid, desired_nr_slabs_per_node); + } +} #endif /** @@ -5847,6 +5950,22 @@ static ssize_t move_store(struct kmem_cache *s, const char *buf, size_t length) return length; } SLAB_ATTR(move); + +static ssize_t balance_show(struct kmem_cache *s, char *buf) +{ + return 0; +} + +static
[RFC PATCH v4 14/15] dcache: Implement partial shrink via Slab Movable Objects
The dentry slab cache is susceptible to internal fragmentation. Now that we have Slab Movable Objects we can attempt to defragment the dcache. Dentry objects are inherently _not_ relocatable however under some conditions they can be free'd. This is the same as shrinking the dcache but instead of shrinking the whole cache we only attempt to free those objects that are located in partially full slab pages. There is no guarantee that this will reduce the memory usage of the system, it is a compromise between fragmented memory and total cache shrinkage with the hope that some memory pressure can be alleviated. This is implemented using the newly added Slab Movable Objects infrastructure. The dcache 'migration' function is intentionally _not_ called 'd_migrate' because we only free, we do not migrate. Call it 'd_partial_shrink' to make explicit that no reallocation is done. Implement isolate and 'migrate' functions for the dentry slab cache. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 76 + 1 file changed, 76 insertions(+) diff --git a/fs/dcache.c b/fs/dcache.c index 3d6cc06eca56..3f9daba1cc78 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "mount.h" @@ -3067,6 +3068,79 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode) } EXPORT_SYMBOL(d_tmpfile); +/* + * d_isolate() - Dentry isolation callback function. + * @s: The dentry cache. + * @v: Vector of pointers to the objects to isolate. + * @nr: Number of objects in @v. + * + * The slab allocator is holding off frees. We can safely examine + * the object without the danger of it vanishing from under us. + */ +static void *d_isolate(struct kmem_cache *s, void **v, int nr) +{ + struct list_head *dispose; + struct dentry *dentry; + int i; + + dispose = kmalloc(sizeof(*dispose), GFP_KERNEL); + if (!dispose) + return NULL; + + INIT_LIST_HEAD(dispose); + + for (i = 0; i < nr; i++) { + dentry = v[i]; + spin_lock(>d_lock); + + if (dentry->d_lockref.count > 0 || + dentry->d_flags & DCACHE_SHRINK_LIST) { + spin_unlock(>d_lock); + continue; + } + + if (dentry->d_flags & DCACHE_LRU_LIST) + d_lru_del(dentry); + + d_shrink_add(dentry, dispose); + spin_unlock(>d_lock); + } + + return dispose; +} + +/* + * d_partial_shrink() - Dentry migration callback function. + * @s: The dentry cache. + * @_unused: We do not access the vector. + * @__unused: No need for length of vector. + * @___unused: We do not do any allocation. + * @private: list_head pointer representing the shrink list. + * + * Dispose of the shrink list created during isolation function. + * + * Dentry objects can _not_ be relocated and shrinking the whole dcache + * can be expensive. This is an effort to free dentry objects that are + * stopping slab pages from being free'd without clearing the whole dcache. + * + * This callback is called from the SLUB allocator object migration + * infrastructure in attempt to free up slab pages by freeing dentry + * objects from partially full slabs. + */ +static void d_partial_shrink(struct kmem_cache *s, void **_unused, int __unused, +int ___unused, void *private) +{ + struct list_head *dispose = private; + + if (!private) /* kmalloc error during isolate. */ + return; + + if (!list_empty(dispose)) + shrink_dentry_list(dispose); + + kfree(private); +} + static __initdata unsigned long dhash_entries; static int __init set_dhash_entries(char *str) { @@ -3112,6 +3186,8 @@ static void __init dcache_init(void) sizeof_field(struct dentry, d_iname), dcache_ctor); + kmem_cache_setup_mobility(dentry_cache, d_isolate, d_partial_shrink); + /* Hash may have been set up in dcache_init_early */ if (!hashdist) return; -- 2.21.0
[RFC PATCH v4 10/15] tools/testing/slab: Add XArray movable objects tests
We just implemented movable objects for the XArray. Let's test it intree. Add test module for the XArray's movable objects implementation. Functionality of the XArray Slab Movable Object implementation can usually be seen by simply by using `slabinfo` on a running machine since the radix tree is typically in use on a running machine and will have partial slabs. For repeated testing we can use the test module to run to simulate a workload on the XArray then use `slabinfo` to test object migration is functioning. If testing on freshly spun up VM (low radix tree workload) it may be necessary to load/unload the module a number of times to create partial slabs. Example test session Relevant /proc/slabinfo column headers: name Prior to testing slabinfo report for radix_tree_node: # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8352 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 497 Sanity Checks : On Total: 8142848 SlabObj: 912 Full : 473 Redzoning : On Used : 4810752 SlabSiz: 16384 Partial: 24 Poisoning : On Loss : 3332096 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2806272 Align : 8 Objects: 17 Tracing : Off Lpadd: 437360 Here you can see the kernel was built with Slab Movable Objects enabled for the XArray (XArray uses the radix tree below the surface). After inserting the test module (note we have triggered allocation of a number of radix tree nodes increasing the object count but decreasing the number of partial slabs): # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8442 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 499 Sanity Checks : On Total: 8175616 SlabObj: 912 Full : 484 Redzoning : On Used : 4862592 SlabSiz: 16384 Partial: 15 Poisoning : On Loss : 3313024 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2836512 Align : 8 Objects: 17 Tracing : Off Lpadd: 439120 Now we can shrink the radix_tree_node cache: # slabinfo radix_tree_node --shrink # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8515 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 501 Sanity Checks : On Total: 8208384 SlabObj: 912 Full : 500 Redzoning : On Used : 4904640 SlabSiz: 16384 Partial: 1 Poisoning : On Loss : 3303744 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2861040 Align : 8 Objects: 17 Tracing : Off Lpadd: 440880 Note the single remaining partial slab. Signed-off-by: Tobin C. Harding --- tools/testing/slab/Makefile | 2 +- tools/testing/slab/slub_defrag_xarray.c | 211 2 files changed, 212 insertions(+), 1 deletion(-) create mode 100644 tools/testing/slab/slub_defrag_xarray.c diff --git a/tools/testing/slab/Makefile b/tools/testing/slab/Makefile index 440c2e3e356f..44c18d9a4d52 100644 --- a/tools/testing/slab/Makefile +++ b/tools/testing/slab/Makefile @@ -1,4 +1,4 @@ -obj-m += slub_defrag.o +obj-m += slub_defrag.o slub_defrag_xarray.o KTREE=../../.. diff --git a/tools/testing/slab/slub_defrag_xarray.c b/tools/testing/slab/slub_defrag_xarray.c new file mode 100644 index ..41143f73256c --- /dev/null +++ b/tools/testing/slab/slub_defrag_xarray.c @@ -0,0 +1,211 @@ +// SPDX-License-Identifier: GPL-2.0+ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define SMOX_CACHE_NAME "smox_test" +static struct kmem_cache *cachep; + +/* + * Declare XArrays globally so we can clean them up on module unload. + */ + +/* Used by test_smo_xarray()*/ +DEFINE_XARRAY(things); + +/* Thing to store pointers to in the XArray */ +struct smox_thing { + long id; +}; + +/* It's up to the caller to ensure id is unique */ +static struct smox_thing *alloc_thing(int id) +{ + struct smox_thing *thing; + + thing = kmem_cache_alloc(cachep, GFP_KERNEL); + if (!thing) + return ERR_PTR(-ENOMEM); + + thing->id = id; + return thing; +} + +/** + * smox_object_ctor() - SMO object constructor function. + * @ptr: Pointer to memory where the object should be constructed. + */ +void
[RFC PATCH v4 09/15] xarray: Implement migration function for objects
Implement functions to migrate objects. This is based on initial code by Matthew Wilcox and was modified to work with slab object migration. This patch can not be merged until all radix tree & IDR users are converted to the XArray because xa_nodes and radix tree nodes share the same slab cache (thanks Matthew). Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- lib/radix-tree.c | 13 + lib/xarray.c | 49 2 files changed, 62 insertions(+) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index 14d51548bea6..9412c2853726 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -1613,6 +1613,17 @@ static int radix_tree_cpu_dead(unsigned int cpu) return 0; } +extern void xa_object_migrate(void *tree_node, int numa_node); + +static void radix_tree_migrate(struct kmem_cache *s, void **objects, int nr, + int node, void *private) +{ + int i; + + for (i = 0; i < nr; i++) + xa_object_migrate(objects[i], node); +} + void __init radix_tree_init(void) { int ret; @@ -1627,4 +1638,6 @@ void __init radix_tree_init(void) ret = cpuhp_setup_state_nocalls(CPUHP_RADIX_DEAD, "lib/radix:dead", NULL, radix_tree_cpu_dead); WARN_ON(ret < 0); + kmem_cache_setup_mobility(radix_tree_node_cachep, NULL, + radix_tree_migrate); } diff --git a/lib/xarray.c b/lib/xarray.c index 6be3acbb861f..731dd3d8ddb8 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1971,6 +1971,55 @@ void xa_destroy(struct xarray *xa) } EXPORT_SYMBOL(xa_destroy); +void xa_object_migrate(struct xa_node *node, int numa_node) +{ + struct xarray *xa = READ_ONCE(node->array); + void __rcu **slot; + struct xa_node *new_node; + int i; + + /* Freed or not yet in tree then skip */ + if (!xa || xa == XA_RCU_FREE) + return; + + new_node = kmem_cache_alloc_node(radix_tree_node_cachep, +GFP_KERNEL, numa_node); + if (!new_node) + return; + + xa_lock_irq(xa); + + /* Check again. */ + if (xa != node->array) { + node = new_node; + goto unlock; + } + + memcpy(new_node, node, sizeof(struct xa_node)); + + if (list_empty(>private_list)) + INIT_LIST_HEAD(_node->private_list); + else + list_replace(>private_list, _node->private_list); + + for (i = 0; i < XA_CHUNK_SIZE; i++) { + void *x = xa_entry_locked(xa, new_node, i); + + if (xa_is_node(x)) + rcu_assign_pointer(xa_to_node(x)->parent, new_node); + } + if (!new_node->parent) + slot = >xa_head; + else + slot = _parent_locked(xa, new_node)->slots[new_node->offset]; + rcu_assign_pointer(*slot, xa_mk_node(new_node)); + +unlock: + xa_unlock_irq(xa); + xa_node_free(node); + rcu_barrier(); +} + #ifdef XA_DEBUG void xa_dump_node(const struct xa_node *node) { -- 2.21.0
[RFC PATCH v4 08/15] tools/testing/slab: Add object migration test suite
We just added a module that enables testing the SLUB allocators ability to defrag/shrink caches via movable objects. Tests are better when they are automated. Add automated testing via a python script for SLUB movable objects. Example output: $ cd path/to/linux/tools/testing/slab $ /slub_defrag.py Please run script as root $ sudo ./slub_defrag.py $ sudo ./slub_defrag.py --debug Loading module ... Slab cache smo_test created Objects per slab: 20 Running sanity checks ... Running module stress test (see dmesg for additional test output) ... Removing module slub_defrag ... Loading module ... Slab cache smo_test created Running test non-movable ... testing slab 'smo_test' prior to enabling movable objects ... verified non-movable slabs are NOT shrinkable Running test movable ... testing slab 'smo_test' after enabling movable objects ... verified movable slabs are shrinkable Removing module slub_defrag ... Signed-off-by: Tobin C. Harding --- tools/testing/slab/slub_defrag.c | 1 + tools/testing/slab/slub_defrag.py | 451 ++ 2 files changed, 452 insertions(+) create mode 100755 tools/testing/slab/slub_defrag.py diff --git a/tools/testing/slab/slub_defrag.c b/tools/testing/slab/slub_defrag.c index 4a5c24394b96..8332e69ee868 100644 --- a/tools/testing/slab/slub_defrag.c +++ b/tools/testing/slab/slub_defrag.c @@ -337,6 +337,7 @@ static int smo_run_module_tests(int nr_objs, int keep) /* * struct functions() - Map command to a function pointer. + * If you update this please update the documentation in slub_defrag.py */ struct functions { char *fn_name; diff --git a/tools/testing/slab/slub_defrag.py b/tools/testing/slab/slub_defrag.py new file mode 100755 index ..41747c0db39b --- /dev/null +++ b/tools/testing/slab/slub_defrag.py @@ -0,0 +1,451 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import subprocess +import sys +from os import path + +# SLUB Movable Objects test suite. +# +# Requirements: +# - CONFIG_SLUB=y +# - CONFIG_SLUB_DEBUG=y +# - The slub_defrag module in this directory. + +# Test SMO using a kernel module that enables triggering arbitrary +# kernel code from userspace via a debugfs file. +# +# Module code is in ./slub_defrag.c, basically the functionality is as +# follows: +# +# - Creates debugfs file /sys/kernel/debugfs/smo/callfn +# - Writes to 'callfn' are parsed as a command string and the function +#associated with command is called. +# - Defines 4 commands (all commands operate on smo_test cache): +# - 'test': Runs module stress tests. +# - 'alloc N': Allocates N slub objects +# - 'free N POS': Frees N objects starting at POS (see below) +# - 'enable': Enables SLUB Movable Objects +# +# The module maintains a list of allocated objects. Allocation adds +# objects to the tail of the list. Free'ing frees from the head of the +# list. This has the effect of creating free slots in the slab. For +# finer grained control over where in the cache slots are free'd POS +# (position) argument may be used. + +# The main() function is reasonably readable; the test suite does the +# following: +# +# 1. Runs the module stress tests. +# 2. Tests the cache without movable objects enabled. +#- Creates multiple partial slabs as explained above. +#- Verifies that partial slabs are _not_ removed by shrink (see below). +# 3. Tests the cache with movable objects enabled. +#- Creates multiple partial slabs as explained above. +#- Verifies that partial slabs _are_ removed by shrink (see below). + +# The sysfs file /sys/kernel/slab//shrink enables calling the +# function kmem_cache_shrink() (see mm/slab_common.c and mm/slub.cc). +# Shrinking a cache attempts to consolidate all partial slabs by moving +# objects if object migration is enable for the cache, otherwise +# shrinking a cache simply re-orders the partial list so as most densely +# populated slab are at the head of the list. + +# Enable/disable debugging output (also enabled via -d | --debug). +debug = False + +# Used in debug messages and when running `insmod`. +MODULE_NAME = "slub_defrag" + +# Slab cache created by the test module. +CACHE_NAME = "smo_test" + +# Set by get_slab_config() +objects_per_slab = 0 +pages_per_slab = 0 +debugfs_mounted = False # Set to true if we mount debugfs. + + +def eprint(*args, **kwargs): +print(*args, file=sys.stderr, **kwargs) + + +def dprint(*args, **kwargs): +if debug: +print(*args, file=sys.stderr, **kwargs) + + +def run_shell(cmd): +return subprocess.call([cmd], shell=True) + + +def run_shell_get_stdout(cmd): +return subprocess.check_output([cmd], shell=True) + + +def assert_root(): +user = run_shell_get_stdout('whoami') +if user != b'root\n': +eprint("Please run script as root") +sys.exit(1) + + +def mount_debugfs(): +mounted = False + +# Check if debugfs is mounted at a known
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Tue, Apr 16, 2019 at 11:01:16AM -0700, Linus Torvalds wrote: > On Tue, Apr 16, 2019 at 10:49 AM Al Viro wrote: > > > > 83 files changed, 241 insertions(+), 516 deletions(-) > > I think this single line is pretty convincing on its own. Ignoring > docs and fs/inode.c, we have > > 80 files changed, 190 insertions(+), 494 deletions(-) > > IOW, just over 300 lines of boiler plate code removed. > > The additions are > > - Ten more lines of actual code in fs/inode.c (and that's not > actually added complexity, it looks simpler if anything - most of it > is the new "i_callback()" helper function) > > - 19 lines of doc updates. > > So it absolutely looks fine to me. > > I only skimmed through the actual filesystem (and one networking) > patches, but they looked like trivial conversions to a better > interface. ... except that this callback can (and always could) get executed after freeing struct super_block. So we can't just dereference ->i_sb->s_op and expect to survive; the table ->s_op pointed to will still be there, but ->i_sb might very well have been freed, with all its contents overwritten. We need to copy the callback into struct inode itself, unfortunately. The following incremental fixes it; I'm going to fold it into the first commit in there. diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 9d80f9e0855e..b8d3ddd8b8db 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -655,3 +655,11 @@ in your dentry operations instead. * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() * combination of NULL ->destroy_inode and NULL ->free_inode is treated as NULL/free_inode_nonrcu, to preserve the compatibility. + + Note that the callback (be it via ->free_inode() or explicit call_rcu() + in ->destroy_inode()) is *NOT* ordered wrt superblock destruction; + as the matter of fact, the superblock and all associated structures + might be already gone. The filesystem driver is guaranteed to be still + there, but that's it. Freeing memory in the callback is fine; doing + more than that is possible, but requires a lot of care and is best + avoided. diff --git a/fs/inode.c b/fs/inode.c index fb45590d284e..855dad43b11d 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -164,6 +164,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_wb_frn_avg_time = 0; inode->i_wb_frn_history = 0; #endif + inode->free_inode = sb->s_op->free_inode; if (security_inode_alloc(inode)) goto out; @@ -211,8 +212,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); static void i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); - if (inode->i_sb->s_op->free_inode) - inode->i_sb->s_op->free_inode(inode); + if (inode->free_inode) + inode->free_inode(inode); else free_inode_nonrcu(inode); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2e9b9f87caca..5ed6b39e588e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -718,6 +718,7 @@ struct inode { #endif void*i_private; /* fs or device private pointer */ + void (*free_inode)(struct inode *); } __randomize_layout; static inline unsigned int i_blocksize(const struct inode *node)
[RFC PATCH v4 07/15] tools/testing/slab: Add object migration test module
We just implemented slab movable objects for the SLUB allocator. We should test that code. In order to do so we need to be able to do a number of things - Create a cache - Enable Slab Movable Objects for the cache - Allocate objects to the cache - Free objects from within specific slabs of the cache We can do all this via a loadable module. Add a module that defines functions that can be triggered from userspace via a debugfs entry. From the source: /* * SLUB defragmentation a.k.a. Slab Movable Objects (SMO). * * This module is used for testing the SLUB allocator. Enables * userspace to run kernel functions via a debugfs file. * * debugfs: /sys/kernel/debugfs/smo/callfn (write only) * * String written to `callfn` is parsed by the module and associated * function is called. See fn_tab for mapping of strings to functions. */ References to allocated objects are kept by the module in a linked list so that userspace can control which object to free. We introduce the following four functions via the function table "enable": Enables object migration for the test cache. "alloc X": Allocates X objects "free X [Y]": Frees X objects starting at list position Y (default Y==0) "test": Runs [stress] tests from within the module (see below). {"enable", smo_enable_cache_mobility}, {"alloc", smo_alloc_objects}, {"free", smo_free_object}, {"test", smo_run_module_tests}, Freeing from the start of the list creates a hole in the slab being freed from (i.e. creates a partial slab). The results of running these commands can be see using `slabinfo` (available in tools/vm/): make -o slabinfo tools/vm/slabinfo.c Stress tests can be run from within the module. These tests are internal to the module because we verify that object references are still good after object migration. These are called 'stress' tests because it is intended that they create/free a lot of objects. Userspace can control the number of objects to create, default is 1000. Example test session Relevant /proc/slabinfo column headers: name # mount -t debugfs none /sys/kernel/debug/ $ cd path/to/linux/tools/testing/slab; make ... # insmod slub_defrag.ko # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 0 0392 202 >From this we can see that the module created cache 'smo_test' with 20 objects per slab and 2 pages per slab (and cache is currently empty). We can play with the slab allocator manually: # insmod slub_defrag.ko # echo 'alloc 21' > callfn # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 21 40392 202 We see here that 21 active objects have been allocated creating 2 slabs (40 total objects). # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 21 Sizes (bytes) Slabs DebugMemory Object : 56 Total : 2 Sanity Checks : On Total: 16384 SlabObj: 392 Full : 1 Redzoning : On Used :1176 SlabSiz:8192 Partial: 1 Poisoning : On Loss : 15208 Loss : 336 CpuSlab: 0 Tracking : On Lalig:7056 Align : 8 Objects: 20 Tracing : Off Lpadd: 704 Now free an object from the first slot of the first slab # echo 'free 1' > callfn # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 20 40392 202 # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 20 Sizes (bytes) Slabs DebugMemory Object : 56 Total : 2 Sanity Checks : On Total: 16384 SlabObj: 392 Full : 0 Redzoning : On Used :1120 SlabSiz:8192 Partial: 2 Poisoning : On Loss : 15264 Loss : 336 CpuSlab: 0 Tracking : On Lalig:6720 Align : 8 Objects: 20 Tracing : Off Lpadd: 704 Calling shrink now on the cache does nothing because object migration is not enabled (output omitted). If we enable object migration then shrink the cache we expect the object from the second slab to me moved to the first slot in the first slab and the second slab to be removed from the partial list. # echo 'enable' > callfn # slabinfo smo_test --shrink # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 20 ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 56 Total : 1 Sanity Checks : On Total:8192 SlabObj: 392 Full : 1
[RFC PATCH v4 06/15] tools/vm/slabinfo: Add defrag_used_ratio output
Add output for the newly added defrag_used_ratio sysfs knob. Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 4 1 file changed, 4 insertions(+) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index d2c22f9ee2d8..ef4ff93df4cc 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -34,6 +34,7 @@ struct slabinfo { unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; int movable, ctor; + int defrag_used_ratio; int remote_node_defrag_ratio; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; @@ -549,6 +550,8 @@ static void report(struct slabinfo *s) printf("** Slabs are destroyed via RCU\n"); if (s->reclaim_account) printf("** Reclaim accounting active\n"); + if (s->movable) + printf("** Defragmentation at %d%%\n", s->defrag_used_ratio); printf("\nSizes (bytes) Slabs Debug Memory\n"); printf("\n"); @@ -1279,6 +1282,7 @@ static void read_slab_dir(void) slab->deactivate_bypass = get_obj("deactivate_bypass"); slab->remote_node_defrag_ratio = get_obj("remote_node_defrag_ratio"); + slab->defrag_used_ratio = get_obj("defrag_used_ratio"); chdir(".."); if (read_slab_obj(slab, "ops")) { if (strstr(buffer, "ctor :")) -- 2.21.0
[RFC PATCH v4 05/15] tools/vm/slabinfo: Add remote node defrag ratio output
Add output line for NUMA remote node defrag ratio. Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index cbfc56c44c2f..d2c22f9ee2d8 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -34,6 +34,7 @@ struct slabinfo { unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; int movable, ctor; + int remote_node_defrag_ratio; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; unsigned long free_fastpath, free_slowpath; @@ -377,6 +378,10 @@ static void slab_numa(struct slabinfo *s, int mode) if (skip_zero && !s->slabs) return; + if (mode) { + printf("\nNUMA remote node defrag ratio: %3d\n", + s->remote_node_defrag_ratio); + } if (!line) { printf("\n%-21s:", mode ? "NUMA nodes" : "Slab"); for(node = 0; node <= highest_node; node++) @@ -1272,6 +1277,8 @@ static void read_slab_dir(void) slab->cpu_partial_free = get_obj("cpu_partial_free"); slab->alloc_node_mismatch = get_obj("alloc_node_mismatch"); slab->deactivate_bypass = get_obj("deactivate_bypass"); + slab->remote_node_defrag_ratio = + get_obj("remote_node_defrag_ratio"); chdir(".."); if (read_slab_obj(slab, "ops")) { if (strstr(buffer, "ctor :")) -- 2.21.0
[RFC PATCH v4 04/15] slub: Slab defrag core
Internal fragmentation can occur within pages used by the slub allocator. Under some workloads large numbers of pages can be used by partial slab pages. This under-utilisation is bad simply because it wastes memory but also because if the system is under memory pressure higher order allocations may become difficult to satisfy. If we can defrag slab caches we can alleviate these problems. Implement Slab Movable Objects in order to defragment slab caches. Slab defragmentation may occur: 1. Unconditionally when __kmem_cache_shrink() is called on a slab cache by the kernel calling kmem_cache_shrink(). 2. Unconditionally through the use of the slabinfo command. slabinfo -s 3. Conditionally via the use of kmem_cache_defrag() - Use Slab Movable Objects when shrinking cache. Currently when the kernel calls kmem_cache_shrink() we curate the partial slabs list. If object migration is not enabled for the cache we still do this, if however, SMO is enabled we attempt to move objects in partially full slabs in order to defragment the cache. Shrink attempts to move all objects in order to reduce the cache to a single partial slab for each node. - Add conditional per node defrag via new function: kmem_defrag_slabs(int node). kmem_defrag_slabs() attempts to defragment all slab caches for node. Defragmentation is done conditionally dependent on MAX_PARTIAL _AND_ defrag_used_ratio. Caches are only considered for defragmentation if the number of partial slabs exceeds MAX_PARTIAL (per node). Also, defragmentation only occurs if the usage ratio of the slab is lower than the configured percentage (sysfs field added in this patch). Fragmentation ratios are measured by calculating the percentage of objects in use compared to the total number of objects that the slab page can accommodate. The scanning of slab caches is optimized because the defragmentable slabs come first on the list. Thus we can terminate scans on the first slab encountered that does not support defragmentation. kmem_defrag_slabs() takes a node parameter. This can either be -1 if defragmentation should be performed on all nodes, or a node number. Defragmentation may be disabled by setting defrag ratio to 0 echo 0 > /sys/kernel/slab//defrag_used_ratio - Add a defrag ratio sysfs field and set it to 30% by default. A limit of 30% specifies that more than 3 out of 10 available slots for objects need to be in use otherwise slab defragmentation will be attempted on the remaining objects. In order for a cache to be defragmentable the cache must support object migration (SMO). Enabling SMO for a cache is done via a call to the recently added function: void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func, kmem_cache_migrate_func); Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- Documentation/ABI/testing/sysfs-kernel-slab | 14 + include/linux/slab.h| 1 + include/linux/slub_def.h| 7 + mm/slub.c | 385 4 files changed, 334 insertions(+), 73 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab index 29601d93a1c2..7770c03be6b4 100644 --- a/Documentation/ABI/testing/sysfs-kernel-slab +++ b/Documentation/ABI/testing/sysfs-kernel-slab @@ -180,6 +180,20 @@ Description: list. It can be written to clear the current count. Available when CONFIG_SLUB_STATS is enabled. +What: /sys/kernel/slab/cache/defrag_used_ratio +Date: February 2019 +KernelVersion: 5.0 +Contact: Christoph Lameter + Pekka Enberg , +Description: + The defrag_used_ratio file allows the control of how aggressive + slab fragmentation reduction works at reclaiming objects from + sparsely populated slabs. This is a percentage. If a slab has + less than this percentage of objects allocated then reclaim will + attempt to reclaim objects so that the whole slab page can be + freed. 0% specifies no reclaim attempt (defrag disabled), 100% + specifies attempt to reclaim all pages. The default is 30%. + What: /sys/kernel/slab/cache/deactivate_to_tail Date: February 2008 KernelVersion: 2.6.25 diff --git a/include/linux/slab.h b/include/linux/slab.h index 886fc130334d..4bf381b34829 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -149,6 +149,7 @@ struct kmem_cache *kmem_cache_create_usercopy(const char *name, void (*ctor)(void *)); void kmem_cache_destroy(struct kmem_cache *); int kmem_cache_shrink(struct kmem_cache *); +unsigned long kmem_defrag_slabs(int node); void memcg_create_kmem_cache(struct
[RFC PATCH v4 02/15] tools/vm/slabinfo: Add support for -C and -M options
-C lists caches that use a ctor. -M lists caches that support object migration. Add command line options to show caches with a constructor and caches that are movable (i.e. have migrate function). Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 40 1 file changed, 36 insertions(+), 4 deletions(-) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index 73818f1b2ef8..cbfc56c44c2f 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -33,6 +33,7 @@ struct slabinfo { unsigned int hwcache_align, object_size, objs_per_slab; unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; + int movable, ctor; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; unsigned long free_fastpath, free_slowpath; @@ -67,6 +68,8 @@ int show_report; int show_alias; int show_slab; int skip_zero = 1; +int show_movable; +int show_ctor; int show_numa; int show_track; int show_first_alias; @@ -109,11 +112,13 @@ static void fatal(const char *x, ...) static void usage(void) { - printf("slabinfo 4/15/2011. (c) 2007 sgi/(c) 2011 Linux Foundation.\n\n" - "slabinfo [-aADefhilnosrStTvz1LXBU] [N=K] [-dafzput] [slab-regexp]\n" + printf("slabinfo 4/15/2017. (c) 2007 sgi/(c) 2011 Linux Foundation/(c) 2017 Jump Trading LLC.\n\n" + "slabinfo [-aACDefhilMnosrStTvz1LXBU] [N=K] [-dafzput] [slab-regexp]\n" + "-a|--aliases Show aliases\n" "-A|--activity Most active slabs first\n" "-B|--Bytes Show size in bytes\n" + "-C|--ctor Show slabs with ctors\n" "-D|--display-activeSwitch line format to activity\n" "-e|--empty Show empty slabs\n" "-f|--first-alias Show first alias\n" @@ -121,6 +126,7 @@ static void usage(void) "-i|--inverted Inverted list\n" "-l|--slabs Show slabs\n" "-L|--Loss Sort by loss\n" + "-M|--movable Show caches that support movable objects\n" "-n|--numa Show NUMA information\n" "-N|--lines=K Show the first K slabs\n" "-o|--ops Show kmem_cache_ops\n" @@ -588,6 +594,12 @@ static void slabcache(struct slabinfo *s) if (show_empty && s->slabs) return; + if (show_ctor && !s->ctor) + return; + + if (show_movable && !s->movable) + return; + if (sort_loss == 0) store_size(size_str, slab_size(s)); else @@ -602,6 +614,10 @@ static void slabcache(struct slabinfo *s) *p++ = '*'; if (s->cache_dma) *p++ = 'd'; + if (s->ctor) + *p++ = 'C'; + if (s->movable) + *p++ = 'M'; if (s->hwcache_align) *p++ = 'A'; if (s->poison) @@ -636,7 +652,8 @@ static void slabcache(struct slabinfo *s) printf("%-21s %8ld %7d %15s %14s %4d %1d %3ld %3ld %s\n", s->name, s->objects, s->object_size, size_str, dist_str, s->objs_per_slab, s->order, - s->slabs ? (s->partial * 100) / s->slabs : 100, + s->slabs ? (s->partial * 100) / + (s->slabs * s->objs_per_slab) : 100, s->slabs ? (s->objects * s->object_size * 100) / (s->slabs * (page_size << s->order)) : 100, flags); @@ -1256,6 +1273,13 @@ static void read_slab_dir(void) slab->alloc_node_mismatch = get_obj("alloc_node_mismatch"); slab->deactivate_bypass = get_obj("deactivate_bypass"); chdir(".."); + if (read_slab_obj(slab, "ops")) { + if (strstr(buffer, "ctor :")) + slab->ctor = 1; + if (strstr(buffer, "migrate :")) + slab->movable = 1; + } + if (slab->name[0] == ':') alias_targets++; slab++; @@ -1332,6 +1356,8 @@ static void xtotals(void) } struct option opts[] = { + { "ctor", no_argument, NULL, 'C' }, + { "movable", no_argument, NULL, 'M' }, { "aliases", no_argument, NULL, 'a' }, { "activity", no_argument, NULL, 'A' }, { "debug", optional_argument, NULL, 'd' }, @@ -1367,7 +1393,7 @@ int main(int argc, char *argv[]) page_size = getpagesize(); -
[RFC PATCH v4 03/15] slub: Sort slab cache list
It is advantageous to have all defragmentable slabs together at the beginning of the list of slabs so that there is no need to scan the complete list. Put defragmentable caches first when adding a slab cache and others last. Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- mm/slab_common.c | 2 +- mm/slub.c| 6 ++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 58251ba63e4a..db5e9a0b1535 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -393,7 +393,7 @@ static struct kmem_cache *create_cache(const char *name, goto out_free_cache; s->refcount = 1; - list_add(>list, _caches); + list_add_tail(>list, _caches); memcg_link_cache(s); out: if (err) diff --git a/mm/slub.c b/mm/slub.c index ae44d640b8c1..f6b0e4a395ef 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4342,6 +4342,8 @@ void kmem_cache_setup_mobility(struct kmem_cache *s, return; } + mutex_lock(_mutex); + s->isolate = isolate; s->migrate = migrate; @@ -4350,6 +4352,10 @@ void kmem_cache_setup_mobility(struct kmem_cache *s, * to disable fast cmpxchg based processing. */ s->flags &= ~__CMPXCHG_DOUBLE; + + list_move(>list, _caches); /* Move to top */ + + mutex_unlock(_mutex); } EXPORT_SYMBOL(kmem_cache_setup_mobility); -- 2.21.0
[RFC PATCH v4 01/15] slub: Add isolate() and migrate() methods
Add the two methods needed for moving objects and enable the display of the callbacks via the /sys/kernel/slab interface. Add documentation explaining the use of these methods and the prototypes for slab.h. Add functions to setup the callbacks method for a slab cache. Add empty functions for SLAB/SLOB. The API is generic so it could be theoretically implemented for these allocators as well. Change sysfs 'ctor' field to be 'ops' to contain all the callback operations defined for a slab cache. Display the existing 'ctor' callback in the ops fields contents along with 'isolate' and 'migrate' callbacks. Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- include/linux/slab.h | 70 include/linux/slub_def.h | 3 ++ mm/slub.c| 59 + 3 files changed, 126 insertions(+), 6 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 9449b19c5f10..886fc130334d 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -154,6 +154,76 @@ void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *); void memcg_deactivate_kmem_caches(struct mem_cgroup *); void memcg_destroy_kmem_caches(struct mem_cgroup *); +/* + * Function prototypes passed to kmem_cache_setup_mobility() to enable + * mobile objects and targeted reclaim in slab caches. + */ + +/** + * typedef kmem_cache_isolate_func - Object migration callback function. + * @s: The cache we are working on. + * @ptr: Pointer to an array of pointers to the objects to isolate. + * @nr: Number of objects in @ptr array. + * + * The purpose of kmem_cache_isolate_func() is to pin each object so that + * they cannot be freed until kmem_cache_migrate_func() has processed + * them. This may be accomplished by increasing the refcount or setting + * a flag. + * + * The object pointer array passed is also passed to + * kmem_cache_migrate_func(). The function may remove objects from the + * array by setting pointers to %NULL. This is useful if we can + * determine that an object is being freed because + * kmem_cache_isolate_func() was called when the subsystem was calling + * kmem_cache_free(). In that case it is not necessary to increase the + * refcount or specially mark the object because the release of the slab + * lock will lead to the immediate freeing of the object. + * + * Context: Called with locks held so that the slab objects cannot be + * freed. We are in an atomic context and no slab operations + * may be performed. + * Return: A pointer that is passed to the migrate function. If any + * objects cannot be touched at this point then the pointer may + * indicate a failure and then the migration function can simply + * remove the references that were already obtained. The private + * data could be used to track the objects that were already pinned. + */ +typedef void *kmem_cache_isolate_func(struct kmem_cache *s, void **ptr, int nr); + +/** + * typedef kmem_cache_migrate_func - Object migration callback function. + * @s: The cache we are working on. + * @ptr: Pointer to an array of pointers to the objects to migrate. + * @nr: Number of objects in @ptr array. + * @node: The NUMA node where the object should be allocated. + * @private: The pointer returned by kmem_cache_isolate_func(). + * + * This function is responsible for migrating objects. Typically, for + * each object in the input array you will want to allocate an new + * object, copy the original object, update any pointers, and free the + * old object. + * + * After this function returns all pointers to the old object should now + * point to the new object. + * + * Context: Called with no locks held and interrupts enabled. Sleeping + * is possible. Any operation may be performed. + */ +typedef void kmem_cache_migrate_func(struct kmem_cache *s, void **ptr, +int nr, int node, void *private); + +/* + * kmem_cache_setup_mobility() is used to setup callbacks for a slab cache. + */ +#ifdef CONFIG_SLUB +void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func, + kmem_cache_migrate_func); +#else +static inline void +kmem_cache_setup_mobility(struct kmem_cache *s, kmem_cache_isolate_func isolate, + kmem_cache_migrate_func migrate) {} +#endif + /* * Please use this macro to create slab caches. Simply specify the * name of the structure and maybe some flags that are listed above. diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index d2153789bd9f..2879a2f5f8eb 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -99,6 +99,9 @@ struct kmem_cache { gfp_t allocflags; /* gfp flags to use on each alloc */ int refcount; /* Refcount for slab cache destroy */ void (*ctor)(void *); + kmem_cache_isolate_func *isolate; +
[RFC PATCH v4 00/15] Slab Movable Objects (SMO)
Hi, Another iteration of the SMO patch set, updates to this version are restricted to the dcache patch #14. Applies on top of Linus' tree (tag: v5.1-rc6). This is a patch set implementing movable objects within the SLUB allocator. This is work based on Christopher Lameter's patch set: https://lore.kernel.org/patchwork/project/lkml/list/?series=377335 The original code logic is from that set and implemented by Christopher. Clean up, refactoring, documentation, and additional features by myself. Responsibility for any bugs remaining falls solely with myself. Changes to this version: Re-write the dcache Slab Movable Objects isolate/migrate functions. Based on review/suggestions by Alexander on the last version. In this version the isolate function loops over the object vector and builds a shrink list for all objects that have refcount==0 AND are NOT on anyone else's shrink list. A pointer to this list is returned from the isolate function and passed to the migrate function (by the SMO infrastructure). The dentry migration function d_partial_shrink() simply calls shrink_dentry_list() on the received shrink list pointer and frees the memory associated with the list_head. Hopefully if this is all ok I can move on to violating the inode slab cache :) FWIW testing on a VM in Qemu brings this mild benefit to the dentry slab cache with no _apparent_ negatives. CONFIG_SLUB_DEBUG=y CONFIG_SLUB=y CONFIG_SLUB_CPU_PARTIAL=y CONFIG_SLUB_DEBUG_ON=y CONFIG_SLUB_STATS=y CONFIG_SMO_NODE=y CONFIG_DCACHE_SMO=y [root@vm ~]# slabinfo dentry -r | head -n 13 Slabcache: dentry Aliases: 0 Order : 1 Objects: 38585 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 192 Total :2582 Sanity Checks : On Total: 21151744 SlabObj: 528 Full :2547 Redzoning : On Used : 7408320 SlabSiz:8192 Partial: 35 Poisoning : On Loss : 13743424 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 12964560 Align : 8 Objects: 15 Tracing : Off Lpadd: 702304 [root@vm ~]# slabinfo dentry --shrink [root@vm ~]# slabinfo dentry -r | head -n 13 Slabcache: dentry Aliases: 0 Order : 1 Objects: 38426 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 192 Total :2578 Sanity Checks : On Total: 21118976 SlabObj: 528 Full :2547 Redzoning : On Used : 7377792 SlabSiz:8192 Partial: 31 Poisoning : On Loss : 13741184 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 12911136 Align : 8 Objects: 15 Tracing : Off Lpadd: 701216 Please note, this dentry shrink implementation is 'best effort', results vary. This is as is expected. We are trying to unobtrusively shrink the dentry cache. thanks, Tobin. Tobin C. Harding (15): slub: Add isolate() and migrate() methods tools/vm/slabinfo: Add support for -C and -M options slub: Sort slab cache list slub: Slab defrag core tools/vm/slabinfo: Add remote node defrag ratio output tools/vm/slabinfo: Add defrag_used_ratio output tools/testing/slab: Add object migration test module tools/testing/slab: Add object migration test suite xarray: Implement migration function for objects tools/testing/slab: Add XArray movable objects tests slub: Enable moving objects to/from specific nodes slub: Enable balancing slabs across nodes dcache: Provide a dentry constructor dcache: Implement partial shrink via Slab Movable Objects dcache: Add CONFIG_DCACHE_SMO Documentation/ABI/testing/sysfs-kernel-slab | 14 + fs/dcache.c | 110 ++- include/linux/slab.h| 71 ++ include/linux/slub_def.h| 10 + lib/radix-tree.c| 13 + lib/xarray.c| 49 ++ mm/Kconfig | 14 + mm/slab_common.c| 2 +- mm/slub.c | 819 ++-- tools/testing/slab/Makefile | 10 + tools/testing/slab/slub_defrag.c| 567 ++ tools/testing/slab/slub_defrag.py | 451 +++ tools/testing/slab/slub_defrag_xarray.c | 211 + tools/vm/slabinfo.c | 51 +- 14 files changed, 2299 insertions(+), 93 deletions(-) create mode 100644 tools/testing/slab/Makefile create mode 100644 tools/testing/slab/slub_defrag.c create mode 100755 tools/testing/slab/slub_defrag.py create mode 100644 tools/testing/slab/slub_defrag_xarray.c -- 2.21.0
Re: [PATCH -next] ASoC: sprd: Fix to use list_for_each_entry_safe() when delete items
Hi, On Mon, 29 Apr 2019 at 20:27, Wei Yongjun wrote: > > Since we will remove items off the list using list_del() we need > to use a safe version of the list_for_each_entry() macro aptly named > list_for_each_entry_safe(). > > Fixes: d7bff893e04f ("ASoC: sprd: Add Spreadtrum multi-channel data transfer > support") > Signed-off-by: Wei Yongjun Yes, thanks for your fixes. Reviewed-by: Baolin Wang > --- > sound/soc/sprd/sprd-mcdt.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/sound/soc/sprd/sprd-mcdt.c b/sound/soc/sprd/sprd-mcdt.c > index 28f5e649733d..df250f7f2b6f 100644 > --- a/sound/soc/sprd/sprd-mcdt.c > +++ b/sound/soc/sprd/sprd-mcdt.c > @@ -978,12 +978,12 @@ static int sprd_mcdt_probe(struct platform_device *pdev) > > static int sprd_mcdt_remove(struct platform_device *pdev) > { > - struct sprd_mcdt_chan *temp; > + struct sprd_mcdt_chan *chan, *temp; > > mutex_lock(_mcdt_list_mutex); > > - list_for_each_entry(temp, _mcdt_chan_list, list) > - list_del(>list); > + list_for_each_entry_safe(chan, temp, _mcdt_chan_list, list) > + list_del(>list); > > mutex_unlock(_mcdt_list_mutex); > > > -- Baolin Wang Best Regards
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 04:22:38AM +0200, Nicholas Mc Guire wrote: > On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire wrote: > > > > > > V2: As requested by Sven Van Asbroeck make the > > > impact of the patch clear in the commit message. > > > > Thank you, but did you miss my comment about creating a local variable > > instead? See: > > https://lkml.org/lkml/2019/4/28/97 > > Did not miss it - I just don't think that makes it any more > understandable - the __force __be16 makes it clear I believe > that this is correct, sparse does not like this though - so tell > sparse. ... to STFU, 'cause you know better. The trouble is, how do we (or yourself a year or two later) know *why* it is correct? Worse, how do we (or yourself, etc.) know if a change about to be done to the code won't invalidate the proof of yours? > The local variable would need to be explained as it is > functionally not necessary - therefor I find it more confusing > that using __force here. What's confusing is mixing host- and fixed-endian values in the same variable at different times. Treat those as unrelated types that happen to have the same sizeof. Quite a few of __force instances in the tree should be taken out and shot. Don't add to their number.
RE: [PATCH] clk: imx: pllv3: Fix fall through build warning
> From: Anson Huang > Sent: Tuesday, April 30, 2019 9:55 AM > Subject: [PATCH] clk: imx: pllv3: Fix fall through build warning > > Fix below fall through build warning: > > drivers/clk/imx/clk-pllv3.c:453:21: warning: > this statement may fall through [-Wimplicit-fallthrough=] > >pll->denom_offset = PLL_IMX7_DENOM_OFFSET; > ^ > drivers/clk/imx/clk-pllv3.c:454:2: note: here > case IMX_PLLV3_AV: > ^~~~ > > Signed-off-by: Anson Huang Reviewed-by: Dong Aisheng Regards Dong Aisheng
Re: [PATCH -next] ASoC: sprd: Fix return value check in sprd_mcdt_probe()
On Mon, 29 Apr 2019 at 20:15, Wei Yongjun wrote: > > In case of error, the function devm_ioremap_resource() returns ERR_PTR() > and never returns NULL. The NULL test in the return value check should > be replaced with IS_ERR(). > > Fixes: d7bff893e04f ("ASoC: sprd: Add Spreadtrum multi-channel data transfer > support") > Signed-off-by: Wei Yongjun Thanks for fixing my mistake. Reviewed-by: Baolin Wang > --- > sound/soc/sprd/sprd-mcdt.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sound/soc/sprd/sprd-mcdt.c b/sound/soc/sprd/sprd-mcdt.c > index 28f5e649733d..e9318d7a4810 100644 > --- a/sound/soc/sprd/sprd-mcdt.c > +++ b/sound/soc/sprd/sprd-mcdt.c > @@ -951,8 +951,8 @@ static int sprd_mcdt_probe(struct platform_device *pdev) > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > mcdt->base = devm_ioremap_resource(>dev, res); > - if (!mcdt->base) > - return -ENOMEM; > + if (IS_ERR(mcdt->base)) > + return PTR_ERR(mcdt->base); > > mcdt->dev = >dev; > spin_lock_init(>lock); > > > -- Baolin Wang Best Regards
Re: INFO: task hung in __get_super
On Sun 28-04-19 19:51:09, Al Viro wrote: > On Sun, Apr 28, 2019 at 11:14:06AM -0700, syzbot wrote: > > down_read+0x49/0x90 kernel/locking/rwsem.c:26 > > __get_super.part.0+0x203/0x2e0 fs/super.c:788 > > __get_super include/linux/spinlock.h:329 [inline] > > get_super+0x2e/0x50 fs/super.c:817 > > fsync_bdev+0x19/0xd0 fs/block_dev.c:525 > > invalidate_partition+0x36/0x60 block/genhd.c:1581 > > drop_partitions block/partition-generic.c:443 [inline] > > rescan_partitions+0xef/0xa20 block/partition-generic.c:516 > > __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173 > > blkdev_reread_part+0x27/0x40 block/ioctl.c:193 > > loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633 > > loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296 > > loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416 > > lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559 > > __blkdev_driver_ioctl block/ioctl.c:303 [inline] > > blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605 > > block_ioctl+0xee/0x130 fs/block_dev.c:1933 > > vfs_ioctl fs/ioctl.c:46 [inline] > > file_ioctl fs/ioctl.c:509 [inline] > > do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696 > > ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > > do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super(). > The trouble is, the only things holding ->s_umount appears to be > these: > > > 2 locks held by syz-executor274/11716: > > #0: a19e2025 (>s_umount_key#38/1){+.+.}, at: > > alloc_super+0x158/0x890 fs/super.c:228 > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl > > drivers/block/loop.c:1514 [inline] > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150 > > drivers/block/loop.c:1572 > > > 2 locks held by syz-executor274/11717: > > #0: e185c083 (>s_umount_key#38/1){+.+.}, at: > > alloc_super+0x158/0x890 fs/super.c:228 > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl > > drivers/block/loop.c:1514 [inline] > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150 > > drivers/block/loop.c:1572 > > ... and that's bollocks. ->s_umount held there is that on freshly allocated > superblock. It *MUST* be in mount(2); no other syscall should be able to > call alloc_super() in the first place. So what the hell is that doing > trying to call lo_ioctl() inside mount(2)? Something like isofs attempting > cdrom ioctls on the underlying device? Actually UDF also calls CDROMMULTISESSION ioctl during mount. So I could see how we get to lo_simple_ioctl() and indeed that would acquire loop_ctl_mutex under s_umount which is the other way around than in BLKRRPART ioctl. > Why do we have loop_func_table->ioctl(), BTW? All in-tree instances are > either NULL or return -EINVAL unconditionally. Considering that the > caller is > err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; > we could bloody well just get rid of cryptoloop_ioctl() (the only > non-NULL instance) and get rid of calling lo_simple_ioctl() in > lo_ioctl() switch's default. Yeah, you're right. And if we push the patch a bit further to not take loop_ctl_mutex for invalid ioctl number, that would fix the problem. I can send a fix. Honza > > Something like this: > > diff --git a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c > index 254ee7d54e91..f16468a562f5 100644 > --- a/drivers/block/cryptoloop.c > +++ b/drivers/block/cryptoloop.c > @@ -167,12 +167,6 @@ cryptoloop_transfer(struct loop_device *lo, int cmd, > } > > static int > -cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg) > -{ > - return -EINVAL; > -} > - > -static int > cryptoloop_release(struct loop_device *lo) > { > struct crypto_sync_skcipher *tfm = lo->key_data; > @@ -188,7 +182,6 @@ cryptoloop_release(struct loop_device *lo) > static struct loop_func_table cryptoloop_funcs = { > .number = LO_CRYPT_CRYPTOAPI, > .init = cryptoloop_init, > - .ioctl = cryptoloop_ioctl, > .transfer = cryptoloop_transfer, > .release = cryptoloop_release, > .owner = THIS_MODULE > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index bf1c61cab8eb..2ec162b80562 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -955,7 +955,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t > mode, > lo->lo_flags = lo_flags; > lo->lo_backing_file = file; > lo->transfer = NULL; > - lo->ioctl = NULL; > lo->lo_sizelimit = 0; > lo->old_gfp_mask = mapping_gfp_mask(mapping); > mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); > @@ -1064,7 +1063,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool > release) > >
Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation
On Mon, Apr 29, 2019 at 5:45 PM Sean Christopherson wrote: > > On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote: > > > > It's 486 based, but either way I suspect the answer is "yes". IIRC, > > Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that > > was based on P54C, though I'm struggling to recall exactly what the > > Larrabee weirdness was. > > Aha! Found an ancient comment that explicitly states P5 does not block > NMI/SMI in the STI shadow, while P6 does block NMI/SMI. Ok, so the STI shadow really wouldn't be reliable on those machines. Scary. Of course, the good news is that hopefully nobody has them any more, and if they do, they presumably don't use fancy NMI profiling etc, so any actual NMI's are probably relegated purely to largely rare and effectively fatal errors anyway (ie memory parity errors). Linus
Re: [PATCH] quota: set init_needed flag only when successfully getting dquot
On 4/30/19 5:49 AM, Jan Kara wrote: On Sun 28-04-19 13:39:21, Chengguang Xu wrote: Set init_needed flag only when successfully getting dquot, so that we can skip unnecessary subsequent operation. Signed-off-by: Chengguang Xu Thanks for the patch but I don't think it's really useful. It will be very rare that we race with quotaoff of dqget() fails due to error. So the additional overhead of iterating over dquots doesn't really matter in that case. Hi Jan, Thanks for the comment, I got it. Chengguang.
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire wrote: > > > > V2: As requested by Sven Van Asbroeck make the > > impact of the patch clear in the commit message. > > Thank you, but did you miss my comment about creating a local variable > instead? See: > https://lkml.org/lkml/2019/4/28/97 Did not miss it - I just don't think that makes it any more understandable - the __force __be16 makes it clear I believe that this is correct, sparse does not like this though - so tell sparse. The local variable would need to be explained as it is functionally not necessary - therefor I find it more confusing that using __force here. If that rational is wrong let me know. thx! hofrat
[PATCH] treewide: fix awk regexp over-escaping
Fix "warning: regexp escape sequence is not a known regexp operator" on gawk 5.0.0. Results found by: - grepping '\\[^\[\\^$.|?*+()a-z]' on *.awk - grepping 'awk.*\\[^\[\\^$.|?*+()a-z]' - running awk --lint -f /dev/null on *.awk Signed-off-by: Alex Xu (Hello71) --- Documentation/arm/Samsung/clksrc-change-registers.awk | 2 +- arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- lib/raid6/unroll.awk | 2 +- tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- tools/perf/arch/x86/tests/gen-insn-x86-dat.awk | 2 +- tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk | 4 ++-- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/arm/Samsung/clksrc-change-registers.awk b/Documentation/arm/Samsung/clksrc-change-registers.awk index 7be1b8aa7cd9..d853f750c861 100755 --- a/Documentation/arm/Samsung/clksrc-change-registers.awk +++ b/Documentation/arm/Samsung/clksrc-change-registers.awk @@ -67,7 +67,7 @@ BEGIN { # to replace and create an associative array of values while (getline line < ARGV[1] > 0) { - if (line ~ /\#define.*_MASK/ && + if (line ~ /#define.*_MASK/ && !(line ~ /USB_SIG_MASK/)) { splitdefine(line, fields) name = fields[0] diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk index b02a36b2c14f..a42015b305f4 100644 --- a/arch/x86/tools/gen-insn-attr-x86.awk +++ b/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\)\\)" + lprefix3_expr = "\\((F2|!F3|66)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index diff --git a/lib/raid6/unroll.awk b/lib/raid6/unroll.awk index c6aa03631df8..0809805a7e23 100644 --- a/lib/raid6/unroll.awk +++ b/lib/raid6/unroll.awk @@ -13,7 +13,7 @@ BEGIN { for (i = 0; i < rep; ++i) { tmp = $0 gsub(/\$\$/, i, tmp) - gsub(/\$\#/, n, tmp) + gsub(/\$#/, n, tmp) gsub(/\$\*/, "$", tmp) print tmp } diff --git a/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk index b02a36b2c14f..a42015b305f4 100644 --- a/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk +++ b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\)\\)" + lprefix3_expr = "\\((F2|!F3|66)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index diff --git a/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk b/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk index a21454835cd4..27585d032ee6 100644 --- a/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk +++ b/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk @@ -31,7 +31,7 @@ BEGIN { going = 0 } -/^\s*[0-9a-fA-F]+\:/ { +/^\s*[0-9a-fA-F]+:/ { if (going) { colon_pos = index($0, ":") useful_line = substr($0, colon_pos + 1) diff --git a/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk b/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk index ddd5c4c21129..606ccd154392 100644 --- a/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk +++ b/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\)\\)" + lprefix3_expr = "\\((F2|!F3|66)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index -- 2.21.0
Re: [PATCH v3 5/8] iommu/vt-d: Implement def_domain_type iommu ops entry
Hi Christoph, On 4/30/19 4:03 AM, Christoph Hellwig wrote: @@ -3631,35 +3607,30 @@ static int iommu_no_mapping(struct device *dev) if (iommu_dummy(dev)) return 1; - if (!iommu_identity_mapping) - return 0; - FYI, iommu_no_mapping has been refactored in for-next: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=x86/vt-d=48b2c937ea37a3bece0094b46450ed5267525289 Oh, yes! Thanks for letting me know this. Will rebase the code. found = identity_mapping(dev); if (found) { + /* +* If the device's dma_mask is less than the system's memory +* size then this is not a candidate for identity mapping. +*/ + u64 dma_mask = *dev->dma_mask; + + if (dev->coherent_dma_mask && + dev->coherent_dma_mask < dma_mask) + dma_mask = dev->coherent_dma_mask; + + if (dma_mask < dma_get_required_mask(dev)) { I know this is mostly existing code moved around, but it really needs some fixing. For one dma_get_required_mask is supposed to return the required to not bounce mask for the given device. E.g. for a device behind an iommu it should always just return 32-bit. If you really want to check vs system memory please call dma_direct_get_required_mask without the dma_ops indirection. Second I don't even think we need to check the coherent_dma_mask, dma_direct is pretty good at always finding memory even without an iommu. Third this doesn't take take the bus_dma_mask into account. This probably should just be: if (min(*dev->dma_mask, dev->bus_dma_mask) < dma_direct_get_required_mask(dev)) { Agreed and will add this in the next version. Best regards, Lu Baolu
Re: RFC: on adding new CLONE_* flags [WAS Re: [PATCH 0/4] clone: add CLONE_PIDFD]
On Mon, Apr 29, 2019 at 5:39 PM Jann Horn wrote: > > ... uuuh, whoops. Turns out I don't know what I'm talking about. Well, apparently there's some odd libc issue accoprding to Florian, so there *might* be something to it. > Nevermind. For some reason I thought vfork() was just > CLONE_VFORK|SIGCHLD, but now I see I got that completely wrong. Well, inside the kernel, that's actually *very* close to what vfork() is: SYSCALL_DEFINE0(vfork) { return _do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0, 0, NULL, NULL, 0); } but that's just an internal implementation detail. It's a real vfork() and should act as the traditional BSD "share everything" without any address space copying. The CLONE_VFORK flag is what does the "wait for child to exit or execve" magic. Note that vfork() is "exciting" for the compiler in much the same way "setjmp/longjmp()" is, because of the shared stack use in the child and the parent. It is *very* easy to get this wrong and cause massive and subtle memory corruption issues because the parent returns to something that has been messed up by the child. That may be why some libc might end up just using "fork()", because it ends up avoiding bugs in user space. (In fact, if I recall correctly, the _reason_ we have an explicit 'vfork()' entry point rather than using clone() with magic parameters was that the lack of arguments meant that you didn't have to save/restore any registers in user space, which made the whole stack issue simpler. But it's been two decades, so my memory is bitrotting). Also, particularly if you have a big address space, vfork()+execve() can be quite a bit faster than fork()+execve(). Linux fork() is pretty efficient, but if you have gigabytes of VM space to copy, it's going to take time even if you do it fairly well. Linus
Re: [PATCH v3] pinctrl:intel: Retain HOSTSW_OWN for requested gpio pin
On Fri, Apr 26, 2019 at 8:50 PM Andriy Shevchenko wrote: > > On Tue, Apr 23, 2019 at 12:38:17PM +0200, Linus Walleij wrote: > > On Mon, Apr 15, 2019 at 7:54 AM Chris Chiu wrote: > > > > > The touchpad of the ASUS laptops E403NA, X540NA, X541NA are not > > > responsive after suspend/resume. The following error message > > > shows after resume. > > > i2c_hid i2c-ELAN1200:00: failed to reset device. > > > > > > On these laptops, the touchpad interrupt is connected via a GPIO > > > pin which is controlled by Intel pinctrl. After system resumes, > > > the GPIO is in ACPI mode and no longer works as an IRQ. > > > > > > This commit saves the HOSTSW_OWN value during suspend, make sure > > > the HOSTSW_OWN mode remains the same after resume. > > > > > > Signed-off-by: Chris Chiu > > > > This v3 patch applied with Mika's ACK. > > Hmm... It's supposed to go along with our PR. Anything I can help with? Chris > > -- > With Best Regards, > Andy Shevchenko > >
[PATCH] NTB: correct ntb_dev_ops and ntb_dev comment typos
The comment for ntb_dev_ops and ntb_dev incorrectly referred to ntb_ctx_ops and ntb_device. Signed-off-by: Wesley Sheng Reviewed-by: Logan Gunthorpe --- include/linux/ntb.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/ntb.h b/include/linux/ntb.h index 56a92e3..604abc8 100644 --- a/include/linux/ntb.h +++ b/include/linux/ntb.h @@ -205,7 +205,7 @@ static inline int ntb_ctx_ops_is_valid(const struct ntb_ctx_ops *ops) } /** - * struct ntb_ctx_ops - ntb device operations + * struct ntb_dev_ops - ntb device operations * @port_number: See ntb_port_number(). * @peer_port_count: See ntb_peer_port_count(). * @peer_port_number: See ntb_peer_port_number(). @@ -404,7 +404,7 @@ struct ntb_client { #define drv_ntb_client(__drv) container_of((__drv), struct ntb_client, drv) /** - * struct ntb_device - ntb device + * struct ntb_dev - ntb device * @dev: Linux device object. * @pdev: PCI device entry of the ntb. * @topo: Detected topology of the ntb. -- 2.7.4
Re: [RFC] Bluetooth: Retry configure request if result is L2CAP_CONF_UNKNOWN
On Tue, Apr 23, 2019 at 1:08 PM Marcel Holtmann wrote: > > Hi Andrey, > > > Due to: > > > > - current implementation of l2cap_config_rsp() dropping BT > > connection if sender of configuration response replied with unknown > > option failure (Result=0x0003/L2CAP_CONF_UNKNOWN) > > > > - current implementation of l2cap_build_conf_req() adding > > L2CAP_CONF_RFC(0x04) option to initial configure request sent by > > the Linux host. > > > > devices that do no recongninze L2CAP_CONF_RFC, such as Xbox One S > > controllers, will get stuck in endless connect -> configure -> > > disconnect loop, never connect and be generaly unusable. > > > > To avoid this problem add code to do the following: > > > > 1. Store a mask of supported conf option types per connection > > > > 2. Parse the body of response L2CAP_CONF_UNKNOWN and adjust > >connection's supported conf option types mask > > > > 3. Retry configuration step the same way it's done for > >L2CAP_CONF_UNACCEPT > > > > Signed-off-by: Andrey Smirnov > > Cc: Pierre-Loup A. Griffais > > Cc: Florian Dollinger > > Cc: Marcel Holtmann > > Cc: Johan Hedberg > > Cc: linux-blueto...@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > --- > > > > Everyone: > > > > I marked this as an RFC, since I don't have a lot of experience with > > Bluetooth subsystem and don't have hight degree of confidence about > > choices made in this patch. I do, however, thins is is good enough to > > start a discussion about the problem. > > > > Thanks, > > Andrey Smirnov > > so it seems that the remote side claims to support Streaming Mode and that is > why we are trying to set it up. > > > ACL Data RX: Handle 12 flags 0x02 dlen 16 > L2CAP: Information Response (0x0b) ident 1 len 8 > Type: Extended features supported (0x0002) > Result: Success (0x) > Features: 0x0010 > Streaming Mode > > And that is why we do this. > > < ACL Data TX: Handle 12 flags 0x00 dlen 23 > L2CAP: Configure Request (0x04) ident 2 len 15 > Destination CID: 64 > Flags: 0x > Option: Retransmission and Flow Control (0x04) [mandatory] > Mode: Basic (0x00) > TX window size: 0 > Max transmit: 0 > Retransmission timeout: 0 > Monitor timeout: 0 > Maximum PDU size: 0 > > > ACL Data RX: Handle 12 flags 0x02 dlen 15 > L2CAP: Configure Response (0x05) ident 2 len 7 > Source CID: 64 > Flags: 0x > Result: Failure - unknown options (0x0003) > 04 > > So btmon needs a patch to decide the failed option octet here. We really want > do provide a human description of the failed option. > I'll see if that's an easy thing to add. Can't promise anything though. > > > > include/net/bluetooth/l2cap.h | 1 + > > net/bluetooth/l2cap_core.c| 58 ++- > > 2 files changed, 51 insertions(+), 8 deletions(-) > > > > diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h > > index 093aedebdf0c..6898bba5d9a8 100644 > > --- a/include/net/bluetooth/l2cap.h > > +++ b/include/net/bluetooth/l2cap.h > > @@ -632,6 +632,7 @@ struct l2cap_conn { > > unsigned intmtu; > > > > __u32 feat_mask; > > + __u32 known_options; > > __u8remote_fixed_chan; > > __u8local_fixed_chan; > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c > > index f17e393b43b4..49be98b6de72 100644 > > --- a/net/bluetooth/l2cap_core.c > > +++ b/net/bluetooth/l2cap_core.c > > @@ -3243,8 +3243,10 @@ static int l2cap_build_conf_req(struct l2cap_chan > > *chan, void *data, size_t data > > rfc.monitor_timeout = 0; > > rfc.max_pdu_size= 0; > > > > - l2cap_add_conf_opt(, L2CAP_CONF_RFC, sizeof(rfc), > > -(unsigned long) , endptr - ptr); > > + if (chan->conn->known_options & BIT(L2CAP_CONF_RFC)) { > > + l2cap_add_conf_opt(, L2CAP_CONF_RFC, sizeof(rfc), > > +(unsigned long), endptr - ptr); > > + } > > break; > > > > case L2CAP_MODE_ERTM: > > @@ -3263,8 +3265,10 @@ static int l2cap_build_conf_req(struct l2cap_chan > > *chan, void *data, size_t data > > rfc.txwin_size = min_t(u16, chan->tx_win, > > L2CAP_DEFAULT_TX_WINDOW); > > > > - l2cap_add_conf_opt(, L2CAP_CONF_RFC, sizeof(rfc), > > -(unsigned long) , endptr - ptr); > > + if (chan->conn->known_options & BIT(L2CAP_CONF_RFC)) { > > + l2cap_add_conf_opt(, L2CAP_CONF_RFC, sizeof(rfc), > > +(unsigned long), endptr - ptr); > > + } > > > > if (test_bit(FLAG_EFS_ENABLE, >flags)) > >
[PATCH] clk: imx: pllv3: Fix fall through build warning
Fix below fall through build warning: drivers/clk/imx/clk-pllv3.c:453:21: warning: this statement may fall through [-Wimplicit-fallthrough=] pll->denom_offset = PLL_IMX7_DENOM_OFFSET; ^ drivers/clk/imx/clk-pllv3.c:454:2: note: here case IMX_PLLV3_AV: ^~~~ Signed-off-by: Anson Huang --- drivers/clk/imx/clk-pllv3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/imx/clk-pllv3.c b/drivers/clk/imx/clk-pllv3.c index e892b9a..fbe4fe0 100644 --- a/drivers/clk/imx/clk-pllv3.c +++ b/drivers/clk/imx/clk-pllv3.c @@ -451,6 +451,7 @@ struct clk *imx_clk_pllv3(enum imx_pllv3_type type, const char *name, case IMX_PLLV3_AV_IMX7: pll->num_offset = PLL_IMX7_NUM_OFFSET; pll->denom_offset = PLL_IMX7_DENOM_OFFSET; + /* fall through */ case IMX_PLLV3_AV: ops = _pllv3_av_ops; break; -- 2.7.4
RE: linux-next: build warning after merge of the clk tree
Hi, Stephen Thanks for notice. As it is intentional, I will send out a patch to add "/* fall through */" to avoid this build warning, Anson. > -Original Message- > From: Stephen Rothwell [mailto:s...@canb.auug.org.au] > Sent: Tuesday, April 30, 2019 8:20 AM > To: Mike Turquette ; Stephen Boyd > > Cc: Linux Next Mailing List ; Linux Kernel Mailing > List ; Anson Huang ; > Gustavo A. R. Silva ; Kees Cook > > Subject: linux-next: build warning after merge of the clk tree > > Hi all, > > After merging the clk tree, today's linux-next build (arm > multi_v7_defconfig) produced this warning: > > drivers/clk/imx/clk-pllv3.c:453:21: warning: this statement may fall through > [- > Wimplicit-fallthrough=] >pll->denom_offset = PLL_IMX7_DENOM_OFFSET; > ^ > drivers/clk/imx/clk-pllv3.c:454:2: note: here > case IMX_PLLV3_AV: > ^~~~ > > Introduced by commit > > 01d0a541ff4b ("clk: imx: correct i.MX7D AV PLL num/denom offset") > > I get this warning because I am building with -Wimplicit-fallthrough in > attempt to catch new additions early. The gcc warning can be turned off by > adding a /* fall through */ comment at the point the fall through happens > (assuming that the fall through is intentional). > > -- > Cheers, > Stephen Rothwell
Re: [RFC PATCH v2 00/17] Core scheduling v2
On Tue, Apr 30, 2019 at 12:01 AM Ingo Molnar wrote: > * Li, Aubrey wrote: > > > > I.e. showing the approximate CPU thread-load figure column would be > > > very useful too, where '50%' shows half-loaded, '100%' fully-loaded, > > > '200%' over-saturated, etc. - for each row? > > > > See below, hope this helps. > > .--. > > |NA/AVX vanilla-SMT [std% / sem%] cpu% |coresched-SMT [std% / > > sem%] +/- cpu% | no-SMT [std% / sem%] +/- cpu% | > > |--| > > | 1/1508.5 [ 0.2%/ 0.0%] 2.1% |504.7 [ 1.1%/ > > 0.1%]-0.8%2.1% | 509.0 [ 0.2%/ 0.0%] 0.1% 4.3% | > > | 2/2 1000.2 [ 1.4%/ 0.1%] 4.1% | 1004.1 [ 1.6%/ > > 0.2%] 0.4%4.1% | 997.6 [ 1.2%/ 0.1%] -0.3% 8.1% | > > | 4/4 1912.1 [ 1.0%/ 0.1%] 7.9% | 1904.2 [ 1.1%/ > > 0.1%]-0.4%7.9% | 1914.9 [ 1.3%/ 0.1%] 0.1%15.1% | > > | 8/8 3753.5 [ 0.3%/ 0.0%]14.9% | 3748.2 [ 0.3%/ > > 0.0%]-0.1% 14.9% | 3751.3 [ 0.4%/ 0.0%] -0.1%30.5% | > > | 16/16 7139.3 [ 2.4%/ 0.2%]30.3% | 7137.9 [ 1.8%/ > > 0.2%]-0.0% 30.3% | 7049.2 [ 2.4%/ 0.2%] -1.3%60.4% | > > | 32/32 10899.0 [ 4.2%/ 0.4%]60.3% | 10780.3 [ 4.4%/ > > 0.4%]-1.1% 55.9% | 10339.2 [ 9.6%/ 0.9%] -5.1%97.7% | > > | 64/64 15086.1 [11.5%/ 1.2%]97.7% | 14262.0 [ 8.2%/ > > 0.8%]-5.5% 82.0% | 11168.7 [22.2%/ 1.7%] -26.0% 100.0% | > > |128/12815371.9 [22.0%/ 2.2%] 100.0% | 14675.8 [14.4%/ > > 1.4%]-4.5% 82.8% | 10963.9 [18.5%/ 1.4%] -28.7% 100.0% | > > |256/25615990.8 [22.0%/ 2.2%] 100.0% | 12227.9 [10.3%/ > > 1.0%] -23.5% 73.2% | 10469.9 [19.6%/ 1.7%] -34.5% 100.0% | > > '--' > > Very nice, thank you! > > What's interesting is how in the over-saturated case (the last three > rows: 128, 256 and 512 total threads) coresched-SMT leaves 20-30% CPU > performance on the floor according to the load figures. Yeah, I found the next focus. > > Is this true idle time (which shows up as 'id' during 'top'), or some > load average artifact? > vmstat periodically reported intermediate CPU utilization in one second, it was running simultaneously when the benchmarks run. The cpu% is computed by the average of (100-idle) series. Thanks, -Aubrey
Re: [PATCH v6 02/10] clk: samsung: add new clocks for DMC for Exynos5422 SoC
Hi Lukasz, I have no objection about this patch. Instead, as I commented on v4, in order to reduce the confusion about multiple clock definitions with same bit range of DIV_CDREX0, You need to add the additional comment and you better to define the three clocks at the nearby in this driver. (CLKDIV_PCLK_CDREX, CLKDIV_PCLK_DREX0, CLKDIV_PCLK_DREX1) If they are scattered, it is difficult for understanding why they are developed like this. [1] [v4,2/8] clk: samsung: add new clocks for DMC for Exynos5422 SoC - https://lkml.org/lkml/2019/2/12/12 Regards, Chanwoo Choi On 19. 4. 19. 오후 11:19, Lukasz Luba wrote: > This patch provides support for clocks needed for Dynamic Memory Controller > in Exynos5422 SoC. It adds CDREX base register addresses, new DIV, MUX and > GATE entries. > > Signed-off-by: Lukasz Luba > --- > drivers/clk/samsung/clk-exynos5420.c | 46 > > 1 file changed, 42 insertions(+), 4 deletions(-) > > diff --git a/drivers/clk/samsung/clk-exynos5420.c > b/drivers/clk/samsung/clk-exynos5420.c > index 34cce3c..d9e6653 100644 > --- a/drivers/clk/samsung/clk-exynos5420.c > +++ b/drivers/clk/samsung/clk-exynos5420.c > @@ -134,6 +134,8 @@ > #define SRC_CDREX0x20200 > #define DIV_CDREX0 0x20500 > #define DIV_CDREX1 0x20504 > +#define GATE_BUS_CDREX0 0x20700 > +#define GATE_BUS_CDREX1 0x20704 > #define KPLL_LOCK0x28000 > #define KPLL_CON00x28100 > #define SRC_KFC 0x28200 > @@ -248,6 +250,8 @@ static const unsigned long exynos5x_clk_regs[] > __initconst = { > DIV_CDREX1, > SRC_KFC, > DIV_KFC0, > + GATE_BUS_CDREX0, > + GATE_BUS_CDREX1, > }; > > static const unsigned long exynos5800_clk_regs[] __initconst = { > @@ -425,6 +429,9 @@ PNAME(mout_group13_5800_p)= { "dout_osc_div", > "mout_sw_aclkfl1_550_cam" }; > PNAME(mout_group14_5800_p) = { "dout_aclk550_cam", "dout_sclk_sw" }; > PNAME(mout_group15_5800_p) = { "dout_osc_div", "mout_sw_aclk550_cam" }; > PNAME(mout_group16_5800_p) = { "dout_osc_div", "mout_mau_epll_clk" }; > +PNAME(mout_mx_mspll_ccore_phy_p) = { "sclk_bpll", "mout_sclk_dpll", > + "mout_sclk_mpll", "ff_dout_spll2", > + "mout_sclk_spll", "mout_sclk_epll"}; > > /* fixed rate clocks generated outside the soc */ > static struct samsung_fixed_rate_clock > @@ -450,7 +457,7 @@ static const struct samsung_fixed_factor_clock > static const struct samsung_fixed_factor_clock > exynos5800_fixed_factor_clks[] __initconst = { > FFACTOR(0, "ff_dout_epll2", "mout_sclk_epll", 1, 2, 0), > - FFACTOR(0, "ff_dout_spll2", "mout_sclk_spll", 1, 2, 0), > + FFACTOR(CLK_FF_DOUT_SPLL2, "ff_dout_spll2", "mout_sclk_spll", 1, 2, 0), > }; > > static const struct samsung_mux_clock exynos5800_mux_clks[] __initconst = { > @@ -472,11 +479,14 @@ static const struct samsung_mux_clock > exynos5800_mux_clks[] __initconst = { > MUX(0, "mout_aclk300_disp1", mout_group5_5800_p, SRC_TOP2, 24, 2), > MUX(0, "mout_aclk300_gscl", mout_group5_5800_p, SRC_TOP2, 28, 2), > > + MUX(CLK_MOUT_MX_MSPLL_CCORE_PHY, "mout_mx_mspll_ccore_phy", > + mout_mx_mspll_ccore_phy_p, SRC_TOP7, 0, 3), > + > MUX(CLK_MOUT_MX_MSPLL_CCORE, "mout_mx_mspll_ccore", > - mout_mx_mspll_ccore_p, SRC_TOP7, 16, 2), > + mout_mx_mspll_ccore_p, SRC_TOP7, 16, 3), > MUX_F(CLK_MOUT_MAU_EPLL, "mout_mau_epll_clk", mout_mau_epll_clk_5800_p, > SRC_TOP7, 20, 2, CLK_SET_RATE_PARENT, 0), > - MUX(0, "sclk_bpll", mout_bpll_p, SRC_TOP7, 24, 1), > + MUX(CLK_SCLK_BPLL, "sclk_bpll", mout_bpll_p, SRC_TOP7, 24, 1), > MUX(0, "mout_epll2", mout_epll2_5800_p, SRC_TOP7, 28, 1), > > MUX(0, "mout_aclk550_cam", mout_group3_5800_p, SRC_TOP8, 16, 3), > @@ -648,7 +658,7 @@ static const struct samsung_mux_clock exynos5x_mux_clks[] > __initconst = { > > MUX(0, "mout_sclk_mpll", mout_mpll_p, SRC_TOP6, 0, 1), > MUX(CLK_MOUT_VPLL, "mout_sclk_vpll", mout_vpll_p, SRC_TOP6, 4, 1), > - MUX(0, "mout_sclk_spll", mout_spll_p, SRC_TOP6, 8, 1), > + MUX(CLK_MOUT_SCLK_SPLL, "mout_sclk_spll", mout_spll_p, SRC_TOP6, 8, 1), > MUX(0, "mout_sclk_ipll", mout_ipll_p, SRC_TOP6, 12, 1), > MUX(0, "mout_sclk_rpll", mout_rpll_p, SRC_TOP6, 16, 1), > MUX_F(CLK_MOUT_EPLL, "mout_sclk_epll", mout_epll_p, SRC_TOP6, 20, 1, > @@ -817,6 +827,8 @@ static const struct samsung_div_clock exynos5x_div_clks[] > __initconst = { > DIV(CLK_DOUT_CLK2X_PHY0, "dout_clk2x_phy0", "dout_sclk_cdrex", > DIV_CDREX0, 3, 5), > > + DIV(0, "dout_pclk_drex0", "dout_cclk_drex0", DIV_CDREX0, 28, 3), > + > DIV(CLK_DOUT_PCLK_CORE_MEM, "dout_pclk_core_mem", "mout_mclk_cdrex", > DIV_CDREX1, 8, 3), > > @@ -1170,6 +1182,32 @@ static const
Re: [RFC PATCH v2 00/17] Core scheduling v2
On Mon, Apr 29, 2019 at 11:39 PM Phil Auld wrote: > > On Mon, Apr 29, 2019 at 09:25:35PM +0800 Li, Aubrey wrote: > > .--. > > |NA/AVX vanilla-SMT [std% / sem%] cpu% |coresched-SMT [std% / > > sem%] +/- cpu% | no-SMT [std% / sem%] +/- cpu% | > > |--| > > | 1/1508.5 [ 0.2%/ 0.0%] 2.1% |504.7 [ 1.1%/ > > 0.1%]-0.8%2.1% | 509.0 [ 0.2%/ 0.0%] 0.1% 4.3% | > > | 2/2 1000.2 [ 1.4%/ 0.1%] 4.1% | 1004.1 [ 1.6%/ > > 0.2%] 0.4%4.1% | 997.6 [ 1.2%/ 0.1%] -0.3% 8.1% | > > | 4/4 1912.1 [ 1.0%/ 0.1%] 7.9% | 1904.2 [ 1.1%/ > > 0.1%]-0.4%7.9% | 1914.9 [ 1.3%/ 0.1%] 0.1%15.1% | > > | 8/8 3753.5 [ 0.3%/ 0.0%]14.9% | 3748.2 [ 0.3%/ > > 0.0%]-0.1% 14.9% | 3751.3 [ 0.4%/ 0.0%] -0.1%30.5% | > > | 16/16 7139.3 [ 2.4%/ 0.2%]30.3% | 7137.9 [ 1.8%/ > > 0.2%]-0.0% 30.3% | 7049.2 [ 2.4%/ 0.2%] -1.3%60.4% | > > | 32/32 10899.0 [ 4.2%/ 0.4%]60.3% | 10780.3 [ 4.4%/ > > 0.4%]-1.1% 55.9% | 10339.2 [ 9.6%/ 0.9%] -5.1%97.7% | > > | 64/64 15086.1 [11.5%/ 1.2%]97.7% | 14262.0 [ 8.2%/ > > 0.8%]-5.5% 82.0% | 11168.7 [22.2%/ 1.7%] -26.0% 100.0% | > > |128/12815371.9 [22.0%/ 2.2%] 100.0% | 14675.8 [14.4%/ > > 1.4%]-4.5% 82.8% | 10963.9 [18.5%/ 1.4%] -28.7% 100.0% | > > |256/25615990.8 [22.0%/ 2.2%] 100.0% | 12227.9 [10.3%/ > > 1.0%] -23.5% 73.2% | 10469.9 [19.6%/ 1.7%] -34.5% 100.0% | > > '--' > > > > That's really nice and clear. > > We start to see the penalty for the coresched at 32/32, leaving some cpus > more idle than otherwise. > But it's pretty good overall, for this benchmark at least. > > Is this with stock v2 or with any of the fixes posted after? I wonder how > much the fixes for > the race that violates the rule effects this, for example. > Yeah, this data is based on v2 without any fixes after. I also tried some fixes potential to performance impact but no luck so far. Please let me know if anything I missed. Thanks, -Aubrey
Re: [PATCH v3 12/16] PM / devfreq: tegra: Reconfigure hardware on governor's restart
Hi, On 19. 4. 18. 오전 7:29, Dmitry Osipenko wrote: > Move hardware configuration to governor's start/resume methods. > This allows to re-initialize hardware counters and reconfigure > cleanly if governor was stopped/paused. That is needed because we > are not aware of all hardware changes that happened while governor > was stopped and the paused state may get out of sync with reality, > hence it's better to start with a clean slate after the pause. In > a result there is no memory bandwidth starvation after resume from > suspend-to-ram that results in display controller underflowing that > happens on resume because of improper decision made by devfreq about > the required memory frequency. This change also cleans up code a tad > by moving hardware-configuration code into a single location. > > Signed-off-by: Dmitry Osipenko > --- > drivers/devfreq/tegra-devfreq.c | 98 ++--- > 1 file changed, 40 insertions(+), 58 deletions(-) > > diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c > index 62f35e818122..e9ab49394d35 100644 > --- a/drivers/devfreq/tegra-devfreq.c > +++ b/drivers/devfreq/tegra-devfreq.c > @@ -392,55 +392,6 @@ static int tegra_actmon_rate_notify_cb(struct > notifier_block *nb, > return NOTIFY_OK; > } > > -static void tegra_actmon_enable_interrupts(struct tegra_devfreq *tegra) > -{ > - struct tegra_devfreq_device *dev; > - u32 val; > - unsigned int i; > - > - for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) { > - dev = >devices[i]; > - > - val = device_readl(dev, ACTMON_DEV_CTRL); > - val |= ACTMON_DEV_CTRL_AVG_ABOVE_WMARK_EN; > - val |= ACTMON_DEV_CTRL_AVG_BELOW_WMARK_EN; > - val |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; > - val |= ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; > - > - device_writel(dev, val, ACTMON_DEV_CTRL); > - } > - > - actmon_write_barrier(tegra); > -} > - > -static void tegra_actmon_disable_interrupts(struct tegra_devfreq *tegra) > -{ > - struct tegra_devfreq_device *dev; > - u32 val; > - unsigned int i; > - > - disable_irq(tegra->irq); > - > - for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) { > - dev = >devices[i]; > - > - val = device_readl(dev, ACTMON_DEV_CTRL); > - val &= ~ACTMON_DEV_CTRL_AVG_ABOVE_WMARK_EN; > - val &= ~ACTMON_DEV_CTRL_AVG_BELOW_WMARK_EN; > - val &= ~ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; > - val &= ~ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; > - > - device_writel(dev, val, ACTMON_DEV_CTRL); > - > - device_writel(dev, ACTMON_INTR_STATUS_CLEAR, > - ACTMON_DEV_INTR_STATUS); > - } > - > - actmon_write_barrier(tegra); > - > - enable_irq(tegra->irq); > -} > - > static void tegra_actmon_configure_device(struct tegra_devfreq *tegra, > struct tegra_devfreq_device *dev) > { > @@ -464,11 +415,47 @@ static void tegra_actmon_configure_device(struct > tegra_devfreq *tegra, > << ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_NUM_SHIFT; > val |= (ACTMON_ABOVE_WMARK_WINDOW - 1) > << ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_NUM_SHIFT; > + val |= ACTMON_DEV_CTRL_AVG_ABOVE_WMARK_EN; > + val |= ACTMON_DEV_CTRL_AVG_BELOW_WMARK_EN; > + val |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; > + val |= ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; > val |= ACTMON_DEV_CTRL_ENB; > > device_writel(dev, val, ACTMON_DEV_CTRL); > +} > + > +static void tegra_actmon_start(struct tegra_devfreq *tegra) > +{ > + unsigned int i; > + > + disable_irq(tegra->irq); > + > + actmon_writel(tegra, ACTMON_SAMPLING_PERIOD - 1, > + ACTMON_GLB_PERIOD_CTRL); > + > + for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) > + tegra_actmon_configure_device(tegra, >devices[i]); nitpick. I agree this patch. In order to make it more simple, I think that you can remove tegra_actmon_configure() function and then just do some opertion under the for loop in the tegra_actmon_start() to keep similar style with tegra_actmon_stop(). But there is perfect solution. If you agree, edit it on next patch. If you think that it is not necessary, just keep this code. > + > + actmon_write_barrier(tegra); > + > + enable_irq(tegra->irq); > +} > + > +static void tegra_actmon_stop(struct tegra_devfreq *tegra) > +{ > + unsigned int i; > + > + disable_irq(tegra->irq); > + > + for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) { > + device_writel(>devices[i], 0x, ACTMON_DEV_CTRL); > + device_writel(>devices[i], ACTMON_INTR_STATUS_CLEAR, > + ACTMON_DEV_INTR_STATUS); > + } > > actmon_write_barrier(tegra); > + > + enable_irq(tegra->irq); > } > > static int
Re: [PATCH v3 2/3] power: supply: Add driver for Microchip UCS1002
On Mon, Apr 29, 2019 at 1:36 PM Guenter Roeck wrote: > > On Mon, Apr 29, 2019 at 12:53:48PM -0700, Andrey Smirnov wrote: > > Add driver for Microchip UCS1002 Programmable USB Port Power > > Controller with Charger Emulation. The driver exposed a power supply > > device to control/monitor various parameter of the device as well as a > > regulator to allow controlling VBUS line. > > > > Signed-off-by: Enric Balletbo Serra > > Signed-off-by: Andrey Smirnov > > Cc: Chris Healy > > Cc: Lucas Stach > > Cc: Fabio Estevam > > Cc: Guenter Roeck > > Cc: Sebastian Reichel > > Cc: linux-kernel@vger.kernel.org > > Cc: linux...@vger.kernel.org > > --- > > drivers/power/supply/Kconfig | 9 + > > drivers/power/supply/Makefile| 1 + > > drivers/power/supply/ucs1002_power.c | 646 +++ > > 3 files changed, 656 insertions(+) > > create mode 100644 drivers/power/supply/ucs1002_power.c > > > > diff --git a/drivers/power/supply/Kconfig b/drivers/power/supply/Kconfig > > index e901b9879e7e..c614c8a196f3 100644 > > --- a/drivers/power/supply/Kconfig > > +++ b/drivers/power/supply/Kconfig > > @@ -660,4 +660,13 @@ config FUEL_GAUGE_SC27XX > >Say Y here to enable support for fuel gauge with SC27XX > >PMIC chips. > > > > +config CHARGER_UCS1002 > > +tristate "Microchip UCS1002 USB Port Power Controller" > > + depends on I2C > > + depends on OF > > + select REGMAP_I2C > > + help > > + Say Y to enable support for Microchip UCS1002 Programmable > > + USB Port Power Controller with Charger Emulation. > > + > > endif # POWER_SUPPLY > > diff --git a/drivers/power/supply/Makefile b/drivers/power/supply/Makefile > > index b731c2a9b695..c56803a9e4fe 100644 > > --- a/drivers/power/supply/Makefile > > +++ b/drivers/power/supply/Makefile > > @@ -87,3 +87,4 @@ obj-$(CONFIG_AXP288_CHARGER)+= axp288_charger.o > > obj-$(CONFIG_CHARGER_CROS_USBPD) += cros_usbpd-charger.o > > obj-$(CONFIG_CHARGER_SC2731) += sc2731_charger.o > > obj-$(CONFIG_FUEL_GAUGE_SC27XX) += sc27xx_fuel_gauge.o > > +obj-$(CONFIG_CHARGER_UCS1002)+= ucs1002_power.o > > diff --git a/drivers/power/supply/ucs1002_power.c > > b/drivers/power/supply/ucs1002_power.c > > new file mode 100644 > > index ..677f20a4d76f > > --- /dev/null > > +++ b/drivers/power/supply/ucs1002_power.c > > @@ -0,0 +1,646 @@ > ... > > + > > +static enum power_supply_usb_type ucs1002_usb_types[] = { > > + POWER_SUPPLY_USB_TYPE_PD, > > + POWER_SUPPLY_USB_TYPE_SDP, > > + POWER_SUPPLY_USB_TYPE_DCP, > > + POWER_SUPPLY_USB_TYPE_CDP, > > + POWER_SUPPLY_USB_TYPE_UNKNOWN, > > +}; > > + > > +static int ucs1002_set_usb_type(struct ucs1002_info *info, int val) > > +{ > > + unsigned int mode; > > + > > + if (val >= ARRAY_SIZE(ucs1002_usb_types)) > > + return -EINVAL; > > + > I hate to bring it up that late, but I don't see a check > against val being negative anywhere in the calling code. > Sure, I'll add it in v4 Thanks, Andrey Smirnov
Re: [PATCH 1/2] dt-bindings: Add CDTech S050WV43-CT5 panel bindings
On Thu, 18 Apr 2019 00:38:44 +0100, Florent TOMASIN wrote: > Add documentation for S050WV43-CT5 panel > > Signed-off-by: Florent TOMASIN > --- > .../bindings/display/panel/cdtech,s050wv43-ct5.txt | 12 > 1 file changed, 12 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/display/panel/cdtech,s050wv43-ct5.txt > Reviewed-by: Rob Herring
Re: [PATCH v3 1/3] clk: analogbits: add Wide-Range PLL library
On Mon, 29 Apr 2019, Stephen Boyd wrote: > Quoting Paul Walmsley (2019-04-11 01:27:32) > > diff --git a/drivers/clk/analogbits/Kconfig b/drivers/clk/analogbits/Kconfig > > new file mode 100644 > > index ..b5fd60c7f136 > > --- /dev/null > > +++ b/drivers/clk/analogbits/Kconfig > > @@ -0,0 +1,2 @@ > > Add SPDX for this file? Done. > > +config CLK_ANALOGBITS_WRPLL_CLN28HPC > > + bool > > diff --git a/drivers/clk/analogbits/Makefile > > b/drivers/clk/analogbits/Makefile > > new file mode 100644 > > index ..bb51a3ae77a7 > > --- /dev/null > > +++ b/drivers/clk/analogbits/Makefile > > @@ -0,0 +1 @@ > > Add SPDX for this file? Done. > > +obj-$(CONFIG_CLK_ANALOGBITS_WRPLL_CLN28HPC)+= wrpll-cln28hpc.o > > diff --git a/drivers/clk/analogbits/wrpll-cln28hpc.c > > b/drivers/clk/analogbits/wrpll-cln28hpc.c > > new file mode 100644 > > index ..2027872719e1 > > --- /dev/null > > +++ b/drivers/clk/analogbits/wrpll-cln28hpc.c > > @@ -0,0 +1,360 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * Copyright (C) 2018-2019 SiFive, Inc. > > + * Wesley Terpstra > > + * Paul Walmsley > > + * > > + * This library supports configuration parsing and reprogramming of > > + * the CLN28HPC variant of the Analog Bits Wide Range PLL. The > > + * intention is for this library to be reusable for any device that > > + * integrates this PLL; thus the register structure and programming > > + * details are expected to be provided by a separate IP block driver. > > + * > > + * The bulk of this code is primarily useful for clock configurations > > + * that must operate at arbitrary rates, as opposed to clock configurations > > + * that are restricted by software or manufacturer guidance to a small, > > + * pre-determined set of performance points. > > + * > > + * References: > > + * - Analog Bits "Wide Range PLL Datasheet", version 2015.10.01 > > + * - SiFive FU540-C000 Manual v1p0, Chapter 7 "Clocking and Reset" > > + * https://static.dev.sifive.com/FU540-C000-v1.0.pdf > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +/* MIN_INPUT_FREQ: minimum input clock frequency, in Hz (Fref_min) */ > > +#define MIN_INPUT_FREQ 700 > > + > > +/* MAX_INPUT_FREQ: maximum input clock frequency, in Hz (Fref_max) */ > > +#define MAX_INPUT_FREQ 6 > > + > > +/* MIN_POST_DIVIDE_REF_FREQ: minimum post-divider reference frequency, in > > Hz */ > > +#define MIN_POST_DIVR_FREQ 700 > > + > > +/* MAX_POST_DIVIDE_REF_FREQ: maximum post-divider reference frequency, in > > Hz */ > > +#define MAX_POST_DIVR_FREQ 2 > > + > > +/* MIN_VCO_FREQ: minimum VCO frequency, in Hz (Fvco_min) */ > > +#define MIN_VCO_FREQ 24UL > > + > > +/* MAX_VCO_FREQ: maximum VCO frequency, in Hz (Fvco_max) */ > > +#define MAX_VCO_FREQ 48ULL > > + > > +/* MAX_DIVQ_DIVISOR: maximum output divisor. Selected by DIVQ = 6 */ > > +#define MAX_DIVQ_DIVISOR 64 > > + > > +/* MAX_DIVR_DIVISOR: maximum reference divisor. Selected by DIVR = 63 */ > > +#define MAX_DIVR_DIVISOR 64 > > + > > +/* MAX_LOCK_US: maximum PLL lock time, in microseconds (tLOCK_max) */ > > +#define MAX_LOCK_US70 > > + > > +/* > > + * ROUND_SHIFT: number of bits to shift to avoid precision loss in the > > rounding > > + * algorithm > > + */ > > +#define ROUND_SHIFT20 > > + > > +/* > > + * Private functions > > + */ > > + > > +/** > > + * __wrpll_calc_filter_range() - determine PLL loop filter bandwidth > > + * @post_divr_freq: input clock rate after the R divider > > + * > > + * Select the value to be presented to the PLL RANGE input signals, based > > + * on the input clock frequency after the post-R-divider @post_divr_freq. > > + * This code follows the recommendations in the PLL datasheet for filter > > + * range selection. > > + * > > + * Return: The RANGE value to be presented to the PLL configuration inputs, > > + * or -1 upon error. > > + */ > > +static int __wrpll_calc_filter_range(unsigned long post_divr_freq) > > +{ > > + u8 range; > > + > > + if (post_divr_freq < MIN_POST_DIVR_FREQ || > > + post_divr_freq > MAX_POST_DIVR_FREQ) { > > + WARN(1, "%s: post-divider reference freq out of range: %lu", > > +__func__, post_divr_freq); > > + return -1; > > + } > > + > > + if (post_divr_freq < 1100) > > + range = 1; > > + else if (post_divr_freq < 1800) > > + range = 2; > > + else if (post_divr_freq < 3000) > > + range = 3; > > + else if (post_divr_freq < 5000) > > + range = 4; > > + else if (post_divr_freq < 8000) > > + range = 5; > > + else if (post_divr_freq < 13000) > > + range = 6; > >
RE: [EXT] Re: [PATCHv5 1/6] PCI: mobiveil: Refactor Mobiveil PCIe Host Bridge IP driver
Hi Subbu, > -Original Message- > From: Subrahmanya Lingappa [mailto:l.subrahma...@mobiveil.co.in] > Sent: 2019年4月24日 13:36 > To: Z.q. Hou > Cc: linux-...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > devicet...@vger.kernel.org; linux-kernel@vger.kernel.org; > bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com; > shawn...@kernel.org; Leo Li ; > lorenzo.pieral...@arm.com; catalin.mari...@arm.com; > will.dea...@arm.com; Mingkai Hu ; M.h. Lian > ; Xiaowei Bao > Subject: [EXT] Re: [PATCHv5 1/6] PCI: mobiveil: Refactor Mobiveil PCIe Host > Bridge IP driver > > WARNING: This email was created outside of NXP. DO NOT CLICK links or > attachments unless you recognize the sender and know the content is safe. > > > > ZQ, > > On Fri, Apr 12, 2019 at 3:22 PM Z.q. Hou wrote: > > > > From: Hou Zhiqiang > > > > Refactor the Mobiveil PCIe Host Bridge IP driver to make > > it easier to add support for both RC and EP mode driver. > > This patch moved the Mobiveil driver to an new directory > > 'drivers/pci/controller/mobiveil' and refactor it according > > to the RC and EP abstraction. > > > > Signed-off-by: Hou Zhiqiang > > Reviewed-by: Minghuan Lian > > Reviewed-by: Subrahmanya Lingappa > > --- > > V5: > > - Regenerated this patch on the new base. > > - Retouched the changelog. > > - Updated the Copyright. > > > > MAINTAINERS | 2 +- > > drivers/pci/controller/Kconfig| 11 +- > > drivers/pci/controller/Makefile | 2 +- > > drivers/pci/controller/mobiveil/Kconfig | 24 + > > drivers/pci/controller/mobiveil/Makefile | 4 + > > .../pcie-mobiveil-host.c} | 570 +++--- > > .../controller/mobiveil/pcie-mobiveil-plat.c | 56 ++ > > .../pci/controller/mobiveil/pcie-mobiveil.c | 248 > > .../pci/controller/mobiveil/pcie-mobiveil.h | 211 +++ > > 9 files changed, 636 insertions(+), 492 deletions(-) > > create mode 100644 drivers/pci/controller/mobiveil/Kconfig > > create mode 100644 drivers/pci/controller/mobiveil/Makefile > > rename drivers/pci/controller/{pcie-mobiveil.c => > mobiveil/pcie-mobiveil-host.c} (53%) > > create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil-plat.c > > create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil.c > > create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil.h > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 1e64279f338a..1013e74b14f2 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -11877,7 +11877,7 @@ M: Subrahmanya Lingappa > > > L: linux-...@vger.kernel.org > > S: Supported > > F: Documentation/devicetree/bindings/pci/mobiveil-pcie.txt > > -F: drivers/pci/controller/pcie-mobiveil.c > > +F: drivers/pci/controller/mobiveil/pcie-mobiveil* > > > > Please add yourself as co-maintainer of the mobiveil driver. Thanks for your invite, will add in v6. Regards, Zhiqiang
RE: [PATCH] rtc: snvs: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
Hi, Trent > -Original Message- > From: Trent Piepho [mailto:tpie...@impinj.com] > Sent: Tuesday, April 30, 2019 1:13 AM > To: linux-...@vger.kernel.org; Anson Huang ; > a.zu...@towertech.it; linux-kernel@vger.kernel.org; > alexandre.bell...@bootlin.com > Cc: dl-linux-imx > Subject: Re: [PATCH] rtc: snvs: Use __maybe_unused instead of #if > CONFIG_PM_SLEEP > > On Mon, 2019-04-29 at 07:02 +, Anson Huang wrote: > > Use __maybe_unused for power management related functions instead of > > #if CONFIG_PM_SLEEP to simply the code. > > > > Signed-off-by: Anson Huang > > This will result in the functions always being included, even if PM_SLEEP is > off... > > > > > @@ -387,14 +385,6 @@ static const struct dev_pm_ops snvs_rtc_pm_ops > = { > > .resume_noirq = snvs_rtc_resume_noirq, }; > > ...because they will always be used by the definition of snvs_rtc_pm_ops > here. You are right, I missed this part, have sent out V2 patch with SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() used to define the ops, please help review. Thanks, Anson. > > In order for this to work, SIMPLE_DEV_PM_OPS() needs to be used, so that > the dev_pm_ops struct is empty when PM is off and the functions don't get > referenced. See: > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.o > rg%2Flkml%2F2019%2F1%2F17%2F376data=02%7C01%7Canson.huan > g%40nxp.com%7C5b5aea8d276d4a3e195008d6ccc5e1b5%7C686ea1d3bc2b4 > c6fa92cd99c5c301635%7C0%7C1%7C636921547599787617sdata=K8jv > KXTCIPw4IDgx8aA2Nn%2Fs64FiSpmf7GVuzuXulbI%3Dreserved=0
Re: [PATCH] Revert "PCI/LINK: Report degraded links via link bandwidth notification"
On 4/29/19 1:56 PM, Bjorn Helgaas wrote: From: Bjorn Helgaas This reverts commit e8303bb7a75c113388badcc49b2a84b4121c1b3e. e8303bb7a75c added logging whenever a link changed speed or width to a state that is considered degraded. Unfortunately, it cannot differentiate signal integrity-related link changes from those intentionally initiated by an endpoint driver, including drivers that may live in userspace or VMs when making use of vfio-pci. Some GPU drivers actively manage the link state to save power, which generates a stream of messages like this: vfio-pci :07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at :00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link) We really *do* want to be alerted when the link bandwidth is reduced because of hardware failures, but degradation by intentional link state management is probably far more common, so the signal-to-noise ratio is currently low. Until we figure out a way to identify the real problems or silence the intentional situations, revert the following commits, which include the initial implementation (e8303bb7a75c) and subsequent fixes: I think we're overreacting to a bit of perceived verbosity in the system log. Intentional degradation does not seem to me to be as common as advertised. I have not observed this with either radeon, nouveau, or amdgpu, and the proper mechanism to save power at the link level is ASPM. I stand to be corrected and we have on CC some very knowledgeable fellows that I am certain will jump at the opportunity to do so. What it seems like to me is that a proprietary driver running in a VM is initiating these changes. And if that is the case then it seems this is a virtualization problem. A quick glance over GPU drivers in linux did not reveal any obvious places where we intentionally downgrade a link. I'm not convinced a revert is the best call. Alex e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification") 3e82a7f9031f ("PCI/LINK: Supply IRQ handler so level-triggered IRQs are acked") 55397ce8df48 ("PCI/LINK: Clear bandwidth notification interrupt before enabling it") 0fa635aec9ab ("PCI/LINK: Deduplicate bandwidth reports for multi-function devices") Link: https://lore.kernel.org/lkml/155597243666.19387.1205950870601742062.st...@gimli.home Link: https://lore.kernel.org/lkml/155605909349.3575.13433421148215616375.st...@gimli.home Signed-off-by: Bjorn Helgaas CC: Alexandru Gagniuc CC: Lukas Wunner CC: Alex Williamson --- drivers/pci/pci.h | 1 - drivers/pci/pcie/Makefile | 1 - drivers/pci/pcie/bw_notification.c | 121 - drivers/pci/pcie/portdrv.h | 6 +- drivers/pci/pcie/portdrv_core.c| 17 ++-- drivers/pci/pcie/portdrv_pci.c | 1 - drivers/pci/probe.c| 2 +- 7 files changed, 7 insertions(+), 142 deletions(-) delete mode 100644 drivers/pci/pcie/bw_notification.c diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index d994839a3e24..224d88634115 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -273,7 +273,6 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev); u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed, enum pcie_link_width *width); void __pcie_print_link_status(struct pci_dev *dev, bool verbose); -void pcie_report_downtraining(struct pci_dev *dev); /* Single Root I/O Virtualization */ struct pci_sriov { diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index f1d7bc1e5efa..ab514083d5d4 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -3,7 +3,6 @@ # Makefile for PCI Express features and port driver pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o -pcieportdrv-y += bw_notification.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/bw_notification.c b/drivers/pci/pcie/bw_notification.c deleted file mode 100644 index 4fa9e3523ee1.. --- a/drivers/pci/pcie/bw_notification.c +++ /dev/null @@ -1,121 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0+ -/* - * PCI Express Link Bandwidth Notification services driver - * Author: Alexandru Gagniuc - * - * Copyright (C) 2019, Dell Inc - * - * The PCIe Link Bandwidth Notification provides a way to notify the - * operating system when the link width or data rate changes. This - * capability is required for all root ports and downstream ports - * supporting links wider than x1 and/or multiple link speeds. - * - * This service port driver hooks into the bandwidth notification interrupt - * and warns when links become degraded in operation. - */ - -#include "../pci.h" -#include "portdrv.h" - -static bool pcie_link_bandwidth_notification_supported(struct pci_dev *dev) -{ - int ret; - u32 lnk_cap; - - ret = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, _cap);
Re: [PATCH v1 1/2] perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
Em Sun, Apr 28, 2019 at 04:32:27PM +0800, Leo Yan escreveu: > Robert Walker reported a segmentation fault is observed when process > CoreSight trace data; this issue can be easily reproduced by the > command 'perf report --itrace=i1000i' for decoding tracing data. > > If neither the 'b' flag (synthesize branches events) nor 'l' flag > (synthesize last branch entries) are specified to option '--itrace', > cs_etm_queue::prev_packet will not been initialised. After merging > the code to support exception packets and sample flags, there > introduced a number of uses of cs_etm_queue::prev_packet without > checking whether it is valid, for these cases any accessing to > uninitialised prev_packet will cause crash. > > As cs_etm_queue::prev_packet is used more widely now and it's already > hard to follow which functions have been called in a context where the > validity of cs_etm_queue::prev_packet has been checked, this patch > always allocates memory for cs_etm_queue::prev_packet. > > Reported-by: Robert Walker > Suggested-by: Robert Walker > Fixes: 7100b12cf474 ("perf cs-etm: Generate branch sample for exception > packet") Thanks, applied both to perf/urgent, testing them now in the containers. - Arnaldo > Fixes: 24fff5eb2b93 ("perf cs-etm: Avoid stale branch samples when flush > packet") > Signed-off-by: Leo Yan > --- > tools/perf/util/cs-etm.c | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > index 110804936fc3..054b480aab04 100644 > --- a/tools/perf/util/cs-etm.c > +++ b/tools/perf/util/cs-etm.c > @@ -422,11 +422,9 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct > cs_etm_auxtrace *etm) > if (!etmq->packet) > goto out_free; > > - if (etm->synth_opts.last_branch || etm->sample_branches) { > - etmq->prev_packet = zalloc(szp); > - if (!etmq->prev_packet) > - goto out_free; > - } > + etmq->prev_packet = zalloc(szp); > + if (!etmq->prev_packet) > + goto out_free; > > if (etm->synth_opts.last_branch) { > size_t sz = sizeof(struct branch_stack); > -- > 2.17.1 -- - Arnaldo
[PATCH V2] rtc: snvs: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
Use __maybe_unused for power management related functions instead of #if CONFIG_PM_SLEEP to simply the code. Signed-off-by: Anson Huang Reviewed-by: Dong Aisheng --- Changes since V1: - use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() to make sure snvs_rtc_pm_ops is empty when PM is off. --- drivers/rtc/rtc-snvs.c | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/rtc/rtc-snvs.c b/drivers/rtc/rtc-snvs.c index e0edd594..7ee673a2 100644 --- a/drivers/rtc/rtc-snvs.c +++ b/drivers/rtc/rtc-snvs.c @@ -360,9 +360,7 @@ static int snvs_rtc_probe(struct platform_device *pdev) return ret; } -#ifdef CONFIG_PM_SLEEP - -static int snvs_rtc_suspend_noirq(struct device *dev) +static int __maybe_unused snvs_rtc_suspend_noirq(struct device *dev) { struct snvs_rtc_data *data = dev_get_drvdata(dev); @@ -372,7 +370,7 @@ static int snvs_rtc_suspend_noirq(struct device *dev) return 0; } -static int snvs_rtc_resume_noirq(struct device *dev) +static int __maybe_unused snvs_rtc_resume_noirq(struct device *dev) { struct snvs_rtc_data *data = dev_get_drvdata(dev); @@ -383,18 +381,9 @@ static int snvs_rtc_resume_noirq(struct device *dev) } static const struct dev_pm_ops snvs_rtc_pm_ops = { - .suspend_noirq = snvs_rtc_suspend_noirq, - .resume_noirq = snvs_rtc_resume_noirq, + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(snvs_rtc_suspend_noirq, snvs_rtc_resume_noirq) }; -#define SNVS_RTC_PM_OPS(_rtc_pm_ops) - -#else - -#define SNVS_RTC_PM_OPSNULL - -#endif - static const struct of_device_id snvs_dt_ids[] = { { .compatible = "fsl,sec-v4.0-mon-rtc-lp", }, { /* sentinel */ } @@ -404,7 +393,7 @@ MODULE_DEVICE_TABLE(of, snvs_dt_ids); static struct platform_driver snvs_rtc_driver = { .driver = { .name = "snvs_rtc", - .pm = SNVS_RTC_PM_OPS, + .pm = _rtc_pm_ops, .of_match_table = snvs_dt_ids, }, .probe = snvs_rtc_probe, -- 2.7.4
[PATCH] kbuild: Enable -Wsometimes-uninitialized
This is Clang's version of GCC's -Wmaybe-uninitialized. Up to this point, it has not been used because -Wuninitialized has been disabled, which also turns off -Wsometimes-uninitialized, meaning that we miss out on finding some bugs [1]. In my experience, it appears to be more accurate than GCC and catch some things that GCC can't. All of these warnings have now been fixed in -next across arm, arm64, and x86_64 defconfig/allyesconfig so this should be enabled for everyone to prevent more from easily creeping in. As of next-20190429: $ git log --oneline --grep="sometimes-uninitialized" | wc -l 45 [1]: https://lore.kernel.org/lkml/86649ee4-9794-77a3-502c-f4cd10019...@lca.pw/ Link: https://github.com/ClangBuiltLinux/linux/issues/381 Signed-off-by: Nathan Chancellor --- Masahiro, I am not sure how you want to handle merging this with regards to all of the patches floating around in -next but I wanted to send this out to let everyone know this is ready to be turned on. Arnd, are there many remaning -Wsometimes-uninitialized warnings in randconfigs? scripts/Makefile.extrawarn | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn index 768306add591..f4332981ea85 100644 --- a/scripts/Makefile.extrawarn +++ b/scripts/Makefile.extrawarn @@ -72,5 +72,6 @@ KBUILD_CFLAGS += $(call cc-disable-warning, format) KBUILD_CFLAGS += $(call cc-disable-warning, sign-compare) KBUILD_CFLAGS += $(call cc-disable-warning, format-zero-length) KBUILD_CFLAGS += $(call cc-disable-warning, uninitialized) +KBUILD_CFLAGS += $(call cc-option, -Wsometimes-uninitialized) endif endif -- 2.21.0
[PATCH V2] clk: imx: pllv4: add fractional-N pll support
The pllv4 supports fractional-N function, the formula is: PLL output freq = input * (mult + num/denom), This patch adds fractional-N function support, including clock round rate, calculate rate and set rate, with this patch, the clock rate of APLL in clock tree is more accurate than before: Without fraction: apll_pre_sel 1112400 0 0 5 apll_pre_div 1122400 0 0 5 apll112 52800 0 0 5 apll_pfd3000 79200 0 0 5 apll_pfd2000 339428571 0 0 5 apll_pfd1000 35200 0 0 5 usdhc0000 35200 0 0 5 apll_pfd0111 35200 0 0 5 With fraction: apll_pre_sel 1112400 0 0 5 apll_pre_div 1122400 0 0 5 apll112 52920 0 0 5 apll_pfd3000 79380 0 0 5 apll_pfd2000 34020 0 0 5 apll_pfd1000 35280 0 0 5 usdhc0000 35280 0 0 5 apll_pfd0111 35280 0 0 5 Signed-off-by: Anson Huang Reviewed-by: Dong Aisheng --- drivers/clk/imx/clk-pllv4.c | 72 +++-- 1 file changed, 63 insertions(+), 9 deletions(-) diff --git a/drivers/clk/imx/clk-pllv4.c b/drivers/clk/imx/clk-pllv4.c index d38bc9f..d7e62c3 100644 --- a/drivers/clk/imx/clk-pllv4.c +++ b/drivers/clk/imx/clk-pllv4.c @@ -30,6 +30,9 @@ /* PLL Denominator Register (xPLLDENOM) */ #define PLL_DENOM_OFFSET 0x14 +#define MAX_MFD0x3fff +#define DEFAULT_MFD100 + struct clk_pllv4 { struct clk_hw hw; void __iomem*base; @@ -64,13 +67,20 @@ static unsigned long clk_pllv4_recalc_rate(struct clk_hw *hw, unsigned long parent_rate) { struct clk_pllv4 *pll = to_clk_pllv4(hw); - u32 div; + u32 mult, mfn, mfd; + u64 temp64; + + mult = readl_relaxed(pll->base + PLL_CFG_OFFSET); + mult &= BM_PLL_MULT; + mult >>= BP_PLL_MULT; - div = readl_relaxed(pll->base + PLL_CFG_OFFSET); - div &= BM_PLL_MULT; - div >>= BP_PLL_MULT; + mfn = readl_relaxed(pll->base + PLL_NUM_OFFSET); + mfd = readl_relaxed(pll->base + PLL_DENOM_OFFSET); + temp64 = parent_rate; + temp64 *= mfn; + do_div(temp64, mfd); - return parent_rate * div; + return (parent_rate * mult) + (u32)temp64; } static long clk_pllv4_round_rate(struct clk_hw *hw, unsigned long rate, @@ -78,14 +88,46 @@ static long clk_pllv4_round_rate(struct clk_hw *hw, unsigned long rate, { unsigned long parent_rate = *prate; unsigned long round_rate, i; + u32 mfn, mfd = DEFAULT_MFD; + bool found = false; + u64 temp64; for (i = 0; i < ARRAY_SIZE(pllv4_mult_table); i++) { round_rate = parent_rate * pllv4_mult_table[i]; - if (rate >= round_rate) - return round_rate; + if (rate >= round_rate) { + found = true; + break; + } + } + + if (!found) { + pr_warn("%s: unable to round rate %lu, parent rate %lu\n", + clk_hw_get_name(hw), rate, parent_rate); + return 0; } - return round_rate; + if (parent_rate <= MAX_MFD) + mfd = parent_rate; + + temp64 = (u64)(rate - round_rate); + temp64 *= mfd; + do_div(temp64, parent_rate); + mfn = temp64; + + /* +* NOTE: The value of numerator must always be configured to be +* less than the value of the denominator. If we can't get a proper +* pair of mfn/mfd, we simply return the round_rate without using +* the frac part. +*/ + if (mfn >= mfd) + return round_rate; + + temp64 = (u64)parent_rate; + temp64 *= mfn; + do_div(temp64, mfd); + + return round_rate + (u32)temp64; } static bool clk_pllv4_is_valid_mult(unsigned int mult) @@ -105,18 +147,30 @@ static int clk_pllv4_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate)
Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation
On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote: > On Mon, Apr 29, 2019 at 03:22:09PM -0700, Linus Torvalds wrote: > > On Mon, Apr 29, 2019 at 3:08 PM Sean Christopherson > > wrote: > > > > > > FWIW, Lakemont (Quark) doesn't block NMI/SMI in the STI shadow, but I'm > > > not sure that counters the "horrible errata" statement ;-). SMI+RSM saves > > > and restores STI blocking in that case, but AFAICT NMI has no such > > > protection and will effectively break the shadow on its IRET. > > > > Ugh. I can't say I care deeply about Quark (ie never seemed to go > > anywhere), but it's odd. I thought it was based on a Pentium core (or > > i486+?). Are you saying those didn't do it either? > > It's 486 based, but either way I suspect the answer is "yes". IIRC, > Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that > was based on P54C, though I'm struggling to recall exactly what the > Larrabee weirdness was. Aha! Found an ancient comment that explicitly states P5 does not block NMI/SMI in the STI shadow, while P6 does block NMI/SMI.
Re: [PATCH 1/2] dt-bindings: Add ir38064 as a trivial device
On Tue, Apr 16, 2019 at 08:41:38AM -0700, Patrick Venture wrote: > The ir38064 is a voltage regulator from Infineon. > > Signed-off-by: Patrick Venture > --- > Documentation/devicetree/bindings/trivial-devices.yaml | 2 ++ > 1 file changed, 2 insertions(+) Patch 1 and 2 applied. Rob
Re: [PATCH v2 2/8] dt-bindings: remoteproc: add bindings for stm32 remote processor driver
On Tue, Apr 16, 2019 at 04:58:13PM +0200, Fabien Dessenne wrote: > Add the device tree bindings document for the stm32 remoteproc devices. > > Signed-off-by: Fabien Dessenne > --- > .../devicetree/bindings/remoteproc/stm32-rproc.txt | 64 > ++ > 1 file changed, 64 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/remoteproc/stm32-rproc.txt > > diff --git a/Documentation/devicetree/bindings/remoteproc/stm32-rproc.txt > b/Documentation/devicetree/bindings/remoteproc/stm32-rproc.txt > new file mode 100644 > index 000..430132c > --- /dev/null > +++ b/Documentation/devicetree/bindings/remoteproc/stm32-rproc.txt > @@ -0,0 +1,64 @@ > +STMicroelectronics STM32 Remoteproc > +--- > +This document defines the binding for the remoteproc component that loads and > +boots firmwares on the ST32MP family chipset. > + > +Required properties: > +- compatible:Must be "st,stm32mp1-m4" > +- reg: Address ranges of the remote processor dedicated > memories. > + The parent node should provide an appropriate ranges property > + for properly translating these into bus addresses. dma-ranges, but that's independent of 'reg'. It needs to list how many reg regions and what they are. > +- resets:Reference to a reset controller asserting the remote processor. > +- st,syscfg-holdboot: Reference to the system configuration which holds the > + remote processor reset hold boot > + 1st cell: phandle of syscon block > + 2nd cell: register offset containing the hold boot setting > + 3rd cell: register bitmask for the hold boot field > +- st,syscfg-tz: Reference to the system configuration which holds the RCC > trust > + zone mode > + 1st cell: phandle to syscon block > + 2nd cell: register offset containing the RCC trust zone mode setting > + 3rd cell: register bitmask for the RCC trust zone mode bit > + > +Optional properties: > +- interrupts:Should contain the watchdog interrupt > +- mboxes:This property is required only if the rpmsg/virtio functionality > + is used. List of phandle and mailbox channel specifiers: > + - a channel (a) used to communicate through virtqueues with the > + remote proc. > + Bi-directional channel: > + - from local to remote = send message > + - from remote to local = send message ack > + - a channel (b) working the opposite direction of channel (a) > + - a channel (c) used by the local proc to notify the remote proc > + that it is about to be shut down. > + Unidirectional channel: > + - from local to remote, where ACK from the remote means > + that it is ready for shutdown > +- mbox-names:This property is required if the mboxes property is > used. > + - must be "vq0" for channel (a) > + - must be "vq1" for channel (b) > + - must be "shutdown" for channel (c) > +- memory-region: List of phandles to the reserved memory regions associated > with > + the remoteproc device. This is variable and describes the > + memories shared with the remote processor (eg: remoteproc > + firmware and carveouts, rpmsg vrings, ...). > + (see ../reserved-memory/reserved-memory.txt) > +- st,syscfg-pdds: Reference to the system configuration which holds the > remote > + processor deep sleep setting > + 1st cell: phandle to syscon block > + 2nd cell: register offset containing the deep sleep setting > + 3rd cell: register bitmask for the deep sleep bit > +- auto_boot: If defined, when remoteproc is probed, it loads the default > + firmware and starts the remote processor. st,auto-boot > + > +Example: > + m4_rproc: m4@0 { > + compatible = "st,stm32mp1-m4"; > + reg = <0x 0x1>, > + <0x1000 0x4>, > + <0x3000 0x4>; > + resets = < MCU_R>; > + st,syscfg-holdboot = < 0x10C 0x1>; > + st,syscfg-tz = < 0x000 0x1>; > + }; > -- > 2.7.4 >