Re: [PATCH] serial/sifive: select SERIAL_EARLYCON
On Sep 10 2019, Christoph Hellwig wrote: > The sifive serial driver implements earlycon support, It should probably be documented in admin-guide/kernel-parameters.txt. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH] media: vimc: fla: Add virtual flash subdevice
On 9/10/19 1:00 AM, Lucas Magalhães wrote: > Hi Hans, > Thanks for the review. I fixed most of the issues you found. Just have > the question below. > > On Mon, Sep 2, 2019 at 9:04 AM Hans Verkuil wrote: >> >>> + >>> +int vimc_fla_add(struct vimc_device *vimc, struct vimc_ent_config *vcfg) >>> +{ >>> + struct v4l2_device *v4l2_dev = &vimc->v4l2_dev; >>> + struct vimc_fla_device *vfla; >>> + int ret; >>> + >>> + /* Allocate the vfla struct */ >>> + vfla = kzalloc(sizeof(*vfla), GFP_KERNEL); >>> + if (!vfla) >>> + return -ENOMEM; >>> + >>> + v4l2_ctrl_handler_init(&vfla->hdl, 4); >>> + >>> + v4l2_ctrl_new_std_menu(&vfla->hdl, &vimc_fla_ctrl_ops, >>> +V4L2_CID_FLASH_LED_MODE, >>> +V4L2_FLASH_LED_MODE_TORCH, ~0x7, >>> +V4L2_FLASH_LED_MODE_NONE); >>> + v4l2_ctrl_new_std_menu(&vfla->hdl, &vimc_fla_ctrl_ops, >>> +V4L2_CID_FLASH_STROBE_SOURCE, 0x1, ~0x3, >>> +V4L2_FLASH_STROBE_SOURCE_SOFTWARE); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_STROBE, 0, 0, 0, 0); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_STROBE_STOP, 0, 0, 0, 0); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_TIMEOUT, 1, 10, 1, 10); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_TORCH_INTENSITY, 0, 255, 1, 255); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_INTENSITY, 0, 255, 1, 255); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_INDICATOR_INTENSITY, 0, 255, 1, 255); >>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops, >>> + V4L2_CID_FLASH_STROBE_STATUS, 0, 0, 0, 0); >> >> It would be nice if this would actually reflect the actual strobe status. >> > Regarding the strobe status I was reading the code and find out that > V4L2_CID_FLASH_STROBE_STATUS is a V4L2_CTRL_FLAG_READ_ONLY > but it's not a V4L2_CTRL_FLAG_VOLATILE. I found this intriguing. How an > I suppose to get it if its not volatile? As I understood it changes over time > if the strobe starts and the timeout expire, isn't it? Shouldn't it be > volatile > if so? A non-volatile read-only control is set deterministically by the the driver. So the driver calls v4l2_ctrl_s_ctrl() to change the controls value. A volatile read-only control is one where the value is read from a hardware register that is continuously changing. E.g. if autogain is on, then the gain register in a device contains the currently calculated gain, but that might be changed the next time the register is read. Regards, Hans > > I've already made a simple implementation were V4L2_CID_FLASH_STROBE_STATUS > returns after calling V4L2_CID_FLASH_STROBE and becomes false after the > timeout > time passes. > > Thanks! >
[tip: core/objtool] objtool: Clobber user CFLAGS variable
The following commit has been merged into the core/objtool branch of tip: Commit-ID: f73b3cc39c84220e6dccd463b5c8279b03514646 Gitweb: https://git.kernel.org/tip/f73b3cc39c84220e6dccd463b5c8279b03514646 Author:Josh Poimboeuf AuthorDate:Thu, 29 Aug 2019 18:28:49 -05:00 Committer: Ingo Molnar CommitterDate: Tue, 10 Sep 2019 08:49:52 +02:00 objtool: Clobber user CFLAGS variable If the build user has the CFLAGS variable set in their environment, objtool blindly appends to it, which can cause unexpected behavior. Clobber CFLAGS to ensure consistent objtool compilation behavior. Reported-by: Valdis Kletnieks Tested-by: Valdis Kletnieks Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/83a276df209962e6058fcb6c615eef9d401c21bc.1567121311.git.jpoim...@redhat.com Signed-off-by: Ingo Molnar --- tools/objtool/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile index 8815823..20f67fc 100644 --- a/tools/objtool/Makefile +++ b/tools/objtool/Makefile @@ -35,7 +35,7 @@ INCLUDES := -I$(srctree)/tools/include \ -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi \ -I$(srctree)/tools/objtool/arch/$(ARCH)/include WARNINGS := $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -Wno-packed -CFLAGS += -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) $(LIBELF_FLAGS) +CFLAGS := -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) $(LIBELF_FLAGS) LDFLAGS += $(LIBELF_LIBS) $(LIBSUBCMD) $(KBUILD_HOSTLDFLAGS) # Allow old libelf to be used:
[PATCH net 2/2] sctp: destroy bucket if failed to bind addr
There is one memory leak bug report: BUG: memory leak unreferenced object 0x8881dc4c5ec0 (size 40): comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s) hex dump (first 32 bytes): 02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00 f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00 .c=. backtrace: [<72006339>] sctp_get_port_local+0x2a1/0xa00 [sctp] [] sctp_do_bind+0x176/0x2c0 [sctp] [<5be274a2>] sctp_bind+0x5a/0x80 [sctp] [ ] inet6_bind+0x59/0xd0 [ipv6] [ ] __sys_bind+0x120/0x1f0 net/socket.c:1647 [<4513635b>] __do_sys_bind net/socket.c:1658 [inline] [<4513635b>] __se_sys_bind net/socket.c:1656 [inline] [<4513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656 [<61f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296 [<03d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe This is because in sctp_do_bind, if sctp_get_port_local is to create hash bucket successfully, and sctp_add_bind_addr failed to bind address, e.g return -ENOMEM, so memory leak found, it needs to destroy allocated bucket. Reported-by: Hulk Robot Signed-off-by: Mao Wenan --- net/sctp/socket.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 766b68b55ebe..ab37fc1f7bb6 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -412,11 +412,13 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr *addr, int len) ret = sctp_add_bind_addr(bp, addr, af->sockaddr_len, SCTP_ADDR_SRC, GFP_ATOMIC); - /* Copy back into socket for getsockname() use. */ - if (!ret) { - inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num); - sp->pf->to_sk_saddr(addr, sk); + if (ret) { + sctp_put_port(sk); + return ret; } + /* Copy back into socket for getsockname() use. */ + inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num); + sp->pf->to_sk_saddr(addr, sk); return ret; } -- 2.20.1
[PATCH net 0/2] fix memory leak for sctp_do_bind
First patch is to do cleanup, remove redundant assignment, second patch is to fix memory leak for sctp_do_bind if failed to bind address. Mao Wenan (2): sctp: remove redundant assignment when call sctp_get_port_local sctp: destroy bucket if failed to bind addr net/sctp/socket.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) -- 2.20.1
[PATCH net 1/2] sctp: remove redundant assignment when call sctp_get_port_local
There are more parentheses in if clause when call sctp_get_port_local in sctp_do_bind, and redundant assignment to 'ret'. This patch is to do cleanup. Signed-off-by: Mao Wenan --- net/sctp/socket.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 9d1f83b10c0a..766b68b55ebe 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -399,9 +399,8 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr *addr, int len) * detection. */ addr->v4.sin_port = htons(snum); - if ((ret = sctp_get_port_local(sk, addr))) { + if (sctp_get_port_local(sk, addr)) return -EADDRINUSE; - } /* Refresh ephemeral port. */ if (!bp->port) -- 2.20.1
[PATCH v3][RESEND] scripts: use pkg-config to locate libcrypto
Otherwise build fails if the headers are not in the default location. While at it also ask pkg-config for the libs, with fallback to the existing value. Signed-off-by: Rolf Eike Beer Cc: sta...@vger.kernel.org --- scripts/Makefile | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/scripts/Makefile b/scripts/Makefile index 16bcb8087899..1715adcd8f81 100644 --- a/scripts/Makefile +++ b/scripts/Makefile @@ -8,7 +8,11 @@ # conmakehash: Create chartable # conmakehash: Create arrays for initializing the kernel console tables +PKG_CONFIG?= pkg-config + HOST_EXTRACFLAGS += -I$(srctree)/tools/include +CRYPTO_LIBS = $(shell $(PKG_CONFIG) --libs libcrypto 2> /dev/null || echo -lcrypto) +CRYPTO_CFLAGS = $(shell $(PKG_CONFIG) --cflags libcrypto 2> /dev/null) hostprogs-$(CONFIG_BUILD_BIN2C) += bin2c hostprogs-$(CONFIG_KALLSYMS) += kallsyms @@ -23,8 +27,9 @@ hostprogs-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE) += insert-sys-cert HOSTCFLAGS_sortextable.o = -I$(srctree)/tools/include HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include -HOSTLDLIBS_sign-file = -lcrypto -HOSTLDLIBS_extract-cert = -lcrypto +HOSTLDLIBS_sign-file = $(CRYPTO_LIBS) +HOSTCFLAGS_extract-cert.o = $(CRYPTO_CFLAGS) +HOSTLDLIBS_extract-cert = $(CRYPTO_LIBS) always := $(hostprogs-y) $(hostprogs-m) -- 2.23.0
Re: [RFC PATCH untested] vhost: block speculation of translated descriptors
On Tue, Sep 10, 2019 at 09:52:10AM +0800, Jason Wang wrote: > > On 2019/9/9 下午10:45, Michael S. Tsirkin wrote: > > On Mon, Sep 09, 2019 at 03:19:55PM +0800, Jason Wang wrote: > > > On 2019/9/8 下午7:05, Michael S. Tsirkin wrote: > > > > iovec addresses coming from vhost are assumed to be > > > > pre-validated, but in fact can be speculated to a value > > > > out of range. > > > > > > > > Userspace address are later validated with array_index_nospec so we can > > > > be sure kernel info does not leak through these addresses, but vhost > > > > must also not leak userspace info outside the allowed memory table to > > > > guests. > > > > > > > > Following the defence in depth principle, make sure > > > > the address is not validated out of node range. > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > --- > > > >drivers/vhost/vhost.c | 4 +++- > > > >1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > > > index 5dc174ac8cac..0ee375fb7145 100644 > > > > --- a/drivers/vhost/vhost.c > > > > +++ b/drivers/vhost/vhost.c > > > > @@ -2072,7 +2072,9 @@ static int translate_desc(struct vhost_virtqueue > > > > *vq, u64 addr, u32 len, > > > > size = node->size - addr + node->start; > > > > _iov->iov_len = min((u64)len - s, size); > > > > _iov->iov_base = (void __user *)(unsigned long) > > > > - (node->userspace_addr + addr - node->start); > > > > + (node->userspace_addr + > > > > +array_index_nospec(addr - node->start, > > > > + node->size)); > > > > s += size; > > > > addr += size; > > > > ++ret; > > > > > > I've tried this on Kaby Lake smap off metadata acceleration off using > > > testpmd (virtio-user) + vhost_net. I don't see obvious performance > > > difference with TX PPS. > > > > > > Thanks > > Should I push this to Linus right now then? It's a security thing so > > maybe we better do it ASAP ... what's your opinion? > > > Yes, you can. > > Acked-by: Jason Wang And should I include Tested-by: Jason Wang ? > > > >
Re: [PATCH] lib/Kconfig: fix OBJAGG in lib/ menu structure
On Mon, Sep 09, 2019 at 02:54:21PM -0700, Randy Dunlap wrote: > From: Randy Dunlap > > Keep the "Library routines" menu intact by moving OBJAGG into it. > Otherwise OBJAGG is displayed/presented as an orphan in the > various config menus. > > Fixes: 0a020d416d0a ("lib: introduce initial implementation of object > aggregation manager") > Signed-off-by: Randy Dunlap > Cc: Jiri Pirko > Cc: Ido Schimmel > Cc: David S. Miller Tested-by: Ido Schimmel Thanks!
Re: [PATCH] ocfs2: Fix passing zero to 'PTR_ERR' warning
On 19/9/9 18:04, Ding Xiang wrote: > Fix a static code checker warning: > fs/ocfs2/acl.c:331 > ocfs2_acl_chmod() warn: passing zero to 'PTR_ERR' > > Fixes: 5ee0fbd50fd ("ocfs2: revert using ocfs2_acl_chmod to avoid inode > cluster lock hang") > Signed-off-by: Ding Xiang Reviewed-by: Joseph Qi > --- > fs/ocfs2/acl.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c > index 3e7da39..bb981ec 100644 > --- a/fs/ocfs2/acl.c > +++ b/fs/ocfs2/acl.c > @@ -327,8 +327,8 @@ int ocfs2_acl_chmod(struct inode *inode, struct > buffer_head *bh) > down_read(&OCFS2_I(inode)->ip_xattr_sem); > acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh); > up_read(&OCFS2_I(inode)->ip_xattr_sem); > - if (IS_ERR(acl) || !acl) > - return PTR_ERR(acl); > + if (IS_ERR_OR_NULL(acl)) > + return PTR_ERR_OR_ZERO(acl); > ret = __posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode); > if (ret) > return ret; >
Re: [PATCH] driver core: ensure a device has valid node id in device_add()
On 2019/9/9 17:53, Greg KH wrote: > On Mon, Sep 09, 2019 at 02:04:23PM +0800, Yunsheng Lin wrote: >> Currently a device does not belong to any of the numa nodes >> (dev->numa_node is NUMA_NO_NODE) when the node id is neither >> specified by fw nor by virtual device layer and the device has >> no parent device. > > Is this really a problem? Not really. Someone need to guess the node id when it is not specified, right? This patch chooses to guess the node id in the driver core. > >> According to discussion in [1]: >> Even if a device's numa node is not specified, the device really >> does belong to a node. > > But as we do not know the node, can we cause more harm by randomly > picking one (i.e. putting it all in node 0)? If we do not pick node 0 for device with invalid node, then caller need to check the node id and pick one, and currently different callers does a different checking: 1) some does " < 0" check; 2) some does "== NUMA_NO_NODE" check; 3) some does ">= MAX_NUMNODES" check; 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check. and caller of dev_to_node() may pick one node based on below if the dev_to_node() return a invalid node based on above checking: 1) based on numa_mem_id(). 2) pick a random one like in workqueue_select_cpu_near(). If we pick node 0 for device with invalid node in device_add(), we may avoid the above different checking and picking for caller, but we may lose some caller context info, for example, user may use node of the cpu on which the process is using the device to allocate the resource close to the process, or user may pick a random one if they know what they are doing. It seems there is trade off here, as I can see, we can guess and pick the node at different stage when it is not specified. 1. guess and pick node 0 at device_add(), it has the advantage of ensure all devices will have a valid node at very begin of device creation, so the user does not have to check and guess one, but user might lose the opportunity to do their own guessing and picking. 2. Maybe provide a dev_to_valid_node() to always return a valid node id, for example return numa_mem_id() if dev->numa_node is not valid. User know what they are doing can still use dev_to_node(). 3. Caller of dev_to_node() do their own checking and picking, which might lead to adding more different and reduplicate checking as above.
[tip: x86/asm] x86/umip: Add emulation (spoofing) for UMIP covered instructions in 64-bit processes as well
The following commit has been merged into the x86/asm branch of tip: Commit-ID: e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2 Gitweb: https://git.kernel.org/tip/e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2 Author:Brendan Shanks AuthorDate:Thu, 05 Sep 2019 16:22:21 -07:00 Committer: Ingo Molnar CommitterDate: Tue, 10 Sep 2019 08:36:16 +02:00 x86/umip: Add emulation (spoofing) for UMIP covered instructions in 64-bit processes as well Add emulation (spoofing) of the SGDT, SIDT, and SMSW instructions for 64-bit processes. Wine users have encountered a number of 64-bit Windows games that use these instructions (particularly SGDT), and were crashing when run on UMIP-enabled systems. Originally-by: Ricardo Neri Signed-off-by: Brendan Shanks Reviewed-by: Ricardo Neri Reviewed-by: H. Peter Anvin (Intel) Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Eric W. Biederman Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/2019090523.14900-1-bsha...@codeweavers.com [ Minor edits: capitalization, added 'spoofing' wording. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/umip.c | 65 +++-- 1 file changed, 38 insertions(+), 27 deletions(-) diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c index 5b345ad..548fefe 100644 --- a/arch/x86/kernel/umip.c +++ b/arch/x86/kernel/umip.c @@ -19,7 +19,7 @@ /** DOC: Emulation for User-Mode Instruction Prevention (UMIP) * * The feature User-Mode Instruction Prevention present in recent Intel - * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str) + * processor prevents a group of instructions (SGDT, SIDT, SLDT, SMSW and STR) * from being executed with CPL > 0. Otherwise, a general protection fault is * issued. * @@ -36,8 +36,8 @@ * DOSEMU2) rely on this subset of instructions to function. * * The instructions protected by UMIP can be split in two groups. Those which - * return a kernel memory address (sgdt and sidt) and those which return a - * value (sldt, str and smsw). + * return a kernel memory address (SGDT and SIDT) and those which return a + * value (SLDT, STR and SMSW). * * For the instructions that return a kernel memory address, applications * such as WineHQ rely on the result being located in the kernel memory space, @@ -45,15 +45,13 @@ * value that, lies close to the top of the kernel memory. The limit for the GDT * and the IDT are set to zero. * - * Given that sldt and str are not commonly used in programs that run on WineHQ + * Given that SLDT and STR are not commonly used in programs that run on WineHQ * or DOSEMU2, they are not emulated. * * The instruction smsw is emulated to return the value that the register CR0 * has at boot time as set in the head_32. * - * Also, emulation is provided only for 32-bit processes; 64-bit processes - * that attempt to use the instructions that UMIP protects will receive the - * SIGSEGV signal issued as a consequence of the general protection fault. + * Emulation is provided for both 32-bit and 64-bit processes. * * Care is taken to appropriately emulate the results when segmentation is * used. That is, rather than relying on USER_DS and USER_CS, the function @@ -63,17 +61,18 @@ * application uses a local descriptor table. */ -#define UMIP_DUMMY_GDT_BASE 0xfffe -#define UMIP_DUMMY_IDT_BASE 0x +#define UMIP_DUMMY_GDT_BASE 0xfffeULL +#define UMIP_DUMMY_IDT_BASE 0xULL /* * The SGDT and SIDT instructions store the contents of the global descriptor * table and interrupt table registers, respectively. The destination is a * memory operand of X+2 bytes. X bytes are used to store the base address of - * the table and 2 bytes are used to store the limit. In 32-bit processes, the - * only processes for which emulation is provided, X has a value of 4. + * the table and 2 bytes are used to store the limit. In 32-bit processes X + * has a value of 4, in 64-bit processes X has a value of 8. */ -#define UMIP_GDT_IDT_BASE_SIZE 4 +#define UMIP_GDT_IDT_BASE_SIZE_64BIT 8 +#define UMIP_GDT_IDT_BASE_SIZE_32BIT 4 #define UMIP_GDT_IDT_LIMIT_SIZE 2 #defineUMIP_INST_SGDT 0 /* 0F 01 /0 */ @@ -189,6 +188,7 @@ static int identify_insn(struct insn *insn) * @umip_inst: A constant indicating the instruction to emulate * @data: Buffer into which the dummy result is stored * @data_size: Size of the emulated result + * @x86_64:true if process is 64-bit, false otherwise * * Emulate an instruction protected by UMIP and provide a dummy result. The * result of the emulation is saved in @data. The size of the results depends @@ -202,11 +202,8 @@ static int identify_insn(struct insn *insn) * 0 on success, -EINVAL on error while emulating. */ static int emulate_umip_insn(struct insn *insn, int umip_inst, -unsigned char *da
[PATCH v2] x86/umip: Add emulation for 64-bit processes
* h...@zytor.com wrote: > On September 10, 2019 7:28:28 AM GMT+01:00, Ingo Molnar > wrote: > > > >* h...@zytor.com wrote: > > > >> I would strongly suggest that we change the term "emulation" to > >> "spoofing" for these instructions. We need to explain that we do > >*not* > >> execute these instructions the was the CPU would have, and unlike the > > > >> native instructions do not leak kernel information. > > > >Ok, I've edited the patch to add the 'spoofing' wording where > >appropriate, and I also made minor fixes such as consistently > >capitalizing instruction names. > > > >Can I also add your Reviewed-by tag? > > > >So the patch should show up in tip:x86/asm today-ish, and barring any > >complications is v5.4 material. > > > >Thanks, > > > > Ingo > > Yes, please do. > > Reviewed-by: H. Peter Anvin (Intel) Thanks! I've attached the updated version of the patch I'm testing. Ingo ==> >From e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2 Mon Sep 17 00:00:00 2001 From: Brendan Shanks Date: Thu, 5 Sep 2019 16:22:21 -0700 Subject: [PATCH] x86/umip: Add emulation (spoofing) for UMIP covered instructions in 64-bit processes as well Add emulation (spoofing) of the SGDT, SIDT, and SMSW instructions for 64-bit processes. Wine users have encountered a number of 64-bit Windows games that use these instructions (particularly SGDT), and were crashing when run on UMIP-enabled systems. Originally-by: Ricardo Neri Signed-off-by: Brendan Shanks Reviewed-by: Ricardo Neri Reviewed-by: H. Peter Anvin (Intel) Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Eric W. Biederman Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/2019090523.14900-1-bsha...@codeweavers.com [ Minor edits: capitalization, added 'spoofing' wording. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/umip.c | 65 +- 1 file changed, 38 insertions(+), 27 deletions(-) diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c index 5b345add550f..548fefed71ee 100644 --- a/arch/x86/kernel/umip.c +++ b/arch/x86/kernel/umip.c @@ -19,7 +19,7 @@ /** DOC: Emulation for User-Mode Instruction Prevention (UMIP) * * The feature User-Mode Instruction Prevention present in recent Intel - * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str) + * processor prevents a group of instructions (SGDT, SIDT, SLDT, SMSW and STR) * from being executed with CPL > 0. Otherwise, a general protection fault is * issued. * @@ -36,8 +36,8 @@ * DOSEMU2) rely on this subset of instructions to function. * * The instructions protected by UMIP can be split in two groups. Those which - * return a kernel memory address (sgdt and sidt) and those which return a - * value (sldt, str and smsw). + * return a kernel memory address (SGDT and SIDT) and those which return a + * value (SLDT, STR and SMSW). * * For the instructions that return a kernel memory address, applications * such as WineHQ rely on the result being located in the kernel memory space, @@ -45,15 +45,13 @@ * value that, lies close to the top of the kernel memory. The limit for the GDT * and the IDT are set to zero. * - * Given that sldt and str are not commonly used in programs that run on WineHQ + * Given that SLDT and STR are not commonly used in programs that run on WineHQ * or DOSEMU2, they are not emulated. * * The instruction smsw is emulated to return the value that the register CR0 * has at boot time as set in the head_32. * - * Also, emulation is provided only for 32-bit processes; 64-bit processes - * that attempt to use the instructions that UMIP protects will receive the - * SIGSEGV signal issued as a consequence of the general protection fault. + * Emulation is provided for both 32-bit and 64-bit processes. * * Care is taken to appropriately emulate the results when segmentation is * used. That is, rather than relying on USER_DS and USER_CS, the function @@ -63,17 +61,18 @@ * application uses a local descriptor table. */ -#define UMIP_DUMMY_GDT_BASE 0xfffe -#define UMIP_DUMMY_IDT_BASE 0x +#define UMIP_DUMMY_GDT_BASE 0xfffeULL +#define UMIP_DUMMY_IDT_BASE 0xULL /* * The SGDT and SIDT instructions store the contents of the global descriptor * table and interrupt table registers, respectively. The destination is a * memory operand of X+2 bytes. X bytes are used to store the base address of - * the table and 2 bytes are used to store the limit. In 32-bit processes, the - * only processes for which emulation is provided, X has a value of 4. + * the table and 2 bytes are used to store the limit. In 32-bit processes X + * has a value of 4, in 64-bit processes X has a value of 8. */ -#define UMIP_GDT_IDT_BASE_SIZE 4 +#define UMIP_GDT_IDT_BASE_SIZE_64BIT 8 +#define UMIP_GDT_IDT_BASE_SIZE_32BIT 4 #define UMIP_GDT_IDT_LIMIT_S
Re: [PATCH] KVM: x86: Manually calculate reserved bits when loading PDPTRS
On Tue, Sep 03, 2019 at 04:36:45PM -0700, Sean Christopherson wrote: > Manually generate the PDPTR reserved bit mask when explicitly loading > PDPTRs. The reserved bits that are being tracked by the MMU reflect the > current paging mode, which is unlikely to be PAE paging in the vast > majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation, > __set_sregs(), etc... This can cause KVM to incorrectly signal a bad > PDPTR, or more likely, miss a reserved bit check and subsequently fail > a VM-Enter due to a bad VMCS.GUEST_PDPTR. > > Add a one off helper to generate the reserved bits instead of sharing > code across the MMU's calculations and the PDPTR emulation. The PDPTR > reserved bits are basically set in stone, and pushing a helper into > the MMU's calculation adds unnecessary complexity without improving > readability. > > Oppurtunistically fix/update the comment for load_pdptrs(). > > Note, the buggy commit also introduced a deliberate functional change, > "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was > effectively (and correctly) reverted by commit cd9ae5fe47df ("KVM: x86: > Fix page-tables reserved bits"). A bit of SDM archaeology shows that > the SDM from late 2008 had a bug (likely a copy+paste error) where it > listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved > for 2mb entries. I.e. the SDM contradicted itself, and bits 6:5 are and > always have been reserved. > > Fixes: 20c466b56168d ("KVM: Use rsvd_bits_mask in load_pdptrs()") > Cc: sta...@vger.kernel.org > Cc: Nadav Amit > Reported-by: Doug Reiland > Signed-off-by: Sean Christopherson Maybe with a test case would be even better? FWIW: Reviewed-by: Peter Xu -- Peter Xu
Re: [PATCH] x86/umip: Add emulation for 64-bit processes
On September 10, 2019 7:28:28 AM GMT+01:00, Ingo Molnar wrote: > >* h...@zytor.com wrote: > >> I would strongly suggest that we change the term "emulation" to >> "spoofing" for these instructions. We need to explain that we do >*not* >> execute these instructions the was the CPU would have, and unlike the > >> native instructions do not leak kernel information. > >Ok, I've edited the patch to add the 'spoofing' wording where >appropriate, and I also made minor fixes such as consistently >capitalizing instruction names. > >Can I also add your Reviewed-by tag? > >So the patch should show up in tip:x86/asm today-ish, and barring any >complications is v5.4 material. > >Thanks, > > Ingo Yes, please do. Reviewed-by: H. Peter Anvin (Intel) -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v6 0/3] genirq/vfio: Introduce irq_update_devid() and optimize VFIO irq ops
A friendly reminder. Thanks, Ben 在 2019/9/2 下午12:01, Ben Luo 写道: Currently, VFIO takes a free-then-request-irq way to do interrupt affinity setting and masking/unmasking for a VM with device passthru via VFIO. Sometimes it only changes the cookie data of irqaction or even changes nothing. The free-then-request-irq not only adds more latency, but also increases the risk of losing interrupt, which may lead to a VM hang forever in waiting for IO completion This patchset solved the issue by: Patch 2 introduces irq_update_devid() to only update dev_id of irqaction Patch 3 make use of this function and optimize irq operations in VFIO changes from v5: - Patch 3: remove an error log to avoid potential DDoS attacking _ Patch 3: fix typo in comment changes from v4: - Patch 3: follow the previous behavior to disable interrupt on error path - Patch 3: do irqbypass registration before update or free the interrupt - Patch 3: add more comments changes from v3: - Patch 2: rename the new function to irq_update_devid() - Patch 2: use disbale_irq() to avoid a twist for threaded interrupt - ALL: amend commit messages and code comments changes from v2: - reformat to avoid quoted string split across lines and etc. changes from v1: - add Patch 1 to enhance error recovery etc. in free irq per tglx's comments - enhance error recovery code and debugging info in irq_update_devid - use __must_check in external referencing of this function - use EXPORT_SYMBOL_GPL for irq_update_devid - reformat code of patch 3 for better readability Ben Luo (3): genirq: enhance error recovery code in free irq genirq: introduce irq_update_devid() vfio/pci: make use of irq_update_devid() and optimize irq ops drivers/vfio/pci/vfio_pci_intrs.c | 118 ++ include/linux/interrupt.h | 3 + kernel/irq/manage.c | 105 + 3 files changed, 177 insertions(+), 49 deletions(-)
Re: [vfs] 8bb3c61baf: vm-scalability.median -23.7% regression
On Mon, 9 Sep 2019, Al Viro wrote: > > Anyway, see vfs.git#uncertain.shmem for what I've got with those folded in. > Do you see any problems with that one? That's the last 5 commits in there... It's mostly fine, I've no problem with going your way instead of what we had in mmotm; but I have seen some problems with it, and had been intending to send you a fixup patch tonight (shmem_reconfigure() missing unlock on error is the main problem, but there are other fixes needed). But I'm growing tired. I've a feeling my "swap" of the mpols, instead of immediate mpol_put(), was necessary to protect against a race with shmem_get_sbmpol(), but I'm not clear-headed enough to trust myself on that now. And I've a mystery to solve, that shmem_reconfigure() gets stuck into showing the wrong error message. Tomorrow Oh, and my first attempt to build and boot that series over 5.3-rc5 wouldn't boot. Luckily there was a tell-tale "i915" in the stacktrace, which reminded me of the drivers/gpu/drm/i915/gem/i915_gemfs.c fix we discussed earlier in the cycle. That is of course in linux-next by now, but I wonder if your branch ought to contain a duplicate of that fix, so that people with i915 doing bisections on 5.4-rc do not fall into an unbootable hole between vfs and gpu merges. Hugh
Re: [PATCH] x86/umip: Add emulation for 64-bit processes
* h...@zytor.com wrote: > I would strongly suggest that we change the term "emulation" to > "spoofing" for these instructions. We need to explain that we do *not* > execute these instructions the was the CPU would have, and unlike the > native instructions do not leak kernel information. Ok, I've edited the patch to add the 'spoofing' wording where appropriate, and I also made minor fixes such as consistently capitalizing instruction names. Can I also add your Reviewed-by tag? So the patch should show up in tip:x86/asm today-ish, and barring any complications is v5.4 material. Thanks, Ingo
Re: [PATCH 4.19 19/57] Bluetooth: hidp: Let hidp_send_message return number of queued bytes
On 10.09.19 00:59, Greg Kroah-Hartman wrote: > On Mon, Sep 09, 2019 at 03:00:46PM +0200, Fabian Henneke wrote: >> Hi, >> >> On Mon, Sep 9, 2019 at 2:15 PM Pavel Machek wrote: >> >>> Hi! >>> [ Upstream commit 48d9cc9d85dde37c87abb7ac9bbec6598ba44b56 ] Let hidp_send_message return the number of successfully queued bytes instead of an unconditional 0. With the return value fixed to 0, other drivers relying on hidp, such as hidraw, can not return meaningful values from their respective implementations of write(). In particular, with the current behavior, a hidraw device's write() will have different return values depending on whether the device is connected via USB or Bluetooth, which makes it harder to abstract away the transport layer. >>> >>> So, does this change any actual behaviour? >>> >>> Is it fixing a bug, or is it just preparation for a patch that is not >>> going to make it to stable? >>> >> >> I created this patch specifically in order to ensure that user space >> applications can use HID devices with hidraw without needing to care about >> whether the transport is USB or Bluetooth. Without the patch, every >> hidraw-backed Bluetooth device needs to be treated specially as its write() >> violates the usual return value contract, which could be viewed as a bug. >> >> Please note that a later patch ( >> https://www.spinics.net/lists/linux-input/msg63291.html) fixes some >> important error checks that were relying on the old behavior (and were >> unfortunately missed by me). > > As that patch doesn't seem to be in Linus's tree yet, we should postpone > taking this one in the stable tree right now, correct? > > thanks, > > greg k-h > Yes, please wait for the other patch if it's not in his tree yet and apply the two together. Thank you, Fabian
Re: [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory
On 9/10/19 8:11 AM, Christoph Hellwig wrote: On Thu, Sep 05, 2019 at 04:23:11AM -0700, Christoph Hellwig wrote: This looks fine from the DMA POV. I'll let the x86 guys comment on the rest. Do we want to pick this series up for 5.4? Should I queue it up in the dma-mapping tree? Hi, Christoph I think the DMA change is pretty uncontroversial. There are still some questions about the x86 change: After digging a bit deeper into the mm code I think Dave is correct about that we should include the sme_me_mask in _PAGE_CHG_MASK. I'll respin that patch and then I guess we need an ack from the x86 people. Thanks, Thomas
[PATCH v2 1/3] regulator: fixed: add possibility to enable by clock
This commit adds the possibility to choose the compatible "regulator-fixed-clock" in devicetree. This is a special regulator-fixed that has to have a clock, from which the regulator gets switched on and off. Signed-off-by: Philippe Schenker --- Changes in v2: - return priv->clk_enable_counter > 0 directly. drivers/regulator/fixed.c | 83 +-- 1 file changed, 80 insertions(+), 3 deletions(-) diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c index 999547dde99d..d90a6fd8cbc7 100644 --- a/drivers/regulator/fixed.c +++ b/drivers/regulator/fixed.c @@ -23,14 +23,63 @@ #include #include #include +#include #include #include +#include + struct fixed_voltage_data { struct regulator_desc desc; struct regulator_dev *dev; + + struct clk *enable_clock; + unsigned int clk_enable_counter; }; +struct fixed_dev_type { + bool has_enable_clock; +}; + +static const struct fixed_dev_type fixed_voltage_data = { + .has_enable_clock = false, +}; + +static const struct fixed_dev_type fixed_clkenable_data = { + .has_enable_clock = true, +}; + +static int reg_clock_enable(struct regulator_dev *rdev) +{ + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev); + int ret = 0; + + ret = clk_prepare_enable(priv->enable_clock); + if (ret) + return ret; + + priv->clk_enable_counter++; + + return ret; +} + +static int reg_clock_disable(struct regulator_dev *rdev) +{ + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev); + + clk_disable_unprepare(priv->enable_clock); + priv->clk_enable_counter--; + + return 0; +} + +static int reg_clock_is_enabled(struct regulator_dev *rdev) +{ + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev); + + return priv->clk_enable_counter > 0; +} + /** * of_get_fixed_voltage_config - extract fixed_voltage_config structure info @@ -84,10 +133,19 @@ of_get_fixed_voltage_config(struct device *dev, static struct regulator_ops fixed_voltage_ops = { }; +static struct regulator_ops fixed_voltage_clkenabled_ops = { + .enable = reg_clock_enable, + .disable = reg_clock_disable, + .is_enabled = reg_clock_is_enabled, +}; + static int reg_fixed_voltage_probe(struct platform_device *pdev) { + struct device *dev = &pdev->dev; struct fixed_voltage_config *config; struct fixed_voltage_data *drvdata; + const struct fixed_dev_type *drvtype = + of_match_device(dev->driver->of_match_table, dev)->data; struct regulator_config cfg = { }; enum gpiod_flags gflags; int ret; @@ -118,7 +176,18 @@ static int reg_fixed_voltage_probe(struct platform_device *pdev) } drvdata->desc.type = REGULATOR_VOLTAGE; drvdata->desc.owner = THIS_MODULE; - drvdata->desc.ops = &fixed_voltage_ops; + + if (drvtype->has_enable_clock) { + drvdata->desc.ops = &fixed_voltage_clkenabled_ops; + + drvdata->enable_clock = devm_clk_get(dev, NULL); + if (IS_ERR(drvdata->enable_clock)) { + dev_err(dev, "Cant get enable-clock from devicetree\n"); + return -ENOENT; + } + } else { + drvdata->desc.ops = &fixed_voltage_ops; + } drvdata->desc.enable_time = config->startup_delay; @@ -191,8 +260,16 @@ static int reg_fixed_voltage_probe(struct platform_device *pdev) #if defined(CONFIG_OF) static const struct of_device_id fixed_of_match[] = { - { .compatible = "regulator-fixed", }, - {}, + { + .compatible = "regulator-fixed", + .data = &fixed_voltage_data, + }, + { + .compatible = "regulator-fixed-clock", + .data = &fixed_clkenable_data, + }, + { + }, }; MODULE_DEVICE_TABLE(of, fixed_of_match); #endif -- 2.23.0
[PATCH v2 0/3] Add new binding regulator-fixed-clock to regulator-fixed
Our hardware has a FET that is switching power rail of the ethernet PHY on and off. This switching enable signal is a clock from the SoC. There is no possibility in regulator subsystem to have this hardware reflected in software. I already discussed with Mark Brown about possible solutions and he suggested to create at least a new compatible. [1] This discussion includes also a better explanation of our circuit as well as schematics. So please refer to that link if you have questions about that. In this first attempt I created a new binding "regulator-fixed-clock" that can take a clock from devicetree. This is a simple addition to regulator-fixed. If the binding regulator-fixed-clock is given, the clock is simply enabled on regulator enable and disabled on regulator disable. To be able to have multiple consumers a counter variable is also given that tells how many consumers need power from this regulator. Best regards, Philippe [1] https://lkml.org/lkml/2019/8/7/78 Changes in v2: - return priv->clk_enable_counter > 0 directly. - Change select: to if: - Change items: to enum: - Defined how many clocks should be given Philippe Schenker (3): regulator: fixed: add possibility to enable by clock ARM: dts: imx6ull-colibri: add phy-supply and respective regulator dt-bindings: regulator: add regulator-fixed-clock binding .../bindings/regulator/fixed-regulator.yaml | 19 - arch/arm/boot/dts/imx6ull-colibri.dtsi| 12 +++ drivers/regulator/fixed.c | 83 ++- 3 files changed, 110 insertions(+), 4 deletions(-) -- 2.23.0
[PATCH v2 2/3] ARM: dts: imx6ull-colibri: add phy-supply and respective regulator
This adds regulator-fixed-clock, a fixed-regulator that turns on and off with a clock and add it to the phy. Signed-off-by: Philippe Schenker --- Changes in v2: None arch/arm/boot/dts/imx6ull-colibri.dtsi | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm/boot/dts/imx6ull-colibri.dtsi b/arch/arm/boot/dts/imx6ull-colibri.dtsi index d56728f03c35..76021b842a97 100644 --- a/arch/arm/boot/dts/imx6ull-colibri.dtsi +++ b/arch/arm/boot/dts/imx6ull-colibri.dtsi @@ -47,6 +47,17 @@ states = <180 0x1 330 0x0>; vin-supply = <®_module_3v3>; }; + + reg_eth_phy: regulator-eth-phy { + compatible = "regulator-fixed-clock"; + regulator-boot-on; + regulator-name = "eth_phy"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + clocks = <&clks IMX6UL_CLK_ENET2_REF_125M>; + startup-delay-us = <15>; + vin-supply = <®_module_3v3>; + }; }; &adc1 { @@ -66,6 +77,7 @@ pinctrl-0 = <&pinctrl_enet2>; phy-mode = "rmii"; phy-handle = <ðphy1>; + phy-supply = <®_eth_phy>; status = "okay"; mdio { -- 2.23.0
Re: [PATCH 2/3] soc: amazon: al-pos: Introduce Amazon's Annapurna Labs POS driver
On 9/9/2019 6:16 PM, Arnd Bergmann wrote: On Mon, Sep 9, 2019 at 4:11 PM Shenhar, Talel wrote: On 9/9/2019 4:41 PM, Arnd Bergmann wrote: In current implementation of v1, I am not doing any read barrier, Hence, using the non-relaxed will add unneeded memory barrier. I have no strong objection moving to the non-relaxed version and have an unneeded memory barrier, as this path is not "hot" one. Ok, then please add it. ok, shall be part of v2 Beside of avoiding the unneeded memory barrier, I would be happy to keep common behavior for our drivers: e.g. https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-al-fic.c#L49 So what do you think we should go with? relaxed or non-relaxed? The al_fic_set_trigger() function is clearly a slow-path and should use the non-relaxed functions. In case of al_fic_irq_handler(), the extra barrier might introduce a measurable overhead, but at the same time I'm not sure if that one is correct without the barrier: If you have an MSI-type interrupt for notifying a device driver of a DMA completion, there might not be any other barrier between the arrival of the MSI message and the CPU accessing the data. Depending on how strict the hardware implements MSI and how the IRQ is chained, this could lead to data corruption. If the interrupt is only used for level or edge triggered interrupts, this is ok since you already need another register read in the driver before it can safely access a DMA buffer. In either case, if you can prove that it's safe to use the relaxed version here and you think that it may help, it would be good to add a comment explaining the reasoning. Decided to go with the non-relaxed version as this is not hot path and likely be more clear to the common reader to have non relaxed version. Arnd
[PATCH v2 3/3] dt-bindings: regulator: add regulator-fixed-clock binding
This adds the documentation to the compatible regulator-fixed-clock. This binding is a special binding of regulator-fixed and adds the ability to add a clock to regulator-fixed, so the regulator can be enabled and disabled with that clock. If the special compatible regulator-fixed-clock is used it is mandatory to supply a clock. Signed-off-by: Philippe Schenker --- Changes in v2: - Change select: to if: - Change items: to enum: - Defined how many clocks should be given .../bindings/regulator/fixed-regulator.yaml | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml b/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml index a650b457085d..a78150c47aa2 100644 --- a/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml +++ b/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml @@ -19,9 +19,19 @@ description: allOf: - $ref: "regulator.yaml#" +if: + properties: +compatible: + contains: +const: regulator-fixed-clock + required: +- clocks + properties: compatible: -const: regulator-fixed +enum: + - const: regulator-fixed + - const: regulator-fixed-clock regulator-name: true @@ -29,6 +39,13 @@ properties: description: gpio to use for enable control maxItems: 1 + clocks: +description: + clock to use for enable control. This binding is only available if + the compatible is chosen to regulator-fixed-clock. The clock binding + is mandatory if compatible is chosen to regulator-fixed-clock. +maxItems: 1 + startup-delay-us: description: startup time in microseconds $ref: /schemas/types.yaml#/definitions/uint32 -- 2.23.0
[PATCH v3] Staging: gasket: Use temporaries to reduce line length.
Using temporaries for gasket_page_table entries to remove scnprintf() statements and reduce line length, as suggested by Joe Perches. Thanks! Signed-off-by: Sandro Volery --- v3: Fixed faulty copy/paste of function v2: Attempt to fix v1: Original patch drivers/staging/gasket/apex_driver.c | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/staging/gasket/apex_driver.c b/drivers/staging/gasket/apex_driver.c index 2973bb920a26..46199c8ca441 100644 --- a/drivers/staging/gasket/apex_driver.c +++ b/drivers/staging/gasket/apex_driver.c @@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct device_attribute *attr, struct gasket_dev *gasket_dev; struct gasket_sysfs_attribute *gasket_attr; enum sysfs_attribute_type type; + struct gasket_page_table *gpt; + uint val; gasket_dev = gasket_sysfs_get_device_data(device); if (!gasket_dev) { @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct device_attribute *attr, } type = (enum sysfs_attribute_type)gasket_attr->data.attr_type; + gpt = gasket_dev->page_table[0]; switch (type) { case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_entries( - gasket_dev->page_table[0])); + val = gasket_page_table_num_entries(gpt); break; case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_simple_entries( - gasket_dev->page_table[0])); + val = gasket_page_table_num_simple_entries(gpt); break; case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_active_pages( - gasket_dev->page_table[0])); + val = gasket_page_table_num_active_pages(gpt); break; default: dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n", attr->attr.name); ret = 0; - break; + goto exit; } - + ret = scnprintf(buf, PAGE_SIZE, "%u\n", val); +exit: gasket_sysfs_put_attr(device, gasket_attr); gasket_sysfs_put_device_data(device, gasket_dev); return ret; -- 2.23.0
Re: [PATCH 3/3] arm64: alpine: select AL_POS
On 9/9/2019 6:08 PM, Arnd Bergmann wrote: On Mon, Sep 9, 2019 at 3:59 PM Shenhar, Talel wrote: On 9/9/2019 4:45 PM, Arnd Bergmann wrote: Its not that something will get broken. its error event detector for POS events which allows seeing bad accesses to registers. What is the general rule of which configs to put under select and which under defconfig? I was thinking that "general" SoC support is good under select - those things that we always want. I generally want as little as possible to be selected, basically only things that are required for linking the kernel and booting it without potentially destroying the hardware. In particular, I want most drivers to be enabled as loadable modules if possible. When you have general-purpose distributions support your platform, there is no need to have this module built-in while running on a different chip, even if you always want to load the module when it's running on yours. And specific features, e.g. RAID support or features that supported only on specific HW shall go under defconfig. Similar, I see ARCH_LAYERSCAPE selecting EDAC_SUPPORT. I think this was done to avoid a link failure. It's also possible that this is a mistake and just did not get caught in review. Arnd I see. Will remove this from v2.
Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.
* Kirill A. Shutemov wrote: > On Fri, Sep 06, 2019 at 04:29:50PM -0500, Steve Wahl wrote: > > Our hardware (UV aka Superdome Flex) has address ranges marked > > reserved by the BIOS. These ranges can cause the system to halt if > > accessed. > > > > During kernel initialization, the processor was speculating into > > reserved memory causing system halts. The processor speculation is > > enabled because the reserved memory is being mapped by the kernel. > > > > The page table level2_kernel_pgt is 1 GiB in size, and had all pages > > initially marked as valid, and the kernel is placed anywhere in this > > range depending on the virtual address selected by KASLR. Later on in > > the boot process, the valid area gets trimmed back to the space > > occupied by the kernel. > > > > But during the interval of time when the full 1 GiB space was marked > > as valid, if the kernel physical address chosen by KASLR was close > > enough to our reserved memory regions, the valid pages outside the > > actual kernel space were allowing the processor to issue speculative > > accesses to the reserved space, causing the system to halt. > > > > This was encountered somewhat rarely on a normal system boot, and > > somewhat more often when starting the crash kernel if > > "crashkernel=512M,high" was specified on the command line (because > > this heavily restricts the physical address of the crash kernel, > > usually to within 1 GiB of our reserved space). > > > > The answer is to invalidate the pages of this table outside the > > address range occupied by the kernel before the page table is > > activated. This patch has been validated to fix this problem on our > > hardware. > > If the goal is to avoid *any* mapping of the reserved region to stop > speculation, I don't think this patch will do the job. We still (likely) > have the same memory mapped as part of the identity mapping. And it > happens at least in two places: here and before on decompression stage. Yeah, this really needs a fix at the KASLR level: it should only ever map into regions that are fully RAM backed. Is the problem that the 1 GiB mapping is a direct mapping, which can be speculated into? I presume KASLR won't accidentally map the kernel into the reserved region, right? Thanks, Ingo
Re: [PATCH v3] KVM: x86: Disable posted interrupts for odd IRQs
And what about even ones? :) Sorry, just joking, but the "odd" qualifier here looks a little weird, maybe something like "non-standard develiry modes" might make sense here.
Re: [RFC 02/19] ktf: Introduce the main part of the kernel side of ktf
On Sun, 2019-09-08 at 18:23 -0700, Brendan Higgins wrote: > On Tue, Aug 13, 2019 at 08:09:17AM +0200, Knut Omang wrote: > > Sorry, it's taken me way too long to get down to a proper code review on > this. I was hoping to send you something a couple weeks ago in > preparation for Tuesday, but I have been crazy busy. > > > The ktf module itself and basic data structures for management > > of test cases and tests and contexts for tests. > > Also contains the top level include file for kernel clients > > in ktf.h. > > > > More elaborate documentation follows towards the end of the > > patch set. > > > > This patch set contains both user level and kernel code, > > we'll provide the full implementation of ktf on the kernel side in > > this and forthcoming patches, then the user space code to execute > > tests within the kernel and report results, then documentation > > before introducing a small self test suite of tests to test ktf > > itself, and some very simple additional example tests. > > > > ktf.h: Defines the KTF user API for kernel clients > > ktf_test.c: Kernel side code for tracking and reporting ktf test > > results > > > > Signed-off-by: Knut Omang > > --- > > tools/testing/selftests/ktf/kernel/Makefile | 15 +- > > tools/testing/selftests/ktf/kernel/ktf.h | 604 - > > tools/testing/selftests/ktf/kernel/ktf_context.c | 409 +++- > > tools/testing/selftests/ktf/kernel/ktf_test.c| 397 +++- > > tools/testing/selftests/ktf/kernel/ktf_test.h| 381 ++- > > 5 files changed, 1806 insertions(+) > > create mode 100644 tools/testing/selftests/ktf/kernel/Makefile > > create mode 100644 tools/testing/selftests/ktf/kernel/ktf.h > > create mode 100644 tools/testing/selftests/ktf/kernel/ktf_context.c > > create mode 100644 tools/testing/selftests/ktf/kernel/ktf_test.c > > create mode 100644 tools/testing/selftests/ktf/kernel/ktf_test.h > [...] > > diff --git a/tools/testing/selftests/ktf/kernel/ktf.h > > b/tools/testing/selftests/ktf/kernel/ktf.h > > new file mode 100644 > > index 000..ea270e7 > > --- /dev/null > > +++ b/tools/testing/selftests/ktf/kernel/ktf.h > > @@ -0,0 +1,604 @@ > > +/* > > + * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. > > + *Author: Knut Omang > > + * > > + * SPDX-License-Identifier: GPL-2.0 > > + * > > + * ktf.h: Defines the KTF user API for kernel clients > > + */ > > +#ifndef _KTF_H > > +#define _KTF_H > > + > > +#include > > +#include > > +#include > > +#include > > +#include "ktf_test.h" > > +#include "ktf_override.h" > > +#include "ktf_map.h" > > Where do you add this file? I don't see any definitions of > `struct ktf_map` in either this or the preceding patches, so I don't > think that this will compile. Compiling is not enabled until patch 17, so that should not be a problem. I wanted to convey the core KTF API early in the set to not let it "drown" in the utilities, but I get your point. One way to do it would be to move the header files up front and keep the implementations in the later patches, even though that would separate the API definition from the implementation. > > +#include "ktf_unlproto.h" > > Same here. This looks important for understanding what you presented > here. yes Thanks, Knut > > +#defineKTF_MAX_LOG 2048 > > + > > +/* Type for an optional configuration callback for contexts. > > + * Implementations should copy and store data into their private > > + * extensions of the context structure. The data pointer is > > + * only valid inside the callback: > > + */ > > +typedef int (*ktf_config_cb)(struct ktf_context *ctx, const void* data, > > size_t data_sz); > > +typedef void (*ktf_context_cb)(struct ktf_context *ctx); > > + > > +struct ktf_context_type; > > + > > +struct ktf_context { > > + struct ktf_map_elem elem; /* Linkage for ctx_map in handle */ > > + char name[KTF_MAX_KEY];/* Context name used in map */ > > + struct ktf_handle *handle; /* Owner of this context */ > > + ktf_config_cb config_cb; /* Optional configuration callback */ > > + ktf_context_cb cleanup;/* Optional callback upon context release > > */ > > + int config_errno; /* If config_cb set: state of configuration > > */ > > + struct ktf_context_type *type; /* Associated type, must be set */ > > +}; > > + > > +typedef struct ktf_context* (*ktf_context_alloc)(struct ktf_context_type > > *ct); > > + > > +struct ktf_context_type { > > + struct ktf_map_elem elem; /* Linkage for map in handle */ > > + char name[KTF_MAX_KEY];/* Context type name */ > > + struct ktf_handle *handle; /* Owner of this context type */ > > + ktf_context_alloc alloc; /* Allocate a new context of this type */ > > + ktf_config_cb config_cb; /* Configuration callback */ > > + ktf_context_cb cleanup;/* Optional callback upon context release > > */ > > +}; > > + > > +#include "ktf_netctx.h" > > + > > +/* type for a
Re: [PATCH 1/3] regulator: fixed: add possibility to enable by clock
On Tue, 2019-09-10 at 06:08 +, Philippe Schenker wrote: > On Thu, 2019-09-05 at 19:06 +0100, Mark Brown wrote: > > On Tue, Sep 03, 2019 at 08:03:46AM +, Philippe Schenker wrote: > > > This commit adds the possibility to choose the compatible > > > "regulator-fixed-clock" in devicetree. > > > > > > This is a special regulator-fixed that has to have a clock, from > > > which > > > the regulator gets switched on and off. > > > > This seems conceptually fine. Minor issues though: > > Thanks for your comments and I'm glad you like it! I will send a v2 > shortly, also with Rob's fixes in. Can I expect it to be pulled for > 5.4? I meant 5.5 of course. > > Best regards, > Philippe > > > > +static int reg_clock_is_enabled(struct regulator_dev *rdev) > > > +{ > > > + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev); > > > + > > > + if (priv->clk_enable_counter > 0) > > > + return 1; > > > + > > > + return 0; > > > +} > > > > This could just be return priv->clk_enable_counter > 0 - ideally the > > clock API would let us query if the clock is enabled but that might > > be > > a > > bit confused anyway given that it's possibly shared.
Re: [PATCH] riscv: dts: sifive: Add ethernet0 to the aliases node
On Thu, Sep 05, 2019 at 05:46:14AM -0700, Bin Meng wrote: > U-Boot expects this alias to be in place in order to fix up the mac > address of the ethernet node. > > Signed-off-by: Bin Meng Looks good: Reviewed-by: Christoph Hellwig
Re: [PATCH v2] riscv: dts: sifive: Drop "clock-frequency" property of cpu nodes
On Thu, Sep 05, 2019 at 05:45:53AM -0700, Bin Meng wrote: > The "clock-frequency" property of cpu nodes isn't required. Drop it. > > Signed-off-by: Bin Meng Looks good: Reviewed-by: Christoph Hellwig
Re: [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory
On Thu, Sep 05, 2019 at 04:23:11AM -0700, Christoph Hellwig wrote: > This looks fine from the DMA POV. I'll let the x86 guys comment on the > rest. Do we want to pick this series up for 5.4? Should I queue it up in the dma-mapping tree?
Re: [PATCH 1/3] regulator: fixed: add possibility to enable by clock
On Thu, 2019-09-05 at 19:06 +0100, Mark Brown wrote: > On Tue, Sep 03, 2019 at 08:03:46AM +, Philippe Schenker wrote: > > This commit adds the possibility to choose the compatible > > "regulator-fixed-clock" in devicetree. > > > > This is a special regulator-fixed that has to have a clock, from > > which > > the regulator gets switched on and off. > > This seems conceptually fine. Minor issues though: Thanks for your comments and I'm glad you like it! I will send a v2 shortly, also with Rob's fixes in. Can I expect it to be pulled for 5.4? Best regards, Philippe > > > +static int reg_clock_is_enabled(struct regulator_dev *rdev) > > +{ > > + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev); > > + > > + if (priv->clk_enable_counter > 0) > > + return 1; > > + > > + return 0; > > +} > > This could just be return priv->clk_enable_counter > 0 - ideally the > clock API would let us query if the clock is enabled but that might be > a > bit confused anyway given that it's possibly shared.
Re: [PATCH AUTOSEL 5.2 06/12] configfs_register_group() shouldn't be (and isn't) called in rmdirable parts
Please stop selectively backporting parts of random series. We'll need to the full series from Al in -stable instead.
[PATCH] serial/sifive: select SERIAL_EARLYCON
The sifive serial driver implements earlycon support, but unless another driver is built in that supports earlycon support it won't be usable. Explicitly select SERIAL_EARLYCON instead. Signed-off-by: Christoph Hellwig --- drivers/tty/serial/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index 530cb966092f..6b77a72278e3 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -1075,6 +1075,7 @@ config SERIAL_SIFIVE_CONSOLE bool "Console on SiFive UART" depends on SERIAL_SIFIVE=y select SERIAL_CORE_CONSOLE + select SERIAL_EARLYCON help Select this option if you would like to use a SiFive UART as the system console. -- 2.20.1
Re: [v5 PATCH] RISC-V: Fix unsupported isa string info.
On Fri, Sep 06, 2019 at 11:27:57PM +, Atish Patra wrote: > > Agreed. May be something like this ? > > > > Let's say f/d is enabled in kernel but cpu doesn't support it. > > "unsupported isa" will only appear if there are any unsupported isa. > > > > processor : 3 > > hart: 4 > > isa : rv64imac > > unsupported isa : fd > > mmu : sv39 > > uarch : sifive,u54-mc > > > > May be I am just trying over optimize one corner case :) :). > > /proc/cpuinfo should just print all the isa string. That's it. > > > > Ping ? Yes, I agree with the "dumb" reporting of all capabilities.
[PATCH] of/fdt: don't ignore errors from of_setup_earlycon
If of_setup_earlycon we should keep on iterating earlycon options instead of breaking out of the loop. Signed-off-by: Christoph Hellwig --- drivers/of/fdt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 9cdf14b9aaab..2f6bd03d8e27 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -946,8 +946,8 @@ int __init early_init_dt_scan_chosen_stdout(void) if (fdt_node_check_compatible(fdt, offset, match->compatible)) continue; - of_setup_earlycon(match, offset, options); - return 0; + if (of_setup_earlycon(match, offset, options) == 0) + return 0; } return -ENODEV; } -- 2.20.1
Re: [PATCH] Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
On Mon, 9 Sep 2019 at 18:56, Waiman Long wrote: > > On 9/9/19 2:40 AM, Wanpeng Li wrote: > > From: Wanpeng Li > > > > This patch reverts commit 75437bb304b20 (locking/pvqspinlock: Don't wait if > > vCPU is preempted), we found great regression caused by this commit. > > > > Xeon Skylake box, 2 sockets, 40 cores, 80 threads, three VMs, each is 80 > > vCPUs. > > The score of ebizzy -M can reduce from 13000-14000 records/s to 1700-1800 > > records/s with this commit. > > > > Host Guestscore > > > > vanilla + w/o kvm optimizes vanilla 1700-1800 records/s > > vanilla + w/o kvm optimizes vanilla + revert 13000-14000 records/s > > vanilla + w/ kvm optimizes vanilla 4500-5000 records/s > > vanilla + w/ kvm optimizes vanilla + revert 14000-15500 records/s > > > > Exit from aggressive wait-early mechanism can result in yield premature and > > incur extra scheduling latency in over-subscribe scenario. > > > > kvm optimizes: > > [1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts) > > [2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock > > waiter preemption) > > > > Tested-by: loobin...@tencent.com > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: Waiman Long > > Cc: Paolo Bonzini > > Cc: Radim Krčmář > > Cc: loobin...@tencent.com > > Cc: sta...@vger.kernel.org > > Fixes: 75437bb304b20 (locking/pvqspinlock: Don't wait if vCPU is preempted) > > Signed-off-by: Wanpeng Li > > --- > > kernel/locking/qspinlock_paravirt.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/locking/qspinlock_paravirt.h > > b/kernel/locking/qspinlock_paravirt.h > > index 89bab07..e84d21a 100644 > > --- a/kernel/locking/qspinlock_paravirt.h > > +++ b/kernel/locking/qspinlock_paravirt.h > > @@ -269,7 +269,7 @@ pv_wait_early(struct pv_node *prev, int loop) > > if ((loop & PV_PREV_CHECK_MASK) != 0) > > return false; > > > > - return READ_ONCE(prev->state) != vcpu_running || > > vcpu_is_preempted(prev->cpu); > > + return READ_ONCE(prev->state) != vcpu_running; > > } > > > > /* > > There are several possibilities for this performance regression: > > 1) Multiple vcpus calling vcpu_is_preempted() repeatedly may cause some > cacheline contention issue depending on how that callback is implemented. > > 2) KVM may set the preempt flag for a short period whenver an vmexit > happens even if a vmenter is executed shortly after. In this case, we > may want to use a more durable vcpu suspend flag that indicates the vcpu > won't get a real vcpu back for a longer period of time. > > Perhaps you can add a lock event counter to count the number of > wait_early events caused by vcpu_is_preempted() being true to see if it > really cause a lot more wait_early than without the vcpu_is_preempted() > call. pv_wait_again:1:179 pv_wait_early:1:189429 pv_wait_head:1:263 pv_wait_node:1:189429 pv_vcpu_is_preempted:1:45588 =sleep 5 pv_wait_again:1:181 pv_wait_early:1:202574 pv_wait_head:1:267 pv_wait_node:1:202590 pv_vcpu_is_preempted:1:46336 The sampling period is 5s, 6% of wait_early events caused by vcpu_is_preempted() being true. Wanpeng
[PATCH v2 2/2] nvmem: sprd: Add Spreadtrum SoCs eFuse support
From: Freeman Liu The Spreadtrum eFuse controller is widely used to dump chip ID, configuration setting, function select and so on, as well as supporting one-time programming. Signed-off-by: Freeman Liu Signed-off-by: Baolin Wang --- Changes from v1: - None --- drivers/nvmem/Kconfig | 11 ++ drivers/nvmem/Makefile |2 + drivers/nvmem/sprd-efuse.c | 424 3 files changed, 437 insertions(+) create mode 100644 drivers/nvmem/sprd-efuse.c diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig index c2ec750..8fd425d 100644 --- a/drivers/nvmem/Kconfig +++ b/drivers/nvmem/Kconfig @@ -230,4 +230,15 @@ config NVMEM_ZYNQMP If sure, say yes. If unsure, say no. +config SPRD_EFUSE + tristate "Spreadtrum SoC eFuse Support" + depends on ARCH_SPRD || COMPILE_TEST + depends on HAS_IOMEM + help + This is a simple driver to dump specified values of Spreadtrum + SoCs from eFuse. + + This driver can also be built as a module. If so, the module + will be called nvmem-sprd-efuse. + endif diff --git a/drivers/nvmem/Makefile b/drivers/nvmem/Makefile index e5c153d..7c19870 100644 --- a/drivers/nvmem/Makefile +++ b/drivers/nvmem/Makefile @@ -50,3 +50,5 @@ obj-$(CONFIG_SC27XX_EFUSE)+= nvmem-sc27xx-efuse.o nvmem-sc27xx-efuse-y := sc27xx-efuse.o obj-$(CONFIG_NVMEM_ZYNQMP) += nvmem_zynqmp_nvmem.o nvmem_zynqmp_nvmem-y := zynqmp_nvmem.o +obj-$(CONFIG_SPRD_EFUSE) += nvmem_sprd_efuse.o +nvmem_sprd_efuse-y := sprd-efuse.o diff --git a/drivers/nvmem/sprd-efuse.c b/drivers/nvmem/sprd-efuse.c new file mode 100644 index 000..2f1e0fb --- /dev/null +++ b/drivers/nvmem/sprd-efuse.c @@ -0,0 +1,424 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2019 Spreadtrum Communications Inc. + +#include +#include +#include +#include +#include +#include +#include +#include + +#define SPRD_EFUSE_ENABLE 0x20 +#define SPRD_EFUSE_ERR_FLAG0x24 +#define SPRD_EFUSE_ERR_CLR 0x28 +#define SPRD_EFUSE_MAGIC_NUM 0x2c +#define SPRD_EFUSE_FW_CFG 0x50 +#define SPRD_EFUSE_PW_SWT 0x54 +#define SPRD_EFUSE_MEM(val)(0x1000 + ((val) << 2)) + +#define SPRD_EFUSE_VDD_EN BIT(0) +#define SPRD_EFUSE_AUTO_CHECK_EN BIT(1) +#define SPRD_EFUSE_DOUBLE_EN BIT(2) +#define SPRD_EFUSE_MARGIN_RD_ENBIT(3) +#define SPRD_EFUSE_LOCK_WR_EN BIT(4) + +#define SPRD_EFUSE_ERR_CLR_MASKGENMASK(13, 0) + +#define SPRD_EFUSE_ENK1_ON BIT(0) +#define SPRD_EFUSE_ENK2_ON BIT(1) +#define SPRD_EFUSE_PROG_EN BIT(2) + +#define SPRD_EFUSE_MAGIC_NUMBER0x8810 + +/* Block width (bytes) definitions */ +#define SPRD_EFUSE_BLOCK_WIDTH 4 + +/* + * The Spreadtrum AP efuse contains 2 parts: normal efuse and secure efuse, + * and we can only access the normal efuse in kernel. So define the normal + * block offset index and normal block numbers. + */ +#define SPRD_EFUSE_NORMAL_BLOCK_NUMS 24 +#define SPRD_EFUSE_NORMAL_BLOCK_OFFSET 72 + +/* Timeout (ms) for the trylock of hardware spinlocks */ +#define SPRD_EFUSE_HWLOCK_TIMEOUT 5000 + +/* + * Since different Spreadtrum SoC chip can have different normal block numbers + * and offset. And some SoC can support block double feature, which means + * when reading or writing data to efuse memory, the controller can save double + * data in case one data become incorrect after a long period. + * + * Thus we should save them in the device data structure. + */ +struct sprd_efuse_variant_data { + u32 blk_nums; + u32 blk_offset; + bool blk_double; +}; + +struct sprd_efuse { + struct device *dev; + struct clk *clk; + struct hwspinlock *hwlock; + struct mutex mutex; + void __iomem *base; + const struct sprd_efuse_variant_data *data; +}; + +static const struct sprd_efuse_variant_data ums312_data = { + .blk_nums = SPRD_EFUSE_NORMAL_BLOCK_NUMS, + .blk_offset = SPRD_EFUSE_NORMAL_BLOCK_OFFSET, + .blk_double = false, +}; + +/* + * On Spreadtrum platform, we have multi-subsystems will access the unique + * efuse controller, so we need one hardware spinlock to synchronize between + * the multiple subsystems. + */ +static int sprd_efuse_lock(struct sprd_efuse *efuse) +{ + int ret; + + mutex_lock(&efuse->mutex); + + ret = hwspin_lock_timeout_raw(efuse->hwlock, + SPRD_EFUSE_HWLOCK_TIMEOUT); + if (ret) { + dev_err(efuse->dev, "timeout get the hwspinlock\n"); + mutex_unlock(&efuse->mutex); + return ret; + } + + return 0; +} + +static void sprd_efuse_unlock(struct sprd_efuse *efuse) +{ + hwspin_unlock_raw(efuse->hwlock); + mutex_unlock(&efuse->mutex); +} + +static void sprd_efuse_set_prog_
[PATCH v2 1/2] dt-bindings: nvmem: Add Spreadtrum eFuse controller documentation
From: Freeman Liu This patch adds the binding documentation for Spreadtrum eFuse controller. Signed-off-by: Freeman Liu Signed-off-by: Baolin Wang Reviewed-by: Rob Herring --- Changes from v1: - Add reviewed tag from Rob. --- .../devicetree/bindings/nvmem/sprd-efuse.txt | 39 1 file changed, 39 insertions(+) create mode 100644 Documentation/devicetree/bindings/nvmem/sprd-efuse.txt diff --git a/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt b/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt new file mode 100644 index 000..96b6fee --- /dev/null +++ b/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt @@ -0,0 +1,39 @@ += Spreadtrum eFuse device tree bindings = + +Required properties: +- compatible: Should be "sprd,ums312-efuse". +- reg: Specify the address offset of efuse controller. +- clock-names: Should be "enable". +- clocks: The phandle and specifier referencing the controller's clock. +- hwlocks: Reference to a phandle of a hwlock provider node. + += Data cells = +Are child nodes of eFuse, bindings of which as described in +bindings/nvmem/nvmem.txt + +Example: + + ap_efuse: efuse@3224 { + compatible = "sprd,ums312-efuse"; + reg = <0 0x3224 0 0x1>; + clock-names = "enable"; + hwlocks = <&hwlock 8>; + clocks = <&aonapb_gate CLK_EFUSE_EB>; + + /* Data cells */ + thermal_calib: calib@10 { + reg = <0x10 0x2>; + }; + }; + += Data consumers = +Are device nodes which consume nvmem data cells. + +Example: + + thermal { + ... + + nvmem-cells = <&thermal_calib>; + nvmem-cell-names = "calibration"; + }; -- 1.7.9.5
Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers
On 09/10/2019 10:15 AM, Christophe Leroy wrote: > > > On 09/10/2019 03:56 AM, Anshuman Khandual wrote: >> >> >> On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote: >>> On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote: On 09/07/2019 12:33 AM, Gerald Schaefer wrote: > On Fri, 6 Sep 2019 11:58:59 +0530 > Anshuman Khandual wrote: > >> On 09/05/2019 10:36 PM, Gerald Schaefer wrote: >>> On Thu, 5 Sep 2019 14:48:14 +0530 >>> Anshuman Khandual wrote: >>> > [...] >> + >> +#if !defined(__PAGETABLE_PMD_FOLDED) && >> !defined(__ARCH_HAS_4LEVEL_HACK) >> +static void pud_clear_tests(pud_t *pudp) >> +{ >> + memset(pudp, RANDOM_NZVALUE, sizeof(pud_t)); >> + pud_clear(pudp); >> + WARN_ON(!pud_none(READ_ONCE(*pudp))); >> +} > > For pgd/p4d/pud_clear(), we only clear if the page table level is > present > and not folded. The memset() here overwrites the table type bits, so > pud_clear() will not clear anything on s390 and the pud_none() check > will > fail. > Would it be possible to OR a (larger) random value into the table, so > that > the lower 12 bits would be preserved? So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE, it should OR a large random value preserving lower 12 bits. Hmm, this should still do the trick for other platforms, they just need non zero value. So on s390, the lower 12 bits on the page table entry already has valid value while entering this function which would make sure that pud_clear() really does clear the entry ? >>> >>> Yes, in theory the table entry on s390 would have the type set in the >>> last >>> 4 bits, so preserving those would be enough. If it does not conflict >>> with >>> others, I would still suggest preserving all 12 bits since those would >>> contain >>> arch-specific flags in general, just to be sure. For s390, the pte/pmd >>> tests >>> would also work with the memset, but for consistency I think the same >>> logic >>> should be used in all pxd_clear_tests. >> >> Makes sense but.. >> >> There is a small challenge with this. Modifying individual bits on a >> given >> page table entry from generic code like this test case is bit tricky. >> That >> is because there are not enough helpers to create entries with an >> absolute >> value. This would have been easier if all the platforms provided >> functions >> like __pxx() which is not the case now. Otherwise something like this >> should >> have worked. >> >> >> pud_t pud = READ_ONCE(*pudp); >> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0)) >> WRITE_ONCE(*pudp, pud); >> >> But __pud() will fail to build in many platforms. > > Hmm, I simply used this on my system to make pud_clear_tests() work, not > sure if it works on all archs: > > pud_val(*pudp) |= RANDOM_NZVALUE; Which compiles on arm64 but then fails on x86 because of the way pmd_val() has been defined there. >>> >>> Use instead >>> >>> *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE); >> >> Agreed. >> >> As I had mentioned before this would have been really the cleanest approach. >> >>> >>> It *should* be more portable. >> >> Not really, because not all the platforms have __pxx() definitions right now. >> Going with these will clearly cause build failures on affected platforms. >> Lets >> examine __pud() for instance. It is defined only on these platforms. >> >> arch/arm64/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { >> (x) } ) >> arch/mips/include/asm/pgtable-64.h: #define __pud(x) ((pud_t) { (x) }) >> arch/powerpc/include/asm/pgtable-be-types.h: #define __pud(x) ((pud_t) { >> cpu_to_be64(x) }) >> arch/powerpc/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) >> }) >> arch/s390/include/asm/page.h: #define __pud(x) ((pud_t) { (x) } ) >> arch/sparc/include/asm/page_64.h: #define __pud(x) ((pud_t) { (x) } ) >> arch/sparc/include/asm/page_64.h: #define __pud(x) (x) >> arch/x86/include/asm/pgtable.h: #define __pud(x) >> native_make_pud(x) > > You missed: > arch/x86/include/asm/paravirt.h:static inline pud_t __pud(pudval_t val) > include/asm-generic/pgtable-nop4d-hack.h:#define __pud(x) > ((pud_t) { __pgd(x) }) > include/asm-generic/pgtable-nopud.h:#define __pud(x) ((pud_t) { > __p4d(x) }) > >> >> Similarly for __pmd() >> >> arch/alpha/include/asm/page.h: #define __pmd(x) ((pmd_t) { (x) } >> ) >> arch/arm/include/asm/page-nommu.h: #define __pmd(x) (x) >> arch/arm/include/asm/pgtable-2level
Re: [PATCH v6 00/12] implement KASLR for powerpc/fsl_booke/32
Hi Scott, On 2019/8/28 12:05, Scott Wood wrote: On Fri, 2019-08-09 at 18:07 +0800, Jason Yan wrote: This series implements KASLR for powerpc/fsl_booke/32, as a security feature that deters exploit attempts relying on knowledge of the location of kernel internals. Since CONFIG_RELOCATABLE has already supported, what we need to do is map or copy kernel to a proper place and relocate. Have you tested this with a kernel that was loaded at a non-zero address? I tried loading a kernel at 0x0400 (by changing the address in the uImage, and setting bootm_low to 0400 in U-Boot), and it works without CONFIG_RANDOMIZE and fails with. How did you change the load address of the uImage, by changing the kernel config CONFIG_PHYSICAL_START or the "-a/-e" parameter of mkimage? I tried both, but it did not work with or without CONFIG_RANDOMIZE. Thanks, Jason Freescale Book-E parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1 entries are not suitable to map the kernel directly in a randomized region, so we chose to copy the kernel to a proper place and restart to relocate. Entropy is derived from the banner and timer base, which will change every build and boot. This not so much safe so additionally the bootloader may pass entropy via the /chosen/kaslr-seed node in device tree. How complicated would it be to directly access the HW RNG (if present) that early in the boot? It'd be nice if a U-Boot update weren't required (and particularly concerning that KASLR would appear to work without a U-Boot update, but without decent entropy). -Scott .
Re: [PATCH v2] Staging: gasket: Use temporaries to reduce line length.
Wow... I checked, compiled and still sent the wrong thing again. I'm gonna have to give this up soon if i can't get it right. Sandro V > On 10 Sep 2019, at 07:06, Sandro Volery wrote: > > Using temporaries for gasket_page_table entries to remove scnprintf() > statements and reduce line length, as suggested by Joe Perches. Thanks! > > Signed-off-by: Sandro Volery > --- > drivers/staging/gasket/apex_driver.c | 20 +--- > 1 file changed, 9 insertions(+), 11 deletions(-) > > diff --git a/drivers/staging/gasket/apex_driver.c > b/drivers/staging/gasket/apex_driver.c > index 2973bb920a26..16ac4329d65f 100644 > --- a/drivers/staging/gasket/apex_driver.c > +++ b/drivers/staging/gasket/apex_driver.c > @@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct > device_attribute *attr, >struct gasket_dev *gasket_dev; >struct gasket_sysfs_attribute *gasket_attr; >enum sysfs_attribute_type type; > +struct gasket_page_table *gpt; > +uint val; > >gasket_dev = gasket_sysfs_get_device_data(device); >if (!gasket_dev) { > @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct > device_attribute *attr, >} > >type = (enum sysfs_attribute_type)gasket_attr->data.attr_type; > +gpt = gasket_dev->page_table[0]; >switch (type) { >case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE: > -ret = scnprintf(buf, PAGE_SIZE, "%u\n", > -gasket_page_table_num_entries( > -gasket_dev->page_table[0])); > +val = gasket_page_table_num_simple_entries(gpt); >break; >case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE: > -ret = scnprintf(buf, PAGE_SIZE, "%u\n", > -gasket_page_table_num_simple_entries( > -gasket_dev->page_table[0])); > +val = gasket_page_table_num_simple_entries(gpt); >break; >case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES: > -ret = scnprintf(buf, PAGE_SIZE, "%u\n", > -gasket_page_table_num_active_pages( > -gasket_dev->page_table[0])); > +val = gasket_page_table_num_active_pages(gpt); >break; >default: >dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n", >attr->attr.name); >ret = 0; > -break; > +goto exit; >} > - > +ret = scnprintf(buf, PAGE_SIZE, "%u\n", val); > +exit: >gasket_sysfs_put_attr(device, gasket_attr); >gasket_sysfs_put_device_data(device, gasket_dev); >return ret; > -- > 2.23.0 >
[PATCH v2] Staging: gasket: Use temporaries to reduce line length.
Using temporaries for gasket_page_table entries to remove scnprintf() statements and reduce line length, as suggested by Joe Perches. Thanks! Signed-off-by: Sandro Volery --- drivers/staging/gasket/apex_driver.c | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/staging/gasket/apex_driver.c b/drivers/staging/gasket/apex_driver.c index 2973bb920a26..16ac4329d65f 100644 --- a/drivers/staging/gasket/apex_driver.c +++ b/drivers/staging/gasket/apex_driver.c @@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct device_attribute *attr, struct gasket_dev *gasket_dev; struct gasket_sysfs_attribute *gasket_attr; enum sysfs_attribute_type type; + struct gasket_page_table *gpt; + uint val; gasket_dev = gasket_sysfs_get_device_data(device); if (!gasket_dev) { @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct device_attribute *attr, } type = (enum sysfs_attribute_type)gasket_attr->data.attr_type; + gpt = gasket_dev->page_table[0]; switch (type) { case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_entries( - gasket_dev->page_table[0])); + val = gasket_page_table_num_simple_entries(gpt); break; case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_simple_entries( - gasket_dev->page_table[0])); + val = gasket_page_table_num_simple_entries(gpt); break; case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES: - ret = scnprintf(buf, PAGE_SIZE, "%u\n", - gasket_page_table_num_active_pages( - gasket_dev->page_table[0])); + val = gasket_page_table_num_active_pages(gpt); break; default: dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n", attr->attr.name); ret = 0; - break; + goto exit; } - + ret = scnprintf(buf, PAGE_SIZE, "%u\n", val); +exit: gasket_sysfs_put_attr(device, gasket_attr); gasket_sysfs_put_device_data(device, gasket_dev); return ret; -- 2.23.0
Re: [PATCH] Staging: gasket: Use temporaries to reduce line length.
> On 10 Sep 2019, at 00:30, Joe Perches wrote: > > On Mon, 2019-09-09 at 22:28 +0200, Sandro Volery wrote: >> Using temporaries for gasket_page_table entries to remove scnprintf() >> statements and reduce line length, as suggested by Joe Perches. Thanks! > > nak. Slow down. You broke the code. > > Please be _way_ more careful and verify for yourself > the code you submit _before_ you submit it. > > compile/test/verify, twice if necessary. > Shoot. I'm sorry I'm just really trying to get into all this... > You also should have cc'd me on this patch. > Will do! I'll submit v2 this afternoon. >> diff --git a/drivers/staging/gasket/apex_driver.c >> b/drivers/staging/gasket/apex_driver.c > [] >> @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, >> struct device_attribute *attr, >>} >> >>type = (enum sysfs_attribute_type)gasket_attr->data.attr_type; >> +gpt = gasket_dev->page_table[0]; >>switch (type) { >>case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE: >> -ret = scnprintf(buf, PAGE_SIZE, "%u\n", >> -gasket_page_table_num_entries( >> -gasket_dev->page_table[0])); >> +val = gasket_page_table_num_simple_entries(gpt); > > You likely duplicated this line via copy/paste. > This should be: >val = gasket_page_table_num_entries(gpt); > Thanks for being patient with me so far... I'd imagine others would've freaked out at me by now :)
Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers
On 09/10/2019 03:56 AM, Anshuman Khandual wrote: On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote: On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote: On 09/07/2019 12:33 AM, Gerald Schaefer wrote: On Fri, 6 Sep 2019 11:58:59 +0530 Anshuman Khandual wrote: On 09/05/2019 10:36 PM, Gerald Schaefer wrote: On Thu, 5 Sep 2019 14:48:14 +0530 Anshuman Khandual wrote: [...] + +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK) +static void pud_clear_tests(pud_t *pudp) +{ + memset(pudp, RANDOM_NZVALUE, sizeof(pud_t)); + pud_clear(pudp); + WARN_ON(!pud_none(READ_ONCE(*pudp))); +} For pgd/p4d/pud_clear(), we only clear if the page table level is present and not folded. The memset() here overwrites the table type bits, so pud_clear() will not clear anything on s390 and the pud_none() check will fail. Would it be possible to OR a (larger) random value into the table, so that the lower 12 bits would be preserved? So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE, it should OR a large random value preserving lower 12 bits. Hmm, this should still do the trick for other platforms, they just need non zero value. So on s390, the lower 12 bits on the page table entry already has valid value while entering this function which would make sure that pud_clear() really does clear the entry ? Yes, in theory the table entry on s390 would have the type set in the last 4 bits, so preserving those would be enough. If it does not conflict with others, I would still suggest preserving all 12 bits since those would contain arch-specific flags in general, just to be sure. For s390, the pte/pmd tests would also work with the memset, but for consistency I think the same logic should be used in all pxd_clear_tests. Makes sense but.. There is a small challenge with this. Modifying individual bits on a given page table entry from generic code like this test case is bit tricky. That is because there are not enough helpers to create entries with an absolute value. This would have been easier if all the platforms provided functions like __pxx() which is not the case now. Otherwise something like this should have worked. pud_t pud = READ_ONCE(*pudp); pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0)) WRITE_ONCE(*pudp, pud); But __pud() will fail to build in many platforms. Hmm, I simply used this on my system to make pud_clear_tests() work, not sure if it works on all archs: pud_val(*pudp) |= RANDOM_NZVALUE; Which compiles on arm64 but then fails on x86 because of the way pmd_val() has been defined there. Use instead *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE); Agreed. As I had mentioned before this would have been really the cleanest approach. It *should* be more portable. Not really, because not all the platforms have __pxx() definitions right now. Going with these will clearly cause build failures on affected platforms. Lets examine __pud() for instance. It is defined only on these platforms. arch/arm64/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) } ) arch/mips/include/asm/pgtable-64.h: #define __pud(x) ((pud_t) { (x) }) arch/powerpc/include/asm/pgtable-be-types.h:#define __pud(x) ((pud_t) { cpu_to_be64(x) }) arch/powerpc/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) }) arch/s390/include/asm/page.h: #define __pud(x) ((pud_t) { (x) } ) arch/sparc/include/asm/page_64.h: #define __pud(x) ((pud_t) { (x) } ) arch/sparc/include/asm/page_64.h: #define __pud(x) (x) arch/x86/include/asm/pgtable.h: #define __pud(x) native_make_pud(x) You missed: arch/x86/include/asm/paravirt.h:static inline pud_t __pud(pudval_t val) include/asm-generic/pgtable-nop4d-hack.h:#define __pud(x) ((pud_t) { __pgd(x) }) include/asm-generic/pgtable-nopud.h:#define __pud(x) ((pud_t) { __p4d(x) }) Similarly for __pmd() arch/alpha/include/asm/page.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/page-nommu.h: #define __pmd(x) (x) arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x) (x) arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x) (x) arch/arm64/include/asm/pgtable-types.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/m68k/include/asm/page.h: #define __pmd(x) ((pmd_t) { { (x) }, }) arch/mips/include/asm/pgtable-64.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/nds32/include/asm/page.h: #define __pmd(x) (x) arch/parisc/include/asm/page.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/parisc/include/asm/page.h: #define __pmd(x) (x) arch/powerp
[PATCH 1/8] x86/platform/uv: Save OEM_ID from ACPI MADT probe
Save the OEM_ID and OEM_TABLE_ID passed to the apic driver probe function for later use. Also, convert the char list arg passed from the kernel to a true null-terminated string. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,10 @@ static u64 gru_dist_base, gru_first_no static u64 gru_dist_lmask, gru_dist_umask; static union uvh_apiciduvh_apicid; +/* Unpack OEM/TABLE ID's to be NULL terminated strings */ +static u8 oem_id[ACPI_OEM_ID_SIZE + 1]; +static u8 oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1]; + /* Information derived from CPUID: */ static struct { unsigned int apicid_shift; @@ -248,11 +253,20 @@ static void __init uv_set_apicid_hibit(v } } -static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id) +static void __init uv_stringify(int len, char *to, char *from) +{ + /* Relies on 'to' being NULL chars so result will be NULL terminated */ + strncpy(to, from, len-1); +} + +static int __init uv_acpi_madt_oem_check(char *_oem_id, char *_oem_table_id) { int pnodeid; int uv_apic; + uv_stringify(sizeof(oem_id), oem_id, _oem_id); + uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id); + if (strncmp(oem_id, "SGI", 3) != 0) { if (strncmp(oem_id, "NSGI", 4) == 0) { uv_hubless_system = true; --
[PATCH V2 6/8] x86/platform/uv: Decode UVsystab Info
Decode the hubless UVsystab passed from BIOS to the kernel saving pertinent info in a similar manner that hubbed UVsystabs are decoded. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Removed redundant error message after call to uv_bios_init. Removed redundant error message after call to decode_uv_systab. Clarify selection of UV4 and higher when checking for extended UVsystab in decode_uv_systab(). --- arch/x86/kernel/apic/x2apic_uv_x.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -1303,7 +1303,8 @@ static int __init decode_uv_systab(void) struct uv_systab *st; int i; - if (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE) + /* If system is uv3 or lower, there is no extended UVsystab */ + if (is_uv_hubbed(0xfe) < uv(4) && is_uv_hubless(0xfe) < uv(4)) return 0; /* No extended UVsystab required */ st = uv_systab; @@ -1554,8 +1555,15 @@ static __init int uv_system_init_hubless /* Init kernel/BIOS interface */ rc = uv_bios_init(); + if (rc < 0) + return rc; - /* Create user access node if UVsystab available */ + /* Process UVsystab */ + rc = decode_uv_systab(); + if (rc < 0) + return rc; + + /* Create user access node */ if (rc >= 0) uv_setup_proc_files(1); --
[PATCH 3/8] x86/platform/uv: Add return code to UV BIOS Init function
Add a return code to the UV BIOS init function that indicates the successful initialization of the kernel/BIOS callback interface. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/include/asm/uv/bios.h |2 +- arch/x86/platform/uv/bios_uv.c |9 + 2 files changed, 6 insertions(+), 5 deletions(-) --- linux.orig/arch/x86/include/asm/uv/bios.h +++ linux/arch/x86/include/asm/uv/bios.h @@ -138,7 +138,7 @@ extern s64 uv_bios_change_memprotect(u64 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *); extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus); -extern void uv_bios_init(void); +extern int uv_bios_init(void); extern unsigned long sn_rtc_cycles_per_second; extern int uv_type; --- linux.orig/arch/x86/platform/uv/bios_uv.c +++ linux/arch/x86/platform/uv/bios_uv.c @@ -184,20 +184,20 @@ int uv_bios_set_legacy_vga_target(bool d } EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target); -void uv_bios_init(void) +int uv_bios_init(void) { uv_systab = NULL; if ((uv_systab_phys == EFI_INVALID_TABLE_ADDR) || !uv_systab_phys || efi_runtime_disabled()) { pr_crit("UV: UVsystab: missing\n"); - return; + return -EEXIST; } uv_systab = ioremap(uv_systab_phys, sizeof(struct uv_systab)); if (!uv_systab || strncmp(uv_systab->signature, UV_SYSTAB_SIG, 4)) { pr_err("UV: UVsystab: bad signature!\n"); iounmap(uv_systab); - return; + return -EINVAL; } /* Starting with UV4 the UV systab size is variable */ @@ -208,8 +208,9 @@ void uv_bios_init(void) uv_systab = ioremap(uv_systab_phys, size); if (!uv_systab) { pr_err("UV: UVsystab: ioremap(%d) failed!\n", size); - return; + return -EFAULT; } } pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision); + return 0; } --
[PATCH V2 5/8] x86/platform/uv: Add UV Hubbed/Hubless Proc FS Files
Indicate to UV user utilities that UV hubless support is available on this system via the existing /proc infterface. The current interface is maintained with the addition of new /proc leaves ("hubbed", "hubless", and "oemid") that contain the specific type of UV arch this one is. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Remove is_uv_hubbed define Remove leading '_' from _is_uv_hubbed --- arch/x86/include/asm/uv/uv.h |4 + arch/x86/kernel/apic/x2apic_uv_x.c | 93 - 2 files changed, 96 insertions(+), 1 deletion(-) --- linux.orig/arch/x86/include/asm/uv/uv.h +++ linux/arch/x86/include/asm/uv/uv.h @@ -12,6 +12,8 @@ struct mm_struct; #ifdef CONFIG_X86_UV #include +#defineUV_PROC_NODE"sgi_uv" + static inline int uv(int uvtype) { /* uv(0) is "any" */ @@ -28,6 +30,7 @@ static inline bool is_early_uv_system(vo return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR; } extern int is_uv_system(void); +extern int is_uv_hubbed(int uvtype); extern int is_uv_hubless(int uvtype); extern void uv_cpu_init(void); extern void uv_nmi_init(void); @@ -40,6 +43,7 @@ extern const struct cpumask *uv_flush_tl static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; } static inline bool is_early_uv_system(void){ return 0; } static inline int is_uv_system(void) { return 0; } +static inline int is_uv_hubbed(int uv) { return 0; } static inline int is_uv_hubless(int uv) { return 0; } static inline void uv_cpu_init(void) { } static inline void uv_system_init(void){ } --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -26,6 +26,7 @@ static DEFINE_PER_CPU(int, x2apic_extra_bits); static enum uv_system_type uv_system_type; +static int uv_hubbed_system; static int uv_hubless_system; static u64 gru_start_paddr, gru_end_paddr; static u64 gru_dist_base, gru_first_node_paddr = -1LL, gru_last_node_paddr; @@ -309,6 +310,24 @@ static int __init uv_acpi_madt_oem_check if (uv_hub_info->hub_revision == 0) goto badbios; + switch (uv_hub_info->hub_revision) { + case UV4_HUB_REVISION_BASE: + uv_hubbed_system = 0x11; + break; + + case UV3_HUB_REVISION_BASE: + uv_hubbed_system = 0x9; + break; + + case UV2_HUB_REVISION_BASE: + uv_hubbed_system = 0x5; + break; + + case UV1_HUB_REVISION_BASE: + uv_hubbed_system = 0x3; + break; + } + pnodeid = early_get_pnodeid(); early_get_apic_socketid_shift(); @@ -359,6 +378,12 @@ int is_uv_system(void) } EXPORT_SYMBOL_GPL(is_uv_system); +int is_uv_hubbed(int uvtype) +{ + return (uv_hubbed_system & uvtype); +} +EXPORT_SYMBOL_GPL(is_uv_hubbed); + int is_uv_hubless(int uvtype) { return (uv_hubless_system & uvtype); @@ -1457,6 +1482,68 @@ static void __init build_socket_tables(v } } +/* Setup user proc fs files */ +static int proc_hubbed_show(struct seq_file *file, void *data) +{ + seq_printf(file, "0x%x\n", uv_hubbed_system); + return 0; +} + +static int proc_hubless_show(struct seq_file *file, void *data) +{ + seq_printf(file, "0x%x\n", uv_hubless_system); + return 0; +} + +static int proc_oemid_show(struct seq_file *file, void *data) +{ + seq_printf(file, "%s/%s\n", oem_id, oem_table_id); + return 0; +} + +static int proc_hubbed_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_hubbed_show, (void *)NULL); +} + +static int proc_hubless_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_hubless_show, (void *)NULL); +} + +static int proc_oemid_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_oemid_show, (void *)NULL); +} + +/* (struct is "non-const" as open function is set at runtime) */ +static struct file_operations proc_version_fops = { + .read = seq_read, + .llseek = seq_lseek, + .release= single_release, +}; + +static const struct file_operations proc_oemid_fops = { + .open = proc_oemid_open, + .read = seq_read, + .llseek = seq_lseek, + .release= single_release, +}; + +static __init void uv_setup_proc_files(int hubless) +{ + struct proc_dir_entry *pde; + char *name = hubless ? "hubless" : "hubbed"; + + pde = proc_mkdir(UV_PROC_NODE,
[PATCH V2 2/8] x86/platform/uv: Return UV Hubless System Type
Return the type of UV hubless system for UV specific code that depends on that. Use a define to indicate the change in arg type for this function in uv.h. Add a function to convert UV system type to bit pattern needed for is_uv_hubless(). Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Remove is_uv_hubless define Remove leading '_' from _is_uv_hubless --- arch/x86/include/asm/uv/uv.h | 12 ++-- arch/x86/kernel/apic/x2apic_uv_x.c | 27 ++- 2 files changed, 28 insertions(+), 11 deletions(-) --- linux.orig/arch/x86/include/asm/uv/uv.h +++ linux/arch/x86/include/asm/uv/uv.h @@ -12,6 +12,14 @@ struct mm_struct; #ifdef CONFIG_X86_UV #include +static inline int uv(int uvtype) +{ + /* uv(0) is "any" */ + if (uvtype >= 0 && uvtype <= 30) + return 1 << uvtype; + return 1; +} + extern unsigned long uv_systab_phys; extern enum uv_system_type get_uv_system_type(void); @@ -20,7 +28,7 @@ static inline bool is_early_uv_system(vo return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR; } extern int is_uv_system(void); -extern int is_uv_hubless(void); +extern int is_uv_hubless(int uvtype); extern void uv_cpu_init(void); extern void uv_nmi_init(void); extern void uv_system_init(void); @@ -32,7 +40,7 @@ extern const struct cpumask *uv_flush_tl static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; } static inline bool is_early_uv_system(void){ return 0; } static inline int is_uv_system(void) { return 0; } -static inline int is_uv_hubless(void) { return 0; } +static inline int is_uv_hubless(int uv) { return 0; } static inline void uv_cpu_init(void) { } static inline void uv_system_init(void){ } static inline const struct cpumask * --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -26,7 +26,7 @@ static DEFINE_PER_CPU(int, x2apic_extra_bits); static enum uv_system_type uv_system_type; -static booluv_hubless_system; +static int uv_hubless_system; static u64 gru_start_paddr, gru_end_paddr; static u64 gru_dist_base, gru_first_node_paddr = -1LL, gru_last_node_paddr; static u64 gru_dist_lmask, gru_dist_umask; @@ -268,11 +268,20 @@ static int __init uv_acpi_madt_oem_check uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id); if (strncmp(oem_id, "SGI", 3) != 0) { - if (strncmp(oem_id, "NSGI", 4) == 0) { - uv_hubless_system = true; - pr_info("UV: OEM IDs %s/%s, HUBLESS\n", - oem_id, oem_table_id); - } + if (strncmp(oem_id, "NSGI", 4) != 0) + return 0; + + /* UV4 Hubless, CH, (0x11:UV4+Any) */ + if (strncmp(oem_id, "NSGI4", 5) == 0) + uv_hubless_system = 0x11; + + /* UV3 Hubless, UV300/MC990X w/o hub (0x9:UV3+Any) */ + else + uv_hubless_system = 0x9; + + pr_info("UV: OEM IDs %s/%s, HUBLESS(0x%x)\n", + oem_id, oem_table_id, uv_hubless_system); + return 0; } @@ -350,9 +359,9 @@ int is_uv_system(void) } EXPORT_SYMBOL_GPL(is_uv_system); -int is_uv_hubless(void) +int is_uv_hubless(int uvtype) { - return uv_hubless_system; + return (uv_hubless_system & uvtype); } EXPORT_SYMBOL_GPL(is_uv_hubless); @@ -1592,7 +1601,7 @@ static void __init uv_system_init_hub(vo */ void __init uv_system_init(void) { - if (likely(!is_uv_system() && !is_uv_hubless())) + if (likely(!is_uv_system() && !is_uv_hubless(1))) return; if (is_uv_system()) --
[PATCH 4/8] x86/platform/uv: Setup UV functions for Hubless UV Systems
Add more support for UV systems that do not contain a UV Hub (AKA "hubless"). This update adds support for additional functions required: Use PCH NMI handler instead of a UV Hub NMI handler. Initialize the UV BIOS callback interface used to support specific UV functions. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -1457,6 +1457,20 @@ static void __init build_socket_tables(v } } +/* Initialize UV hubless systems */ +static __init int uv_system_init_hubless(void) +{ + int rc; + + /* Setup PCH NMI handler */ + uv_nmi_setup_hubless(); + + /* Init kernel/BIOS interface */ + rc = uv_bios_init(); + + return rc; +} + static void __init uv_system_init_hub(void) { struct uv_hub_info_s hub_info = {0}; @@ -1596,8 +1610,8 @@ static void __init uv_system_init_hub(vo } /* - * There is a small amount of UV specific code needed to initialize a - * UV system that does not have a "UV HUB" (referred to as "hubless"). + * There is a different code path needed to initialize a UV system that does + * not have a "UV HUB" (referred to as "hubless"). */ void __init uv_system_init(void) { @@ -1607,7 +1621,7 @@ void __init uv_system_init(void) if (is_uv_system()) uv_system_init_hub(); else - uv_nmi_setup_hubless(); + uv_system_init_hubless(); } apic_driver(apic_x2apic_uv_x); --
[PATCH 7/8] x86/platform/uv: Check EFI Boot to set reboot type
Change to checking for EFI Boot type from previous check on if this is a KDUMP kernel. This allows for KDUMP kernels that can handle EFI reboots. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -1483,6 +1484,14 @@ static void __init build_socket_tables(v } } +/* Check which reboot to use */ +static void check_efi_reboot(void) +{ + /* If EFI reboot not available, use ACPI reboot */ + if (!efi_enabled(EFI_BOOT)) + reboot_type = BOOT_ACPI; +} + /* Setup user proc fs files */ static int proc_hubbed_show(struct seq_file *file, void *data) { @@ -1571,6 +1580,8 @@ static __init int uv_system_init_hubless if (rc >= 0) uv_setup_proc_files(1); + check_efi_reboot(); + return rc; } @@ -1704,12 +1715,7 @@ static void __init uv_system_init_hub(vo /* Register Legacy VGA I/O redirection handler: */ pci_register_set_vga_state(uv_set_vga_state); - /* -* For a kdump kernel the reset must be BOOT_ACPI, not BOOT_EFI, as -* EFI is not enabled in the kdump kernel: -*/ - if (is_kdump_kernel()) - reboot_type = BOOT_ACPI; + check_efi_reboot(); } /* --
[PATCH V2 8/8] x86/platform/uv: Account for UV Hubless in is_uvX_hub Ops
The references in the is_uvX_hub() function uses the hub_info pointer which will be NULL when the system is hubless. This change avoids that NULL dereference. It is also an optimization in performance. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Add WARNING that the is UVx supported defines will be removed. --- arch/x86/include/asm/uv/.uv_hub.h.swp |binary arch/x86/include/asm/uv/uv_hub.h | 61 --- 1 file changed, 20 insertions(+), 41 deletions(-) Binary files linux.orig/arch/x86/include/asm/uv/.uv_hub.h.swp and linux/arch/x86/include/asm/uv/.uv_hub.h.swp differ --- linux.orig/arch/x86/include/asm/uv/uv_hub.h +++ linux/arch/x86/include/asm/uv/uv_hub.h @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -243,83 +244,61 @@ static inline int uv_hub_info_check(int #define UV4_HUB_REVISION_BASE 7 #define UV4A_HUB_REVISION_BASE 8 /* UV4 (fixed) rev 2 */ -#ifdef UV1_HUB_IS_SUPPORTED +/* WARNING: UVx_HUB_IS_SUPPORTED defines are deprecated and will be removed */ static inline int is_uv1_hub(void) { - return uv_hub_info->hub_revision < UV2_HUB_REVISION_BASE; -} +#ifdef UV1_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(1)); #else -static inline int is_uv1_hub(void) -{ return 0; -} #endif +} -#ifdef UV2_HUB_IS_SUPPORTED static inline int is_uv2_hub(void) { - return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) && - (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE)); -} +#ifdef UV2_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(2)); #else -static inline int is_uv2_hub(void) -{ return 0; -} #endif +} -#ifdef UV3_HUB_IS_SUPPORTED static inline int is_uv3_hub(void) { - return ((uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE) && - (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE)); -} +#ifdef UV3_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(3)); #else -static inline int is_uv3_hub(void) -{ return 0; -} #endif +} /* First test "is UV4A", then "is UV4" */ -#ifdef UV4A_HUB_IS_SUPPORTED -static inline int is_uv4a_hub(void) -{ - return (uv_hub_info->hub_revision >= UV4A_HUB_REVISION_BASE); -} -#else static inline int is_uv4a_hub(void) { +#ifdef UV4A_HUB_IS_SUPPORTED + if (is_uv_hubbed(uv(4))) + return (uv_hub_info->hub_revision == UV4A_HUB_REVISION_BASE); +#endif return 0; } -#endif -#ifdef UV4_HUB_IS_SUPPORTED static inline int is_uv4_hub(void) { - return uv_hub_info->hub_revision >= UV4_HUB_REVISION_BASE; -} +#ifdef UV4_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(4)); #else -static inline int is_uv4_hub(void) -{ return 0; -} #endif +} static inline int is_uvx_hub(void) { - if (uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) - return uv_hub_info->hub_revision; - - return 0; + return (is_uv_hubbed(-2) >= uv(2)); } static inline int is_uv_hub(void) { -#ifdef UV1_HUB_IS_SUPPORTED - return uv_hub_info->hub_revision; -#endif - return is_uvx_hub(); + return is_uv1_hub() || is_uvx_hub(); } union uvh_apicid { --
[PATCH V2 0/8] x86/platform/UV: Update UV Hubless System Support
On 9/5/2019 11:47 AM, Mike Travis wrote: > > These patches support upcoming UV systems that do not have a UV HUB. > > [1/8] Save OEM_ID from ACPI MADT probe > > [2/8] Return UV Hubless System Type V2: Remove is_uv_hubless define Remove leading '_' from _is_uv_hubless > [3/8] Add return code to UV BIOS Init function > > [4/8] Setup UV functions for Hubless UV Systems > > [5/8] Add UV Hubbed/Hubless Proc FS Files V2: Remove is_uv_hubbed define Remove leading '_' from _is_uv_hubbed > [6/8] Decode UVsystab Info V2: Removed redundant error message after call to uv_bios_init. Removed redundant error message after call to decode_uv_systab. Clarify selection of UV4 and higher when checking for extended UVsystab in decode_uv_systab(). > [7/8] Check EFI Boot to set reboot type > > [8/8] Account for UV Hubless in is_uvX_hub Ops V2: Add WARNING that the is UVx supported defines will be removed. --
[PATCH 4/8] x86/platform/uv: Setup UV functions for Hubless UV Systems
Add more support for UV systems that do not contain a UV Hub (AKA "hubless"). This update adds support for additional functions required: Use PCH NMI handler instead of a UV Hub NMI handler. Initialize the UV BIOS callback interface used to support specific UV functions. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -1457,6 +1457,20 @@ static void __init build_socket_tables(v } } +/* Initialize UV hubless systems */ +static __init int uv_system_init_hubless(void) +{ + int rc; + + /* Setup PCH NMI handler */ + uv_nmi_setup_hubless(); + + /* Init kernel/BIOS interface */ + rc = uv_bios_init(); + + return rc; +} + static void __init uv_system_init_hub(void) { struct uv_hub_info_s hub_info = {0}; @@ -1596,8 +1610,8 @@ static void __init uv_system_init_hub(vo } /* - * There is a small amount of UV specific code needed to initialize a - * UV system that does not have a "UV HUB" (referred to as "hubless"). + * There is a different code path needed to initialize a UV system that does + * not have a "UV HUB" (referred to as "hubless"). */ void __init uv_system_init(void) { @@ -1607,7 +1621,7 @@ void __init uv_system_init(void) if (is_uv_system()) uv_system_init_hub(); else - uv_nmi_setup_hubless(); + uv_system_init_hubless(); } apic_driver(apic_x2apic_uv_x); --
[PATCH 3/8] x86/platform/uv: Add return code to UV BIOS Init function
Add a return code to the UV BIOS init function that indicates the successful initialization of the kernel/BIOS callback interface. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/include/asm/uv/bios.h |2 +- arch/x86/platform/uv/bios_uv.c |9 + 2 files changed, 6 insertions(+), 5 deletions(-) --- linux.orig/arch/x86/include/asm/uv/bios.h +++ linux/arch/x86/include/asm/uv/bios.h @@ -138,7 +138,7 @@ extern s64 uv_bios_change_memprotect(u64 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *); extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus); -extern void uv_bios_init(void); +extern int uv_bios_init(void); extern unsigned long sn_rtc_cycles_per_second; extern int uv_type; --- linux.orig/arch/x86/platform/uv/bios_uv.c +++ linux/arch/x86/platform/uv/bios_uv.c @@ -184,20 +184,20 @@ int uv_bios_set_legacy_vga_target(bool d } EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target); -void uv_bios_init(void) +int uv_bios_init(void) { uv_systab = NULL; if ((uv_systab_phys == EFI_INVALID_TABLE_ADDR) || !uv_systab_phys || efi_runtime_disabled()) { pr_crit("UV: UVsystab: missing\n"); - return; + return -EEXIST; } uv_systab = ioremap(uv_systab_phys, sizeof(struct uv_systab)); if (!uv_systab || strncmp(uv_systab->signature, UV_SYSTAB_SIG, 4)) { pr_err("UV: UVsystab: bad signature!\n"); iounmap(uv_systab); - return; + return -EINVAL; } /* Starting with UV4 the UV systab size is variable */ @@ -208,8 +208,9 @@ void uv_bios_init(void) uv_systab = ioremap(uv_systab_phys, size); if (!uv_systab) { pr_err("UV: UVsystab: ioremap(%d) failed!\n", size); - return; + return -EFAULT; } } pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision); + return 0; } --
[PATCH V2 6/8] x86/platform/uv: Decode UVsystab Info
Decode the hubless UVsystab passed from BIOS to the kernel saving pertinent info in a similar manner that hubbed UVsystabs are decoded. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Removed redundant error message after call to uv_bios_init. Removed redundant error message after call to decode_uv_systab. Clarify selection of UV4 and higher when checking for extended UVsystab in decode_uv_systab(). --- arch/x86/kernel/apic/x2apic_uv_x.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -1303,7 +1303,8 @@ static int __init decode_uv_systab(void) struct uv_systab *st; int i; - if (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE) + /* If system is uv3 or lower, there is no extended UVsystab */ + if (is_uv_hubbed(0xfe) < uv(4) && is_uv_hubless(0xfe) < uv(4)) return 0; /* No extended UVsystab required */ st = uv_systab; @@ -1554,8 +1555,15 @@ static __init int uv_system_init_hubless /* Init kernel/BIOS interface */ rc = uv_bios_init(); + if (rc < 0) + return rc; - /* Create user access node if UVsystab available */ + /* Process UVsystab */ + rc = decode_uv_systab(); + if (rc < 0) + return rc; + + /* Create user access node */ if (rc >= 0) uv_setup_proc_files(1); --
[PATCH 7/8] x86/platform/uv: Check EFI Boot to set reboot type
Change to checking for EFI Boot type from previous check on if this is a KDUMP kernel. This allows for KDUMP kernels that can handle EFI reboots. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -1483,6 +1484,14 @@ static void __init build_socket_tables(v } } +/* Check which reboot to use */ +static void check_efi_reboot(void) +{ + /* If EFI reboot not available, use ACPI reboot */ + if (!efi_enabled(EFI_BOOT)) + reboot_type = BOOT_ACPI; +} + /* Setup user proc fs files */ static int proc_hubbed_show(struct seq_file *file, void *data) { @@ -1571,6 +1580,8 @@ static __init int uv_system_init_hubless if (rc >= 0) uv_setup_proc_files(1); + check_efi_reboot(); + return rc; } @@ -1704,12 +1715,7 @@ static void __init uv_system_init_hub(vo /* Register Legacy VGA I/O redirection handler: */ pci_register_set_vga_state(uv_set_vga_state); - /* -* For a kdump kernel the reset must be BOOT_ACPI, not BOOT_EFI, as -* EFI is not enabled in the kdump kernel: -*/ - if (is_kdump_kernel()) - reboot_type = BOOT_ACPI; + check_efi_reboot(); } /* --
[PATCH V2 8/8] x86/platform/uv: Account for UV Hubless in is_uvX_hub Ops
The references in the is_uvX_hub() function uses the hub_info pointer which will be NULL when the system is hubless. This change avoids that NULL dereference. It is also an optimization in performance. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Add WARNING that the is UVx supported defines will be removed. --- arch/x86/include/asm/uv/.uv_hub.h.swp |binary arch/x86/include/asm/uv/uv_hub.h | 61 --- 1 file changed, 20 insertions(+), 41 deletions(-) Binary files linux.orig/arch/x86/include/asm/uv/.uv_hub.h.swp and linux/arch/x86/include/asm/uv/.uv_hub.h.swp differ --- linux.orig/arch/x86/include/asm/uv/uv_hub.h +++ linux/arch/x86/include/asm/uv/uv_hub.h @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -243,83 +244,61 @@ static inline int uv_hub_info_check(int #define UV4_HUB_REVISION_BASE 7 #define UV4A_HUB_REVISION_BASE 8 /* UV4 (fixed) rev 2 */ -#ifdef UV1_HUB_IS_SUPPORTED +/* WARNING: UVx_HUB_IS_SUPPORTED defines are deprecated and will be removed */ static inline int is_uv1_hub(void) { - return uv_hub_info->hub_revision < UV2_HUB_REVISION_BASE; -} +#ifdef UV1_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(1)); #else -static inline int is_uv1_hub(void) -{ return 0; -} #endif +} -#ifdef UV2_HUB_IS_SUPPORTED static inline int is_uv2_hub(void) { - return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) && - (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE)); -} +#ifdef UV2_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(2)); #else -static inline int is_uv2_hub(void) -{ return 0; -} #endif +} -#ifdef UV3_HUB_IS_SUPPORTED static inline int is_uv3_hub(void) { - return ((uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE) && - (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE)); -} +#ifdef UV3_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(3)); #else -static inline int is_uv3_hub(void) -{ return 0; -} #endif +} /* First test "is UV4A", then "is UV4" */ -#ifdef UV4A_HUB_IS_SUPPORTED -static inline int is_uv4a_hub(void) -{ - return (uv_hub_info->hub_revision >= UV4A_HUB_REVISION_BASE); -} -#else static inline int is_uv4a_hub(void) { +#ifdef UV4A_HUB_IS_SUPPORTED + if (is_uv_hubbed(uv(4))) + return (uv_hub_info->hub_revision == UV4A_HUB_REVISION_BASE); +#endif return 0; } -#endif -#ifdef UV4_HUB_IS_SUPPORTED static inline int is_uv4_hub(void) { - return uv_hub_info->hub_revision >= UV4_HUB_REVISION_BASE; -} +#ifdef UV4_HUB_IS_SUPPORTED + return is_uv_hubbed(uv(4)); #else -static inline int is_uv4_hub(void) -{ return 0; -} #endif +} static inline int is_uvx_hub(void) { - if (uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) - return uv_hub_info->hub_revision; - - return 0; + return (is_uv_hubbed(-2) >= uv(2)); } static inline int is_uv_hub(void) { -#ifdef UV1_HUB_IS_SUPPORTED - return uv_hub_info->hub_revision; -#endif - return is_uvx_hub(); + return is_uv1_hub() || is_uvx_hub(); } union uvh_apicid { --
[PATCH 1/8] x86/platform/uv: Save OEM_ID from ACPI MADT probe
Save the OEM_ID and OEM_TABLE_ID passed to the apic driver probe function for later use. Also, convert the char list arg passed from the kernel to a true null-terminated string. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/kernel/apic/x2apic_uv_x.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,10 @@ static u64 gru_dist_base, gru_first_no static u64 gru_dist_lmask, gru_dist_umask; static union uvh_apiciduvh_apicid; +/* Unpack OEM/TABLE ID's to be NULL terminated strings */ +static u8 oem_id[ACPI_OEM_ID_SIZE + 1]; +static u8 oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1]; + /* Information derived from CPUID: */ static struct { unsigned int apicid_shift; @@ -248,11 +253,20 @@ static void __init uv_set_apicid_hibit(v } } -static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id) +static void __init uv_stringify(int len, char *to, char *from) +{ + /* Relies on 'to' being NULL chars so result will be NULL terminated */ + strncpy(to, from, len-1); +} + +static int __init uv_acpi_madt_oem_check(char *_oem_id, char *_oem_table_id) { int pnodeid; int uv_apic; + uv_stringify(sizeof(oem_id), oem_id, _oem_id); + uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id); + if (strncmp(oem_id, "SGI", 3) != 0) { if (strncmp(oem_id, "NSGI", 4) == 0) { uv_hubless_system = true; --
[PATCH V2 2/8] x86/platform/uv: Return UV Hubless System Type
Return the type of UV hubless system for UV specific code that depends on that. Use a define to indicate the change in arg type for this function in uv.h. Add a function to convert UV system type to bit pattern needed for is_uv_hubless(). Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Remove is_uv_hubless define Remove leading '_' from _is_uv_hubless --- arch/x86/include/asm/uv/uv.h | 12 ++-- arch/x86/kernel/apic/x2apic_uv_x.c | 27 ++- 2 files changed, 28 insertions(+), 11 deletions(-) --- linux.orig/arch/x86/include/asm/uv/uv.h +++ linux/arch/x86/include/asm/uv/uv.h @@ -12,6 +12,14 @@ struct mm_struct; #ifdef CONFIG_X86_UV #include +static inline int uv(int uvtype) +{ + /* uv(0) is "any" */ + if (uvtype >= 0 && uvtype <= 30) + return 1 << uvtype; + return 1; +} + extern unsigned long uv_systab_phys; extern enum uv_system_type get_uv_system_type(void); @@ -20,7 +28,7 @@ static inline bool is_early_uv_system(vo return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR; } extern int is_uv_system(void); -extern int is_uv_hubless(void); +extern int is_uv_hubless(int uvtype); extern void uv_cpu_init(void); extern void uv_nmi_init(void); extern void uv_system_init(void); @@ -32,7 +40,7 @@ extern const struct cpumask *uv_flush_tl static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; } static inline bool is_early_uv_system(void){ return 0; } static inline int is_uv_system(void) { return 0; } -static inline int is_uv_hubless(void) { return 0; } +static inline int is_uv_hubless(int uv) { return 0; } static inline void uv_cpu_init(void) { } static inline void uv_system_init(void){ } static inline const struct cpumask * --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -26,7 +26,7 @@ static DEFINE_PER_CPU(int, x2apic_extra_bits); static enum uv_system_type uv_system_type; -static booluv_hubless_system; +static int uv_hubless_system; static u64 gru_start_paddr, gru_end_paddr; static u64 gru_dist_base, gru_first_node_paddr = -1LL, gru_last_node_paddr; static u64 gru_dist_lmask, gru_dist_umask; @@ -268,11 +268,20 @@ static int __init uv_acpi_madt_oem_check uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id); if (strncmp(oem_id, "SGI", 3) != 0) { - if (strncmp(oem_id, "NSGI", 4) == 0) { - uv_hubless_system = true; - pr_info("UV: OEM IDs %s/%s, HUBLESS\n", - oem_id, oem_table_id); - } + if (strncmp(oem_id, "NSGI", 4) != 0) + return 0; + + /* UV4 Hubless, CH, (0x11:UV4+Any) */ + if (strncmp(oem_id, "NSGI4", 5) == 0) + uv_hubless_system = 0x11; + + /* UV3 Hubless, UV300/MC990X w/o hub (0x9:UV3+Any) */ + else + uv_hubless_system = 0x9; + + pr_info("UV: OEM IDs %s/%s, HUBLESS(0x%x)\n", + oem_id, oem_table_id, uv_hubless_system); + return 0; } @@ -350,9 +359,9 @@ int is_uv_system(void) } EXPORT_SYMBOL_GPL(is_uv_system); -int is_uv_hubless(void) +int is_uv_hubless(int uvtype) { - return uv_hubless_system; + return (uv_hubless_system & uvtype); } EXPORT_SYMBOL_GPL(is_uv_hubless); @@ -1592,7 +1601,7 @@ static void __init uv_system_init_hub(vo */ void __init uv_system_init(void) { - if (likely(!is_uv_system() && !is_uv_hubless())) + if (likely(!is_uv_system() && !is_uv_hubless(1))) return; if (is_uv_system()) --
[PATCH V2 5/8] x86/platform/uv: Add UV Hubbed/Hubless Proc FS Files
Indicate to UV user utilities that UV hubless support is available on this system via the existing /proc infterface. The current interface is maintained with the addition of new /proc leaves ("hubbed", "hubless", and "oemid") that contain the specific type of UV arch this one is. Signed-off-by: Mike Travis Reviewed-by: Steve Wahl Reviewed-by: Dimitri Sivanich To: Thomas Gleixner To: Ingo Molnar To: H. Peter Anvin To: Andrew Morton To: Borislav Petkov To: Christoph Hellwig To: Sasha Levin Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Hedi Berriche Cc: Steve Wahl Cc: Justin Ernst Cc: x...@kernel.org Cc: linux-kernel@vger.kernel.org --- V2: Remove is_uv_hubbed define Remove leading '_' from _is_uv_hubbed --- arch/x86/include/asm/uv/uv.h |4 + arch/x86/kernel/apic/x2apic_uv_x.c | 93 - 2 files changed, 96 insertions(+), 1 deletion(-) --- linux.orig/arch/x86/include/asm/uv/uv.h +++ linux/arch/x86/include/asm/uv/uv.h @@ -12,6 +12,8 @@ struct mm_struct; #ifdef CONFIG_X86_UV #include +#defineUV_PROC_NODE"sgi_uv" + static inline int uv(int uvtype) { /* uv(0) is "any" */ @@ -28,6 +30,7 @@ static inline bool is_early_uv_system(vo return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR; } extern int is_uv_system(void); +extern int is_uv_hubbed(int uvtype); extern int is_uv_hubless(int uvtype); extern void uv_cpu_init(void); extern void uv_nmi_init(void); @@ -40,6 +43,7 @@ extern const struct cpumask *uv_flush_tl static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; } static inline bool is_early_uv_system(void){ return 0; } static inline int is_uv_system(void) { return 0; } +static inline int is_uv_hubbed(int uv) { return 0; } static inline int is_uv_hubless(int uv) { return 0; } static inline void uv_cpu_init(void) { } static inline void uv_system_init(void){ } --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c @@ -26,6 +26,7 @@ static DEFINE_PER_CPU(int, x2apic_extra_bits); static enum uv_system_type uv_system_type; +static int uv_hubbed_system; static int uv_hubless_system; static u64 gru_start_paddr, gru_end_paddr; static u64 gru_dist_base, gru_first_node_paddr = -1LL, gru_last_node_paddr; @@ -309,6 +310,24 @@ static int __init uv_acpi_madt_oem_check if (uv_hub_info->hub_revision == 0) goto badbios; + switch (uv_hub_info->hub_revision) { + case UV4_HUB_REVISION_BASE: + uv_hubbed_system = 0x11; + break; + + case UV3_HUB_REVISION_BASE: + uv_hubbed_system = 0x9; + break; + + case UV2_HUB_REVISION_BASE: + uv_hubbed_system = 0x5; + break; + + case UV1_HUB_REVISION_BASE: + uv_hubbed_system = 0x3; + break; + } + pnodeid = early_get_pnodeid(); early_get_apic_socketid_shift(); @@ -359,6 +378,12 @@ int is_uv_system(void) } EXPORT_SYMBOL_GPL(is_uv_system); +int is_uv_hubbed(int uvtype) +{ + return (uv_hubbed_system & uvtype); +} +EXPORT_SYMBOL_GPL(is_uv_hubbed); + int is_uv_hubless(int uvtype) { return (uv_hubless_system & uvtype); @@ -1457,6 +1482,68 @@ static void __init build_socket_tables(v } } +/* Setup user proc fs files */ +static int proc_hubbed_show(struct seq_file *file, void *data) +{ + seq_printf(file, "0x%x\n", uv_hubbed_system); + return 0; +} + +static int proc_hubless_show(struct seq_file *file, void *data) +{ + seq_printf(file, "0x%x\n", uv_hubless_system); + return 0; +} + +static int proc_oemid_show(struct seq_file *file, void *data) +{ + seq_printf(file, "%s/%s\n", oem_id, oem_table_id); + return 0; +} + +static int proc_hubbed_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_hubbed_show, (void *)NULL); +} + +static int proc_hubless_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_hubless_show, (void *)NULL); +} + +static int proc_oemid_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_oemid_show, (void *)NULL); +} + +/* (struct is "non-const" as open function is set at runtime) */ +static struct file_operations proc_version_fops = { + .read = seq_read, + .llseek = seq_lseek, + .release= single_release, +}; + +static const struct file_operations proc_oemid_fops = { + .open = proc_oemid_open, + .read = seq_read, + .llseek = seq_lseek, + .release= single_release, +}; + +static __init void uv_setup_proc_files(int hubless) +{ + struct proc_dir_entry *pde; + char *name = hubless ? "hubless" : "hubbed"; + + pde = proc_mkdir(UV_PROC_NODE,
[PATCH V2 0/8] x86/platform/UV: Update UV Hubless System Support
On 9/5/2019 11:47 AM, Mike Travis wrote: > > These patches support upcoming UV systems that do not have a UV HUB. > > [1/8] Save OEM_ID from ACPI MADT probe > > [2/8] Return UV Hubless System Type V2: Remove is_uv_hubless define Remove leading '_' from _is_uv_hubless > [3/8] Add return code to UV BIOS Init function > > [4/8] Setup UV functions for Hubless UV Systems > > [5/8] Add UV Hubbed/Hubless Proc FS Files V2: Remove is_uv_hubbed define Remove leading '_' from _is_uv_hubbed > [6/8] Decode UVsystab Info V2: Removed redundant error message after call to uv_bios_init. Removed redundant error message after call to decode_uv_systab. Clarify selection of UV4 and higher when checking for extended UVsystab in decode_uv_systab(). > [7/8] Check EFI Boot to set reboot type > > [8/8] Account for UV Hubless in is_uvX_hub Ops V2: Add WARNING that the is UVx supported defines will be removed. --
Re: Linux 5.3-rc8
Hi, On Sun, Sep 08, 2019 at 01:59:27PM -0700, Linus Torvalds wrote: > So we probably didn't strictly need an rc8 this release, but with LPC > and the KS conference travel this upcoming week it just makes > everything easier. > The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1] which was merged in v5.3-rc1, *always* leads to a blocked boot on my system due to low entropy. The hardware is not a VM: it's a Thinkpad E480 (i5-8250U CPU), with a standard Arch user-space. It was discovered through bisecting the problem v5.2 => v5.3-rc1, since v5.2 never had any similar issues. The issue still persists in v5.3-rc8: reverting that commit always fixes the problem. It seems that batching the directory lookup I/O requests (which are possibly a lot during boot) is minimizing sources of disk-activity- induced entropy? [2] [3] Can this even be considered a user-space breakage? I'm honestly not sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8) early-on fixes the problem. I'm not sure about the status of older CPUs though. Thanks, [1] commit b03755ad6f33b7b8cd7312a3596a2dbf496de6e7 Author: zhangjs Date: Wed Jun 19 23:41:29 2019 -0400 ext4: make __ext4_get_inode_loc plug Add a blk_plug to prevent the inode table readahead from being submitted as small I/O requests. Signed-off-by: zhangjs Signed-off-by: Theodore Ts'o Reviewed-by: Jan Kara [2] https://lkml.kernel.org/r/20190619122457.gf27...@quack2.suse.cz [3] block/blk-core.c :: blk_start_plug() -- darwi http://darwish.chasingpointers.com
[PATCH] tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
'exit' functions should be marked as __exit, not __init. Fixes: c10b13325ced ("tty: serial: Add RDA8810PL UART driver") Signed-off-by: Christophe JAILLET --- drivers/tty/serial/rda-uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/rda-uart.c b/drivers/tty/serial/rda-uart.c index c1b0d7662ef9..ff9a27d48bca 100644 --- a/drivers/tty/serial/rda-uart.c +++ b/drivers/tty/serial/rda-uart.c @@ -815,7 +815,7 @@ static int __init rda_uart_init(void) return ret; } -static void __init rda_uart_exit(void) +static void __exit rda_uart_exit(void) { platform_driver_unregister(&rda_uart_platform_driver); uart_unregister_driver(&rda_uart_driver); -- 2.20.1
[PATCH] tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
'exit' functions should be marked as __exit, not __init. Fixes: fc60a8b675bd ("tty: serial: owl: Implement console driver") Signed-off-by: Christophe JAILLET --- drivers/tty/serial/owl-uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/owl-uart.c b/drivers/tty/serial/owl-uart.c index 03963af77b15..d2d8b3494685 100644 --- a/drivers/tty/serial/owl-uart.c +++ b/drivers/tty/serial/owl-uart.c @@ -740,7 +740,7 @@ static int __init owl_uart_init(void) return ret; } -static void __init owl_uart_exit(void) +static void __exit owl_uart_exit(void) { platform_driver_unregister(&owl_uart_platform_driver); uart_unregister_driver(&owl_uart_driver); -- 2.20.1
Re: [PATCH] smack: include linux/watch_queue.h
Hi Arnd, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.3-rc8 next-20190904] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Arnd-Bergmann/smack-include-linux-watch_queue-h/20190910-095704 reproduce: # apt-get install sparse # sparse version: make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): >> security/smack/smack_lsm.c:45:11: sparse: error: unable to open >> 'linux/watch_queue.h' vim +45 security/smack/smack_lsm.c > 45 #include 46 #include "smack.h" 47 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH] mips: Loongson: Fix the link time qualifier of 'serial_exit()'
'exit' functions should be marked as __exit, not __init. Fixes: 85cc028817ef ("mips: make loongsoon serial driver explicitly modular") Signed-off-by: Christophe JAILLET --- arch/mips/loongson64/common/serial.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/loongson64/common/serial.c b/arch/mips/loongson64/common/serial.c index ffefc1cb2612..98c3a7feb10f 100644 --- a/arch/mips/loongson64/common/serial.c +++ b/arch/mips/loongson64/common/serial.c @@ -110,7 +110,7 @@ static int __init serial_init(void) } module_init(serial_init); -static void __init serial_exit(void) +static void __exit serial_exit(void) { platform_device_unregister(&uart8250_device); } -- 2.20.1
Re: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
On 2019/7/17 19:15, Vasily Averin wrote: > On 7/16/19 5:07 PM, Xiaoming Ni wrote: >> On 2019/7/16 18:20, Vasily Averin wrote: >>> On 7/16/19 5:00 AM, Xiaoming Ni wrote: On 2019/7/15 13:38, Vasily Averin wrote: > On 7/14/19 5:45 AM, Xiaoming Ni wrote: >> On 2019/7/12 22:07, gre...@linuxfoundation.org wrote: >>> On Fri, Jul 12, 2019 at 09:11:57PM +0800, Xiaoming Ni wrote: On 2019/7/11 21:57, Vasily Averin wrote: > On 7/11/19 4:55 AM, Nixiaoming wrote: >> On Wed, July 10, 2019 1:49 PM Vasily Averin wrote: >>> On 7/10/19 6:09 AM, Xiaoming Ni wrote: ... So in these two cases, is it more reasonable to trigger BUG() directly when checking for duplicate registration ? But why does current notifier_chain_register() just trigger WARN() without exiting ? notifier_chain_cond_register() direct exit without triggering WARN() ? >>> >>> It should recover from this, if it can be detected. The main point is >>> that not all apis have to be this "robust" when used within the kernel >>> as we do allow for the callers to know what they are doing :) >>> >> In the notifier_chain_register(), the condition ( (*nl) == n) is the >> same registration of the same hook. >> We can intercept this situation and avoid forming a linked list ring to >> make the API more rob ... ... > Yes, I'm agree, at present there are no difference between > notifier_chain_cond_register() and notifier_chain_register() > > Question is -- how to improve it. > You propose to remove notifier_chain_cond_register() by some way. > Another option is return an error, for some abstract callers who expect > possible double registration. > > Frankly speaking I prefer second one, > however because of kernel do not have any such callers right now seems you > are right, > and we can delete notifier_chain_cond_register(). > > So let me finally accept your patch-set. > > Thank you, > Vasily Averin > > . > Dear Greg Kroah-Hartman is there any other opinion on this patch set? can you pick this series? thanks Xiaoming Ni
Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers
On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote: > On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote: >> >> >> On 09/07/2019 12:33 AM, Gerald Schaefer wrote: >>> On Fri, 6 Sep 2019 11:58:59 +0530 >>> Anshuman Khandual wrote: >>> On 09/05/2019 10:36 PM, Gerald Schaefer wrote: > On Thu, 5 Sep 2019 14:48:14 +0530 > Anshuman Khandual wrote: > >>> [...] + +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK) +static void pud_clear_tests(pud_t *pudp) +{ + memset(pudp, RANDOM_NZVALUE, sizeof(pud_t)); + pud_clear(pudp); + WARN_ON(!pud_none(READ_ONCE(*pudp))); +} >>> >>> For pgd/p4d/pud_clear(), we only clear if the page table level is >>> present >>> and not folded. The memset() here overwrites the table type bits, so >>> pud_clear() will not clear anything on s390 and the pud_none() check >>> will >>> fail. >>> Would it be possible to OR a (larger) random value into the table, so >>> that >>> the lower 12 bits would be preserved? >> >> So the suggestion is instead of doing memset() on entry with >> RANDOM_NZVALUE, >> it should OR a large random value preserving lower 12 bits. Hmm, this >> should >> still do the trick for other platforms, they just need non zero value. >> So on >> s390, the lower 12 bits on the page table entry already has valid value >> while >> entering this function which would make sure that pud_clear() really does >> clear the entry ? > > Yes, in theory the table entry on s390 would have the type set in the last > 4 bits, so preserving those would be enough. If it does not conflict with > others, I would still suggest preserving all 12 bits since those would > contain > arch-specific flags in general, just to be sure. For s390, the pte/pmd > tests > would also work with the memset, but for consistency I think the same > logic > should be used in all pxd_clear_tests. Makes sense but.. There is a small challenge with this. Modifying individual bits on a given page table entry from generic code like this test case is bit tricky. That is because there are not enough helpers to create entries with an absolute value. This would have been easier if all the platforms provided functions like __pxx() which is not the case now. Otherwise something like this should have worked. pud_t pud = READ_ONCE(*pudp); pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0)) WRITE_ONCE(*pudp, pud); But __pud() will fail to build in many platforms. >>> >>> Hmm, I simply used this on my system to make pud_clear_tests() work, not >>> sure if it works on all archs: >>> >>> pud_val(*pudp) |= RANDOM_NZVALUE; >> >> Which compiles on arm64 but then fails on x86 because of the way pmd_val() >> has been defined there. > > Use instead > > *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE); Agreed. As I had mentioned before this would have been really the cleanest approach. > > It *should* be more portable. Not really, because not all the platforms have __pxx() definitions right now. Going with these will clearly cause build failures on affected platforms. Lets examine __pud() for instance. It is defined only on these platforms. arch/arm64/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) } ) arch/mips/include/asm/pgtable-64.h: #define __pud(x) ((pud_t) { (x) }) arch/powerpc/include/asm/pgtable-be-types.h:#define __pud(x) ((pud_t) { cpu_to_be64(x) }) arch/powerpc/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) }) arch/s390/include/asm/page.h: #define __pud(x) ((pud_t) { (x) } ) arch/sparc/include/asm/page_64.h: #define __pud(x) ((pud_t) { (x) } ) arch/sparc/include/asm/page_64.h: #define __pud(x) (x) arch/x86/include/asm/pgtable.h: #define __pud(x) native_make_pud(x) Similarly for __pmd() arch/alpha/include/asm/page.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/page-nommu.h: #define __pmd(x) (x) arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x) (x) arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x) ((pmd_t) { (x) } ) arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x) (x) arch/arm64/include/asm/pgtable-types.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/m68k/include/asm/page.h: #define __pmd(x) ((pmd_t) { { (x) }, }) arch/mips/include/asm/pgtable-64.h: #define __pmd(x) ((pmd_t) { (x) } ) arch/nds32/include/asm/page.h: #define __pmd(x) (x) arch/parisc/include/a
Re: [PATCH] arm64: fix unreachable code issue with cmpxchg
On Mon, Sep 09, 2019 at 10:21:35PM +0200, Arnd Bergmann wrote: > On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined > when CONFIG_OPTIMIZE_INLINING is set. > Clang then fails a compile-time assertion, because it cannot tell at > compile time what the size of the argument is: > > mm/memcontrol.o: In function `__cmpxchg_mb': > memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175' > memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 > against undefined symbol `__compiletime_assert_175' > > Mark all of the cmpxchg() style functions as __always_inline to > ensure that the compiler can see the result. > > Signed-off-by: Arnd Bergmann Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor
Re: [PATCH] smack: include linux/watch_queue.h
Hi Arnd, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.3-rc8 next-20190904] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Arnd-Bergmann/smack-include-linux-watch_queue-h/20190910-095704 config: ia64-allmodconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 7.4.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.4.0 make.cross ARCH=ia64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): >> security/smack/smack_lsm.c:45:10: fatal error: linux/watch_queue.h: No such >> file or directory #include ^ compilation terminated. vim +45 security/smack/smack_lsm.c > 45 #include 46 #include "smack.h" 47 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
possible deadlock in shmem_fallocate (3)
Hello, syzbot found the following crash on: HEAD commit:6d028043 Add linux-next specific files for 20190830 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=12359ec660 kernel config: https://syzkaller.appspot.com/x/.config?x=82a6bec43ab0cb69 dashboard link: https://syzkaller.appspot.com/bug?extid=5d04068d02b9da8a0947 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+5d04068d02b9da8a0...@syzkaller.appspotmail.com == WARNING: possible circular locking dependency detected 5.3.0-rc6-next-20190830 #75 Not tainted -- kswapd0/1770 is trying to acquire lock: 8880a0b9b780 (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock include/linux/fs.h:789 [inline] 8880a0b9b780 (&sb->s_type->i_mutex_key#13){+.+.}, at: shmem_fallocate+0x15a/0xc60 mm/shmem.c:2728 but task is already holding lock: 89042f80 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:4889 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}: __fs_reclaim_acquire mm/page_alloc.c:4075 [inline] fs_reclaim_acquire.part.0+0x24/0x30 mm/page_alloc.c:4086 fs_reclaim_acquire mm/page_alloc.c:4662 [inline] prepare_alloc_pages mm/page_alloc.c:4659 [inline] __alloc_pages_nodemask+0x52f/0x900 mm/page_alloc.c:4711 alloc_pages_vma+0x1bc/0x3f0 mm/mempolicy.c:2114 shmem_alloc_page+0xbd/0x180 mm/shmem.c:1496 shmem_alloc_and_acct_page+0x165/0x990 mm/shmem.c:1521 shmem_getpage_gfp+0x598/0x2680 mm/shmem.c:1835 shmem_getpage mm/shmem.c:152 [inline] shmem_write_begin+0x105/0x1e0 mm/shmem.c:2480 generic_perform_write+0x23b/0x540 mm/filemap.c:3304 __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3433 generic_file_write_iter+0x420/0x690 mm/filemap.c:3465 call_write_iter include/linux/fs.h:1890 [inline] new_sync_write+0x4d3/0x770 fs/read_write.c:483 __vfs_write+0xe1/0x110 fs/read_write.c:496 vfs_write+0x268/0x5d0 fs/read_write.c:558 ksys_write+0x14f/0x290 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:620 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&sb->s_type->i_mutex_key#13){+.+.}: check_prev_add kernel/locking/lockdep.c:2476 [inline] check_prevs_add kernel/locking/lockdep.c:2581 [inline] validate_chain kernel/locking/lockdep.c:2971 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 down_write+0x93/0x150 kernel/locking/rwsem.c:1534 inode_lock include/linux/fs.h:789 [inline] shmem_fallocate+0x15a/0xc60 mm/shmem.c:2728 ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline] ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437 do_shrink_slab+0x40f/0xa30 mm/vmscan.c:560 shrink_slab mm/vmscan.c:721 [inline] shrink_slab+0x19a/0x680 mm/vmscan.c:694 shrink_node+0x223/0x12e0 mm/vmscan.c:2807 kswapd_shrink_node mm/vmscan.c:3549 [inline] balance_pgdat+0x57c/0xea0 mm/vmscan.c:3707 kswapd+0x5c3/0xf30 mm/vmscan.c:3958 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 other info that might help us debug this: Possible unsafe locking scenario: CPU0CPU1 lock(fs_reclaim); lock(&sb->s_type->i_mutex_key#13); lock(fs_reclaim); lock(&sb->s_type->i_mutex_key#13); *** DEADLOCK *** 2 locks held by kswapd0/1770: #0: 89042f80 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:4889 #1: 8901ffe8 (shrinker_rwsem){}, at: shrink_slab mm/vmscan.c:711 [inline] #1: 8901ffe8 (shrinker_rwsem){}, at: shrink_slab+0xe6/0x680 mm/vmscan.c:694 stack backtrace: CPU: 0 PID: 1770 Comm: kswapd0 Not tainted 5.3.0-rc6-next-20190830 #75 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1685 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1809 check_prev_add kernel/locking/lockdep.c:2476 [inline] check_prevs_add kernel/locking/lockdep.c:2581 [inline] validate_chain kernel/locking/lockdep.c:2971 [in
[PATCH 1/2] arm64: dts: imx8mn: Add system counter node
Add i.MX8MN system counter node to enable timer-imx-sysctr broadcast timer driver. Signed-off-by: Anson Huang --- arch/arm64/boot/dts/freescale/imx8mn.dtsi | 8 1 file changed, 8 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi b/arch/arm64/boot/dts/freescale/imx8mn.dtsi index d94db95..0166f8c 100644 --- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi @@ -428,6 +428,14 @@ #pwm-cells = <2>; status = "disabled"; }; + + system_counter: timer@306a { + compatible = "nxp,sysctr-timer"; + reg = <0x306a 0x2>; + interrupts = ; + clocks = <&osc_24m>; + clock-names = "per"; + }; }; aips3: bus@3080 { -- 2.7.4
[PATCH 2/2] arm64: dts: imx8mn: Enable cpu-idle driver
Enable i.MX8MN cpu-idle using generic ARM cpu-idle driver, 2 states are supported, details as below: root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/name WFI root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/usage 3098 root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/name cpu-pd-wait root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/usage 3078 Signed-off-by: Anson Huang --- arch/arm64/boot/dts/freescale/imx8mn.dtsi | 17 + 1 file changed, 17 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi b/arch/arm64/boot/dts/freescale/imx8mn.dtsi index 0166f8c..e4efe8d 100644 --- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi @@ -43,6 +43,19 @@ #address-cells = <1>; #size-cells = <0>; + idle-states { + entry-method = "psci"; + + cpu_pd_wait: cpu-pd-wait { + compatible = "arm,idle-state"; + arm,psci-suspend-param = <0x0010033>; + local-timer-stop; + entry-latency-us = <1000>; + exit-latency-us = <700>; + min-residency-us = <2700>; + }; + }; + A53_0: cpu@0 { device_type = "cpu"; compatible = "arm,cortex-a53"; @@ -54,6 +67,7 @@ operating-points-v2 = <&a53_opp_table>; nvmem-cells = <&cpu_speed_grade>; nvmem-cell-names = "speed_grade"; + cpu-idle-states = <&cpu_pd_wait>; }; A53_1: cpu@1 { @@ -65,6 +79,7 @@ enable-method = "psci"; next-level-cache = <&A53_L2>; operating-points-v2 = <&a53_opp_table>; + cpu-idle-states = <&cpu_pd_wait>; }; A53_2: cpu@2 { @@ -76,6 +91,7 @@ enable-method = "psci"; next-level-cache = <&A53_L2>; operating-points-v2 = <&a53_opp_table>; + cpu-idle-states = <&cpu_pd_wait>; }; A53_3: cpu@3 { @@ -87,6 +103,7 @@ enable-method = "psci"; next-level-cache = <&A53_L2>; operating-points-v2 = <&a53_opp_table>; + cpu-idle-states = <&cpu_pd_wait>; }; A53_L2: l2-cache0 { -- 2.7.4
Re: [PATCH 2/4] mmc: Add virtual command queue support
On Mon, 9 Sep 2019 at 20:45, Adrian Hunter wrote: > > On 9/09/19 3:16 PM, Baolin Wang wrote: > > Hi Adrian, > > > > On Mon, 9 Sep 2019 at 20:02, Adrian Hunter wrote: > >> > >> On 6/09/19 6:52 AM, Baolin Wang wrote: > >>> Now the MMC read/write stack will always wait for previous request is > >>> completed by mmc_blk_rw_wait(), before sending a new request to hardware, > >>> or queue a work to complete request, that will bring context switching > >>> overhead, especially for high I/O per second rates, to affect the IO > >>> performance. > >>> > >>> Thus this patch introduces virtual command queue interface, which is > >>> similar with the hardware command queue engine's idea, that can remove > >>> the context switching. > >> > >> CQHCI is a hardware interface for eMMC's that support command queuing. > >> What > >> you are doing is a software issue queue, unrelated to CQHCI. I think you > > > > Yes. > > > >> should avoid all reference to CQHCI i.e. call it something else. > > > > Since its process is similar with CQHCI and re-use the CQHCI's > > interfaces, I called it virtual command queue. I am not sure what else > > name is better, any thoughts? VCQHCI? Thanks. > > What about swq for software queue. Maybe Ulf can suggest something? Um, though changing to use swq, still need reuse command queue's interfaces, like 'mq->use-cqe', 'host->cqe_depth' and cqe ops and so on, looks a little weird for me. But if you all agree with this name, then I am okay. Ulf, what do you suggest? -- Baolin Wang Best Regards
Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation
On (09/06/19 16:01), Peter Zijlstra wrote: > In fact, i've gotten output that is plain impossible with > the current junk. Peter, can you post any of those backtraces? Very curious. -ss
Re: [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation
On (08/10/19 07:53), Thomas Gleixner wrote: > > Right now we have an implementation for serial only, but that already is > useful. I nicely got (minimaly garbled) crash dumps out of an NMI > handler. With the current mainline console code the machine just hung. > Thomas, any chance you can post backtraces? Just curious where exactly current printk() and console_drivers() hung. -ss
Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
Hey Ming, Ok, so the real problem is per-cpu bounded tasks. I share Thomas opinion about a NAPI like approach. We already have that, its irq_poll, but it seems that for this use-case, we get lower performance for some reason. I'm not entirely sure why that is, maybe its because we need to mask interrupts because we don't have an "arm" register in nvme like network devices have? Long observed that IOPS drops much too by switching to threaded irq. If softirqd is waken up for handing softirq, the performance shouldn't be better than threaded irq. Its true that it shouldn't be any faster, but what irqpoll already has and we don't need to reinvent is a proper budgeting mechanism that needs to occur when multiple devices map irq vectors to the same cpu core. irqpoll already maintains a percpu list and dispatch the ->poll with a budget that the backend enforces and irqpoll multiplexes between them. Having this mechanism in irq (hard or threaded) context sounds unnecessary a bit. It seems like we're attempting to stay in irq context for as long as we can instead of scheduling to softirq/thread context if we have more than a minimal amount of work to do. Without at least understanding why softirq/thread degrades us so much this code seems like the wrong approach to me. Interrupt context will always be faster, but it is not a sufficient reason to spend as much time as possible there, is it? We should also keep in mind, that the networking stack has been doing this for years, I would try to understand why this cannot work for nvme before dismissing. Especially, Long found that context switch is increased a lot after applying your irq poll patch. http://lists.infradead.org/pipermail/linux-nvme/2019-August/026788.html Oh, I didn't see that one, wonder why... thanks! 5% improvement, I guess we can buy that for other users as is :) If we suffer from lots of context switches while the CPU is flooded with interrupts, then I would argue that we're re-raising softirq too much. In this use-case, my assumption is that the cpu cannot keep up with the interrupts and not that it doesn't reap enough (we also reap the first batch in interrupt context...) Perhaps making irqpoll continue until it must resched would improve things further? Although this is a latency vs. efficiency tradeoff, looks like MAX_SOFTIRQ_TIME is set to 2ms: " * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in * certain cases, such as stop_machine(), jiffies may cease to * increment and so we need the MAX_SOFTIRQ_RESTART limit as * well to make sure we eventually return from this method. * * These limits have been established via experimentation. * The two things to balance is latency against fairness - * we want to handle softirqs as soon as possible, but they * should not be able to lock up the box. " Long, does this patch make any difference? -- diff --git a/lib/irq_poll.c b/lib/irq_poll.c index 2f17b488d58e..d8eab563fa77 100644 --- a/lib/irq_poll.c +++ b/lib/irq_poll.c @@ -12,8 +12,6 @@ #include #include -static unsigned int irq_poll_budget __read_mostly = 256; - static DEFINE_PER_CPU(struct list_head, blk_cpu_iopoll); /** @@ -77,42 +75,29 @@ EXPORT_SYMBOL(irq_poll_complete); static void __latent_entropy irq_poll_softirq(struct softirq_action *h) { - struct list_head *list = this_cpu_ptr(&blk_cpu_iopoll); - int rearm = 0, budget = irq_poll_budget; - unsigned long start_time = jiffies; + struct list_head *irqpoll_list = this_cpu_ptr(&blk_cpu_iopoll); + LIST_HEAD(list); local_irq_disable(); + list_splice_init(irqpoll_list, &list); + local_irq_enable(); - while (!list_empty(list)) { + while (!list_empty(&list)) { struct irq_poll *iop; int work, weight; - /* -* If softirq window is exhausted then punt. -*/ - if (budget <= 0 || time_after(jiffies, start_time)) { - rearm = 1; - break; - } - - local_irq_enable(); - /* Even though interrupts have been re-enabled, this * access is safe because interrupts can only add new * entries to the tail of this list, and only ->poll() * calls can remove this head entry from the list. */ - iop = list_entry(list->next, struct irq_poll, list); + iop = list_first_entry(&list, struct irq_poll, list); weight = iop->weight; work = 0; if (test_bit(IRQ_POLL_F_SCHED, &iop->state)) work = iop->poll(iop, weight); - budget -= work; - - local_irq_disable(); - /* * Drivers must not modify the iopoll state, if they * consume their assigned weight (or more, some drivers can't @@ -125,11 +110,21
Zdravstvujte! Vas interesujut klientskie bazy dannyh?
Zdravstvujte! Vas interesujut klientskie bazy dannyh?
[PATCH 0/2] Add bounds check for Hotplugged memory
From: Alastair D'Silva This series adds bounds checks for hotplugged memory, ensuring that it is within the physically addressable range (for platforms that define MAX_(POSSIBLE_)PHYSMEM_BITS. This allows for early failure, rather than attempting to access bogus section numbers. Alastair D'Silva (2): memory_hotplug: Add a bounds check to check_hotplug_memory_range() mm: Add a bounds check in devm_memremap_pages() include/linux/memory_hotplug.h | 1 + mm/memory_hotplug.c| 19 ++- mm/memremap.c | 8 3 files changed, 27 insertions(+), 1 deletion(-) -- 2.21.0
[PATCH 2/2] mm: Add a bounds check in devm_memremap_pages()
From: Alastair D'Silva The call to check_hotplug_memory_addressable() validates that the memory is fully addressable. Without this call, it is possible that we may remap pages that is not physically addressable, resulting in bogus section numbers being returned from __section_nr(). Signed-off-by: Alastair D'Silva --- mm/memremap.c | 8 1 file changed, 8 insertions(+) diff --git a/mm/memremap.c b/mm/memremap.c index 86432650f829..fd00993caa3e 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -269,6 +269,13 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) mem_hotplug_begin(); + error = check_hotplug_memory_addressable(res->start, +resource_size(res)); + if (error) { + mem_hotplug_done(); + goto err_checkrange; + } + /* * For device private memory we call add_pages() as we only need to * allocate and initialize struct page for the device memory. More- @@ -324,6 +331,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) err_add_memory: kasan_remove_zero_shadow(__va(res->start), resource_size(res)); + err_checkrange: err_kasan: untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); err_pfn_remap: -- 2.21.0
[PATCH 1/2] memory_hotplug: Add a bounds check to check_hotplug_memory_range()
From: Alastair D'Silva On PowerPC, the address ranges allocated to OpenCAPI LPC memory are allocated from firmware. These address ranges may be higher than what older kernels permit, as we increased the maximum permissable address in commit 4ffe713b7587 ("powerpc/mm: Increase the max addressable memory to 2PB"). It is possible that the addressable range may change again in the future. In this scenario, we end up with a bogus section returned from __section_nr (see the discussion on the thread "mm: Trigger bug on if a section is not found in __section_nr"). Adding a check here means that we fail early and have an opportunity to handle the error gracefully, rather than rumbling on and potentially accessing an incorrect section. Further discussion is also on the thread ("powerpc: Perform a bounds check in arch_add_memory"). Signed-off-by: Alastair D'Silva --- include/linux/memory_hotplug.h | 1 + mm/memory_hotplug.c| 19 ++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index f46ea71b4ffd..bc477e98a310 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -110,6 +110,7 @@ extern void __online_page_increment_counters(struct page *page); extern void __online_page_free(struct page *page); extern int try_online_node(int nid); +int check_hotplug_memory_addressable(u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size, struct mhp_restrictions *restrictions); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index c73f09913165..3c5428b014f9 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1030,6 +1030,23 @@ int try_online_node(int nid) return ret; } +#ifndef MAX_POSSIBLE_PHYSMEM_BITS +#ifdef MAX_PHYSMEM_BITS +#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS +#endif +#endif + +int check_hotplug_memory_addressable(u64 start, u64 size) +{ +#ifdef MAX_POSSIBLE_PHYSMEM_BITS + if ((start + size - 1) >> MAX_POSSIBLE_PHYSMEM_BITS) + return -E2BIG; +#endif + + return 0; +} +EXPORT_SYMBOL_GPL(check_hotplug_memory_addressable); + static int check_hotplug_memory_range(u64 start, u64 size) { /* memory range must be block size aligned */ @@ -1040,7 +1057,7 @@ static int check_hotplug_memory_range(u64 start, u64 size) return -EINVAL; } - return 0; + return check_hotplug_memory_addressable(start, size); } static int online_memory_block(struct memory_block *mem, void *arg) -- 2.21.0
RE: [PATCH] clk: imx: lpcg: write twice when writing lpcg regs
> On Sat, Sep 7, 2019 at 9:47 PM Stephen Boyd wrote: > > > > Quoting Peng Fan (2019-08-27 01:17:50) > > > From: Peng Fan > > > > > > There is hardware issue that: > > > The output clock the LPCG cell will not turn back on as expected, > > > even though a read of the IPG registers in the LPCG indicates that > > > the clock should be enabled. > > > > > > The software workaround is to write twice to enable the LPCG clock > > > output. > > > > > > Signed-off-by: Peng Fan > > > > Does this need a Fixes tag? > > Not sure as it's not code logic issue but a hardware bug. > And 4.19 LTS still have not this driver support. Looks like there is an errata for this issue, and Ranjani just sent a patch for review internally, Back-to-back LPCG writes can be ignored by the LPCG register due to a HW bug. The writes need to be separated by atleast 4 cycles of the gated clock. The workaround is implemented as follows: 1. For clocks running greater than 50MHz no delay is required as the delay in accessing the LPCG register is sufficient. 2. For clocks running greater than 23MHz, a read followed by the write will provide the sufficient delay. 3. For clocks running below 23MHz, LPCG is not used. Need double check? Anson.
Re: [PATCH v1 1/1] ARM: dts: rockchip: set crypto default disabled on rk3288
Hi Heiko, On 9/1/2019 07:04, Heiko Stuebner wrote: Hi Elon, Am Donnerstag, 29. August 2019, 13:31:00 CEST schrieb Elon Zhang: On 8/27/2019 22:28, Heiko Stuebner wrote: Am Dienstag, 27. August 2019, 09:14:39 CEST schrieb Elon Zhang: Not every board needs to enable crypto node, so the node should be set default disabled in rk3288.dtsi and enabled in specific board dts file. Can you give a bit more rationale here? There would need to be a very specific reason because of the following: The crypto module is not wired to some board-specific components, so its usability does not depend on the specific board at all. Instead every board can just use it out of the box and the devicetree is supposed to describe the hardware and is _not_ meant as a space for user configuration. Right for almost all normal hardware modules but the crypto module was designed for secure world. As a result, the crypto module will become inaccessible for linux kernel if secure world enable it. We plan to enable the crypto module in secure world so we should set crypto module default disabled in linux kernel. ok ... I'm halfway convinced ;-) . The big thing I want to see is that secure setting in the actual firmware. Aka right now you probably have that in your Rockchip-specific ATF fork and I really want to see the relevant change for public uboot or ATF. I don't necessarily require it to be fully merged before taking this, but I really want to see the change either on a mailing list or atf gerrit instance [that makes the crypto engine secure only]. Rationale behind this is that we don't care very much about private stuff that the general ecosystem doesn't benefit from. Now the crypto security property setting is done in the rockchip private code, which is not opensource. So if you don't care about private stuff and the change in private stuff will not affect the upstream kernel, the crypto can be enabled in upstream kernel? Thanks Heiko So in fact the status property should probably go away completely from the crypto node, as it's usable out of the box in all cases. Heiko Signed-off-by: Elon Zhang --- arch/arm/boot/dts/rk3288.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi index cc893e154fe5..d509aa24177c 100644 --- a/arch/arm/boot/dts/rk3288.dtsi +++ b/arch/arm/boot/dts/rk3288.dtsi @@ -984,7 +984,7 @@ clock-names = "aclk", "hclk", "sclk", "apb_pclk"; resets = <&cru SRST_CRYPTO>; reset-names = "crypto-rst"; - status = "okay"; + status = "disabled"; }; iep_mmu: iommu@ff900800 {
[PATCH] arm64: dts: imx8mn: Add "fsl,imx8mq-src" as src's fallback compatible
i.MX8MN can reuse i.MX8MQ's src driver, add "fsl,imx8mq-src" as src's fallback compatible to enable it. Signed-off-by: Anson Huang --- arch/arm64/boot/dts/freescale/imx8mn.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi b/arch/arm64/boot/dts/freescale/imx8mn.dtsi index 785f4c4..d94db95 100644 --- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi @@ -371,7 +371,7 @@ }; src: reset-controller@3039 { - compatible = "fsl,imx8mn-src", "syscon"; + compatible = "fsl,imx8mn-src", "fsl,imx8mq-src", "syscon"; reg = <0x3039 0x1>; interrupts = ; #reset-cells = <1>; -- 2.7.4
[PATCH] proc:fix confusing macro arg name
state_size and ops are in the wrong position, fix it. Signed-off-by: Miaohe Lin --- include/linux/proc_fs.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h index a705aa2d03f9..0640be56dcbd 100644 --- a/include/linux/proc_fs.h +++ b/include/linux/proc_fs.h @@ -58,8 +58,8 @@ extern int remove_proc_subtree(const char *, struct proc_dir_entry *); struct proc_dir_entry *proc_create_net_data(const char *name, umode_t mode, struct proc_dir_entry *parent, const struct seq_operations *ops, unsigned int state_size, void *data); -#define proc_create_net(name, mode, parent, state_size, ops) \ - proc_create_net_data(name, mode, parent, state_size, ops, NULL) +#define proc_create_net(name, mode, parent, ops, state_size) \ + proc_create_net_data(name, mode, parent, ops, state_size, NULL) struct proc_dir_entry *proc_create_net_single(const char *name, umode_t mode, struct proc_dir_entry *parent, int (*show)(struct seq_file *, void *), void *data); -- 2.21.GIT
RE: [EXT] Re: [PATCH 3/3] ASoC: fsl_asrc: Fix error with S24_3LE format bitstream in i.MX8
Hi > > On Mon, Sep 09, 2019 at 06:33:21PM -0400, Shengjiu Wang wrote: > > There is error "aplay: pcm_write:2023: write error: Input/output error" > > on i.MX8QM/i.MX8QXP platform for S24_3LE format. > > > > In i.MX8QM/i.MX8QXP, the DMA is EDMA, which don't support 24bit > > sample, but we didn't add any constraint, that cause issues. > > > > So we need to query the caps of dma, then update the hw parameters > > according to the caps. > > > @@ -285,8 +293,81 @@ static int fsl_asrc_dma_startup(struct > > snd_pcm_substream *substream) > > > > runtime->private_data = pair; > > > > - snd_pcm_hw_constraint_integer(substream->runtime, > > - SNDRV_PCM_HW_PARAM_PERIODS); > > + ret = snd_pcm_hw_constraint_integer(substream->runtime, > > + SNDRV_PCM_HW_PARAM_PERIODS); > > + if (ret < 0) { > > + dev_err(dev, "failed to set pcm hw params periods\n"); > > + return ret; > > + } > > + > > + dma_data = snd_soc_dai_get_dma_data(rtd->cpu_dai, substream); > > + > > + /* Request a temp pair, which is release in the end */ > > + fsl_asrc_request_pair(1, pair); > > Not sure if it'd be practical, but a pair request could fail. Will probably > need > to check return value. > > And a quick feeling is that below code is mostly identical to what is in the > soc-generic-dmaengine-pcm.c file. So I'm wondering if we could abstract a > helper function somewhere in the ASoC core: Mark? > > Thanks > Nicolin > Yes, it refers to the code in soc-generic-dmaengine-pcm.c, if there is a common API, this is helpful. Best regards Wang shengjiu
Re: [PATCH 1/2] export.h: remove defined(__KERNEL__)
On Tue, 10 Sep 2019, Masahiro Yamada wrote: > On Tue, Sep 10, 2019 at 1:06 AM Nicolas Pitre wrote: > > > > On Mon, 9 Sep 2019, Masahiro Yamada wrote: > > > > > Hi Nicolas, > > > > > > On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre wrote: > > > > > > > > On Mon, 9 Sep 2019, Masahiro Yamada wrote: > > > > > > > > > This line was touched by commit f235541699bc ("export.h: allow for > > > > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did > > > > > not explain why. > > > > > > > > > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__). > > > > > > > > I'm pretty sure it was needed back then so not to interfere with users > > > > of this file. My fault for not documenting it. > > > > > > Hmm, I did not see a problem in my quick build test. > > > > > > Do you remember which file was causing the problem? > > > > If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the > > defined(__KERNEL__) test removed then you'll get: > > > > HOSTCC scripts/mod/modpost.o > > In file included from scripts/mod/modpost.c:24: > > scripts/mod/../../include/linux/export.h:81:10: fatal error: > > linux/kconfig.h: No such file or directory > > > > > > Nicolas > > > Thanks for explaining this. > > It is not the case any more. > > > I will reword the commit message as follows: > > >8--- > export.h: remove defined(__KERNEL__), which is no longer needed > > The conditional define(__KERNEL__) was added by commit f235541699bc > ("export.h: allow for per-symbol configurable EXPORT_SYMBOL()"). > > It was needed at that time to avoid the build error of modpost > with CONFIG_TRIM_UNUSED_KSYMS=y. > > Since commit b2c5cdcfd4bc ("modpost: remove symbol prefix support"), > modpost no longer includes linux/export.h, thus the define(__KERNEL__) > is unneeded. > >8--- > Acked-by: Nicolas Pitre Nicolas
RE: [EXT] Re: [PATCH 1/3] ASoC: fsl_asrc: Use in(out)put_format instead of in(out)put_word_width
Hi > > On Mon, Sep 09, 2019 at 06:33:19PM -0400, Shengjiu Wang wrote: > > snd_pcm_format_t is more formal than enum asrc_word_width, which > has > > two property, width and physical width, which is more accurate than > > enum asrc_word_width. So it is better to use in(out)put_format instead > > of in(out)put_word_width. > > Hmm...I don't really see the benefit of using snd_pcm_format_t here...I > mean, I know it's a generic one, and would understand if we use it as a > param for a common API. But this patch merely packs the "width" by > intentionally using this snd_pcm_format_t and then adds another > translation to unpack it.. I feel it's a bit overcomplicated. Or am I missing > something? > > And I feel it's not necessary to use ALSA common format in our own "struct > asrc_config" since it is more IP/register specific. > > Thanks > Nicolin > As you know, we have another M2M function internally, when user want to Set the format through M2M API, it is better to use snd_pcm_format_t instead the Width, for snd_pcm_format_t include two property, data with and physical width In driver some place need data width, some place need physical width. For example how to distinguish S24_LE and S24_3LE in driver, DMA setting needs The physical width, but ASRC need data width. Another purpose is that we have another new designed ASRC, which support more Formats, I would like it can share same API with this ASRC, using snd_pcm_format_t That we can use the common API, like snd_pcm_format_linear, snd_pcm_format_big_endian to get the property of the format, which is needed by driver. Best regards Wang shengjiu
Re: [PATCH] mm: avoid slub allocation while holding list_lock
On Tue, Sep 10, 2019 at 10:41:31AM +0900, Tetsuo Handa wrote: > Yu Zhao wrote: > > I think we can safely assume PAGE_SIZE is unsigned long aligned and > > page->objects is non-zero. But if you don't feel comfortable with these > > assumptions, I'd be happy to ensure them explicitly. > > I know PAGE_SIZE is unsigned long aligned. If someone by chance happens to > change from "dynamic allocation" to "on stack", get_order() will no longer > be called and the bug will show up. > > I don't know whether __get_free_page(GFP_ATOMIC) can temporarily consume more > than 4096 bytes, but if it can, we might want to avoid "dynamic allocation". With GFP_ATOMIC and ~~__GFP_HIGHMEM, it shouldn't. > By the way, if "struct kmem_cache_node" is object which won't have many > thousands > of instances, can't we embed that buffer into "struct kmem_cache_node" because > max size of that buffer is only 4096 bytes? It seems to me allocation in error path is better than always keeping a page around. But the latter may still be acceptable given it's done only when debug is on and, of course, on a per-node scale.
Re: [PATCH 1/2] export.h: remove defined(__KERNEL__)
On Tue, Sep 10, 2019 at 1:06 AM Nicolas Pitre wrote: > > On Mon, 9 Sep 2019, Masahiro Yamada wrote: > > > Hi Nicolas, > > > > On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre wrote: > > > > > > On Mon, 9 Sep 2019, Masahiro Yamada wrote: > > > > > > > This line was touched by commit f235541699bc ("export.h: allow for > > > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did > > > > not explain why. > > > > > > > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__). > > > > > > I'm pretty sure it was needed back then so not to interfere with users > > > of this file. My fault for not documenting it. > > > > Hmm, I did not see a problem in my quick build test. > > > > Do you remember which file was causing the problem? > > If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the > defined(__KERNEL__) test removed then you'll get: > > HOSTCC scripts/mod/modpost.o > In file included from scripts/mod/modpost.c:24: > scripts/mod/../../include/linux/export.h:81:10: fatal error: linux/kconfig.h: > No such file or directory > > > Nicolas Thanks for explaining this. It is not the case any more. I will reword the commit message as follows: >8--- export.h: remove defined(__KERNEL__), which is no longer needed The conditional define(__KERNEL__) was added by commit f235541699bc ("export.h: allow for per-symbol configurable EXPORT_SYMBOL()"). It was needed at that time to avoid the build error of modpost with CONFIG_TRIM_UNUSED_KSYMS=y. Since commit b2c5cdcfd4bc ("modpost: remove symbol prefix support"), modpost no longer includes linux/export.h, thus the define(__KERNEL__) is unneeded. >8--- -- Best Regards Masahiro Yamada
Re: [PATCH 2/3] ASoC: fsl_asrc: update supported sample format
Hi > > On Mon, Sep 09, 2019 at 06:33:20PM -0400, Shengjiu Wang wrote: > > The ASRC support 24bit/16bit/8bit input width, so S20_3LE format > > should not be supported, it is word width is 20bit. > > I thought 3LE used 24-bit physical width. And the driver assigns > ASRC_WIDTH_24_BIT to "width" for all non-16bit cases, so 20-bit would go > for that 24-bit slot also. I don't clearly recall if I had explicitly tested > S20_3LE, but I feel it should work since I put there... > > Thanks > Nicolin > For S20_3LE, the width is 20bit, but the ASRC only support 24bit, if set the ASRMCR1n.IWD= 24bit, because the actual width is 20 bit, the volume is Lower than expected, it likes 24bit data right shift 4 bit. So it is not supported. Best regards Wang shengjiu
[PATCH] reset: uniphier-glue: Add Pro5 USB3 support
Pro5 SoC has same scheme of USB3 reset as Pro4, so the data for Pro5 is equivalent to Pro4. Signed-off-by: Kunihiko Hayashi --- Documentation/devicetree/bindings/reset/uniphier-reset.txt | 5 +++-- drivers/reset/reset-uniphier-glue.c| 4 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/reset/uniphier-reset.txt b/Documentation/devicetree/bindings/reset/uniphier-reset.txt index ea00517..e320a8c 100644 --- a/Documentation/devicetree/bindings/reset/uniphier-reset.txt +++ b/Documentation/devicetree/bindings/reset/uniphier-reset.txt @@ -130,6 +130,7 @@ this layer. These clocks and resets should be described in each property. Required properties: - compatible: Should be "socionext,uniphier-pro4-usb3-reset" - for Pro4 SoC USB3 +"socionext,uniphier-pro5-usb3-reset" - for Pro5 SoC USB3 "socionext,uniphier-pxs2-usb3-reset" - for PXs2 SoC USB3 "socionext,uniphier-ld20-usb3-reset" - for LD20 SoC USB3 "socionext,uniphier-pxs3-usb3-reset" - for PXs3 SoC USB3 @@ -141,12 +142,12 @@ Required properties: - clocks: A list of phandles to the clock gate for the glue layer. According to the clock-names, appropriate clocks are required. - clock-names: Should contain -"gio", "link" - for Pro4 SoC +"gio", "link" - for Pro4 and Pro5 SoCs "link"- for others - resets: A list of phandles to the reset control for the glue layer. According to the reset-names, appropriate resets are required. - reset-names: Should contain -"gio", "link" - for Pro4 SoC +"gio", "link" - for Pro4 and Pro5 SoCs "link"- for others Example: diff --git a/drivers/reset/reset-uniphier-glue.c b/drivers/reset/reset-uniphier-glue.c index a45923f..2b188b3bb 100644 --- a/drivers/reset/reset-uniphier-glue.c +++ b/drivers/reset/reset-uniphier-glue.c @@ -141,6 +141,10 @@ static const struct of_device_id uniphier_glue_reset_match[] = { .data = &uniphier_pro4_data, }, { + .compatible = "socionext,uniphier-pro5-usb3-reset", + .data = &uniphier_pro4_data, + }, + { .compatible = "socionext,uniphier-pxs2-usb3-reset", .data = &uniphier_pxs2_data, }, -- 2.7.4
Re: [PATCH 3/3] ASoC: fsl_asrc: Fix error with S24_3LE format bitstream in i.MX8
On Mon, Sep 09, 2019 at 06:33:21PM -0400, Shengjiu Wang wrote: > There is error "aplay: pcm_write:2023: write error: Input/output error" > on i.MX8QM/i.MX8QXP platform for S24_3LE format. > > In i.MX8QM/i.MX8QXP, the DMA is EDMA, which don't support 24bit > sample, but we didn't add any constraint, that cause issues. > > So we need to query the caps of dma, then update the hw parameters > according to the caps. > @@ -285,8 +293,81 @@ static int fsl_asrc_dma_startup(struct snd_pcm_substream > *substream) > > runtime->private_data = pair; > > - snd_pcm_hw_constraint_integer(substream->runtime, > - SNDRV_PCM_HW_PARAM_PERIODS); > + ret = snd_pcm_hw_constraint_integer(substream->runtime, > + SNDRV_PCM_HW_PARAM_PERIODS); > + if (ret < 0) { > + dev_err(dev, "failed to set pcm hw params periods\n"); > + return ret; > + } > + > + dma_data = snd_soc_dai_get_dma_data(rtd->cpu_dai, substream); > + > + /* Request a temp pair, which is release in the end */ > + fsl_asrc_request_pair(1, pair); Not sure if it'd be practical, but a pair request could fail. Will probably need to check return value. And a quick feeling is that below code is mostly identical to what is in the soc-generic-dmaengine-pcm.c file. So I'm wondering if we could abstract a helper function somewhere in the ASoC core: Mark? Thanks Nicolin > + tmp_chan = fsl_asrc_get_dma_channel(pair, dir); > + if (!tmp_chan) { > + dev_err(dev, "can't get dma channel\n"); > + return -EINVAL; > + } > + > + ret = dma_get_slave_caps(tmp_chan, &dma_caps); > + if (ret == 0) { > + if (dma_caps.cmd_pause) > + snd_imx_hardware.info |= SNDRV_PCM_INFO_PAUSE | > + SNDRV_PCM_INFO_RESUME; > + if (dma_caps.residue_granularity <= > + DMA_RESIDUE_GRANULARITY_SEGMENT) > + snd_imx_hardware.info |= SNDRV_PCM_INFO_BATCH; > + > + if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) > + addr_widths = dma_caps.dst_addr_widths; > + else > + addr_widths = dma_caps.src_addr_widths; > + } > + > + /* > + * If SND_DMAENGINE_PCM_DAI_FLAG_PACK is set keep > + * hw.formats set to 0, meaning no restrictions are in place. > + * In this case it's the responsibility of the DAI driver to > + * provide the supported format information. > + */ > + if (!(dma_data->flags & SND_DMAENGINE_PCM_DAI_FLAG_PACK)) > + /* > + * Prepare formats mask for valid/allowed sample types. If the > + * dma does not have support for the given physical word size, > + * it needs to be masked out so user space can not use the > + * format which produces corrupted audio. > + * In case the dma driver does not implement the slave_caps the > + * default assumption is that it supports 1, 2 and 4 bytes > + * widths. > + */ > + for (i = 0; i <= SNDRV_PCM_FORMAT_LAST; i++) { > + int bits = snd_pcm_format_physical_width(i); > + > + /* > + * Enable only samples with DMA supported physical > + * widths > + */ > + switch (bits) { > + case 8: > + case 16: > + case 24: > + case 32: > + case 64: > + if (addr_widths & (1 << (bits / 8))) > + snd_imx_hardware.formats |= (1LL << i); > + break; > + default: > + /* Unsupported types */ > + break; > + } > + } > + > + if (tmp_chan) > + dma_release_channel(tmp_chan); > + fsl_asrc_release_pair(pair); > + > snd_soc_set_runtime_hwparams(substream, &snd_imx_hardware); > > return 0; > -- > 2.21.0 >
Re: [RFC PATCH untested] vhost: block speculation of translated descriptors
On 2019/9/9 下午10:45, Michael S. Tsirkin wrote: On Mon, Sep 09, 2019 at 03:19:55PM +0800, Jason Wang wrote: On 2019/9/8 下午7:05, Michael S. Tsirkin wrote: iovec addresses coming from vhost are assumed to be pre-validated, but in fact can be speculated to a value out of range. Userspace address are later validated with array_index_nospec so we can be sure kernel info does not leak through these addresses, but vhost must also not leak userspace info outside the allowed memory table to guests. Following the defence in depth principle, make sure the address is not validated out of node range. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 5dc174ac8cac..0ee375fb7145 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2072,7 +2072,9 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len, size = node->size - addr + node->start; _iov->iov_len = min((u64)len - s, size); _iov->iov_base = (void __user *)(unsigned long) - (node->userspace_addr + addr - node->start); + (node->userspace_addr + +array_index_nospec(addr - node->start, + node->size)); s += size; addr += size; ++ret; I've tried this on Kaby Lake smap off metadata acceleration off using testpmd (virtio-user) + vhost_net. I don't see obvious performance difference with TX PPS. Thanks Should I push this to Linus right now then? It's a security thing so maybe we better do it ASAP ... what's your opinion? Yes, you can. Acked-by: Jason Wang
[PATCH] regulator: uniphier: Add Pro5 USB3 VBUS support
Pro5 SoC has same scheme of USB3 VBUS as Pro4, so the data for Pro5 is equivalent to Pro4. Signed-off-by: Kunihiko Hayashi --- Documentation/devicetree/bindings/regulator/uniphier-regulator.txt | 5 +++-- drivers/regulator/uniphier-regulator.c | 4 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt b/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt index c9919f4..94fd38b 100644 --- a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt +++ b/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt @@ -13,6 +13,7 @@ this layer. These clocks and resets should be described in each property. Required properties: - compatible: Should be "socionext,uniphier-pro4-usb3-regulator" - for Pro4 SoC +"socionext,uniphier-pro5-usb3-regulator" - for Pro5 SoC "socionext,uniphier-pxs2-usb3-regulator" - for PXs2 SoC "socionext,uniphier-ld20-usb3-regulator" - for LD20 SoC "socionext,uniphier-pxs3-usb3-regulator" - for PXs3 SoC @@ -20,12 +21,12 @@ Required properties: - clocks: A list of phandles to the clock gate for USB3 glue layer. According to the clock-names, appropriate clocks are required. - clock-names: Should contain -"gio", "link" - for Pro4 SoC +"gio", "link" - for Pro4 and Pro5 SoCs "link"- for others - resets: A list of phandles to the reset control for USB3 glue layer. According to the reset-names, appropriate resets are required. - reset-names: Should contain -"gio", "link" - for Pro4 SoC +"gio", "link" - for Pro4 and Pro5 SoCs "link"- for others See Documentation/devicetree/bindings/regulator/regulator.txt diff --git a/drivers/regulator/uniphier-regulator.c b/drivers/regulator/uniphier-regulator.c index 9026d5a..2311924 100644 --- a/drivers/regulator/uniphier-regulator.c +++ b/drivers/regulator/uniphier-regulator.c @@ -186,6 +186,10 @@ static const struct of_device_id uniphier_regulator_match[] = { .data = &uniphier_pro4_usb3_data, }, { + .compatible = "socionext,uniphier-pro5-usb3-regulator", + .data = &uniphier_pro4_usb3_data, + }, + { .compatible = "socionext,uniphier-pxs2-usb3-regulator", .data = &uniphier_pxs2_usb3_data, }, -- 2.7.4