date:20190909

Re: [PATCH] serial/sifive: select SERIAL_EARLYCON

2019-09-09 Thread Andreas Schwab

On Sep 10 2019, Christoph Hellwig  wrote:

> The sifive serial driver implements earlycon support,

It should probably be documented in admin-guide/kernel-parameters.txt.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH] media: vimc: fla: Add virtual flash subdevice

2019-09-09 Thread Hans Verkuil

On 9/10/19 1:00 AM, Lucas Magalhães wrote:
> Hi Hans,
> Thanks for the review. I fixed most of the issues you found. Just have
> the question below.
> 
> On Mon, Sep 2, 2019 at 9:04 AM Hans Verkuil  wrote:
>>
>>> +
>>> +int vimc_fla_add(struct vimc_device *vimc, struct vimc_ent_config *vcfg)
>>> +{
>>> + struct v4l2_device *v4l2_dev = &vimc->v4l2_dev;
>>> + struct vimc_fla_device *vfla;
>>> + int ret;
>>> +
>>> + /* Allocate the vfla struct */
>>> + vfla = kzalloc(sizeof(*vfla), GFP_KERNEL);
>>> + if (!vfla)
>>> + return -ENOMEM;
>>> +
>>> + v4l2_ctrl_handler_init(&vfla->hdl, 4);
>>> +
>>> + v4l2_ctrl_new_std_menu(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +V4L2_CID_FLASH_LED_MODE,
>>> +V4L2_FLASH_LED_MODE_TORCH, ~0x7,
>>> +V4L2_FLASH_LED_MODE_NONE);
>>> + v4l2_ctrl_new_std_menu(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +V4L2_CID_FLASH_STROBE_SOURCE, 0x1, ~0x3,
>>> +V4L2_FLASH_STROBE_SOURCE_SOFTWARE);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_STROBE, 0, 0, 0, 0);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_STROBE_STOP, 0, 0, 0, 0);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_TIMEOUT, 1, 10, 1, 10);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_TORCH_INTENSITY, 0, 255, 1, 255);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_INTENSITY, 0, 255, 1, 255);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_INDICATOR_INTENSITY, 0, 255, 1, 255);
>>> + v4l2_ctrl_new_std(&vfla->hdl, &vimc_fla_ctrl_ops,
>>> +   V4L2_CID_FLASH_STROBE_STATUS, 0, 0, 0, 0);
>>
>> It would be nice if this would actually reflect the actual strobe status.
>>
> Regarding the strobe status I was reading the code and find out that
> V4L2_CID_FLASH_STROBE_STATUS is a V4L2_CTRL_FLAG_READ_ONLY
> but it's not a V4L2_CTRL_FLAG_VOLATILE. I found this intriguing. How an
> I suppose to get it if its not volatile? As I understood it changes over time
> if the strobe starts and the timeout expire, isn't it? Shouldn't it be 
> volatile
> if so?

A non-volatile read-only control is set deterministically by the the driver.
So the driver calls v4l2_ctrl_s_ctrl() to change the controls value.

A volatile read-only control is one where the value is read from a hardware
register that is continuously changing. E.g. if autogain is on, then the gain
register in a device contains the currently calculated gain, but that might be
changed the next time the register is read.

Regards,

Hans

> 
> I've already made a simple implementation were V4L2_CID_FLASH_STROBE_STATUS
> returns after calling V4L2_CID_FLASH_STROBE and becomes false after the 
> timeout
> time passes.
> 
> Thanks!
>

[tip: core/objtool] objtool: Clobber user CFLAGS variable

2019-09-09 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the core/objtool branch of tip:

Commit-ID: f73b3cc39c84220e6dccd463b5c8279b03514646
Gitweb:
https://git.kernel.org/tip/f73b3cc39c84220e6dccd463b5c8279b03514646
Author:Josh Poimboeuf 
AuthorDate:Thu, 29 Aug 2019 18:28:49 -05:00
Committer: Ingo Molnar 
CommitterDate: Tue, 10 Sep 2019 08:49:52 +02:00

objtool: Clobber user CFLAGS variable

If the build user has the CFLAGS variable set in their environment,
objtool blindly appends to it, which can cause unexpected behavior.

Clobber CFLAGS to ensure consistent objtool compilation behavior.

Reported-by: Valdis Kletnieks 
Tested-by: Valdis Kletnieks 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
https://lkml.kernel.org/r/83a276df209962e6058fcb6c615eef9d401c21bc.1567121311.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 8815823..20f67fc 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -35,7 +35,7 @@ INCLUDES := -I$(srctree)/tools/include \
-I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi \
-I$(srctree)/tools/objtool/arch/$(ARCH)/include
 WARNINGS := $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -Wno-packed
-CFLAGS   += -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) 
$(LIBELF_FLAGS)
+CFLAGS   := -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) 
$(LIBELF_FLAGS)
 LDFLAGS  += $(LIBELF_LIBS) $(LIBSUBCMD) $(KBUILD_HOSTLDFLAGS)
 
 # Allow old libelf to be used:

[PATCH net 2/2] sctp: destroy bucket if failed to bind addr

2019-09-09 Thread Mao Wenan

There is one memory leak bug report:
BUG: memory leak
unreferenced object 0x8881dc4c5ec0 (size 40):
  comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s)
  hex dump (first 32 bytes):
02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00  
f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00  .c=.
  backtrace:
[<72006339>] sctp_get_port_local+0x2a1/0xa00 [sctp]
[] sctp_do_bind+0x176/0x2c0 [sctp]
[<5be274a2>] sctp_bind+0x5a/0x80 [sctp]
[] inet6_bind+0x59/0xd0 [ipv6]
[] __sys_bind+0x120/0x1f0 net/socket.c:1647
[<4513635b>] __do_sys_bind net/socket.c:1658 [inline]
[<4513635b>] __se_sys_bind net/socket.c:1656 [inline]
[<4513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656
[<61f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
[<03d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

This is because in sctp_do_bind, if sctp_get_port_local is to
create hash bucket successfully, and sctp_add_bind_addr failed
to bind address, e.g return -ENOMEM, so memory leak found, it
needs to destroy allocated bucket.

Reported-by: Hulk Robot 
Signed-off-by: Mao Wenan 
---
 net/sctp/socket.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 766b68b55ebe..ab37fc1f7bb6 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -412,11 +412,13 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr 
*addr, int len)
ret = sctp_add_bind_addr(bp, addr, af->sockaddr_len,
 SCTP_ADDR_SRC, GFP_ATOMIC);
 
-   /* Copy back into socket for getsockname() use. */
-   if (!ret) {
-   inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num);
-   sp->pf->to_sk_saddr(addr, sk);
+   if (ret) {
+   sctp_put_port(sk);
+   return ret;
}
+   /* Copy back into socket for getsockname() use. */
+   inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num);
+   sp->pf->to_sk_saddr(addr, sk);
 
return ret;
 }
-- 
2.20.1

[PATCH net 0/2] fix memory leak for sctp_do_bind

2019-09-09 Thread Mao Wenan

First patch is to do cleanup, remove redundant assignment,
second patch is to fix memory leak for sctp_do_bind if failed
to bind address.

Mao Wenan (2):
  sctp: remove redundant assignment when call sctp_get_port_local
  sctp: destroy bucket if failed to bind addr

 net/sctp/socket.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

-- 
2.20.1

[PATCH net 1/2] sctp: remove redundant assignment when call sctp_get_port_local

2019-09-09 Thread Mao Wenan

There are more parentheses in if clause when call sctp_get_port_local
in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
do cleanup.

Signed-off-by: Mao Wenan 
---
 net/sctp/socket.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 9d1f83b10c0a..766b68b55ebe 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -399,9 +399,8 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr 
*addr, int len)
 * detection.
 */
addr->v4.sin_port = htons(snum);
-   if ((ret = sctp_get_port_local(sk, addr))) {
+   if (sctp_get_port_local(sk, addr))
return -EADDRINUSE;
-   }
 
/* Refresh ephemeral port.  */
if (!bp->port)
-- 
2.20.1

[PATCH v3][RESEND] scripts: use pkg-config to locate libcrypto

2019-09-09 Thread Rolf Eike Beer

Otherwise build fails if the headers are not in the default location. While at
it also ask pkg-config for the libs, with fallback to the existing value.

Signed-off-by: Rolf Eike Beer 
Cc: sta...@vger.kernel.org
---
 scripts/Makefile | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile b/scripts/Makefile
index 16bcb8087899..1715adcd8f81 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -8,7 +8,11 @@
 # conmakehash:   Create chartable
 # conmakehash:  Create arrays for initializing the kernel console tables
 
+PKG_CONFIG?= pkg-config
+
 HOST_EXTRACFLAGS += -I$(srctree)/tools/include
+CRYPTO_LIBS = $(shell $(PKG_CONFIG) --libs libcrypto 2> /dev/null || echo 
-lcrypto)
+CRYPTO_CFLAGS = $(shell $(PKG_CONFIG) --cflags libcrypto 2> /dev/null)
 
 hostprogs-$(CONFIG_BUILD_BIN2C)  += bin2c
 hostprogs-$(CONFIG_KALLSYMS) += kallsyms
@@ -23,8 +27,9 @@ hostprogs-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE) += 
insert-sys-cert
 
 HOSTCFLAGS_sortextable.o = -I$(srctree)/tools/include
 HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include
-HOSTLDLIBS_sign-file = -lcrypto
-HOSTLDLIBS_extract-cert = -lcrypto
+HOSTLDLIBS_sign-file = $(CRYPTO_LIBS)
+HOSTCFLAGS_extract-cert.o = $(CRYPTO_CFLAGS)
+HOSTLDLIBS_extract-cert = $(CRYPTO_LIBS)
 
 always := $(hostprogs-y) $(hostprogs-m)
 
-- 
2.23.0

Re: [RFC PATCH untested] vhost: block speculation of translated descriptors

2019-09-09 Thread Michael S. Tsirkin

On Tue, Sep 10, 2019 at 09:52:10AM +0800, Jason Wang wrote:
> 
> On 2019/9/9 下午10:45, Michael S. Tsirkin wrote:
> > On Mon, Sep 09, 2019 at 03:19:55PM +0800, Jason Wang wrote:
> > > On 2019/9/8 下午7:05, Michael S. Tsirkin wrote:
> > > > iovec addresses coming from vhost are assumed to be
> > > > pre-validated, but in fact can be speculated to a value
> > > > out of range.
> > > > 
> > > > Userspace address are later validated with array_index_nospec so we can
> > > > be sure kernel info does not leak through these addresses, but vhost
> > > > must also not leak userspace info outside the allowed memory table to
> > > > guests.
> > > > 
> > > > Following the defence in depth principle, make sure
> > > > the address is not validated out of node range.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > ---
> > > >drivers/vhost/vhost.c | 4 +++-
> > > >1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > index 5dc174ac8cac..0ee375fb7145 100644
> > > > --- a/drivers/vhost/vhost.c
> > > > +++ b/drivers/vhost/vhost.c
> > > > @@ -2072,7 +2072,9 @@ static int translate_desc(struct vhost_virtqueue 
> > > > *vq, u64 addr, u32 len,
> > > > size = node->size - addr + node->start;
> > > > _iov->iov_len = min((u64)len - s, size);
> > > > _iov->iov_base = (void __user *)(unsigned long)
> > > > -   (node->userspace_addr + addr - node->start);
> > > > +   (node->userspace_addr +
> > > > +array_index_nospec(addr - node->start,
> > > > +   node->size));
> > > > s += size;
> > > > addr += size;
> > > > ++ret;
> > > 
> > > I've tried this on Kaby Lake smap off metadata acceleration off using
> > > testpmd (virtio-user) + vhost_net. I don't see obvious performance
> > > difference with TX PPS.
> > > 
> > > Thanks
> > Should I push this to Linus right now then? It's a security thing so
> > maybe we better do it ASAP ... what's your opinion?
> 
> 
> Yes, you can.
> 
> Acked-by: Jason Wang 


And should I include

Tested-by: Jason Wang 

?

> 
> 
> >

Re: [PATCH] lib/Kconfig: fix OBJAGG in lib/ menu structure

2019-09-09 Thread Ido Schimmel

On Mon, Sep 09, 2019 at 02:54:21PM -0700, Randy Dunlap wrote:
> From: Randy Dunlap 
> 
> Keep the "Library routines" menu intact by moving OBJAGG into it.
> Otherwise OBJAGG is displayed/presented as an orphan in the
> various config menus.
> 
> Fixes: 0a020d416d0a ("lib: introduce initial implementation of object 
> aggregation manager")
> Signed-off-by: Randy Dunlap 
> Cc: Jiri Pirko 
> Cc: Ido Schimmel 
> Cc: David S. Miller 

Tested-by: Ido Schimmel 

Thanks!

Re: [PATCH] ocfs2: Fix passing zero to 'PTR_ERR' warning

2019-09-09 Thread Joseph Qi



On 19/9/9 18:04, Ding Xiang wrote:
> Fix a static code checker warning:
> fs/ocfs2/acl.c:331
>   ocfs2_acl_chmod() warn: passing zero to 'PTR_ERR'
> 
> Fixes: 5ee0fbd50fd ("ocfs2: revert using ocfs2_acl_chmod to avoid inode 
> cluster lock hang")
> Signed-off-by: Ding Xiang 

Reviewed-by: Joseph Qi 
> ---
>  fs/ocfs2/acl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
> index 3e7da39..bb981ec 100644
> --- a/fs/ocfs2/acl.c
> +++ b/fs/ocfs2/acl.c
> @@ -327,8 +327,8 @@ int ocfs2_acl_chmod(struct inode *inode, struct 
> buffer_head *bh)
>   down_read(&OCFS2_I(inode)->ip_xattr_sem);
>   acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh);
>   up_read(&OCFS2_I(inode)->ip_xattr_sem);
> - if (IS_ERR(acl) || !acl)
> - return PTR_ERR(acl);
> + if (IS_ERR_OR_NULL(acl))
> + return PTR_ERR_OR_ZERO(acl);
>   ret = __posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode);
>   if (ret)
>   return ret;
>

Re: [PATCH] driver core: ensure a device has valid node id in device_add()

2019-09-09 Thread Yunsheng Lin

On 2019/9/9 17:53, Greg KH wrote:
> On Mon, Sep 09, 2019 at 02:04:23PM +0800, Yunsheng Lin wrote:
>> Currently a device does not belong to any of the numa nodes
>> (dev->numa_node is NUMA_NO_NODE) when the node id is neither
>> specified by fw nor by virtual device layer and the device has
>> no parent device.
> 
> Is this really a problem?

Not really.
Someone need to guess the node id when it is not specified, right?
This patch chooses to guess the node id in the driver core.

> 
>> According to discussion in [1]:
>> Even if a device's numa node is not specified, the device really
>> does belong to a node.
> 
> But as we do not know the node, can we cause more harm by randomly
> picking one (i.e. putting it all in node 0)?
If we do not pick node 0 for device with invalid node, then caller need
to check the node id and pick one, and currently different callers
does a different checking:

1) some does " < 0" check;
2) some does "== NUMA_NO_NODE" check;
3) some does ">= MAX_NUMNODES" check;
4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.

and caller of dev_to_node() may pick one node based on below if the
dev_to_node() return a invalid node based on above checking:
1) based on numa_mem_id().
2) pick a random one like in workqueue_select_cpu_near().

If we pick node 0 for device with invalid node in device_add(), we
may avoid the above different checking and picking for caller, but we
may lose some caller context info, for example, user may use node of the
cpu on which the process is using the device to allocate the resource
close to the process, or user may pick a random one if they know what
they are doing.

It seems there is trade off here, as I can see, we can guess and pick the
node at different stage when it is not specified.
1. guess and pick node 0 at device_add(), it has the advantage of ensure
   all devices will have a valid node at very begin of device creation,
   so the user does not have to check and guess one, but user might lose
   the opportunity to do their own guessing and picking.

2. Maybe provide a dev_to_valid_node() to always return a valid node id,
   for example return numa_mem_id() if dev->numa_node is not valid.
   User know what they are doing can still use dev_to_node().

3. Caller of dev_to_node() do their own checking and picking, which
   might lead to adding more different and reduplicate checking as above.

[tip: x86/asm] x86/umip: Add emulation (spoofing) for UMIP covered instructions in 64-bit processes as well

2019-09-09 Thread tip-bot2 for Brendan Shanks

The following commit has been merged into the x86/asm branch of tip:

Commit-ID: e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2
Gitweb:
https://git.kernel.org/tip/e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2
Author:Brendan Shanks 
AuthorDate:Thu, 05 Sep 2019 16:22:21 -07:00
Committer: Ingo Molnar 
CommitterDate: Tue, 10 Sep 2019 08:36:16 +02:00

x86/umip: Add emulation (spoofing) for UMIP covered instructions in 64-bit 
processes as well

Add emulation (spoofing) of the SGDT, SIDT, and SMSW instructions for 64-bit
processes.

Wine users have encountered a number of 64-bit Windows games that use
these instructions (particularly SGDT), and were crashing when run on
UMIP-enabled systems.

Originally-by: Ricardo Neri 
Signed-off-by: Brendan Shanks 
Reviewed-by: Ricardo Neri 
Reviewed-by: H. Peter Anvin (Intel) 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: Eric W. Biederman 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/2019090523.14900-1-bsha...@codeweavers.com
[ Minor edits: capitalization, added 'spoofing' wording. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/umip.c | 65 +++--
 1 file changed, 38 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5b345ad..548fefe 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -19,7 +19,7 @@
 /** DOC: Emulation for User-Mode Instruction Prevention (UMIP)
  *
  * The feature User-Mode Instruction Prevention present in recent Intel
- * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str)
+ * processor prevents a group of instructions (SGDT, SIDT, SLDT, SMSW and STR)
  * from being executed with CPL > 0. Otherwise, a general protection fault is
  * issued.
  *
@@ -36,8 +36,8 @@
  * DOSEMU2) rely on this subset of instructions to function.
  *
  * The instructions protected by UMIP can be split in two groups. Those which
- * return a kernel memory address (sgdt and sidt) and those which return a
- * value (sldt, str and smsw).
+ * return a kernel memory address (SGDT and SIDT) and those which return a
+ * value (SLDT, STR and SMSW).
  *
  * For the instructions that return a kernel memory address, applications
  * such as WineHQ rely on the result being located in the kernel memory space,
@@ -45,15 +45,13 @@
  * value that, lies close to the top of the kernel memory. The limit for the 
GDT
  * and the IDT are set to zero.
  *
- * Given that sldt and str are not commonly used in programs that run on WineHQ
+ * Given that SLDT and STR are not commonly used in programs that run on WineHQ
  * or DOSEMU2, they are not emulated.
  *
  * The instruction smsw is emulated to return the value that the register CR0
  * has at boot time as set in the head_32.
  *
- * Also, emulation is provided only for 32-bit processes; 64-bit processes
- * that attempt to use the instructions that UMIP protects will receive the
- * SIGSEGV signal issued as a consequence of the general protection fault.
+ * Emulation is provided for both 32-bit and 64-bit processes.
  *
  * Care is taken to appropriately emulate the results when segmentation is
  * used. That is, rather than relying on USER_DS and USER_CS, the function
@@ -63,17 +61,18 @@
  * application uses a local descriptor table.
  */
 
-#define UMIP_DUMMY_GDT_BASE 0xfffe
-#define UMIP_DUMMY_IDT_BASE 0x
+#define UMIP_DUMMY_GDT_BASE 0xfffeULL
+#define UMIP_DUMMY_IDT_BASE 0xULL
 
 /*
  * The SGDT and SIDT instructions store the contents of the global descriptor
  * table and interrupt table registers, respectively. The destination is a
  * memory operand of X+2 bytes. X bytes are used to store the base address of
- * the table and 2 bytes are used to store the limit. In 32-bit processes, the
- * only processes for which emulation is provided, X has a value of 4.
+ * the table and 2 bytes are used to store the limit. In 32-bit processes X
+ * has a value of 4, in 64-bit processes X has a value of 8.
  */
-#define UMIP_GDT_IDT_BASE_SIZE 4
+#define UMIP_GDT_IDT_BASE_SIZE_64BIT 8
+#define UMIP_GDT_IDT_BASE_SIZE_32BIT 4
 #define UMIP_GDT_IDT_LIMIT_SIZE 2
 
 #defineUMIP_INST_SGDT  0   /* 0F 01 /0 */
@@ -189,6 +188,7 @@ static int identify_insn(struct insn *insn)
  * @umip_inst: A constant indicating the instruction to emulate
  * @data:  Buffer into which the dummy result is stored
  * @data_size: Size of the emulated result
+ * @x86_64:true if process is 64-bit, false otherwise
  *
  * Emulate an instruction protected by UMIP and provide a dummy result. The
  * result of the emulation is saved in @data. The size of the results depends
@@ -202,11 +202,8 @@ static int identify_insn(struct insn *insn)
  * 0 on success, -EINVAL on error while emulating.
  */
 static int emulate_umip_insn(struct insn *insn, int umip_inst,
-unsigned char *da

[PATCH v2] x86/umip: Add emulation for 64-bit processes

2019-09-09 Thread Ingo Molnar



* h...@zytor.com  wrote:

> On September 10, 2019 7:28:28 AM GMT+01:00, Ingo Molnar  
> wrote:
> >
> >* h...@zytor.com  wrote:
> >
> >> I would strongly suggest that we change the term "emulation" to 
> >> "spoofing" for these instructions. We need to explain that we do
> >*not* 
> >> execute these instructions the was the CPU would have, and unlike the
> >
> >> native instructions do not leak kernel information.
> >
> >Ok, I've edited the patch to add the 'spoofing' wording where 
> >appropriate, and I also made minor fixes such as consistently 
> >capitalizing instruction names.
> >
> >Can I also add your Reviewed-by tag?
> >
> >So the patch should show up in tip:x86/asm today-ish, and barring any 
> >complications is v5.4 material.
> >
> >Thanks,
> >
> > Ingo
> 
> Yes, please do.
> 
> Reviewed-by: H. Peter Anvin (Intel) 

Thanks!

I've attached the updated version of the patch I'm testing.

Ingo

==>
>From e86c2c8b9380440bbe761b8e2f63ab6b04a45ac2 Mon Sep 17 00:00:00 2001
From: Brendan Shanks 
Date: Thu, 5 Sep 2019 16:22:21 -0700
Subject: [PATCH] x86/umip: Add emulation (spoofing) for UMIP covered 
instructions in 64-bit processes as well

Add emulation (spoofing) of the SGDT, SIDT, and SMSW instructions for 64-bit
processes.

Wine users have encountered a number of 64-bit Windows games that use
these instructions (particularly SGDT), and were crashing when run on
UMIP-enabled systems.

Originally-by: Ricardo Neri 
Signed-off-by: Brendan Shanks 
Reviewed-by: Ricardo Neri 
Reviewed-by: H. Peter Anvin (Intel) 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: Eric W. Biederman 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/2019090523.14900-1-bsha...@codeweavers.com
[ Minor edits: capitalization, added 'spoofing' wording. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/umip.c | 65 +-
 1 file changed, 38 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5b345add550f..548fefed71ee 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -19,7 +19,7 @@
 /** DOC: Emulation for User-Mode Instruction Prevention (UMIP)
  *
  * The feature User-Mode Instruction Prevention present in recent Intel
- * processor prevents a group of instructions (sgdt, sidt, sldt, smsw, and str)
+ * processor prevents a group of instructions (SGDT, SIDT, SLDT, SMSW and STR)
  * from being executed with CPL > 0. Otherwise, a general protection fault is
  * issued.
  *
@@ -36,8 +36,8 @@
  * DOSEMU2) rely on this subset of instructions to function.
  *
  * The instructions protected by UMIP can be split in two groups. Those which
- * return a kernel memory address (sgdt and sidt) and those which return a
- * value (sldt, str and smsw).
+ * return a kernel memory address (SGDT and SIDT) and those which return a
+ * value (SLDT, STR and SMSW).
  *
  * For the instructions that return a kernel memory address, applications
  * such as WineHQ rely on the result being located in the kernel memory space,
@@ -45,15 +45,13 @@
  * value that, lies close to the top of the kernel memory. The limit for the 
GDT
  * and the IDT are set to zero.
  *
- * Given that sldt and str are not commonly used in programs that run on WineHQ
+ * Given that SLDT and STR are not commonly used in programs that run on WineHQ
  * or DOSEMU2, they are not emulated.
  *
  * The instruction smsw is emulated to return the value that the register CR0
  * has at boot time as set in the head_32.
  *
- * Also, emulation is provided only for 32-bit processes; 64-bit processes
- * that attempt to use the instructions that UMIP protects will receive the
- * SIGSEGV signal issued as a consequence of the general protection fault.
+ * Emulation is provided for both 32-bit and 64-bit processes.
  *
  * Care is taken to appropriately emulate the results when segmentation is
  * used. That is, rather than relying on USER_DS and USER_CS, the function
@@ -63,17 +61,18 @@
  * application uses a local descriptor table.
  */
 
-#define UMIP_DUMMY_GDT_BASE 0xfffe
-#define UMIP_DUMMY_IDT_BASE 0x
+#define UMIP_DUMMY_GDT_BASE 0xfffeULL
+#define UMIP_DUMMY_IDT_BASE 0xULL
 
 /*
  * The SGDT and SIDT instructions store the contents of the global descriptor
  * table and interrupt table registers, respectively. The destination is a
  * memory operand of X+2 bytes. X bytes are used to store the base address of
- * the table and 2 bytes are used to store the limit. In 32-bit processes, the
- * only processes for which emulation is provided, X has a value of 4.
+ * the table and 2 bytes are used to store the limit. In 32-bit processes X
+ * has a value of 4, in 64-bit processes X has a value of 8.
  */
-#define UMIP_GDT_IDT_BASE_SIZE 4
+#define UMIP_GDT_IDT_BASE_SIZE_64BIT 8
+#define UMIP_GDT_IDT_BASE_SIZE_32BIT 4
 #define UMIP_GDT_IDT_LIMIT_S

Re: [PATCH] KVM: x86: Manually calculate reserved bits when loading PDPTRS

2019-09-09 Thread Peter Xu

On Tue, Sep 03, 2019 at 04:36:45PM -0700, Sean Christopherson wrote:
> Manually generate the PDPTR reserved bit mask when explicitly loading
> PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
> current paging mode, which is unlikely to be PAE paging in the vast
> majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
> __set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
> PDPTR, or more likely, miss a reserved bit check and subsequently fail
> a VM-Enter due to a bad VMCS.GUEST_PDPTR.
> 
> Add a one off helper to generate the reserved bits instead of sharing
> code across the MMU's calculations and the PDPTR emulation.  The PDPTR
> reserved bits are basically set in stone, and pushing a helper into
> the MMU's calculation adds unnecessary complexity without improving
> readability.
> 
> Oppurtunistically fix/update the comment for load_pdptrs().
> 
> Note, the buggy commit also introduced a deliberate functional change,
> "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
> effectively (and correctly) reverted by commit cd9ae5fe47df ("KVM: x86:
> Fix page-tables reserved bits").  A bit of SDM archaeology shows that
> the SDM from late 2008 had a bug (likely a copy+paste error) where it
> listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
> for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
> always have been reserved.
> 
> Fixes: 20c466b56168d ("KVM: Use rsvd_bits_mask in load_pdptrs()")
> Cc: sta...@vger.kernel.org
> Cc: Nadav Amit 
> Reported-by: Doug Reiland 
> Signed-off-by: Sean Christopherson 

Maybe with a test case would be even better?  FWIW:

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH] x86/umip: Add emulation for 64-bit processes

2019-09-09 Thread hpa

On September 10, 2019 7:28:28 AM GMT+01:00, Ingo Molnar  
wrote:
>
>* h...@zytor.com  wrote:
>
>> I would strongly suggest that we change the term "emulation" to 
>> "spoofing" for these instructions. We need to explain that we do
>*not* 
>> execute these instructions the was the CPU would have, and unlike the
>
>> native instructions do not leak kernel information.
>
>Ok, I've edited the patch to add the 'spoofing' wording where 
>appropriate, and I also made minor fixes such as consistently 
>capitalizing instruction names.
>
>Can I also add your Reviewed-by tag?
>
>So the patch should show up in tip:x86/asm today-ish, and barring any 
>complications is v5.4 material.
>
>Thanks,
>
>   Ingo

Yes, please do.

Reviewed-by: H. Peter Anvin (Intel) 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH v6 0/3] genirq/vfio: Introduce irq_update_devid() and optimize VFIO irq ops

2019-09-09 Thread Ben Luo


A friendly reminder.


Thanks,

    Ben

在 2019/9/2 下午12:01, Ben Luo 写道:

Currently, VFIO takes a free-then-request-irq way to do interrupt
affinity setting and masking/unmasking for a VM with device passthru
via VFIO. Sometimes it only changes the cookie data of irqaction or even
changes nothing. The free-then-request-irq not only adds more latency,
but also increases the risk of losing interrupt, which may lead to a
VM hang forever in waiting for IO completion

This patchset solved the issue by:
Patch 2 introduces irq_update_devid() to only update dev_id of irqaction
Patch 3 make use of this function and optimize irq operations in VFIO

changes from v5:
  - Patch 3: remove an error log to avoid potential DDoS attacking
  _ Patch 3: fix typo in comment

changes from v4:
  - Patch 3: follow the previous behavior to disable interrupt on error path
  - Patch 3: do irqbypass registration before update or free the interrupt
  - Patch 3: add more comments

changes from v3:
  - Patch 2: rename the new function to irq_update_devid()
  - Patch 2: use disbale_irq() to avoid a twist for threaded interrupt
  - ALL: amend commit messages and code comments

changes from v2:
  - reformat to avoid quoted string split across lines and etc.

changes from v1:
  - add Patch 1 to enhance error recovery etc. in free irq per tglx's comments
  - enhance error recovery code and debugging info in irq_update_devid
  - use __must_check in external referencing of this function
  - use EXPORT_SYMBOL_GPL for irq_update_devid
  - reformat code of patch 3 for better readability

Ben Luo (3):
   genirq: enhance error recovery code in free irq
   genirq: introduce irq_update_devid()
   vfio/pci: make use of irq_update_devid() and optimize irq ops

  drivers/vfio/pci/vfio_pci_intrs.c | 118 ++
  include/linux/interrupt.h |   3 +
  kernel/irq/manage.c   | 105 +
  3 files changed, 177 insertions(+), 49 deletions(-)

Re: [vfs] 8bb3c61baf: vm-scalability.median -23.7% regression

2019-09-09 Thread Hugh Dickins

On Mon, 9 Sep 2019, Al Viro wrote:
> 
> Anyway, see vfs.git#uncertain.shmem for what I've got with those folded in.
> Do you see any problems with that one?  That's the last 5 commits in there...

It's mostly fine, I've no problem with going your way instead of what
we had in mmotm; but I have seen some problems with it, and had been
intending to send you a fixup patch tonight (shmem_reconfigure() missing
unlock on error is the main problem, but there are other fixes needed).

But I'm growing tired. I've a feeling my "swap" of the mpols, instead
of immediate mpol_put(), was necessary to protect against a race with
shmem_get_sbmpol(), but I'm not clear-headed enough to trust myself on
that now.  And I've a mystery to solve, that shmem_reconfigure() gets
stuck into showing the wrong error message.

Tomorrow

Oh, and my first attempt to build and boot that series over 5.3-rc5
wouldn't boot. Luckily there was a tell-tale "i915" in the stacktrace,
which reminded me of the drivers/gpu/drm/i915/gem/i915_gemfs.c fix
we discussed earlier in the cycle.  That is of course in linux-next
by now, but I wonder if your branch ought to contain a duplicate of
that fix, so that people with i915 doing bisections on 5.4-rc do not
fall into an unbootable hole between vfs and gpu merges.

Hugh

Re: [PATCH] x86/umip: Add emulation for 64-bit processes

2019-09-09 Thread Ingo Molnar

* h...@zytor.com  wrote:

> I would strongly suggest that we change the term "emulation" to 
> "spoofing" for these instructions. We need to explain that we do *not* 
> execute these instructions the was the CPU would have, and unlike the 
> native instructions do not leak kernel information.

Ok, I've edited the patch to add the 'spoofing' wording where 
appropriate, and I also made minor fixes such as consistently 
capitalizing instruction names.

Can I also add your Reviewed-by tag?

So the patch should show up in tip:x86/asm today-ish, and barring any 
complications is v5.4 material.

Thanks,

Ingo

Re: [PATCH 4.19 19/57] Bluetooth: hidp: Let hidp_send_message return number of queued bytes

2019-09-09 Thread Fabian Henneke




On 10.09.19 00:59, Greg Kroah-Hartman wrote:
> On Mon, Sep 09, 2019 at 03:00:46PM +0200, Fabian Henneke wrote:
>> Hi,
>>
>> On Mon, Sep 9, 2019 at 2:15 PM Pavel Machek  wrote:
>>
>>> Hi!
>>>
 [ Upstream commit 48d9cc9d85dde37c87abb7ac9bbec6598ba44b56 ]

 Let hidp_send_message return the number of successfully queued bytes
 instead of an unconditional 0.

 With the return value fixed to 0, other drivers relying on hidp, such as
 hidraw, can not return meaningful values from their respective
 implementations of write(). In particular, with the current behavior, a
 hidraw device's write() will have different return values depending on
 whether the device is connected via USB or Bluetooth, which makes it
 harder to abstract away the transport layer.
>>>
>>> So, does this change any actual behaviour?
>>>
>>> Is it fixing a bug, or is it just preparation for a patch that is not
>>> going to make it to stable?
>>>
>>
>> I created this patch specifically in order to ensure that user space
>> applications can use HID devices with hidraw without needing to care about
>> whether the transport is USB or Bluetooth. Without the patch, every
>> hidraw-backed Bluetooth device needs to be treated specially as its write()
>> violates the usual return value contract, which could be viewed as a bug.
>>
>> Please note that a later patch (
>> https://www.spinics.net/lists/linux-input/msg63291.html) fixes some
>> important error checks that were relying on the old behavior (and were
>> unfortunately missed by me).
> 
> As that patch doesn't seem to be in Linus's tree yet, we should postpone
> taking this one in the stable tree right now, correct?
> 
> thanks,
> 
> greg k-h
> 

Yes, please wait for the other patch if it's not in his tree yet and apply the 
two together.

Thank you,
Fabian

Re: [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory

2019-09-09 Thread VMware


On 9/10/19 8:11 AM, Christoph Hellwig wrote:

On Thu, Sep 05, 2019 at 04:23:11AM -0700, Christoph Hellwig wrote:

This looks fine from the DMA POV.  I'll let the x86 guys comment on the
rest.

Do we want to pick this series up for 5.4?  Should I queue it up in
the dma-mapping tree?


Hi, Christoph

I think the DMA change is pretty uncontroversial.

There are still some questions about the x86 change: After digging a bit 
deeper into the mm code I think Dave is correct about that we should 
include the sme_me_mask in _PAGE_CHG_MASK.


I'll respin that patch and then I guess we need an ack from the x86 people.

Thanks,
Thomas

[PATCH v2 1/3] regulator: fixed: add possibility to enable by clock

2019-09-09 Thread Philippe Schenker

This commit adds the possibility to choose the compatible
"regulator-fixed-clock" in devicetree.

This is a special regulator-fixed that has to have a clock, from which
the regulator gets switched on and off.

Signed-off-by: Philippe Schenker 

---

Changes in v2:
- return priv->clk_enable_counter > 0 directly.

 drivers/regulator/fixed.c | 83 +--
 1 file changed, 80 insertions(+), 3 deletions(-)

diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c
index 999547dde99d..d90a6fd8cbc7 100644
--- a/drivers/regulator/fixed.c
+++ b/drivers/regulator/fixed.c
@@ -23,14 +23,63 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+
 
 struct fixed_voltage_data {
struct regulator_desc desc;
struct regulator_dev *dev;
+
+   struct clk *enable_clock;
+   unsigned int clk_enable_counter;
 };
 
+struct fixed_dev_type {
+   bool has_enable_clock;
+};
+
+static const struct fixed_dev_type fixed_voltage_data = {
+   .has_enable_clock = false,
+};
+
+static const struct fixed_dev_type fixed_clkenable_data = {
+   .has_enable_clock = true,
+};
+
+static int reg_clock_enable(struct regulator_dev *rdev)
+{
+   struct fixed_voltage_data *priv = rdev_get_drvdata(rdev);
+   int ret = 0;
+
+   ret = clk_prepare_enable(priv->enable_clock);
+   if (ret)
+   return ret;
+
+   priv->clk_enable_counter++;
+
+   return ret;
+}
+
+static int reg_clock_disable(struct regulator_dev *rdev)
+{
+   struct fixed_voltage_data *priv = rdev_get_drvdata(rdev);
+
+   clk_disable_unprepare(priv->enable_clock);
+   priv->clk_enable_counter--;
+
+   return 0;
+}
+
+static int reg_clock_is_enabled(struct regulator_dev *rdev)
+{
+   struct fixed_voltage_data *priv = rdev_get_drvdata(rdev);
+
+   return priv->clk_enable_counter > 0;
+}
+
 
 /**
  * of_get_fixed_voltage_config - extract fixed_voltage_config structure info
@@ -84,10 +133,19 @@ of_get_fixed_voltage_config(struct device *dev,
 static struct regulator_ops fixed_voltage_ops = {
 };
 
+static struct regulator_ops fixed_voltage_clkenabled_ops = {
+   .enable = reg_clock_enable,
+   .disable = reg_clock_disable,
+   .is_enabled = reg_clock_is_enabled,
+};
+
 static int reg_fixed_voltage_probe(struct platform_device *pdev)
 {
+   struct device *dev = &pdev->dev;
struct fixed_voltage_config *config;
struct fixed_voltage_data *drvdata;
+   const struct fixed_dev_type *drvtype =
+   of_match_device(dev->driver->of_match_table, dev)->data;
struct regulator_config cfg = { };
enum gpiod_flags gflags;
int ret;
@@ -118,7 +176,18 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
}
drvdata->desc.type = REGULATOR_VOLTAGE;
drvdata->desc.owner = THIS_MODULE;
-   drvdata->desc.ops = &fixed_voltage_ops;
+
+   if (drvtype->has_enable_clock) {
+   drvdata->desc.ops = &fixed_voltage_clkenabled_ops;
+
+   drvdata->enable_clock = devm_clk_get(dev, NULL);
+   if (IS_ERR(drvdata->enable_clock)) {
+   dev_err(dev, "Cant get enable-clock from devicetree\n");
+   return -ENOENT;
+   }
+   } else {
+   drvdata->desc.ops = &fixed_voltage_ops;
+   }
 
drvdata->desc.enable_time = config->startup_delay;
 
@@ -191,8 +260,16 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
 
 #if defined(CONFIG_OF)
 static const struct of_device_id fixed_of_match[] = {
-   { .compatible = "regulator-fixed", },
-   {},
+   {
+   .compatible = "regulator-fixed",
+   .data = &fixed_voltage_data,
+   },
+   {
+   .compatible = "regulator-fixed-clock",
+   .data = &fixed_clkenable_data,
+   },
+   {
+   },
 };
 MODULE_DEVICE_TABLE(of, fixed_of_match);
 #endif
-- 
2.23.0

[PATCH v2 0/3] Add new binding regulator-fixed-clock to regulator-fixed

2019-09-09 Thread Philippe Schenker



Our hardware has a FET that is switching power rail of the ethernet PHY
on and off. This switching enable signal is a clock from the SoC.

There is no possibility in regulator subsystem to have this hardware
reflected in software.

I already discussed with Mark Brown about possible solutions and he
suggested to create at least a new compatible. [1]
This discussion includes also a better explanation of our circuit as
well as schematics. So please refer to that link if you have questions
about that.

In this first attempt I created a new binding "regulator-fixed-clock"
that can take a clock from devicetree. This is a simple addition to
regulator-fixed. If the binding regulator-fixed-clock is given, the
clock is simply enabled on regulator enable and disabled on regulator
disable.
To be able to have multiple consumers a counter variable is also given
that tells how many consumers need power from this regulator.

Best regards,
Philippe

[1] https://lkml.org/lkml/2019/8/7/78


Changes in v2:
- return priv->clk_enable_counter > 0 directly.
- Change select: to if:
- Change items: to enum:
- Defined how many clocks should be given

Philippe Schenker (3):
  regulator: fixed: add possibility to enable by clock
  ARM: dts: imx6ull-colibri: add phy-supply and respective regulator
  dt-bindings: regulator: add regulator-fixed-clock binding

 .../bindings/regulator/fixed-regulator.yaml   | 19 -
 arch/arm/boot/dts/imx6ull-colibri.dtsi| 12 +++
 drivers/regulator/fixed.c | 83 ++-
 3 files changed, 110 insertions(+), 4 deletions(-)

-- 
2.23.0

[PATCH v2 2/3] ARM: dts: imx6ull-colibri: add phy-supply and respective regulator

2019-09-09 Thread Philippe Schenker

This adds regulator-fixed-clock, a fixed-regulator that turns on and
off with a clock and add it to the phy.

Signed-off-by: Philippe Schenker 
---

Changes in v2: None

 arch/arm/boot/dts/imx6ull-colibri.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/imx6ull-colibri.dtsi 
b/arch/arm/boot/dts/imx6ull-colibri.dtsi
index d56728f03c35..76021b842a97 100644
--- a/arch/arm/boot/dts/imx6ull-colibri.dtsi
+++ b/arch/arm/boot/dts/imx6ull-colibri.dtsi
@@ -47,6 +47,17 @@
states = <180 0x1 330 0x0>;
vin-supply = <®_module_3v3>;
};
+
+   reg_eth_phy: regulator-eth-phy {
+   compatible = "regulator-fixed-clock";
+   regulator-boot-on;
+   regulator-name = "eth_phy";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   clocks = <&clks IMX6UL_CLK_ENET2_REF_125M>;
+   startup-delay-us = <15>;
+   vin-supply = <®_module_3v3>;
+   };
 };
 
 &adc1 {
@@ -66,6 +77,7 @@
pinctrl-0 = <&pinctrl_enet2>;
phy-mode = "rmii";
phy-handle = <ðphy1>;
+   phy-supply = <®_eth_phy>;
status = "okay";
 
mdio {
-- 
2.23.0

Re: [PATCH 2/3] soc: amazon: al-pos: Introduce Amazon's Annapurna Labs POS driver

2019-09-09 Thread Shenhar, Talel




On 9/9/2019 6:16 PM, Arnd Bergmann wrote:

On Mon, Sep 9, 2019 at 4:11 PM Shenhar, Talel  wrote:

On 9/9/2019 4:41 PM, Arnd Bergmann wrote:

In current implementation of v1, I am not doing any read barrier, Hence,
using the non-relaxed will add unneeded memory barrier.

I have no strong objection moving to the non-relaxed version and have an
unneeded memory barrier, as this path is not "hot" one.

Ok, then please add it.

ok, shall be part of v2



Beside of avoiding the unneeded memory barrier, I would be happy to keep
common behavior for our drivers:

e.g.

https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-al-fic.c#L49


So what do you think we should go with? relaxed or non-relaxed?

The al_fic_set_trigger() function is clearly a slow-path and should use the
non-relaxed functions. In case of al_fic_irq_handler(), the extra barrier
might introduce a measurable overhead, but at the same time I'm
not sure if that one is correct without the barrier:

If you have an MSI-type interrupt for notifying a device driver of
a DMA completion, there might not be any other barrier between
the arrival of the MSI message and the CPU accessing the data.
Depending on how strict the hardware implements MSI and how
the IRQ is chained, this could lead to data corruption.

If the interrupt is only used for level or edge triggered interrupts,
this is ok since you already need another register read in
the driver before it can safely access a DMA buffer.

In either case, if you can prove that it's safe to use the relaxed
version here and you think that it may help, it would be good to
add a comment explaining the reasoning.
Decided to go with the non-relaxed version as this is not hot path and 
likely be more clear to the common reader to have non relaxed version.


Arnd

[PATCH v2 3/3] dt-bindings: regulator: add regulator-fixed-clock binding

2019-09-09 Thread Philippe Schenker

This adds the documentation to the compatible regulator-fixed-clock.
This binding is a special binding of regulator-fixed and adds the
ability to add a clock to regulator-fixed, so the regulator can be
enabled and disabled with that clock. If the special compatible
regulator-fixed-clock is used it is mandatory to supply a clock.

Signed-off-by: Philippe Schenker 

---

Changes in v2:
- Change select: to if:
- Change items: to enum:
- Defined how many clocks should be given

 .../bindings/regulator/fixed-regulator.yaml   | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml 
b/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml
index a650b457085d..a78150c47aa2 100644
--- a/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml
+++ b/Documentation/devicetree/bindings/regulator/fixed-regulator.yaml
@@ -19,9 +19,19 @@ description:
 allOf:
   - $ref: "regulator.yaml#"
 
+if:
+  properties:
+compatible:
+  contains:
+const: regulator-fixed-clock
+  required:
+- clocks
+
 properties:
   compatible:
-const: regulator-fixed
+enum:
+  - const: regulator-fixed
+  - const: regulator-fixed-clock
 
   regulator-name: true
 
@@ -29,6 +39,13 @@ properties:
 description: gpio to use for enable control
 maxItems: 1
 
+  clocks:
+description:
+  clock to use for enable control. This binding is only available if
+  the compatible is chosen to regulator-fixed-clock. The clock binding
+  is mandatory if compatible is chosen to regulator-fixed-clock.
+maxItems: 1
+
   startup-delay-us:
 description: startup time in microseconds
 $ref: /schemas/types.yaml#/definitions/uint32
-- 
2.23.0

[PATCH v3] Staging: gasket: Use temporaries to reduce line length.

2019-09-09 Thread Sandro Volery

Using temporaries for gasket_page_table entries to remove scnprintf()
statements and reduce line length, as suggested by Joe Perches. Thanks!

Signed-off-by: Sandro Volery 
---
v3: Fixed faulty copy/paste of function
v2: Attempt to fix
v1: Original patch


 drivers/staging/gasket/apex_driver.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index 2973bb920a26..46199c8ca441 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct 
device_attribute *attr,
struct gasket_dev *gasket_dev;
struct gasket_sysfs_attribute *gasket_attr;
enum sysfs_attribute_type type;
+   struct gasket_page_table *gpt;
+   uint val;
 
gasket_dev = gasket_sysfs_get_device_data(device);
if (!gasket_dev) {
@@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct 
device_attribute *attr,
}
 
type = (enum sysfs_attribute_type)gasket_attr->data.attr_type;
+   gpt = gasket_dev->page_table[0];
switch (type) {
case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_entries(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_entries(gpt);
break;
case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_simple_entries(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_simple_entries(gpt);
break;
case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_active_pages(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_active_pages(gpt);
break;
default:
dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n",
attr->attr.name);
ret = 0;
-   break;
+   goto exit;
}
-
+   ret = scnprintf(buf, PAGE_SIZE, "%u\n", val);
+exit:
gasket_sysfs_put_attr(device, gasket_attr);
gasket_sysfs_put_device_data(device, gasket_dev);
return ret;
-- 
2.23.0

Re: [PATCH 3/3] arm64: alpine: select AL_POS

2019-09-09 Thread Shenhar, Talel




On 9/9/2019 6:08 PM, Arnd Bergmann wrote:

On Mon, Sep 9, 2019 at 3:59 PM Shenhar, Talel  wrote:

On 9/9/2019 4:45 PM, Arnd Bergmann wrote:

Its not that something will get broken. its error event detector for POS
events which allows seeing bad accesses to registers.

What is the general rule of which configs to put under select and which
under defconfig?

I was thinking that "general" SoC support is good under select - those
things that we always want.

I generally want as little as possible to be selected, basically only
things that are required for linking the kernel and booting it without
potentially destroying the hardware.

In particular, I want most drivers to be enabled as loadable modules
if possible. When you have general-purpose distributions support
your platform, there is no need to have this module built-in while
running on a different chip, even if you always want to load the
module when it's running on yours.


And specific features, e.g. RAID support or features that supported only
on specific HW shall go under defconfig.

Similar, I see ARCH_LAYERSCAPE selecting EDAC_SUPPORT.

I think this was done to avoid a link failure. It's also possible that this
is a mistake and just did not get caught in review.

Arnd



I see.

Will remove this from v2.

Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

2019-09-09 Thread Ingo Molnar



* Kirill A. Shutemov  wrote:

> On Fri, Sep 06, 2019 at 04:29:50PM -0500, Steve Wahl wrote:
> > Our hardware (UV aka Superdome Flex) has address ranges marked
> > reserved by the BIOS. These ranges can cause the system to halt if
> > accessed.
> > 
> > During kernel initialization, the processor was speculating into
> > reserved memory causing system halts.  The processor speculation is
> > enabled because the reserved memory is being mapped by the kernel.
> > 
> > The page table level2_kernel_pgt is 1 GiB in size, and had all pages
> > initially marked as valid, and the kernel is placed anywhere in this
> > range depending on the virtual address selected by KASLR.  Later on in
> > the boot process, the valid area gets trimmed back to the space
> > occupied by the kernel.
> > 
> > But during the interval of time when the full 1 GiB space was marked
> > as valid, if the kernel physical address chosen by KASLR was close
> > enough to our reserved memory regions, the valid pages outside the
> > actual kernel space were allowing the processor to issue speculative
> > accesses to the reserved space, causing the system to halt.
> > 
> > This was encountered somewhat rarely on a normal system boot, and
> > somewhat more often when starting the crash kernel if
> > "crashkernel=512M,high" was specified on the command line (because
> > this heavily restricts the physical address of the crash kernel,
> > usually to within 1 GiB of our reserved space).
> > 
> > The answer is to invalidate the pages of this table outside the
> > address range occupied by the kernel before the page table is
> > activated.  This patch has been validated to fix this problem on our
> > hardware.
> 
> If the goal is to avoid *any* mapping of the reserved region to stop
> speculation, I don't think this patch will do the job. We still (likely)
> have the same memory mapped as part of the identity mapping. And it
> happens at least in two places: here and before on decompression stage.

Yeah, this really needs a fix at the KASLR level: it should only ever map 
into regions that are fully RAM backed.

Is the problem that the 1 GiB mapping is a direct mapping, which can be 
speculated into? I presume KASLR won't accidentally map the kernel into 
the reserved region, right?

Thanks,

Ingo

Re: [PATCH v3] KVM: x86: Disable posted interrupts for odd IRQs

2019-09-09 Thread Christoph Hellwig

And what about even ones? :)

Sorry, just joking, but the "odd" qualifier here looks a little weird,
maybe something like "non-standard develiry modes" might make sense
here.

Re: [RFC 02/19] ktf: Introduce the main part of the kernel side of ktf

2019-09-09 Thread Knut Omang

On Sun, 2019-09-08 at 18:23 -0700, Brendan Higgins wrote:
> On Tue, Aug 13, 2019 at 08:09:17AM +0200, Knut Omang wrote:
> 
> Sorry, it's taken me way too long to get down to a proper code review on
> this. I was hoping to send you something a couple weeks ago in
> preparation for Tuesday, but I have been crazy busy.
> 
> > The ktf module itself and basic data structures for management
> > of test cases and tests and contexts for tests.
> > Also contains the top level include file for kernel clients
> > in ktf.h.
> > 
> > More elaborate documentation follows towards the end of the
> > patch set.
> > 
> > This patch set contains both user level and kernel code,
> > we'll provide the full implementation of ktf on the kernel side in
> > this and forthcoming patches, then the user space code to execute
> > tests within the kernel and report results, then documentation
> > before introducing a small self test suite of tests to test ktf
> > itself, and some very simple additional example tests.
> > 
> > ktf.h:   Defines the KTF user API for kernel clients
> > ktf_test.c:  Kernel side code for tracking and reporting ktf test
> > results
> > 
> > Signed-off-by: Knut Omang 
> > ---
> >  tools/testing/selftests/ktf/kernel/Makefile  |  15 +-
> >  tools/testing/selftests/ktf/kernel/ktf.h | 604 -
> >  tools/testing/selftests/ktf/kernel/ktf_context.c | 409 +++-
> >  tools/testing/selftests/ktf/kernel/ktf_test.c| 397 +++-
> >  tools/testing/selftests/ktf/kernel/ktf_test.h| 381 ++-
> >  5 files changed, 1806 insertions(+)
> >  create mode 100644 tools/testing/selftests/ktf/kernel/Makefile
> >  create mode 100644 tools/testing/selftests/ktf/kernel/ktf.h
> >  create mode 100644 tools/testing/selftests/ktf/kernel/ktf_context.c
> >  create mode 100644 tools/testing/selftests/ktf/kernel/ktf_test.c
> >  create mode 100644 tools/testing/selftests/ktf/kernel/ktf_test.h
> [...]
> > diff --git a/tools/testing/selftests/ktf/kernel/ktf.h
> > b/tools/testing/selftests/ktf/kernel/ktf.h
> > new file mode 100644
> > index 000..ea270e7
> > --- /dev/null
> > +++ b/tools/testing/selftests/ktf/kernel/ktf.h
> > @@ -0,0 +1,604 @@
> > +/*
> > + * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.
> > + *Author: Knut Omang 
> > + *
> > + * SPDX-License-Identifier: GPL-2.0
> > + *
> > + * ktf.h: Defines the KTF user API for kernel clients
> > + */
> > +#ifndef _KTF_H
> > +#define _KTF_H
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include "ktf_test.h"
> > +#include "ktf_override.h"
> > +#include "ktf_map.h"
> 
> Where do you add this file? I don't see any definitions of
> `struct ktf_map` in either this or the preceding patches, so I don't
> think that this will compile.

Compiling is not enabled until patch 17, so that should not be a problem.
I wanted to convey the core KTF API early in the set to not let it "drown" 
in the utilities, but I get your point. One way to do it would be to move 
the header files up front and keep the implementations in the later patches,
even though that would separate the API definition from the implementation.

> > +#include "ktf_unlproto.h"
> 
> Same here. This looks important for understanding what you presented
> here.

yes

Thanks,
Knut

> > +#defineKTF_MAX_LOG 2048
> > +
> > +/* Type for an optional configuration callback for contexts.
> > + * Implementations should copy and store data into their private
> > + * extensions of the context structure. The data pointer is
> > + * only valid inside the callback:
> > + */
> > +typedef int (*ktf_config_cb)(struct ktf_context *ctx, const void* data,
> > size_t data_sz);
> > +typedef void (*ktf_context_cb)(struct ktf_context *ctx);
> > +
> > +struct ktf_context_type;
> > +
> > +struct ktf_context {
> > +   struct ktf_map_elem elem;  /* Linkage for ctx_map in handle */
> > +   char name[KTF_MAX_KEY];/* Context name used in map */
> > +   struct ktf_handle *handle; /* Owner of this context */
> > +   ktf_config_cb config_cb;   /* Optional configuration callback */
> > +   ktf_context_cb cleanup;/* Optional callback upon context release
> > */
> > +   int config_errno;  /* If config_cb set: state of configuration 
> > */
> > +   struct ktf_context_type *type; /* Associated type, must be set */
> > +};
> > +
> > +typedef struct ktf_context* (*ktf_context_alloc)(struct ktf_context_type
> > *ct);
> > +
> > +struct ktf_context_type {
> > +   struct ktf_map_elem elem;  /* Linkage for map in handle */
> > +   char name[KTF_MAX_KEY];/* Context type name */
> > +   struct ktf_handle *handle; /* Owner of this context type */
> > +   ktf_context_alloc alloc;   /* Allocate a new context of this type */
> > +   ktf_config_cb config_cb;   /* Configuration callback */
> > +   ktf_context_cb cleanup;/* Optional callback upon context release
> > */
> > +};
> > +
> > +#include "ktf_netctx.h"
> > +
> > +/* type for a

Re: [PATCH 1/3] regulator: fixed: add possibility to enable by clock

2019-09-09 Thread Philippe Schenker

On Tue, 2019-09-10 at 06:08 +, Philippe Schenker wrote:
> On Thu, 2019-09-05 at 19:06 +0100, Mark Brown wrote:
> > On Tue, Sep 03, 2019 at 08:03:46AM +, Philippe Schenker wrote:
> > > This commit adds the possibility to choose the compatible
> > > "regulator-fixed-clock" in devicetree.
> > > 
> > > This is a special regulator-fixed that has to have a clock, from
> > > which
> > > the regulator gets switched on and off.
> > 
> > This seems conceptually fine.  Minor issues though:
> 
> Thanks for your comments and I'm glad you like it! I will send a v2
> shortly, also with Rob's fixes in. Can I expect it to be pulled for
> 5.4?

I meant 5.5 of course.

> 
> Best regards,
> Philippe
> 
> > > +static int reg_clock_is_enabled(struct regulator_dev *rdev)
> > > +{
> > > + struct fixed_voltage_data *priv = rdev_get_drvdata(rdev);
> > > +
> > > + if (priv->clk_enable_counter > 0)
> > > + return 1;
> > > +
> > > + return 0;
> > > +}
> > 
> > This could just be return priv->clk_enable_counter > 0 - ideally the
> > clock API would let us query if the clock is enabled but that might
> > be
> > a
> > bit confused anyway given that it's possibly shared.

Re: [PATCH] riscv: dts: sifive: Add ethernet0 to the aliases node

2019-09-09 Thread Christoph Hellwig

On Thu, Sep 05, 2019 at 05:46:14AM -0700, Bin Meng wrote:
> U-Boot expects this alias to be in place in order to fix up the mac
> address of the ethernet node.
> 
> Signed-off-by: Bin Meng 

Looks good:

Reviewed-by: Christoph Hellwig

Re: [PATCH v2] riscv: dts: sifive: Drop "clock-frequency" property of cpu nodes

2019-09-09 Thread Christoph Hellwig

On Thu, Sep 05, 2019 at 05:45:53AM -0700, Bin Meng wrote:
> The "clock-frequency" property of cpu nodes isn't required. Drop it.
> 
> Signed-off-by: Bin Meng 

Looks good:

Reviewed-by: Christoph Hellwig

Re: [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory

2019-09-09 Thread Christoph Hellwig

On Thu, Sep 05, 2019 at 04:23:11AM -0700, Christoph Hellwig wrote:
> This looks fine from the DMA POV.  I'll let the x86 guys comment on the
> rest.

Do we want to pick this series up for 5.4?  Should I queue it up in
the dma-mapping tree?

Re: [PATCH 1/3] regulator: fixed: add possibility to enable by clock

2019-09-09 Thread Philippe Schenker

On Thu, 2019-09-05 at 19:06 +0100, Mark Brown wrote:
> On Tue, Sep 03, 2019 at 08:03:46AM +, Philippe Schenker wrote:
> > This commit adds the possibility to choose the compatible
> > "regulator-fixed-clock" in devicetree.
> > 
> > This is a special regulator-fixed that has to have a clock, from
> > which
> > the regulator gets switched on and off.
> 
> This seems conceptually fine.  Minor issues though:

Thanks for your comments and I'm glad you like it! I will send a v2
shortly, also with Rob's fixes in. Can I expect it to be pulled for 5.4?

Best regards,
Philippe

> 
> > +static int reg_clock_is_enabled(struct regulator_dev *rdev)
> > +{
> > +   struct fixed_voltage_data *priv = rdev_get_drvdata(rdev);
> > +
> > +   if (priv->clk_enable_counter > 0)
> > +   return 1;
> > +
> > +   return 0;
> > +}
> 
> This could just be return priv->clk_enable_counter > 0 - ideally the
> clock API would let us query if the clock is enabled but that might be
> a
> bit confused anyway given that it's possibly shared.

Re: [PATCH AUTOSEL 5.2 06/12] configfs_register_group() shouldn't be (and isn't) called in rmdirable parts

2019-09-09 Thread Christoph Hellwig

Please stop selectively backporting parts of random series.  We'll
need to the full series from Al in -stable instead.

[PATCH] serial/sifive: select SERIAL_EARLYCON

2019-09-09 Thread Christoph Hellwig

The sifive serial driver implements earlycon support, but unless
another driver is built in that supports earlycon support it won't
be usable.  Explicitly select SERIAL_EARLYCON instead.

Signed-off-by: Christoph Hellwig 
---
 drivers/tty/serial/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 530cb966092f..6b77a72278e3 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -1075,6 +1075,7 @@ config SERIAL_SIFIVE_CONSOLE
bool "Console on SiFive UART"
depends on SERIAL_SIFIVE=y
select SERIAL_CORE_CONSOLE
+   select SERIAL_EARLYCON
help
  Select this option if you would like to use a SiFive UART as the
  system console.
-- 
2.20.1

Re: [v5 PATCH] RISC-V: Fix unsupported isa string info.

2019-09-09 Thread h...@infradead.org

On Fri, Sep 06, 2019 at 11:27:57PM +, Atish Patra wrote:
> > Agreed. May be something like this ?
> > 
> > Let's say f/d is enabled in kernel but cpu doesn't support it.
> > "unsupported isa" will only appear if there are any unsupported isa.
> > 
> > processor   : 3
> > hart: 4
> > isa : rv64imac
> > unsupported isa : fd
> > mmu : sv39
> > uarch   : sifive,u54-mc
> > 
> > May be I am just trying over optimize one corner case :) :).
> > /proc/cpuinfo should just print all the isa string. That's it.
> > 
> 
> Ping ?

Yes, I agree with the "dumb" reporting of all capabilities.

[PATCH] of/fdt: don't ignore errors from of_setup_earlycon

2019-09-09 Thread Christoph Hellwig

If of_setup_earlycon we should keep on iterating earlycon options
instead of breaking out of the loop.

Signed-off-by: Christoph Hellwig 
---
 drivers/of/fdt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 9cdf14b9aaab..2f6bd03d8e27 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -946,8 +946,8 @@ int __init early_init_dt_scan_chosen_stdout(void)
if (fdt_node_check_compatible(fdt, offset, match->compatible))
continue;
 
-   of_setup_earlycon(match, offset, options);
-   return 0;
+   if (of_setup_earlycon(match, offset, options) == 0)
+   return 0;
}
return -ENODEV;
 }
-- 
2.20.1

Re: [PATCH] Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"

2019-09-09 Thread Wanpeng Li

On Mon, 9 Sep 2019 at 18:56, Waiman Long  wrote:
>
> On 9/9/19 2:40 AM, Wanpeng Li wrote:
> > From: Wanpeng Li 
> >
> > This patch reverts commit 75437bb304b20 (locking/pvqspinlock: Don't wait if
> > vCPU is preempted), we found great regression caused by this commit.
> >
> > Xeon Skylake box, 2 sockets, 40 cores, 80 threads, three VMs, each is 80 
> > vCPUs.
> > The score of ebizzy -M can reduce from 13000-14000 records/s to 1700-1800
> > records/s with this commit.
> >
> >   Host   Guestscore
> >
> > vanilla + w/o kvm optimizes vanilla   1700-1800 records/s
> > vanilla + w/o kvm optimizes vanilla + revert  13000-14000 records/s
> > vanilla + w/ kvm optimizes  vanilla   4500-5000 records/s
> > vanilla + w/ kvm optimizes  vanilla + revert  14000-15500 records/s
> >
> > Exit from aggressive wait-early mechanism can result in yield premature and
> > incur extra scheduling latency in over-subscribe scenario.
> >
> > kvm optimizes:
> > [1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts)
> > [2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock 
> > waiter preemption)
> >
> > Tested-by: loobin...@tencent.com
> > Cc: Peter Zijlstra 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: Waiman Long 
> > Cc: Paolo Bonzini 
> > Cc: Radim Krčmář 
> > Cc: loobin...@tencent.com
> > Cc: sta...@vger.kernel.org
> > Fixes: 75437bb304b20 (locking/pvqspinlock: Don't wait if vCPU is preempted)
> > Signed-off-by: Wanpeng Li 
> > ---
> >  kernel/locking/qspinlock_paravirt.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/locking/qspinlock_paravirt.h 
> > b/kernel/locking/qspinlock_paravirt.h
> > index 89bab07..e84d21a 100644
> > --- a/kernel/locking/qspinlock_paravirt.h
> > +++ b/kernel/locking/qspinlock_paravirt.h
> > @@ -269,7 +269,7 @@ pv_wait_early(struct pv_node *prev, int loop)
> >   if ((loop & PV_PREV_CHECK_MASK) != 0)
> >   return false;
> >
> > - return READ_ONCE(prev->state) != vcpu_running || 
> > vcpu_is_preempted(prev->cpu);
> > + return READ_ONCE(prev->state) != vcpu_running;
> >  }
> >
> >  /*
>
> There are several possibilities for this performance regression:
>
> 1) Multiple vcpus calling vcpu_is_preempted() repeatedly may cause some
> cacheline contention issue depending on how that callback is implemented.
>
> 2) KVM may set the preempt flag for a short period whenver an vmexit
> happens even if a vmenter is executed shortly after. In this case, we
> may want to use a more durable vcpu suspend flag that indicates the vcpu
> won't get a real vcpu back for a longer period of time.
>
> Perhaps you can add a lock event counter to count the number of
> wait_early events caused by vcpu_is_preempted() being true to see if it
> really cause a lot more wait_early than without the vcpu_is_preempted()
> call.

pv_wait_again:1:179
pv_wait_early:1:189429
pv_wait_head:1:263
pv_wait_node:1:189429
pv_vcpu_is_preempted:1:45588
=sleep 5
pv_wait_again:1:181
pv_wait_early:1:202574
pv_wait_head:1:267
pv_wait_node:1:202590
pv_vcpu_is_preempted:1:46336

The sampling period is 5s, 6% of wait_early events caused by
vcpu_is_preempted() being true.

Wanpeng

[PATCH v2 2/2] nvmem: sprd: Add Spreadtrum SoCs eFuse support

2019-09-09 Thread Baolin Wang

From: Freeman Liu 

The Spreadtrum eFuse controller is widely used to dump chip ID,
configuration setting, function select and so on, as well as
supporting one-time programming.

Signed-off-by: Freeman Liu 
Signed-off-by: Baolin Wang 
---
Changes from v1:
 - None
---
 drivers/nvmem/Kconfig  |   11 ++
 drivers/nvmem/Makefile |2 +
 drivers/nvmem/sprd-efuse.c |  424 
 3 files changed, 437 insertions(+)
 create mode 100644 drivers/nvmem/sprd-efuse.c

diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig
index c2ec750..8fd425d 100644
--- a/drivers/nvmem/Kconfig
+++ b/drivers/nvmem/Kconfig
@@ -230,4 +230,15 @@ config NVMEM_ZYNQMP
 
  If sure, say yes. If unsure, say no.
 
+config SPRD_EFUSE
+   tristate "Spreadtrum SoC eFuse Support"
+   depends on ARCH_SPRD || COMPILE_TEST
+   depends on HAS_IOMEM
+   help
+ This is a simple driver to dump specified values of Spreadtrum
+ SoCs from eFuse.
+
+ This driver can also be built as a module. If so, the module
+ will be called nvmem-sprd-efuse.
+
 endif
diff --git a/drivers/nvmem/Makefile b/drivers/nvmem/Makefile
index e5c153d..7c19870 100644
--- a/drivers/nvmem/Makefile
+++ b/drivers/nvmem/Makefile
@@ -50,3 +50,5 @@ obj-$(CONFIG_SC27XX_EFUSE)+= nvmem-sc27xx-efuse.o
 nvmem-sc27xx-efuse-y   := sc27xx-efuse.o
 obj-$(CONFIG_NVMEM_ZYNQMP) += nvmem_zynqmp_nvmem.o
 nvmem_zynqmp_nvmem-y   := zynqmp_nvmem.o
+obj-$(CONFIG_SPRD_EFUSE)   += nvmem_sprd_efuse.o
+nvmem_sprd_efuse-y := sprd-efuse.o
diff --git a/drivers/nvmem/sprd-efuse.c b/drivers/nvmem/sprd-efuse.c
new file mode 100644
index 000..2f1e0fb
--- /dev/null
+++ b/drivers/nvmem/sprd-efuse.c
@@ -0,0 +1,424 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Spreadtrum Communications Inc.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SPRD_EFUSE_ENABLE  0x20
+#define SPRD_EFUSE_ERR_FLAG0x24
+#define SPRD_EFUSE_ERR_CLR 0x28
+#define SPRD_EFUSE_MAGIC_NUM   0x2c
+#define SPRD_EFUSE_FW_CFG  0x50
+#define SPRD_EFUSE_PW_SWT  0x54
+#define SPRD_EFUSE_MEM(val)(0x1000 + ((val) << 2))
+
+#define SPRD_EFUSE_VDD_EN  BIT(0)
+#define SPRD_EFUSE_AUTO_CHECK_EN   BIT(1)
+#define SPRD_EFUSE_DOUBLE_EN   BIT(2)
+#define SPRD_EFUSE_MARGIN_RD_ENBIT(3)
+#define SPRD_EFUSE_LOCK_WR_EN  BIT(4)
+
+#define SPRD_EFUSE_ERR_CLR_MASKGENMASK(13, 0)
+
+#define SPRD_EFUSE_ENK1_ON BIT(0)
+#define SPRD_EFUSE_ENK2_ON BIT(1)
+#define SPRD_EFUSE_PROG_EN BIT(2)
+
+#define SPRD_EFUSE_MAGIC_NUMBER0x8810
+
+/* Block width (bytes) definitions */
+#define SPRD_EFUSE_BLOCK_WIDTH 4
+
+/*
+ * The Spreadtrum AP efuse contains 2 parts: normal efuse and secure efuse,
+ * and we can only access the normal efuse in kernel. So define the normal
+ * block offset index and normal block numbers.
+ */
+#define SPRD_EFUSE_NORMAL_BLOCK_NUMS   24
+#define SPRD_EFUSE_NORMAL_BLOCK_OFFSET 72
+
+/* Timeout (ms) for the trylock of hardware spinlocks */
+#define SPRD_EFUSE_HWLOCK_TIMEOUT  5000
+
+/*
+ * Since different Spreadtrum SoC chip can have different normal block numbers
+ * and offset. And some SoC can support block double feature, which means
+ * when reading or writing data to efuse memory, the controller can save double
+ * data in case one data become incorrect after a long period.
+ *
+ * Thus we should save them in the device data structure.
+ */
+struct sprd_efuse_variant_data {
+   u32 blk_nums;
+   u32 blk_offset;
+   bool blk_double;
+};
+
+struct sprd_efuse {
+   struct device *dev;
+   struct clk *clk;
+   struct hwspinlock *hwlock;
+   struct mutex mutex;
+   void __iomem *base;
+   const struct sprd_efuse_variant_data *data;
+};
+
+static const struct sprd_efuse_variant_data ums312_data = {
+   .blk_nums = SPRD_EFUSE_NORMAL_BLOCK_NUMS,
+   .blk_offset = SPRD_EFUSE_NORMAL_BLOCK_OFFSET,
+   .blk_double = false,
+};
+
+/*
+ * On Spreadtrum platform, we have multi-subsystems will access the unique
+ * efuse controller, so we need one hardware spinlock to synchronize between
+ * the multiple subsystems.
+ */
+static int sprd_efuse_lock(struct sprd_efuse *efuse)
+{
+   int ret;
+
+   mutex_lock(&efuse->mutex);
+
+   ret = hwspin_lock_timeout_raw(efuse->hwlock,
+ SPRD_EFUSE_HWLOCK_TIMEOUT);
+   if (ret) {
+   dev_err(efuse->dev, "timeout get the hwspinlock\n");
+   mutex_unlock(&efuse->mutex);
+   return ret;
+   }
+
+   return 0;
+}
+
+static void sprd_efuse_unlock(struct sprd_efuse *efuse)
+{
+   hwspin_unlock_raw(efuse->hwlock);
+   mutex_unlock(&efuse->mutex);
+}
+
+static void sprd_efuse_set_prog_

[PATCH v2 1/2] dt-bindings: nvmem: Add Spreadtrum eFuse controller documentation

2019-09-09 Thread Baolin Wang

From: Freeman Liu 

This patch adds the binding documentation for Spreadtrum eFuse controller.

Signed-off-by: Freeman Liu 
Signed-off-by: Baolin Wang 
Reviewed-by: Rob Herring 
---
Changes from v1:
 - Add reviewed tag from Rob.
---
 .../devicetree/bindings/nvmem/sprd-efuse.txt   |   39 
 1 file changed, 39 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/nvmem/sprd-efuse.txt

diff --git a/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt 
b/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt
new file mode 100644
index 000..96b6fee
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvmem/sprd-efuse.txt
@@ -0,0 +1,39 @@
+= Spreadtrum eFuse device tree bindings =
+
+Required properties:
+- compatible: Should be "sprd,ums312-efuse".
+- reg: Specify the address offset of efuse controller.
+- clock-names: Should be "enable".
+- clocks: The phandle and specifier referencing the controller's clock.
+- hwlocks: Reference to a phandle of a hwlock provider node.
+
+= Data cells =
+Are child nodes of eFuse, bindings of which as described in
+bindings/nvmem/nvmem.txt
+
+Example:
+
+   ap_efuse: efuse@3224 {
+   compatible = "sprd,ums312-efuse";
+   reg = <0 0x3224 0 0x1>;
+   clock-names = "enable";
+   hwlocks = <&hwlock 8>;
+   clocks = <&aonapb_gate CLK_EFUSE_EB>;
+
+   /* Data cells */
+   thermal_calib: calib@10 {
+   reg = <0x10 0x2>;
+   };
+   };
+
+= Data consumers =
+Are device nodes which consume nvmem data cells.
+
+Example:
+
+   thermal {
+   ...
+
+   nvmem-cells = <&thermal_calib>;
+   nvmem-cell-names = "calibration";
+   };
-- 
1.7.9.5

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-09 Thread Anshuman Khandual




On 09/10/2019 10:15 AM, Christophe Leroy wrote:
> 
> 
> On 09/10/2019 03:56 AM, Anshuman Khandual wrote:
>>
>>
>> On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote:
>>> On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote:


 On 09/07/2019 12:33 AM, Gerald Schaefer wrote:
> On Fri, 6 Sep 2019 11:58:59 +0530
> Anshuman Khandual  wrote:
>
>> On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
>>> On Thu, 5 Sep 2019 14:48:14 +0530
>>> Anshuman Khandual  wrote:
>>>   
> [...]
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && 
>> !defined(__ARCH_HAS_4LEVEL_HACK)
>> +static void pud_clear_tests(pud_t *pudp)
>> +{
>> +    memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
>> +    pud_clear(pudp);
>> +    WARN_ON(!pud_none(READ_ONCE(*pudp)));
>> +}
>
> For pgd/p4d/pud_clear(), we only clear if the page table level is 
> present
> and not folded. The memset() here overwrites the table type bits, so
> pud_clear() will not clear anything on s390 and the pud_none() check 
> will
> fail.
> Would it be possible to OR a (larger) random value into the table, so 
> that
> the lower 12 bits would be preserved?

 So the suggestion is instead of doing memset() on entry with 
 RANDOM_NZVALUE,
 it should OR a large random value preserving lower 12 bits. Hmm, this 
 should
 still do the trick for other platforms, they just need non zero value. 
 So on
 s390, the lower 12 bits on the page table entry already has valid 
 value while
 entering this function which would make sure that pud_clear() really 
 does
 clear the entry ?
>>>
>>> Yes, in theory the table entry on s390 would have the type set in the 
>>> last
>>> 4 bits, so preserving those would be enough. If it does not conflict 
>>> with
>>> others, I would still suggest preserving all 12 bits since those would 
>>> contain
>>> arch-specific flags in general, just to be sure. For s390, the pte/pmd 
>>> tests
>>> would also work with the memset, but for consistency I think the same 
>>> logic
>>> should be used in all pxd_clear_tests.
>>
>> Makes sense but..
>>
>> There is a small challenge with this. Modifying individual bits on a 
>> given
>> page table entry from generic code like this test case is bit tricky. 
>> That
>> is because there are not enough helpers to create entries with an 
>> absolute
>> value. This would have been easier if all the platforms provided 
>> functions
>> like __pxx() which is not the case now. Otherwise something like this 
>> should
>> have worked.
>>
>>
>> pud_t pud = READ_ONCE(*pudp);
>> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
>> WRITE_ONCE(*pudp, pud);
>>
>> But __pud() will fail to build in many platforms.
>
> Hmm, I simply used this on my system to make pud_clear_tests() work, not
> sure if it works on all archs:
>
> pud_val(*pudp) |= RANDOM_NZVALUE;

 Which compiles on arm64 but then fails on x86 because of the way pmd_val()
 has been defined there.
>>>
>>> Use instead
>>>
>>> *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE);
>>
>> Agreed.
>>
>> As I had mentioned before this would have been really the cleanest approach.
>>
>>>
>>> It *should* be more portable.
>>
>> Not really, because not all the platforms have __pxx() definitions right now.
>> Going with these will clearly cause build failures on affected platforms. 
>> Lets
>> examine __pud() for instance. It is defined only on these platforms.
>>
>> arch/arm64/include/asm/pgtable-types.h:    #define __pud(x) ((pud_t) { 
>> (x) } )
>> arch/mips/include/asm/pgtable-64.h:    #define __pud(x) ((pud_t) { (x) })
>> arch/powerpc/include/asm/pgtable-be-types.h:    #define __pud(x) ((pud_t) { 
>> cpu_to_be64(x) })
>> arch/powerpc/include/asm/pgtable-types.h:    #define __pud(x) ((pud_t) { (x) 
>> })
>> arch/s390/include/asm/page.h:    #define __pud(x) ((pud_t) { (x) } )
>> arch/sparc/include/asm/page_64.h:    #define __pud(x) ((pud_t) { (x) } )
>> arch/sparc/include/asm/page_64.h:    #define __pud(x) (x)
>> arch/x86/include/asm/pgtable.h:    #define __pud(x) 
>> native_make_pud(x)
> 
> You missed:
> arch/x86/include/asm/paravirt.h:static inline pud_t __pud(pudval_t val)
> include/asm-generic/pgtable-nop4d-hack.h:#define __pud(x)    
> ((pud_t) { __pgd(x) })
> include/asm-generic/pgtable-nopud.h:#define __pud(x)    ((pud_t) { 
> __p4d(x) })
> 
>>
>> Similarly for __pmd()
>>
>> arch/alpha/include/asm/page.h:    #define __pmd(x)  ((pmd_t) { (x) } 
>> )
>> arch/arm/include/asm/page-nommu.h:    #define __pmd(x)  (x)
>> arch/arm/include/asm/pgtable-2level

Re: [PATCH v6 00/12] implement KASLR for powerpc/fsl_booke/32

2019-09-09 Thread Jason Yan


Hi Scott,

On 2019/8/28 12:05, Scott Wood wrote:

On Fri, 2019-08-09 at 18:07 +0800, Jason Yan wrote:

This series implements KASLR for powerpc/fsl_booke/32, as a security
feature that deters exploit attempts relying on knowledge of the location
of kernel internals.

Since CONFIG_RELOCATABLE has already supported, what we need to do is
map or copy kernel to a proper place and relocate.


Have you tested this with a kernel that was loaded at a non-zero address?  I
tried loading a kernel at 0x0400 (by changing the address in the uImage,
and setting bootm_low to 0400 in U-Boot), and it works without
CONFIG_RANDOMIZE and fails with.



How did you change the load address of the uImage, by changing the
kernel config CONFIG_PHYSICAL_START or the "-a/-e" parameter of mkimage?
I tried both, but it did not work with or without CONFIG_RANDOMIZE.


Thanks,
Jason


  Freescale Book-E
parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1
entries are not suitable to map the kernel directly in a randomized
region, so we chose to copy the kernel to a proper place and restart to
relocate.

Entropy is derived from the banner and timer base, which will change every
build and boot. This not so much safe so additionally the bootloader may
pass entropy via the /chosen/kaslr-seed node in device tree.


How complicated would it be to directly access the HW RNG (if present) that
early in the boot?  It'd be nice if a U-Boot update weren't required (and
particularly concerning that KASLR would appear to work without a U-Boot
update, but without decent entropy).

-Scott



.

Re: [PATCH v2] Staging: gasket: Use temporaries to reduce line length.

2019-09-09 Thread Sandro Volery LKML

Wow... I checked, compiled and still sent the wrong thing again. I'm gonna have 
to give this up soon if i can't get it right.

Sandro V

> On 10 Sep 2019, at 07:06, Sandro Volery  wrote:
> 
> Using temporaries for gasket_page_table entries to remove scnprintf()
> statements and reduce line length, as suggested by Joe Perches. Thanks!
> 
> Signed-off-by: Sandro Volery 
> ---
> drivers/staging/gasket/apex_driver.c | 20 +---
> 1 file changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/staging/gasket/apex_driver.c 
> b/drivers/staging/gasket/apex_driver.c
> index 2973bb920a26..16ac4329d65f 100644
> --- a/drivers/staging/gasket/apex_driver.c
> +++ b/drivers/staging/gasket/apex_driver.c
> @@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct 
> device_attribute *attr,
>struct gasket_dev *gasket_dev;
>struct gasket_sysfs_attribute *gasket_attr;
>enum sysfs_attribute_type type;
> +struct gasket_page_table *gpt;
> +uint val;
> 
>gasket_dev = gasket_sysfs_get_device_data(device);
>if (!gasket_dev) {
> @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct 
> device_attribute *attr,
>}
> 
>type = (enum sysfs_attribute_type)gasket_attr->data.attr_type;
> +gpt = gasket_dev->page_table[0];
>switch (type) {
>case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE:
> -ret = scnprintf(buf, PAGE_SIZE, "%u\n",
> -gasket_page_table_num_entries(
> -gasket_dev->page_table[0]));
> +val = gasket_page_table_num_simple_entries(gpt);
>break;
>case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE:
> -ret = scnprintf(buf, PAGE_SIZE, "%u\n",
> -gasket_page_table_num_simple_entries(
> -gasket_dev->page_table[0]));
> +val = gasket_page_table_num_simple_entries(gpt);
>break;
>case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES:
> -ret = scnprintf(buf, PAGE_SIZE, "%u\n",
> -gasket_page_table_num_active_pages(
> -gasket_dev->page_table[0]));
> +val = gasket_page_table_num_active_pages(gpt);
>break;
>default:
>dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n",
>attr->attr.name);
>ret = 0;
> -break;
> +goto exit;
>}
> -
> +ret = scnprintf(buf, PAGE_SIZE, "%u\n", val);
> +exit:
>gasket_sysfs_put_attr(device, gasket_attr);
>gasket_sysfs_put_device_data(device, gasket_dev);
>return ret;
> -- 
> 2.23.0
>

[PATCH v2] Staging: gasket: Use temporaries to reduce line length.

2019-09-09 Thread Sandro Volery

Using temporaries for gasket_page_table entries to remove scnprintf()
statements and reduce line length, as suggested by Joe Perches. Thanks!

Signed-off-by: Sandro Volery 
---
 drivers/staging/gasket/apex_driver.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index 2973bb920a26..16ac4329d65f 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -509,6 +509,8 @@ static ssize_t sysfs_show(struct device *device, struct 
device_attribute *attr,
struct gasket_dev *gasket_dev;
struct gasket_sysfs_attribute *gasket_attr;
enum sysfs_attribute_type type;
+   struct gasket_page_table *gpt;
+   uint val;
 
gasket_dev = gasket_sysfs_get_device_data(device);
if (!gasket_dev) {
@@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, struct 
device_attribute *attr,
}
 
type = (enum sysfs_attribute_type)gasket_attr->data.attr_type;
+   gpt = gasket_dev->page_table[0];
switch (type) {
case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_entries(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_simple_entries(gpt);
break;
case ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_simple_entries(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_simple_entries(gpt);
break;
case ATTR_KERNEL_HIB_NUM_ACTIVE_PAGES:
-   ret = scnprintf(buf, PAGE_SIZE, "%u\n",
-   gasket_page_table_num_active_pages(
-   gasket_dev->page_table[0]));
+   val = gasket_page_table_num_active_pages(gpt);
break;
default:
dev_dbg(gasket_dev->dev, "Unknown attribute: %s\n",
attr->attr.name);
ret = 0;
-   break;
+   goto exit;
}
-
+   ret = scnprintf(buf, PAGE_SIZE, "%u\n", val);
+exit:
gasket_sysfs_put_attr(device, gasket_attr);
gasket_sysfs_put_device_data(device, gasket_dev);
return ret;
-- 
2.23.0

Re: [PATCH] Staging: gasket: Use temporaries to reduce line length.

2019-09-09 Thread Sandro Volery




> On 10 Sep 2019, at 00:30, Joe Perches  wrote:
> 
> On Mon, 2019-09-09 at 22:28 +0200, Sandro Volery wrote:
>> Using temporaries for gasket_page_table entries to remove scnprintf()
>> statements and reduce line length, as suggested by Joe Perches. Thanks!
> 
> nak.  Slow down.  You broke the code.
> 
> Please be _way_ more careful and verify for yourself
> the code you submit _before_ you submit it.
> 
> compile/test/verify, twice if necessary.
> 

Shoot. I'm sorry I'm just really trying to get into all this...


> You also should have cc'd me on this patch.
> 

Will do! I'll submit v2 this afternoon.

>> diff --git a/drivers/staging/gasket/apex_driver.c 
>> b/drivers/staging/gasket/apex_driver.c
> []
>> @@ -524,29 +526,25 @@ static ssize_t sysfs_show(struct device *device, 
>> struct device_attribute *attr,
>>}
>> 
>>type = (enum sysfs_attribute_type)gasket_attr->data.attr_type;
>> +gpt = gasket_dev->page_table[0];
>>switch (type) {
>>case ATTR_KERNEL_HIB_PAGE_TABLE_SIZE:
>> -ret = scnprintf(buf, PAGE_SIZE, "%u\n",
>> -gasket_page_table_num_entries(
>> -gasket_dev->page_table[0]));
>> +val = gasket_page_table_num_simple_entries(gpt);
> 
> You likely duplicated this line via copy/paste.
> This should be:
>val = gasket_page_table_num_entries(gpt);
> 

Thanks for being patient with me so far... I'd imagine others would've freaked 
out at me by now :)

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-09 Thread Christophe Leroy





On 09/10/2019 03:56 AM, Anshuman Khandual wrote:



On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote:

On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote:



On 09/07/2019 12:33 AM, Gerald Schaefer wrote:

On Fri, 6 Sep 2019 11:58:59 +0530
Anshuman Khandual  wrote:


On 09/05/2019 10:36 PM, Gerald Schaefer wrote:

On Thu, 5 Sep 2019 14:48:14 +0530
Anshuman Khandual  wrote:
   

[...]

+
+#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
+static void pud_clear_tests(pud_t *pudp)
+{
+   memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
+   pud_clear(pudp);
+   WARN_ON(!pud_none(READ_ONCE(*pudp)));
+}


For pgd/p4d/pud_clear(), we only clear if the page table level is present
and not folded. The memset() here overwrites the table type bits, so
pud_clear() will not clear anything on s390 and the pud_none() check will
fail.
Would it be possible to OR a (larger) random value into the table, so that
the lower 12 bits would be preserved?


So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
it should OR a large random value preserving lower 12 bits. Hmm, this should
still do the trick for other platforms, they just need non zero value. So on
s390, the lower 12 bits on the page table entry already has valid value while
entering this function which would make sure that pud_clear() really does
clear the entry ?


Yes, in theory the table entry on s390 would have the type set in the last
4 bits, so preserving those would be enough. If it does not conflict with
others, I would still suggest preserving all 12 bits since those would contain
arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
would also work with the memset, but for consistency I think the same logic
should be used in all pxd_clear_tests.


Makes sense but..

There is a small challenge with this. Modifying individual bits on a given
page table entry from generic code like this test case is bit tricky. That
is because there are not enough helpers to create entries with an absolute
value. This would have been easier if all the platforms provided functions
like __pxx() which is not the case now. Otherwise something like this should
have worked.


pud_t pud = READ_ONCE(*pudp);
pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
WRITE_ONCE(*pudp, pud);

But __pud() will fail to build in many platforms.


Hmm, I simply used this on my system to make pud_clear_tests() work, not
sure if it works on all archs:

pud_val(*pudp) |= RANDOM_NZVALUE;


Which compiles on arm64 but then fails on x86 because of the way pmd_val()
has been defined there.


Use instead

*pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE);


Agreed.

As I had mentioned before this would have been really the cleanest approach.



It *should* be more portable.


Not really, because not all the platforms have __pxx() definitions right now.
Going with these will clearly cause build failures on affected platforms. Lets
examine __pud() for instance. It is defined only on these platforms.

arch/arm64/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) 
} )
arch/mips/include/asm/pgtable-64.h: #define __pud(x) ((pud_t) { (x) 
})
arch/powerpc/include/asm/pgtable-be-types.h:#define __pud(x) ((pud_t) { 
cpu_to_be64(x) })
arch/powerpc/include/asm/pgtable-types.h:   #define __pud(x) ((pud_t) { (x) 
})
arch/s390/include/asm/page.h:   #define __pud(x) ((pud_t) { (x) 
} )
arch/sparc/include/asm/page_64.h:   #define __pud(x) ((pud_t) { (x) 
} )
arch/sparc/include/asm/page_64.h:   #define __pud(x) (x)
arch/x86/include/asm/pgtable.h: #define __pud(x) 
native_make_pud(x)


You missed:
arch/x86/include/asm/paravirt.h:static inline pud_t __pud(pudval_t val)
include/asm-generic/pgtable-nop4d-hack.h:#define __pud(x) 
   ((pud_t) { __pgd(x) })
include/asm-generic/pgtable-nopud.h:#define __pud(x) 
   ((pud_t) { __p4d(x) })




Similarly for __pmd()

arch/alpha/include/asm/page.h:  #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/page-nommu.h:  #define __pmd(x)  (x)
arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x)  (x)
arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x)  (x)
arch/arm64/include/asm/pgtable-types.h: #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/m68k/include/asm/page.h:   #define __pmd(x)  ((pmd_t) { { 
(x) }, })
arch/mips/include/asm/pgtable-64.h: #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/nds32/include/asm/page.h:  #define __pmd(x)  (x)
arch/parisc/include/asm/page.h: #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/parisc/include/asm/page.h: #define __pmd(x)  (x)
arch/powerp

[PATCH 1/8] x86/platform/uv: Save OEM_ID from ACPI MADT probe

2019-09-09 Thread Mike Travis

Save the OEM_ID and OEM_TABLE_ID passed to the apic driver probe function
for later use.  Also, convert the char list arg passed from the kernel
to a true null-terminated string.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -31,6 +32,10 @@ static u64   gru_dist_base, gru_first_no
 static u64 gru_dist_lmask, gru_dist_umask;
 static union uvh_apiciduvh_apicid;
 
+/* Unpack OEM/TABLE ID's to be NULL terminated strings */
+static u8 oem_id[ACPI_OEM_ID_SIZE + 1];
+static u8 oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
+
 /* Information derived from CPUID: */
 static struct {
unsigned int apicid_shift;
@@ -248,11 +253,20 @@ static void __init uv_set_apicid_hibit(v
}
 }
 
-static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+static void __init uv_stringify(int len, char *to, char *from)
+{
+   /* Relies on 'to' being NULL chars so result will be NULL terminated */
+   strncpy(to, from, len-1);
+}
+
+static int __init uv_acpi_madt_oem_check(char *_oem_id, char *_oem_table_id)
 {
int pnodeid;
int uv_apic;
 
+   uv_stringify(sizeof(oem_id), oem_id, _oem_id);
+   uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id);
+
if (strncmp(oem_id, "SGI", 3) != 0) {
if (strncmp(oem_id, "NSGI", 4) == 0) {
uv_hubless_system = true;

--

[PATCH V2 6/8] x86/platform/uv: Decode UVsystab Info

2019-09-09 Thread Mike Travis

Decode the hubless UVsystab passed from BIOS to the kernel saving
pertinent info in a similar manner that hubbed UVsystabs are decoded.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Removed redundant error message after call to uv_bios_init.
Removed redundant error message after call to decode_uv_systab.
Clarify selection of UV4 and higher when checking for extended UVsystab
in decode_uv_systab().
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1303,7 +1303,8 @@ static int __init decode_uv_systab(void)
struct uv_systab *st;
int i;
 
-   if (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE)
+   /* If system is uv3 or lower, there is no extended UVsystab */
+   if (is_uv_hubbed(0xfe) < uv(4) && is_uv_hubless(0xfe) < uv(4))
return 0;   /* No extended UVsystab required */
 
st = uv_systab;
@@ -1554,8 +1555,15 @@ static __init int uv_system_init_hubless
 
/* Init kernel/BIOS interface */
rc = uv_bios_init();
+   if (rc < 0)
+   return rc;
 
-   /* Create user access node if UVsystab available */
+   /* Process UVsystab */
+   rc = decode_uv_systab();
+   if (rc < 0)
+   return rc;
+
+   /* Create user access node */
if (rc >= 0)
uv_setup_proc_files(1);
 

--

[PATCH 3/8] x86/platform/uv: Add return code to UV BIOS Init function

2019-09-09 Thread Mike Travis

Add a return code to the UV BIOS init function that indicates the 
successful initialization of the kernel/BIOS callback interface.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/uv/bios.h |2 +-
 arch/x86/platform/uv/bios_uv.c |9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/bios.h
+++ linux/arch/x86/include/asm/uv/bios.h
@@ -138,7 +138,7 @@ extern s64 uv_bios_change_memprotect(u64
 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *);
 extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus);
 
-extern void uv_bios_init(void);
+extern int uv_bios_init(void);
 
 extern unsigned long sn_rtc_cycles_per_second;
 extern int uv_type;
--- linux.orig/arch/x86/platform/uv/bios_uv.c
+++ linux/arch/x86/platform/uv/bios_uv.c
@@ -184,20 +184,20 @@ int uv_bios_set_legacy_vga_target(bool d
 }
 EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 
-void uv_bios_init(void)
+int uv_bios_init(void)
 {
uv_systab = NULL;
if ((uv_systab_phys == EFI_INVALID_TABLE_ADDR) ||
!uv_systab_phys || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
-   return;
+   return -EEXIST;
}
 
uv_systab = ioremap(uv_systab_phys, sizeof(struct uv_systab));
if (!uv_systab || strncmp(uv_systab->signature, UV_SYSTAB_SIG, 4)) {
pr_err("UV: UVsystab: bad signature!\n");
iounmap(uv_systab);
-   return;
+   return -EINVAL;
}
 
/* Starting with UV4 the UV systab size is variable */
@@ -208,8 +208,9 @@ void uv_bios_init(void)
uv_systab = ioremap(uv_systab_phys, size);
if (!uv_systab) {
pr_err("UV: UVsystab: ioremap(%d) failed!\n", size);
-   return;
+   return -EFAULT;
}
}
pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision);
+   return 0;
 }

--

[PATCH V2 5/8] x86/platform/uv: Add UV Hubbed/Hubless Proc FS Files

2019-09-09 Thread Mike Travis

Indicate to UV user utilities that UV hubless support is available on
this system via the existing /proc infterface.  The current interface is
maintained with the addition of new /proc leaves ("hubbed", "hubless",
and "oemid") that contain the specific type of UV arch this one is.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Remove is_uv_hubbed define
Remove leading '_' from _is_uv_hubbed
---
 arch/x86/include/asm/uv/uv.h   |4 +
 arch/x86/kernel/apic/x2apic_uv_x.c |   93 -
 2 files changed, 96 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,8 @@ struct mm_struct;
 #ifdef CONFIG_X86_UV
 #include 
 
+#defineUV_PROC_NODE"sgi_uv"
+
 static inline int uv(int uvtype)
 {
/* uv(0) is "any" */
@@ -28,6 +30,7 @@ static inline bool is_early_uv_system(vo
return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR;
 }
 extern int is_uv_system(void);
+extern int is_uv_hubbed(int uvtype);
 extern int is_uv_hubless(int uvtype);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
@@ -40,6 +43,7 @@ extern const struct cpumask *uv_flush_tl
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline bool is_early_uv_system(void){ return 0; }
 static inline int is_uv_system(void)   { return 0; }
+static inline int is_uv_hubbed(int uv) { return 0; }
 static inline int is_uv_hubless(int uv) { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -26,6 +26,7 @@
 static DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 static enum uv_system_type uv_system_type;
+static int uv_hubbed_system;
 static int uv_hubless_system;
 static u64 gru_start_paddr, gru_end_paddr;
 static u64 gru_dist_base, gru_first_node_paddr = -1LL, 
gru_last_node_paddr;
@@ -309,6 +310,24 @@ static int __init uv_acpi_madt_oem_check
if (uv_hub_info->hub_revision == 0)
goto badbios;
 
+   switch (uv_hub_info->hub_revision) {
+   case UV4_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x11;
+   break;
+
+   case UV3_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x9;
+   break;
+
+   case UV2_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x5;
+   break;
+
+   case UV1_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x3;
+   break;
+   }
+
pnodeid = early_get_pnodeid();
early_get_apic_socketid_shift();
 
@@ -359,6 +378,12 @@ int is_uv_system(void)
 }
 EXPORT_SYMBOL_GPL(is_uv_system);
 
+int is_uv_hubbed(int uvtype)
+{
+   return (uv_hubbed_system & uvtype);
+}
+EXPORT_SYMBOL_GPL(is_uv_hubbed);
+
 int is_uv_hubless(int uvtype)
 {
return (uv_hubless_system & uvtype);
@@ -1457,6 +1482,68 @@ static void __init build_socket_tables(v
}
 }
 
+/* Setup user proc fs files */
+static int proc_hubbed_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "0x%x\n", uv_hubbed_system);
+   return 0;
+}
+
+static int proc_hubless_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "0x%x\n", uv_hubless_system);
+   return 0;
+}
+
+static int proc_oemid_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "%s/%s\n", oem_id, oem_table_id);
+   return 0;
+}
+
+static int proc_hubbed_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_hubbed_show, (void *)NULL);
+}
+
+static int proc_hubless_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_hubless_show, (void *)NULL);
+}
+
+static int proc_oemid_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_oemid_show, (void *)NULL);
+}
+
+/* (struct is "non-const" as open function is set at runtime) */
+static struct file_operations proc_version_fops = {
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static const struct file_operations proc_oemid_fops = {
+   .open   = proc_oemid_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static __init void uv_setup_proc_files(int hubless)
+{
+   struct proc_dir_entry *pde;
+   char *name = hubless ? "hubless" : "hubbed";
+
+   pde = proc_mkdir(UV_PROC_NODE,

[PATCH V2 2/8] x86/platform/uv: Return UV Hubless System Type

2019-09-09 Thread Mike Travis

Return the type of UV hubless system for UV specific code that depends
on that.  Use a define to indicate the change in arg type for this
function in uv.h.  Add a function to convert UV system type to bit
pattern needed for is_uv_hubless().

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Remove is_uv_hubless define
Remove leading '_' from _is_uv_hubless
---
 arch/x86/include/asm/uv/uv.h   |   12 ++--
 arch/x86/kernel/apic/x2apic_uv_x.c |   27 ++-
 2 files changed, 28 insertions(+), 11 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,14 @@ struct mm_struct;
 #ifdef CONFIG_X86_UV
 #include 
 
+static inline int uv(int uvtype)
+{
+   /* uv(0) is "any" */
+   if (uvtype >= 0 && uvtype <= 30)
+   return 1 << uvtype;
+   return 1;
+}
+
 extern unsigned long uv_systab_phys;
 
 extern enum uv_system_type get_uv_system_type(void);
@@ -20,7 +28,7 @@ static inline bool is_early_uv_system(vo
return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR;
 }
 extern int is_uv_system(void);
-extern int is_uv_hubless(void);
+extern int is_uv_hubless(int uvtype);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_system_init(void);
@@ -32,7 +40,7 @@ extern const struct cpumask *uv_flush_tl
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline bool is_early_uv_system(void){ return 0; }
 static inline int is_uv_system(void)   { return 0; }
-static inline int is_uv_hubless(void)  { return 0; }
+static inline int is_uv_hubless(int uv) { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
 static inline const struct cpumask *
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -26,7 +26,7 @@
 static DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 static enum uv_system_type uv_system_type;
-static booluv_hubless_system;
+static int uv_hubless_system;
 static u64 gru_start_paddr, gru_end_paddr;
 static u64 gru_dist_base, gru_first_node_paddr = -1LL, 
gru_last_node_paddr;
 static u64 gru_dist_lmask, gru_dist_umask;
@@ -268,11 +268,20 @@ static int __init uv_acpi_madt_oem_check
uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id);
 
if (strncmp(oem_id, "SGI", 3) != 0) {
-   if (strncmp(oem_id, "NSGI", 4) == 0) {
-   uv_hubless_system = true;
-   pr_info("UV: OEM IDs %s/%s, HUBLESS\n",
-   oem_id, oem_table_id);
-   }
+   if (strncmp(oem_id, "NSGI", 4) != 0)
+   return 0;
+
+   /* UV4 Hubless, CH, (0x11:UV4+Any) */
+   if (strncmp(oem_id, "NSGI4", 5) == 0)
+   uv_hubless_system = 0x11;
+
+   /* UV3 Hubless, UV300/MC990X w/o hub (0x9:UV3+Any) */
+   else
+   uv_hubless_system = 0x9;
+
+   pr_info("UV: OEM IDs %s/%s, HUBLESS(0x%x)\n",
+   oem_id, oem_table_id, uv_hubless_system);
+
return 0;
}
 
@@ -350,9 +359,9 @@ int is_uv_system(void)
 }
 EXPORT_SYMBOL_GPL(is_uv_system);
 
-int is_uv_hubless(void)
+int is_uv_hubless(int uvtype)
 {
-   return uv_hubless_system;
+   return (uv_hubless_system & uvtype);
 }
 EXPORT_SYMBOL_GPL(is_uv_hubless);
 
@@ -1592,7 +1601,7 @@ static void __init uv_system_init_hub(vo
  */
 void __init uv_system_init(void)
 {
-   if (likely(!is_uv_system() && !is_uv_hubless()))
+   if (likely(!is_uv_system() && !is_uv_hubless(1)))
return;
 
if (is_uv_system())

--

[PATCH 4/8] x86/platform/uv: Setup UV functions for Hubless UV Systems

2019-09-09 Thread Mike Travis

Add more support for UV systems that do not contain a UV Hub (AKA
"hubless").  This update adds support for additional functions required:

Use PCH NMI handler instead of a UV Hub NMI handler.

Initialize the UV BIOS callback interface used to support specific
UV functions.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1457,6 +1457,20 @@ static void __init build_socket_tables(v
}
 }
 
+/* Initialize UV hubless systems */
+static __init int uv_system_init_hubless(void)
+{
+   int rc;
+
+   /* Setup PCH NMI handler */
+   uv_nmi_setup_hubless();
+
+   /* Init kernel/BIOS interface */
+   rc = uv_bios_init();
+
+   return rc;
+}
+
 static void __init uv_system_init_hub(void)
 {
struct uv_hub_info_s hub_info = {0};
@@ -1596,8 +1610,8 @@ static void __init uv_system_init_hub(vo
 }
 
 /*
- * There is a small amount of UV specific code needed to initialize a
- * UV system that does not have a "UV HUB" (referred to as "hubless").
+ * There is a different code path needed to initialize a UV system that does
+ * not have a "UV HUB" (referred to as "hubless").
  */
 void __init uv_system_init(void)
 {
@@ -1607,7 +1621,7 @@ void __init uv_system_init(void)
if (is_uv_system())
uv_system_init_hub();
else
-   uv_nmi_setup_hubless();
+   uv_system_init_hubless();
 }
 
 apic_driver(apic_x2apic_uv_x);

--

[PATCH 7/8] x86/platform/uv: Check EFI Boot to set reboot type

2019-09-09 Thread Mike Travis

Change to checking for EFI Boot type from previous check on if this
is a KDUMP kernel.  This allows for KDUMP kernels that can handle
EFI reboots.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1483,6 +1484,14 @@ static void __init build_socket_tables(v
}
 }
 
+/* Check which reboot to use */
+static void check_efi_reboot(void)
+{
+   /* If EFI reboot not available, use ACPI reboot */
+   if (!efi_enabled(EFI_BOOT))
+   reboot_type = BOOT_ACPI;
+}
+
 /* Setup user proc fs files */
 static int proc_hubbed_show(struct seq_file *file, void *data)
 {
@@ -1571,6 +1580,8 @@ static __init int uv_system_init_hubless
if (rc >= 0)
uv_setup_proc_files(1);
 
+   check_efi_reboot();
+
return rc;
 }
 
@@ -1704,12 +1715,7 @@ static void __init uv_system_init_hub(vo
/* Register Legacy VGA I/O redirection handler: */
pci_register_set_vga_state(uv_set_vga_state);
 
-   /*
-* For a kdump kernel the reset must be BOOT_ACPI, not BOOT_EFI, as
-* EFI is not enabled in the kdump kernel:
-*/
-   if (is_kdump_kernel())
-   reboot_type = BOOT_ACPI;
+   check_efi_reboot();
 }
 
 /*

--

[PATCH V2 8/8] x86/platform/uv: Account for UV Hubless in is_uvX_hub Ops

2019-09-09 Thread Mike Travis

The references in the is_uvX_hub() function uses the hub_info pointer
which will be NULL when the system is hubless.  This change avoids
that NULL dereference.  It is also an optimization in performance.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Add WARNING that the is UVx supported defines will be removed.
---
 arch/x86/include/asm/uv/.uv_hub.h.swp |binary
 arch/x86/include/asm/uv/uv_hub.h |   61 ---
 1 file changed, 20 insertions(+), 41 deletions(-)

Binary files linux.orig/arch/x86/include/asm/uv/.uv_hub.h.swp and 
linux/arch/x86/include/asm/uv/.uv_hub.h.swp differ
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -243,83 +244,61 @@ static inline int uv_hub_info_check(int
 #define UV4_HUB_REVISION_BASE  7
 #define UV4A_HUB_REVISION_BASE 8   /* UV4 (fixed) rev 2 */
 
-#ifdef UV1_HUB_IS_SUPPORTED
+/* WARNING: UVx_HUB_IS_SUPPORTED defines are deprecated and will be removed */
 static inline int is_uv1_hub(void)
 {
-   return uv_hub_info->hub_revision < UV2_HUB_REVISION_BASE;
-}
+#ifdef UV1_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(1));
 #else
-static inline int is_uv1_hub(void)
-{
return 0;
-}
 #endif
+}
 
-#ifdef UV2_HUB_IS_SUPPORTED
 static inline int is_uv2_hub(void)
 {
-   return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) &&
-   (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE));
-}
+#ifdef UV2_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(2));
 #else
-static inline int is_uv2_hub(void)
-{
return 0;
-}
 #endif
+}
 
-#ifdef UV3_HUB_IS_SUPPORTED
 static inline int is_uv3_hub(void)
 {
-   return ((uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE) &&
-   (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE));
-}
+#ifdef UV3_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(3));
 #else
-static inline int is_uv3_hub(void)
-{
return 0;
-}
 #endif
+}
 
 /* First test "is UV4A", then "is UV4" */
-#ifdef UV4A_HUB_IS_SUPPORTED
-static inline int is_uv4a_hub(void)
-{
-   return (uv_hub_info->hub_revision >= UV4A_HUB_REVISION_BASE);
-}
-#else
 static inline int is_uv4a_hub(void)
 {
+#ifdef UV4A_HUB_IS_SUPPORTED
+   if (is_uv_hubbed(uv(4)))
+   return (uv_hub_info->hub_revision == UV4A_HUB_REVISION_BASE);
+#endif
return 0;
 }
-#endif
 
-#ifdef UV4_HUB_IS_SUPPORTED
 static inline int is_uv4_hub(void)
 {
-   return uv_hub_info->hub_revision >= UV4_HUB_REVISION_BASE;
-}
+#ifdef UV4_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(4));
 #else
-static inline int is_uv4_hub(void)
-{
return 0;
-}
 #endif
+}
 
 static inline int is_uvx_hub(void)
 {
-   if (uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE)
-   return uv_hub_info->hub_revision;
-
-   return 0;
+   return (is_uv_hubbed(-2) >= uv(2));
 }
 
 static inline int is_uv_hub(void)
 {
-#ifdef UV1_HUB_IS_SUPPORTED
-   return uv_hub_info->hub_revision;
-#endif
-   return is_uvx_hub();
+   return is_uv1_hub() || is_uvx_hub();
 }
 
 union uvh_apicid {

--

[PATCH V2 0/8] x86/platform/UV: Update UV Hubless System Support

2019-09-09 Thread Mike Travis



On 9/5/2019 11:47 AM, Mike Travis wrote:
> 
> These patches support upcoming UV systems that do not have a UV HUB.
> 
> [1/8] Save OEM_ID from ACPI MADT probe
>
> [2/8] Return UV Hubless System Type
V2: Remove is_uv_hubless define
Remove leading '_' from _is_uv_hubless

> [3/8] Add return code to UV BIOS Init function
>
> [4/8] Setup UV functions for Hubless UV Systems
>
> [5/8] Add UV Hubbed/Hubless Proc FS Files
V2: Remove is_uv_hubbed define
Remove leading '_' from _is_uv_hubbed

> [6/8] Decode UVsystab Info
V2: Removed redundant error message after call to uv_bios_init.
Removed redundant error message after call to decode_uv_systab.
Clarify selection of UV4 and higher when checking for extended UVsystab
in decode_uv_systab().

> [7/8] Check EFI Boot to set reboot type
>
> [8/8] Account for UV Hubless in is_uvX_hub Ops
V2: Add WARNING that the is UVx supported defines will be removed.

--

[PATCH 4/8] x86/platform/uv: Setup UV functions for Hubless UV Systems

2019-09-09 Thread Mike Travis

Add more support for UV systems that do not contain a UV Hub (AKA
"hubless").  This update adds support for additional functions required:

Use PCH NMI handler instead of a UV Hub NMI handler.

Initialize the UV BIOS callback interface used to support specific
UV functions.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1457,6 +1457,20 @@ static void __init build_socket_tables(v
}
 }
 
+/* Initialize UV hubless systems */
+static __init int uv_system_init_hubless(void)
+{
+   int rc;
+
+   /* Setup PCH NMI handler */
+   uv_nmi_setup_hubless();
+
+   /* Init kernel/BIOS interface */
+   rc = uv_bios_init();
+
+   return rc;
+}
+
 static void __init uv_system_init_hub(void)
 {
struct uv_hub_info_s hub_info = {0};
@@ -1596,8 +1610,8 @@ static void __init uv_system_init_hub(vo
 }
 
 /*
- * There is a small amount of UV specific code needed to initialize a
- * UV system that does not have a "UV HUB" (referred to as "hubless").
+ * There is a different code path needed to initialize a UV system that does
+ * not have a "UV HUB" (referred to as "hubless").
  */
 void __init uv_system_init(void)
 {
@@ -1607,7 +1621,7 @@ void __init uv_system_init(void)
if (is_uv_system())
uv_system_init_hub();
else
-   uv_nmi_setup_hubless();
+   uv_system_init_hubless();
 }
 
 apic_driver(apic_x2apic_uv_x);

--

[PATCH 3/8] x86/platform/uv: Add return code to UV BIOS Init function

2019-09-09 Thread Mike Travis

Add a return code to the UV BIOS init function that indicates the 
successful initialization of the kernel/BIOS callback interface.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/uv/bios.h |2 +-
 arch/x86/platform/uv/bios_uv.c |9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/bios.h
+++ linux/arch/x86/include/asm/uv/bios.h
@@ -138,7 +138,7 @@ extern s64 uv_bios_change_memprotect(u64
 extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *);
 extern int uv_bios_set_legacy_vga_target(bool decode, int domain, int bus);
 
-extern void uv_bios_init(void);
+extern int uv_bios_init(void);
 
 extern unsigned long sn_rtc_cycles_per_second;
 extern int uv_type;
--- linux.orig/arch/x86/platform/uv/bios_uv.c
+++ linux/arch/x86/platform/uv/bios_uv.c
@@ -184,20 +184,20 @@ int uv_bios_set_legacy_vga_target(bool d
 }
 EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 
-void uv_bios_init(void)
+int uv_bios_init(void)
 {
uv_systab = NULL;
if ((uv_systab_phys == EFI_INVALID_TABLE_ADDR) ||
!uv_systab_phys || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
-   return;
+   return -EEXIST;
}
 
uv_systab = ioremap(uv_systab_phys, sizeof(struct uv_systab));
if (!uv_systab || strncmp(uv_systab->signature, UV_SYSTAB_SIG, 4)) {
pr_err("UV: UVsystab: bad signature!\n");
iounmap(uv_systab);
-   return;
+   return -EINVAL;
}
 
/* Starting with UV4 the UV systab size is variable */
@@ -208,8 +208,9 @@ void uv_bios_init(void)
uv_systab = ioremap(uv_systab_phys, size);
if (!uv_systab) {
pr_err("UV: UVsystab: ioremap(%d) failed!\n", size);
-   return;
+   return -EFAULT;
}
}
pr_info("UV: UVsystab: Revision:%x\n", uv_systab->revision);
+   return 0;
 }

--

[PATCH V2 6/8] x86/platform/uv: Decode UVsystab Info

2019-09-09 Thread Mike Travis

Decode the hubless UVsystab passed from BIOS to the kernel saving
pertinent info in a similar manner that hubbed UVsystabs are decoded.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Removed redundant error message after call to uv_bios_init.
Removed redundant error message after call to decode_uv_systab.
Clarify selection of UV4 and higher when checking for extended UVsystab
in decode_uv_systab().
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1303,7 +1303,8 @@ static int __init decode_uv_systab(void)
struct uv_systab *st;
int i;
 
-   if (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE)
+   /* If system is uv3 or lower, there is no extended UVsystab */
+   if (is_uv_hubbed(0xfe) < uv(4) && is_uv_hubless(0xfe) < uv(4))
return 0;   /* No extended UVsystab required */
 
st = uv_systab;
@@ -1554,8 +1555,15 @@ static __init int uv_system_init_hubless
 
/* Init kernel/BIOS interface */
rc = uv_bios_init();
+   if (rc < 0)
+   return rc;
 
-   /* Create user access node if UVsystab available */
+   /* Process UVsystab */
+   rc = decode_uv_systab();
+   if (rc < 0)
+   return rc;
+
+   /* Create user access node */
if (rc >= 0)
uv_setup_proc_files(1);
 

--

[PATCH 7/8] x86/platform/uv: Check EFI Boot to set reboot type

2019-09-09 Thread Mike Travis

Change to checking for EFI Boot type from previous check on if this
is a KDUMP kernel.  This allows for KDUMP kernels that can handle
EFI reboots.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1483,6 +1484,14 @@ static void __init build_socket_tables(v
}
 }
 
+/* Check which reboot to use */
+static void check_efi_reboot(void)
+{
+   /* If EFI reboot not available, use ACPI reboot */
+   if (!efi_enabled(EFI_BOOT))
+   reboot_type = BOOT_ACPI;
+}
+
 /* Setup user proc fs files */
 static int proc_hubbed_show(struct seq_file *file, void *data)
 {
@@ -1571,6 +1580,8 @@ static __init int uv_system_init_hubless
if (rc >= 0)
uv_setup_proc_files(1);
 
+   check_efi_reboot();
+
return rc;
 }
 
@@ -1704,12 +1715,7 @@ static void __init uv_system_init_hub(vo
/* Register Legacy VGA I/O redirection handler: */
pci_register_set_vga_state(uv_set_vga_state);
 
-   /*
-* For a kdump kernel the reset must be BOOT_ACPI, not BOOT_EFI, as
-* EFI is not enabled in the kdump kernel:
-*/
-   if (is_kdump_kernel())
-   reboot_type = BOOT_ACPI;
+   check_efi_reboot();
 }
 
 /*

--

[PATCH V2 8/8] x86/platform/uv: Account for UV Hubless in is_uvX_hub Ops

2019-09-09 Thread Mike Travis

The references in the is_uvX_hub() function uses the hub_info pointer
which will be NULL when the system is hubless.  This change avoids
that NULL dereference.  It is also an optimization in performance.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Add WARNING that the is UVx supported defines will be removed.
---
 arch/x86/include/asm/uv/.uv_hub.h.swp |binary
 arch/x86/include/asm/uv/uv_hub.h |   61 ---
 1 file changed, 20 insertions(+), 41 deletions(-)

Binary files linux.orig/arch/x86/include/asm/uv/.uv_hub.h.swp and 
linux/arch/x86/include/asm/uv/.uv_hub.h.swp differ
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -243,83 +244,61 @@ static inline int uv_hub_info_check(int
 #define UV4_HUB_REVISION_BASE  7
 #define UV4A_HUB_REVISION_BASE 8   /* UV4 (fixed) rev 2 */
 
-#ifdef UV1_HUB_IS_SUPPORTED
+/* WARNING: UVx_HUB_IS_SUPPORTED defines are deprecated and will be removed */
 static inline int is_uv1_hub(void)
 {
-   return uv_hub_info->hub_revision < UV2_HUB_REVISION_BASE;
-}
+#ifdef UV1_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(1));
 #else
-static inline int is_uv1_hub(void)
-{
return 0;
-}
 #endif
+}
 
-#ifdef UV2_HUB_IS_SUPPORTED
 static inline int is_uv2_hub(void)
 {
-   return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) &&
-   (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE));
-}
+#ifdef UV2_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(2));
 #else
-static inline int is_uv2_hub(void)
-{
return 0;
-}
 #endif
+}
 
-#ifdef UV3_HUB_IS_SUPPORTED
 static inline int is_uv3_hub(void)
 {
-   return ((uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE) &&
-   (uv_hub_info->hub_revision < UV4_HUB_REVISION_BASE));
-}
+#ifdef UV3_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(3));
 #else
-static inline int is_uv3_hub(void)
-{
return 0;
-}
 #endif
+}
 
 /* First test "is UV4A", then "is UV4" */
-#ifdef UV4A_HUB_IS_SUPPORTED
-static inline int is_uv4a_hub(void)
-{
-   return (uv_hub_info->hub_revision >= UV4A_HUB_REVISION_BASE);
-}
-#else
 static inline int is_uv4a_hub(void)
 {
+#ifdef UV4A_HUB_IS_SUPPORTED
+   if (is_uv_hubbed(uv(4)))
+   return (uv_hub_info->hub_revision == UV4A_HUB_REVISION_BASE);
+#endif
return 0;
 }
-#endif
 
-#ifdef UV4_HUB_IS_SUPPORTED
 static inline int is_uv4_hub(void)
 {
-   return uv_hub_info->hub_revision >= UV4_HUB_REVISION_BASE;
-}
+#ifdef UV4_HUB_IS_SUPPORTED
+   return is_uv_hubbed(uv(4));
 #else
-static inline int is_uv4_hub(void)
-{
return 0;
-}
 #endif
+}
 
 static inline int is_uvx_hub(void)
 {
-   if (uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE)
-   return uv_hub_info->hub_revision;
-
-   return 0;
+   return (is_uv_hubbed(-2) >= uv(2));
 }
 
 static inline int is_uv_hub(void)
 {
-#ifdef UV1_HUB_IS_SUPPORTED
-   return uv_hub_info->hub_revision;
-#endif
-   return is_uvx_hub();
+   return is_uv1_hub() || is_uvx_hub();
 }
 
 union uvh_apicid {

--

[PATCH 1/8] x86/platform/uv: Save OEM_ID from ACPI MADT probe

2019-09-09 Thread Mike Travis

Save the OEM_ID and OEM_TABLE_ID passed to the apic driver probe function
for later use.  Also, convert the char list arg passed from the kernel
to a true null-terminated string.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/apic/x2apic_uv_x.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -31,6 +32,10 @@ static u64   gru_dist_base, gru_first_no
 static u64 gru_dist_lmask, gru_dist_umask;
 static union uvh_apiciduvh_apicid;
 
+/* Unpack OEM/TABLE ID's to be NULL terminated strings */
+static u8 oem_id[ACPI_OEM_ID_SIZE + 1];
+static u8 oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
+
 /* Information derived from CPUID: */
 static struct {
unsigned int apicid_shift;
@@ -248,11 +253,20 @@ static void __init uv_set_apicid_hibit(v
}
 }
 
-static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+static void __init uv_stringify(int len, char *to, char *from)
+{
+   /* Relies on 'to' being NULL chars so result will be NULL terminated */
+   strncpy(to, from, len-1);
+}
+
+static int __init uv_acpi_madt_oem_check(char *_oem_id, char *_oem_table_id)
 {
int pnodeid;
int uv_apic;
 
+   uv_stringify(sizeof(oem_id), oem_id, _oem_id);
+   uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id);
+
if (strncmp(oem_id, "SGI", 3) != 0) {
if (strncmp(oem_id, "NSGI", 4) == 0) {
uv_hubless_system = true;

--

[PATCH V2 2/8] x86/platform/uv: Return UV Hubless System Type

2019-09-09 Thread Mike Travis

Return the type of UV hubless system for UV specific code that depends
on that.  Use a define to indicate the change in arg type for this
function in uv.h.  Add a function to convert UV system type to bit
pattern needed for is_uv_hubless().

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Remove is_uv_hubless define
Remove leading '_' from _is_uv_hubless
---
 arch/x86/include/asm/uv/uv.h   |   12 ++--
 arch/x86/kernel/apic/x2apic_uv_x.c |   27 ++-
 2 files changed, 28 insertions(+), 11 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,14 @@ struct mm_struct;
 #ifdef CONFIG_X86_UV
 #include 
 
+static inline int uv(int uvtype)
+{
+   /* uv(0) is "any" */
+   if (uvtype >= 0 && uvtype <= 30)
+   return 1 << uvtype;
+   return 1;
+}
+
 extern unsigned long uv_systab_phys;
 
 extern enum uv_system_type get_uv_system_type(void);
@@ -20,7 +28,7 @@ static inline bool is_early_uv_system(vo
return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR;
 }
 extern int is_uv_system(void);
-extern int is_uv_hubless(void);
+extern int is_uv_hubless(int uvtype);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_system_init(void);
@@ -32,7 +40,7 @@ extern const struct cpumask *uv_flush_tl
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline bool is_early_uv_system(void){ return 0; }
 static inline int is_uv_system(void)   { return 0; }
-static inline int is_uv_hubless(void)  { return 0; }
+static inline int is_uv_hubless(int uv) { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
 static inline const struct cpumask *
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -26,7 +26,7 @@
 static DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 static enum uv_system_type uv_system_type;
-static booluv_hubless_system;
+static int uv_hubless_system;
 static u64 gru_start_paddr, gru_end_paddr;
 static u64 gru_dist_base, gru_first_node_paddr = -1LL, 
gru_last_node_paddr;
 static u64 gru_dist_lmask, gru_dist_umask;
@@ -268,11 +268,20 @@ static int __init uv_acpi_madt_oem_check
uv_stringify(sizeof(oem_table_id), oem_table_id, _oem_table_id);
 
if (strncmp(oem_id, "SGI", 3) != 0) {
-   if (strncmp(oem_id, "NSGI", 4) == 0) {
-   uv_hubless_system = true;
-   pr_info("UV: OEM IDs %s/%s, HUBLESS\n",
-   oem_id, oem_table_id);
-   }
+   if (strncmp(oem_id, "NSGI", 4) != 0)
+   return 0;
+
+   /* UV4 Hubless, CH, (0x11:UV4+Any) */
+   if (strncmp(oem_id, "NSGI4", 5) == 0)
+   uv_hubless_system = 0x11;
+
+   /* UV3 Hubless, UV300/MC990X w/o hub (0x9:UV3+Any) */
+   else
+   uv_hubless_system = 0x9;
+
+   pr_info("UV: OEM IDs %s/%s, HUBLESS(0x%x)\n",
+   oem_id, oem_table_id, uv_hubless_system);
+
return 0;
}
 
@@ -350,9 +359,9 @@ int is_uv_system(void)
 }
 EXPORT_SYMBOL_GPL(is_uv_system);
 
-int is_uv_hubless(void)
+int is_uv_hubless(int uvtype)
 {
-   return uv_hubless_system;
+   return (uv_hubless_system & uvtype);
 }
 EXPORT_SYMBOL_GPL(is_uv_hubless);
 
@@ -1592,7 +1601,7 @@ static void __init uv_system_init_hub(vo
  */
 void __init uv_system_init(void)
 {
-   if (likely(!is_uv_system() && !is_uv_hubless()))
+   if (likely(!is_uv_system() && !is_uv_hubless(1)))
return;
 
if (is_uv_system())

--

[PATCH V2 5/8] x86/platform/uv: Add UV Hubbed/Hubless Proc FS Files

2019-09-09 Thread Mike Travis

Indicate to UV user utilities that UV hubless support is available on
this system via the existing /proc infterface.  The current interface is
maintained with the addition of new /proc leaves ("hubbed", "hubless",
and "oemid") that contain the specific type of UV arch this one is.

Signed-off-by: Mike Travis 
Reviewed-by: Steve Wahl 
Reviewed-by: Dimitri Sivanich 
To: Thomas Gleixner 
To: Ingo Molnar 
To: H. Peter Anvin 
To: Andrew Morton 
To: Borislav Petkov 
To: Christoph Hellwig 
To: Sasha Levin 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Hedi Berriche 
Cc: Steve Wahl 
Cc: Justin Ernst 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
V2: Remove is_uv_hubbed define
Remove leading '_' from _is_uv_hubbed
---
 arch/x86/include/asm/uv/uv.h   |4 +
 arch/x86/kernel/apic/x2apic_uv_x.c |   93 -
 2 files changed, 96 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,8 @@ struct mm_struct;
 #ifdef CONFIG_X86_UV
 #include 
 
+#defineUV_PROC_NODE"sgi_uv"
+
 static inline int uv(int uvtype)
 {
/* uv(0) is "any" */
@@ -28,6 +30,7 @@ static inline bool is_early_uv_system(vo
return uv_systab_phys && uv_systab_phys != EFI_INVALID_TABLE_ADDR;
 }
 extern int is_uv_system(void);
+extern int is_uv_hubbed(int uvtype);
 extern int is_uv_hubless(int uvtype);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
@@ -40,6 +43,7 @@ extern const struct cpumask *uv_flush_tl
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline bool is_early_uv_system(void){ return 0; }
 static inline int is_uv_system(void)   { return 0; }
+static inline int is_uv_hubbed(int uv) { return 0; }
 static inline int is_uv_hubless(int uv) { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -26,6 +26,7 @@
 static DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 static enum uv_system_type uv_system_type;
+static int uv_hubbed_system;
 static int uv_hubless_system;
 static u64 gru_start_paddr, gru_end_paddr;
 static u64 gru_dist_base, gru_first_node_paddr = -1LL, 
gru_last_node_paddr;
@@ -309,6 +310,24 @@ static int __init uv_acpi_madt_oem_check
if (uv_hub_info->hub_revision == 0)
goto badbios;
 
+   switch (uv_hub_info->hub_revision) {
+   case UV4_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x11;
+   break;
+
+   case UV3_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x9;
+   break;
+
+   case UV2_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x5;
+   break;
+
+   case UV1_HUB_REVISION_BASE:
+   uv_hubbed_system = 0x3;
+   break;
+   }
+
pnodeid = early_get_pnodeid();
early_get_apic_socketid_shift();
 
@@ -359,6 +378,12 @@ int is_uv_system(void)
 }
 EXPORT_SYMBOL_GPL(is_uv_system);
 
+int is_uv_hubbed(int uvtype)
+{
+   return (uv_hubbed_system & uvtype);
+}
+EXPORT_SYMBOL_GPL(is_uv_hubbed);
+
 int is_uv_hubless(int uvtype)
 {
return (uv_hubless_system & uvtype);
@@ -1457,6 +1482,68 @@ static void __init build_socket_tables(v
}
 }
 
+/* Setup user proc fs files */
+static int proc_hubbed_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "0x%x\n", uv_hubbed_system);
+   return 0;
+}
+
+static int proc_hubless_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "0x%x\n", uv_hubless_system);
+   return 0;
+}
+
+static int proc_oemid_show(struct seq_file *file, void *data)
+{
+   seq_printf(file, "%s/%s\n", oem_id, oem_table_id);
+   return 0;
+}
+
+static int proc_hubbed_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_hubbed_show, (void *)NULL);
+}
+
+static int proc_hubless_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_hubless_show, (void *)NULL);
+}
+
+static int proc_oemid_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, proc_oemid_show, (void *)NULL);
+}
+
+/* (struct is "non-const" as open function is set at runtime) */
+static struct file_operations proc_version_fops = {
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static const struct file_operations proc_oemid_fops = {
+   .open   = proc_oemid_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static __init void uv_setup_proc_files(int hubless)
+{
+   struct proc_dir_entry *pde;
+   char *name = hubless ? "hubless" : "hubbed";
+
+   pde = proc_mkdir(UV_PROC_NODE,

[PATCH V2 0/8] x86/platform/UV: Update UV Hubless System Support

2019-09-09 Thread Mike Travis



On 9/5/2019 11:47 AM, Mike Travis wrote:
> 
> These patches support upcoming UV systems that do not have a UV HUB.
> 
> [1/8] Save OEM_ID from ACPI MADT probe
>
> [2/8] Return UV Hubless System Type
V2: Remove is_uv_hubless define
Remove leading '_' from _is_uv_hubless

> [3/8] Add return code to UV BIOS Init function
>
> [4/8] Setup UV functions for Hubless UV Systems
>
> [5/8] Add UV Hubbed/Hubless Proc FS Files
V2: Remove is_uv_hubbed define
Remove leading '_' from _is_uv_hubbed

> [6/8] Decode UVsystab Info
V2: Removed redundant error message after call to uv_bios_init.
Removed redundant error message after call to decode_uv_systab.
Clarify selection of UV4 and higher when checking for extended UVsystab
in decode_uv_systab().

> [7/8] Check EFI Boot to set reboot type
>
> [8/8] Account for UV Hubless in is_uvX_hub Ops
V2: Add WARNING that the is UVx supported defines will be removed.

--

Re: Linux 5.3-rc8

2019-09-09 Thread Ahmed S. Darwish

Hi,

On Sun, Sep 08, 2019 at 01:59:27PM -0700, Linus Torvalds wrote:
> So we probably didn't strictly need an rc8 this release, but with LPC
> and the KS conference travel this upcoming week it just makes
> everything easier.
>

The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1]
which was merged in v5.3-rc1, *always* leads to a blocked boot on my
system due to low entropy.

The hardware is not a VM: it's a Thinkpad E480 (i5-8250U CPU), with
a standard Arch user-space.

It was discovered through bisecting the problem v5.2 => v5.3-rc1,
since v5.2 never had any similar issues. The issue still persists in
v5.3-rc8: reverting that commit always fixes the problem.

It seems that batching the directory lookup I/O requests (which are
possibly a lot during boot) is minimizing sources of disk-activity-
induced entropy? [2] [3]

Can this even be considered a user-space breakage? I'm honestly not
sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8)
early-on fixes the problem. I'm not sure about the status of older
CPUs though.

Thanks,

[1]
  commit b03755ad6f33b7b8cd7312a3596a2dbf496de6e7
  Author: zhangjs 
  Date:   Wed Jun 19 23:41:29 2019 -0400

  ext4: make __ext4_get_inode_loc plug

  Add a blk_plug to prevent the inode table readahead from being
  submitted as small I/O requests.

  Signed-off-by: zhangjs 
  Signed-off-by: Theodore Ts'o 
  Reviewed-by: Jan Kara 

[2] https://lkml.kernel.org/r/20190619122457.gf27...@quack2.suse.cz

[3] block/blk-core.c :: blk_start_plug()

--
darwi
http://darwish.chasingpointers.com

[PATCH] tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'

2019-09-09 Thread Christophe JAILLET

'exit' functions should be marked as __exit, not __init.

Fixes: c10b13325ced ("tty: serial: Add RDA8810PL UART driver")
Signed-off-by: Christophe JAILLET 
---
 drivers/tty/serial/rda-uart.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/rda-uart.c b/drivers/tty/serial/rda-uart.c
index c1b0d7662ef9..ff9a27d48bca 100644
--- a/drivers/tty/serial/rda-uart.c
+++ b/drivers/tty/serial/rda-uart.c
@@ -815,7 +815,7 @@ static int __init rda_uart_init(void)
return ret;
 }
 
-static void __init rda_uart_exit(void)
+static void __exit rda_uart_exit(void)
 {
platform_driver_unregister(&rda_uart_platform_driver);
uart_unregister_driver(&rda_uart_driver);
-- 
2.20.1

[PATCH] tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'

2019-09-09 Thread Christophe JAILLET

'exit' functions should be marked as __exit, not __init.

Fixes: fc60a8b675bd ("tty: serial: owl: Implement console driver")
Signed-off-by: Christophe JAILLET 
---
 drivers/tty/serial/owl-uart.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/owl-uart.c b/drivers/tty/serial/owl-uart.c
index 03963af77b15..d2d8b3494685 100644
--- a/drivers/tty/serial/owl-uart.c
+++ b/drivers/tty/serial/owl-uart.c
@@ -740,7 +740,7 @@ static int __init owl_uart_init(void)
return ret;
 }
 
-static void __init owl_uart_exit(void)
+static void __exit owl_uart_exit(void)
 {
platform_driver_unregister(&owl_uart_platform_driver);
uart_unregister_driver(&owl_uart_driver);
-- 
2.20.1

Re: [PATCH] smack: include linux/watch_queue.h

2019-09-09 Thread kbuild test robot

Hi Arnd,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Arnd-Bergmann/smack-include-linux-watch_queue-h/20190910-095704
reproduce:
# apt-get install sparse
# sparse version: 
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

>> security/smack/smack_lsm.c:45:11: sparse: error: unable to open 
>> 'linux/watch_queue.h'

vim +45 security/smack/smack_lsm.c

  > 45  #include 
46  #include "smack.h"
47  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH] mips: Loongson: Fix the link time qualifier of 'serial_exit()'

2019-09-09 Thread Christophe JAILLET

'exit' functions should be marked as __exit, not __init.

Fixes: 85cc028817ef ("mips: make loongsoon serial driver explicitly modular")
Signed-off-by: Christophe JAILLET 
---
 arch/mips/loongson64/common/serial.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/loongson64/common/serial.c 
b/arch/mips/loongson64/common/serial.c
index ffefc1cb2612..98c3a7feb10f 100644
--- a/arch/mips/loongson64/common/serial.c
+++ b/arch/mips/loongson64/common/serial.c
@@ -110,7 +110,7 @@ static int __init serial_init(void)
 }
 module_init(serial_init);
 
-static void __init serial_exit(void)
+static void __exit serial_exit(void)
 {
platform_device_unregister(&uart8250_device);
 }
-- 
2.20.1

Re: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration

2019-09-09 Thread Xiaoming Ni



On 2019/7/17 19:15, Vasily Averin wrote:
> On 7/16/19 5:07 PM, Xiaoming Ni wrote:
>> On 2019/7/16 18:20, Vasily Averin wrote:
>>> On 7/16/19 5:00 AM, Xiaoming Ni wrote:
 On 2019/7/15 13:38, Vasily Averin wrote:
> On 7/14/19 5:45 AM, Xiaoming Ni wrote:
>> On 2019/7/12 22:07, gre...@linuxfoundation.org wrote:
>>> On Fri, Jul 12, 2019 at 09:11:57PM +0800, Xiaoming Ni wrote:
 On 2019/7/11 21:57, Vasily Averin wrote:
> On 7/11/19 4:55 AM, Nixiaoming wrote:
>> On Wed, July 10, 2019 1:49 PM Vasily Averin wrote:
>>> On 7/10/19 6:09 AM, Xiaoming Ni wrote:

...
 So in these two cases, is it more reasonable to trigger BUG() directly 
 when checking for duplicate registration ?
 But why does current notifier_chain_register() just trigger WARN() 
 without exiting ?
 notifier_chain_cond_register() direct exit without triggering WARN() ?
>>>
>>> It should recover from this, if it can be detected.  The main point is
>>> that not all apis have to be this "robust" when used within the kernel
>>> as we do allow for the callers to know what they are doing :)
>>>
>> In the notifier_chain_register(), the condition ( (*nl) == n) is the 
>> same registration of the same hook.
>>  We can intercept this situation and avoid forming a linked list ring to 
>> make the API more rob
...
...

> Yes, I'm agree, at present there are no difference between
> notifier_chain_cond_register() and notifier_chain_register()
> 
> Question is -- how to improve it.
> You propose to remove notifier_chain_cond_register() by some way.
> Another option is return an error, for some abstract callers who expect 
> possible double registration.
> 
> Frankly speaking I prefer second one,
> however because of kernel do not have any such callers right now seems you 
> are right, 
> and we can delete notifier_chain_cond_register().
> 
> So let me finally accept your patch-set.
> 
> Thank you,
>   Vasily Averin
> 
> .
>

Dear Greg Kroah-Hartman
is there any other opinion on this patch set?
can you pick this series?

thanks
Xiaoming Ni

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-09 Thread Anshuman Khandual




On 09/09/2019 08:43 PM, Kirill A. Shutemov wrote:
> On Mon, Sep 09, 2019 at 11:56:50AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 09/07/2019 12:33 AM, Gerald Schaefer wrote:
>>> On Fri, 6 Sep 2019 11:58:59 +0530
>>> Anshuman Khandual  wrote:
>>>
 On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
> On Thu, 5 Sep 2019 14:48:14 +0530
> Anshuman Khandual  wrote:
>   
>>> [...]
 +
 +#if !defined(__PAGETABLE_PMD_FOLDED) && 
 !defined(__ARCH_HAS_4LEVEL_HACK)
 +static void pud_clear_tests(pud_t *pudp)
 +{
 +  memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
 +  pud_clear(pudp);
 +  WARN_ON(!pud_none(READ_ONCE(*pudp)));
 +}
>>>
>>> For pgd/p4d/pud_clear(), we only clear if the page table level is 
>>> present
>>> and not folded. The memset() here overwrites the table type bits, so
>>> pud_clear() will not clear anything on s390 and the pud_none() check 
>>> will
>>> fail.
>>> Would it be possible to OR a (larger) random value into the table, so 
>>> that
>>> the lower 12 bits would be preserved?
>>
>> So the suggestion is instead of doing memset() on entry with 
>> RANDOM_NZVALUE,
>> it should OR a large random value preserving lower 12 bits. Hmm, this 
>> should
>> still do the trick for other platforms, they just need non zero value. 
>> So on
>> s390, the lower 12 bits on the page table entry already has valid value 
>> while
>> entering this function which would make sure that pud_clear() really does
>> clear the entry ?  
>
> Yes, in theory the table entry on s390 would have the type set in the last
> 4 bits, so preserving those would be enough. If it does not conflict with
> others, I would still suggest preserving all 12 bits since those would 
> contain
> arch-specific flags in general, just to be sure. For s390, the pte/pmd 
> tests
> would also work with the memset, but for consistency I think the same 
> logic
> should be used in all pxd_clear_tests.  

 Makes sense but..

 There is a small challenge with this. Modifying individual bits on a given
 page table entry from generic code like this test case is bit tricky. That
 is because there are not enough helpers to create entries with an absolute
 value. This would have been easier if all the platforms provided functions
 like __pxx() which is not the case now. Otherwise something like this 
 should
 have worked.


 pud_t pud = READ_ONCE(*pudp);
 pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
 WRITE_ONCE(*pudp, pud);

 But __pud() will fail to build in many platforms.
>>>
>>> Hmm, I simply used this on my system to make pud_clear_tests() work, not
>>> sure if it works on all archs:
>>>
>>> pud_val(*pudp) |= RANDOM_NZVALUE;
>>
>> Which compiles on arm64 but then fails on x86 because of the way pmd_val()
>> has been defined there.
> 
> Use instead
> 
>   *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE);

Agreed.

As I had mentioned before this would have been really the cleanest approach.

> 
> It *should* be more portable.

Not really, because not all the platforms have __pxx() definitions right now.
Going with these will clearly cause build failures on affected platforms. Lets
examine __pud() for instance. It is defined only on these platforms.

arch/arm64/include/asm/pgtable-types.h: #define __pud(x) ((pud_t) { (x) 
} )
arch/mips/include/asm/pgtable-64.h: #define __pud(x) ((pud_t) { (x) 
})
arch/powerpc/include/asm/pgtable-be-types.h:#define __pud(x) ((pud_t) { 
cpu_to_be64(x) })
arch/powerpc/include/asm/pgtable-types.h:   #define __pud(x) ((pud_t) { (x) 
})
arch/s390/include/asm/page.h:   #define __pud(x) ((pud_t) { (x) 
} )
arch/sparc/include/asm/page_64.h:   #define __pud(x) ((pud_t) { (x) 
} )
arch/sparc/include/asm/page_64.h:   #define __pud(x) (x)
arch/x86/include/asm/pgtable.h: #define __pud(x) 
native_make_pud(x)

Similarly for __pmd()

arch/alpha/include/asm/page.h:  #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/page-nommu.h:  #define __pmd(x)  (x)
arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/pgtable-2level-types.h:#define __pmd(x)  (x)
arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x)  ((pmd_t) { 
(x) } )
arch/arm/include/asm/pgtable-3level-types.h:#define __pmd(x)  (x)
arch/arm64/include/asm/pgtable-types.h: #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/m68k/include/asm/page.h:   #define __pmd(x)  ((pmd_t) { { 
(x) }, })
arch/mips/include/asm/pgtable-64.h: #define __pmd(x)  ((pmd_t) { 
(x) } )
arch/nds32/include/asm/page.h:  #define __pmd(x)  (x)
arch/parisc/include/a

Re: [PATCH] arm64: fix unreachable code issue with cmpxchg

2019-09-09 Thread Nathan Chancellor

On Mon, Sep 09, 2019 at 10:21:35PM +0200, Arnd Bergmann wrote:
> On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined
> when CONFIG_OPTIMIZE_INLINING is set.
> Clang then fails a compile-time assertion, because it cannot tell at
> compile time what the size of the argument is:
> 
> mm/memcontrol.o: In function `__cmpxchg_mb':
> memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175'
> memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 
> against undefined symbol `__compiletime_assert_175'
> 
> Mark all of the cmpxchg() style functions as __always_inline to
> ensure that the compiler can see the result.
> 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Nathan Chancellor 
Tested-by: Nathan Chancellor

Re: [PATCH] smack: include linux/watch_queue.h

2019-09-09 Thread kbuild test robot

Hi Arnd,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Arnd-Bergmann/smack-include-linux-watch_queue-h/20190910-095704
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=ia64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

>> security/smack/smack_lsm.c:45:10: fatal error: linux/watch_queue.h: No such 
>> file or directory
#include 
 ^
   compilation terminated.

vim +45 security/smack/smack_lsm.c

  > 45  #include 
46  #include "smack.h"
47  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

possible deadlock in shmem_fallocate (3)

2019-09-09 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:6d028043 Add linux-next specific files for 20190830
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12359ec660
kernel config:  https://syzkaller.appspot.com/x/.config?x=82a6bec43ab0cb69
dashboard link: https://syzkaller.appspot.com/bug?extid=5d04068d02b9da8a0947
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+5d04068d02b9da8a0...@syzkaller.appspotmail.com

==
WARNING: possible circular locking dependency detected
5.3.0-rc6-next-20190830 #75 Not tainted
--
kswapd0/1770 is trying to acquire lock:
8880a0b9b780 (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock  
include/linux/fs.h:789 [inline]
8880a0b9b780 (&sb->s_type->i_mutex_key#13){+.+.}, at:  
shmem_fallocate+0x15a/0xc60 mm/shmem.c:2728


but task is already holding lock:
89042f80 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  
mm/page_alloc.c:4889


which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}:
   __fs_reclaim_acquire mm/page_alloc.c:4075 [inline]
   fs_reclaim_acquire.part.0+0x24/0x30 mm/page_alloc.c:4086
   fs_reclaim_acquire mm/page_alloc.c:4662 [inline]
   prepare_alloc_pages mm/page_alloc.c:4659 [inline]
   __alloc_pages_nodemask+0x52f/0x900 mm/page_alloc.c:4711
   alloc_pages_vma+0x1bc/0x3f0 mm/mempolicy.c:2114
   shmem_alloc_page+0xbd/0x180 mm/shmem.c:1496
   shmem_alloc_and_acct_page+0x165/0x990 mm/shmem.c:1521
   shmem_getpage_gfp+0x598/0x2680 mm/shmem.c:1835
   shmem_getpage mm/shmem.c:152 [inline]
   shmem_write_begin+0x105/0x1e0 mm/shmem.c:2480
   generic_perform_write+0x23b/0x540 mm/filemap.c:3304
   __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3433
   generic_file_write_iter+0x420/0x690 mm/filemap.c:3465
   call_write_iter include/linux/fs.h:1890 [inline]
   new_sync_write+0x4d3/0x770 fs/read_write.c:483
   __vfs_write+0xe1/0x110 fs/read_write.c:496
   vfs_write+0x268/0x5d0 fs/read_write.c:558
   ksys_write+0x14f/0x290 fs/read_write.c:611
   __do_sys_write fs/read_write.c:623 [inline]
   __se_sys_write fs/read_write.c:620 [inline]
   __x64_sys_write+0x73/0xb0 fs/read_write.c:620
   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&sb->s_type->i_mutex_key#13){+.+.}:
   check_prev_add kernel/locking/lockdep.c:2476 [inline]
   check_prevs_add kernel/locking/lockdep.c:2581 [inline]
   validate_chain kernel/locking/lockdep.c:2971 [inline]
   __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3955
   lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
   down_write+0x93/0x150 kernel/locking/rwsem.c:1534
   inode_lock include/linux/fs.h:789 [inline]
   shmem_fallocate+0x15a/0xc60 mm/shmem.c:2728
   ashmem_shrink_scan drivers/staging/android/ashmem.c:462 [inline]
   ashmem_shrink_scan+0x370/0x510 drivers/staging/android/ashmem.c:437
   do_shrink_slab+0x40f/0xa30 mm/vmscan.c:560
   shrink_slab mm/vmscan.c:721 [inline]
   shrink_slab+0x19a/0x680 mm/vmscan.c:694
   shrink_node+0x223/0x12e0 mm/vmscan.c:2807
   kswapd_shrink_node mm/vmscan.c:3549 [inline]
   balance_pgdat+0x57c/0xea0 mm/vmscan.c:3707
   kswapd+0x5c3/0xf30 mm/vmscan.c:3958
   kthread+0x361/0x430 kernel/kthread.c:255
   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(fs_reclaim);
   lock(&sb->s_type->i_mutex_key#13);
   lock(fs_reclaim);
  lock(&sb->s_type->i_mutex_key#13);

 *** DEADLOCK ***

2 locks held by kswapd0/1770:
 #0: 89042f80 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30  
mm/page_alloc.c:4889
 #1: 8901ffe8 (shrinker_rwsem){}, at: shrink_slab  
mm/vmscan.c:711 [inline]
 #1: 8901ffe8 (shrinker_rwsem){}, at: shrink_slab+0xe6/0x680  
mm/vmscan.c:694


stack backtrace:
CPU: 0 PID: 1770 Comm: kswapd0 Not tainted 5.3.0-rc6-next-20190830 #75
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1685
 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1809
 check_prev_add kernel/locking/lockdep.c:2476 [inline]
 check_prevs_add kernel/locking/lockdep.c:2581 [inline]
 validate_chain kernel/locking/lockdep.c:2971 [in

[PATCH 1/2] arm64: dts: imx8mn: Add system counter node

2019-09-09 Thread Anson Huang

Add i.MX8MN system counter node to enable timer-imx-sysctr
broadcast timer driver.

Signed-off-by: Anson Huang 
---
 arch/arm64/boot/dts/freescale/imx8mn.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
index d94db95..0166f8c 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
@@ -428,6 +428,14 @@
#pwm-cells = <2>;
status = "disabled";
};
+
+   system_counter: timer@306a {
+   compatible = "nxp,sysctr-timer";
+   reg = <0x306a 0x2>;
+   interrupts = ;
+   clocks = <&osc_24m>;
+   clock-names = "per";
+   };
};
 
aips3: bus@3080 {
-- 
2.7.4

[PATCH 2/2] arm64: dts: imx8mn: Enable cpu-idle driver

2019-09-09 Thread Anson Huang

Enable i.MX8MN cpu-idle using generic ARM cpu-idle driver, 2 states
are supported, details as below:

root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/name
WFI
root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state0/usage
3098
root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/name
cpu-pd-wait
root@imx8mnevk:~# cat /sys/devices/system/cpu/cpu0/cpuidle/state1/usage
3078

Signed-off-by: Anson Huang 
---
 arch/arm64/boot/dts/freescale/imx8mn.dtsi | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
index 0166f8c..e4efe8d 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
@@ -43,6 +43,19 @@
#address-cells = <1>;
#size-cells = <0>;
 
+   idle-states {
+   entry-method = "psci";
+
+   cpu_pd_wait: cpu-pd-wait {
+   compatible = "arm,idle-state";
+   arm,psci-suspend-param = <0x0010033>;
+   local-timer-stop;
+   entry-latency-us = <1000>;
+   exit-latency-us = <700>;
+   min-residency-us = <2700>;
+   };
+   };
+
A53_0: cpu@0 {
device_type = "cpu";
compatible = "arm,cortex-a53";
@@ -54,6 +67,7 @@
operating-points-v2 = <&a53_opp_table>;
nvmem-cells = <&cpu_speed_grade>;
nvmem-cell-names = "speed_grade";
+   cpu-idle-states = <&cpu_pd_wait>;
};
 
A53_1: cpu@1 {
@@ -65,6 +79,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
operating-points-v2 = <&a53_opp_table>;
+   cpu-idle-states = <&cpu_pd_wait>;
};
 
A53_2: cpu@2 {
@@ -76,6 +91,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
operating-points-v2 = <&a53_opp_table>;
+   cpu-idle-states = <&cpu_pd_wait>;
};
 
A53_3: cpu@3 {
@@ -87,6 +103,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
operating-points-v2 = <&a53_opp_table>;
+   cpu-idle-states = <&cpu_pd_wait>;
};
 
A53_L2: l2-cache0 {
-- 
2.7.4

Re: [PATCH 2/4] mmc: Add virtual command queue support

2019-09-09 Thread Baolin Wang

On Mon, 9 Sep 2019 at 20:45, Adrian Hunter  wrote:
>
> On 9/09/19 3:16 PM, Baolin Wang wrote:
> > Hi Adrian,
> >
> > On Mon, 9 Sep 2019 at 20:02, Adrian Hunter  wrote:
> >>
> >> On 6/09/19 6:52 AM, Baolin Wang wrote:
> >>> Now the MMC read/write stack will always wait for previous request is
> >>> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> >>> or queue a work to complete request, that will bring context switching
> >>> overhead, especially for high I/O per second rates, to affect the IO
> >>> performance.
> >>>
> >>> Thus this patch introduces virtual command queue interface, which is
> >>> similar with the hardware command queue engine's idea, that can remove
> >>> the context switching.
> >>
> >> CQHCI is a hardware interface for eMMC's that support command queuing.  
> >> What
> >> you are doing is a software issue queue, unrelated to CQHCI.  I think you
> >
> > Yes.
> >
> >> should avoid all reference to CQHCI i.e. call it something else.
> >
> > Since its process is similar with CQHCI and re-use the CQHCI's
> > interfaces, I called it virtual command queue. I am not sure what else
> > name is better, any thoughts? VCQHCI? Thanks.
>
> What about swq for software queue.  Maybe Ulf can suggest something?

Um, though changing to use swq, still need reuse command queue's
interfaces, like 'mq->use-cqe', 'host->cqe_depth' and cqe ops and so
on, looks a little weird for me. But if you all agree with this name,
then I am okay. Ulf, what do you suggest?

-- 
Baolin Wang
Best Regards

Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

2019-09-09 Thread Sergey Senozhatsky

On (09/06/19 16:01), Peter Zijlstra wrote:
> In fact, i've gotten output that is plain impossible with
> the current junk.

Peter, can you post any of those backtraces? Very curious.

-ss

Re: [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation

2019-09-09 Thread Sergey Senozhatsky

On (08/10/19 07:53), Thomas Gleixner wrote:
> 
> Right now we have an implementation for serial only, but that already is
> useful. I nicely got (minimaly garbled) crash dumps out of an NMI
> handler. With the current mainline console code the machine just hung.
> 

Thomas, any chance you can post backtraces? Just curious where exactly
current printk() and console_drivers() hung.

-ss

Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-09 Thread Sagi Grimberg


Hey Ming,


Ok, so the real problem is per-cpu bounded tasks.

I share Thomas opinion about a NAPI like approach.


We already have that, its irq_poll, but it seems that for this
use-case, we get lower performance for some reason. I'm not
entirely sure why that is, maybe its because we need to mask interrupts
because we don't have an "arm" register in nvme like network devices
have?


Long observed that IOPS drops much too by switching to threaded irq. If
softirqd is waken up for handing softirq, the performance shouldn't
be better than threaded irq.


Its true that it shouldn't be any faster, but what irqpoll already has
and we don't need to reinvent is a proper budgeting mechanism that
needs to occur when multiple devices map irq vectors to the same cpu
core.

irqpoll already maintains a percpu list and dispatch the ->poll with
a budget that the backend enforces and irqpoll multiplexes between them.
Having this mechanism in irq (hard or threaded) context sounds
unnecessary a bit.

It seems like we're attempting to stay in irq context for as long as we
can instead of scheduling to softirq/thread context if we have more than
a minimal amount of work to do. Without at least understanding why
softirq/thread degrades us so much this code seems like the wrong
approach to me. Interrupt context will always be faster, but it is
not a sufficient reason to spend as much time as possible there, is it?

We should also keep in mind, that the networking stack has been doing
this for years, I would try to understand why this cannot work for nvme
before dismissing.


Especially, Long found that context
switch is increased a lot after applying your irq poll patch.

http://lists.infradead.org/pipermail/linux-nvme/2019-August/026788.html


Oh, I didn't see that one, wonder why... thanks!

5% improvement, I guess we can buy that for other users as is :)

If we suffer from lots of context switches while the CPU is flooded with
interrupts, then I would argue that we're re-raising softirq too much.
In this use-case, my assumption is that the cpu cannot keep up with the
interrupts and not that it doesn't reap enough (we also reap the first
batch in interrupt context...)

Perhaps making irqpoll continue until it must resched would improve
things further? Although this is a latency vs. efficiency tradeoff,
looks like MAX_SOFTIRQ_TIME is set to 2ms:

"
 * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
 * certain cases, such as stop_machine(), jiffies may cease to
 * increment and so we need the MAX_SOFTIRQ_RESTART limit as
 * well to make sure we eventually return from this method.
 *
 * These limits have been established via experimentation.
 * The two things to balance is latency against fairness -
 * we want to handle softirqs as soon as possible, but they
 * should not be able to lock up the box.
"

Long, does this patch make any difference?
--
diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 2f17b488d58e..d8eab563fa77 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -12,8 +12,6 @@
 #include 
 #include 

-static unsigned int irq_poll_budget __read_mostly = 256;
-
 static DEFINE_PER_CPU(struct list_head, blk_cpu_iopoll);

 /**
@@ -77,42 +75,29 @@ EXPORT_SYMBOL(irq_poll_complete);

 static void __latent_entropy irq_poll_softirq(struct softirq_action *h)
 {
-   struct list_head *list = this_cpu_ptr(&blk_cpu_iopoll);
-   int rearm = 0, budget = irq_poll_budget;
-   unsigned long start_time = jiffies;
+   struct list_head *irqpoll_list = this_cpu_ptr(&blk_cpu_iopoll);
+   LIST_HEAD(list);

local_irq_disable();
+   list_splice_init(irqpoll_list, &list);
+   local_irq_enable();

-   while (!list_empty(list)) {
+   while (!list_empty(&list)) {
struct irq_poll *iop;
int work, weight;

-   /*
-* If softirq window is exhausted then punt.
-*/
-   if (budget <= 0 || time_after(jiffies, start_time)) {
-   rearm = 1;
-   break;
-   }
-
-   local_irq_enable();
-
/* Even though interrupts have been re-enabled, this
 * access is safe because interrupts can only add new
 * entries to the tail of this list, and only ->poll()
 * calls can remove this head entry from the list.
 */
-   iop = list_entry(list->next, struct irq_poll, list);
+   iop = list_first_entry(&list, struct irq_poll, list);

weight = iop->weight;
work = 0;
if (test_bit(IRQ_POLL_F_SCHED, &iop->state))
work = iop->poll(iop, weight);

-   budget -= work;
-
-   local_irq_disable();
-
/*
 * Drivers must not modify the iopoll state, if they
 * consume their assigned weight (or more, some drivers 
can't
@@ -125,11 +110,21

Zdravstvujte! Vas interesujut klientskie bazy dannyh?

2019-09-09 Thread 128128promteh

Zdravstvujte! Vas interesujut klientskie bazy dannyh?

[PATCH 0/2] Add bounds check for Hotplugged memory

2019-09-09 Thread Alastair D'Silva

From: Alastair D'Silva 

This series adds bounds checks for hotplugged memory, ensuring that
it is within the physically addressable range (for platforms that
define MAX_(POSSIBLE_)PHYSMEM_BITS.

This allows for early failure, rather than attempting to access
bogus section numbers.

Alastair D'Silva (2):
  memory_hotplug: Add a bounds check to check_hotplug_memory_range()
  mm: Add a bounds check in devm_memremap_pages()

 include/linux/memory_hotplug.h |  1 +
 mm/memory_hotplug.c| 19 ++-
 mm/memremap.c  |  8 
 3 files changed, 27 insertions(+), 1 deletion(-)

-- 
2.21.0

[PATCH 2/2] mm: Add a bounds check in devm_memremap_pages()

2019-09-09 Thread Alastair D'Silva

From: Alastair D'Silva 

The call to check_hotplug_memory_addressable() validates that the memory
is fully addressable.

Without this call, it is possible that we may remap pages that is
not physically addressable, resulting in bogus section numbers
being returned from __section_nr().

Signed-off-by: Alastair D'Silva 
---
 mm/memremap.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/mm/memremap.c b/mm/memremap.c
index 86432650f829..fd00993caa3e 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -269,6 +269,13 @@ void *devm_memremap_pages(struct device *dev, struct 
dev_pagemap *pgmap)
 
mem_hotplug_begin();
 
+   error = check_hotplug_memory_addressable(res->start,
+resource_size(res));
+   if (error) {
+   mem_hotplug_done();
+   goto err_checkrange;
+   }
+
/*
 * For device private memory we call add_pages() as we only need to
 * allocate and initialize struct page for the device memory. More-
@@ -324,6 +331,7 @@ void *devm_memremap_pages(struct device *dev, struct 
dev_pagemap *pgmap)
 
  err_add_memory:
kasan_remove_zero_shadow(__va(res->start), resource_size(res));
+ err_checkrange:
  err_kasan:
untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res));
  err_pfn_remap:
-- 
2.21.0

[PATCH 1/2] memory_hotplug: Add a bounds check to check_hotplug_memory_range()

2019-09-09 Thread Alastair D'Silva

From: Alastair D'Silva 

On PowerPC, the address ranges allocated to OpenCAPI LPC memory
are allocated from firmware. These address ranges may be higher
than what older kernels permit, as we increased the maximum
permissable address in commit 4ffe713b7587
("powerpc/mm: Increase the max addressable memory to 2PB"). It is
possible that the addressable range may change again in the
future.

In this scenario, we end up with a bogus section returned from
__section_nr (see the discussion on the thread "mm: Trigger bug on
if a section is not found in __section_nr").

Adding a check here means that we fail early and have an
opportunity to handle the error gracefully, rather than rumbling
on and potentially accessing an incorrect section.

Further discussion is also on the thread ("powerpc: Perform a bounds
check in arch_add_memory").

Signed-off-by: Alastair D'Silva 
---
 include/linux/memory_hotplug.h |  1 +
 mm/memory_hotplug.c| 19 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index f46ea71b4ffd..bc477e98a310 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -110,6 +110,7 @@ extern void __online_page_increment_counters(struct page 
*page);
 extern void __online_page_free(struct page *page);
 
 extern int try_online_node(int nid);
+int check_hotplug_memory_addressable(u64 start, u64 size);
 
 extern int arch_add_memory(int nid, u64 start, u64 size,
struct mhp_restrictions *restrictions);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c73f09913165..3c5428b014f9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1030,6 +1030,23 @@ int try_online_node(int nid)
return ret;
 }
 
+#ifndef MAX_POSSIBLE_PHYSMEM_BITS
+#ifdef MAX_PHYSMEM_BITS
+#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
+#endif
+#endif
+
+int check_hotplug_memory_addressable(u64 start, u64 size)
+{
+#ifdef MAX_POSSIBLE_PHYSMEM_BITS
+   if ((start + size - 1) >> MAX_POSSIBLE_PHYSMEM_BITS)
+   return -E2BIG;
+#endif
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(check_hotplug_memory_addressable);
+
 static int check_hotplug_memory_range(u64 start, u64 size)
 {
/* memory range must be block size aligned */
@@ -1040,7 +1057,7 @@ static int check_hotplug_memory_range(u64 start, u64 size)
return -EINVAL;
}
 
-   return 0;
+   return check_hotplug_memory_addressable(start, size);
 }
 
 static int online_memory_block(struct memory_block *mem, void *arg)
-- 
2.21.0

RE: [PATCH] clk: imx: lpcg: write twice when writing lpcg regs

2019-09-09 Thread Anson Huang

> On Sat, Sep 7, 2019 at 9:47 PM Stephen Boyd  wrote:
> >
> > Quoting Peng Fan (2019-08-27 01:17:50)
> > > From: Peng Fan 
> > >
> > > There is hardware issue that:
> > > The output clock the LPCG cell will not turn back on as expected,
> > > even though a read of the IPG registers in the LPCG indicates that
> > > the clock should be enabled.
> > >
> > > The software workaround is to write twice to enable the LPCG clock
> > > output.
> > >
> > > Signed-off-by: Peng Fan 
> >
> > Does this need a Fixes tag?
> 
> Not sure as it's not code logic issue but a hardware bug.
> And 4.19 LTS still have not this driver support.

Looks like there is an errata for this issue, and Ranjani just sent a patch for 
review internally,

Back-to-back LPCG writes can be ignored by the LPCG register due to a 
HW bug. The writes need to be separated by atleast 4 cycles of the gated clock.
The workaround is implemented as follows:
1. For clocks running greater than 50MHz no delay is required as the 
delay in accessing the LPCG register is sufficient.
2. For clocks running greater than 23MHz, a read followed by the write 
will provide the sufficient delay.
3. For clocks running below 23MHz, LPCG is not used.

Need double check?

Anson.

Re: [PATCH v1 1/1] ARM: dts: rockchip: set crypto default disabled on rk3288

2019-09-09 Thread Elon Zhang


Hi Heiko,

On 9/1/2019 07:04, Heiko Stuebner wrote:

Hi Elon,

Am Donnerstag, 29. August 2019, 13:31:00 CEST schrieb Elon Zhang:

On 8/27/2019 22:28, Heiko Stuebner wrote:

Am Dienstag, 27. August 2019, 09:14:39 CEST schrieb Elon Zhang:

Not every board needs to enable crypto node, so the node should
be set default disabled in rk3288.dtsi and enabled in specific
board dts file.

Can you give a bit more rationale here? There would need to be a very
specific reason because of the following:

The crypto module is not wired to some board-specific components,
so its usability does not depend on the specific board at all.
Instead every board can just use it out of the box and the devicetree
is supposed to describe the hardware and is _not_ meant as a space
for user configuration.

Right for almost all normal hardware modules but the crypto module was
designed

for secure world. As a result,  the crypto module will become
inaccessible for linux kernel if secure world enable it.

We plan to enable the crypto module in secure world so we should set
crypto module default disabled in linux kernel.

ok ... I'm halfway convinced ;-) .

The big thing I want to see is that secure setting in the actual firmware.
Aka right now you probably have that in your Rockchip-specific ATF fork
and I really want to see the relevant change for public uboot or ATF.

I don't necessarily require it to be fully merged before taking this, but
I really want to see the change either on a mailing list or atf gerrit
instance [that makes the crypto engine secure only].

Rationale behind this is that we don't care very much about private stuff
that the general ecosystem doesn't benefit from.


Now the crypto security property setting is done in the rockchip private 
code, which is not


opensource. So  if you don't care about private stuff and the change in 
private stuff will not


affect the upstream kernel,  the crypto can be enabled in upstream kernel?




Thanks
Heiko



So in fact the status property should probably go away completely from
the crypto node, as it's usable out of the box in all cases.


Heiko




Signed-off-by: Elon Zhang 
---
   arch/arm/boot/dts/rk3288.dtsi | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index cc893e154fe5..d509aa24177c 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -984,7 +984,7 @@
clock-names = "aclk", "hclk", "sclk", "apb_pclk";
resets = <&cru SRST_CRYPTO>;
reset-names = "crypto-rst";
-   status = "okay";
+   status = "disabled";
};
   
   	iep_mmu: iommu@ff900800 {

[PATCH] arm64: dts: imx8mn: Add "fsl,imx8mq-src" as src's fallback compatible

2019-09-09 Thread Anson Huang

i.MX8MN can reuse i.MX8MQ's src driver, add "fsl,imx8mq-src" as
src's fallback compatible to enable it.

Signed-off-by: Anson Huang 
---
 arch/arm64/boot/dts/freescale/imx8mn.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
index 785f4c4..d94db95 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
@@ -371,7 +371,7 @@
};
 
src: reset-controller@3039 {
-   compatible = "fsl,imx8mn-src", "syscon";
+   compatible = "fsl,imx8mn-src", 
"fsl,imx8mq-src", "syscon";
reg = <0x3039 0x1>;
interrupts = ;
#reset-cells = <1>;
-- 
2.7.4

[PATCH] proc:fix confusing macro arg name

2019-09-09 Thread Miaohe Lin

state_size and ops are in the wrong position, fix it.

Signed-off-by: Miaohe Lin 
---
 include/linux/proc_fs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index a705aa2d03f9..0640be56dcbd 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -58,8 +58,8 @@ extern int remove_proc_subtree(const char *, struct 
proc_dir_entry *);
 struct proc_dir_entry *proc_create_net_data(const char *name, umode_t mode,
struct proc_dir_entry *parent, const struct seq_operations *ops,
unsigned int state_size, void *data);
-#define proc_create_net(name, mode, parent, state_size, ops) \
-   proc_create_net_data(name, mode, parent, state_size, ops, NULL)
+#define proc_create_net(name, mode, parent, ops, state_size) \
+   proc_create_net_data(name, mode, parent, ops, state_size, NULL)
 struct proc_dir_entry *proc_create_net_single(const char *name, umode_t mode,
struct proc_dir_entry *parent,
int (*show)(struct seq_file *, void *), void *data);
-- 
2.21.GIT

RE: [EXT] Re: [PATCH 3/3] ASoC: fsl_asrc: Fix error with S24_3LE format bitstream in i.MX8

2019-09-09 Thread S.j. Wang



Hi

> 
> On Mon, Sep 09, 2019 at 06:33:21PM -0400, Shengjiu Wang wrote:
> > There is error "aplay: pcm_write:2023: write error: Input/output error"
> > on i.MX8QM/i.MX8QXP platform for S24_3LE format.
> >
> > In i.MX8QM/i.MX8QXP, the DMA is EDMA, which don't support 24bit
> > sample, but we didn't add any constraint, that cause issues.
> >
> > So we need to query the caps of dma, then update the hw parameters
> > according to the caps.
> 
> > @@ -285,8 +293,81 @@ static int fsl_asrc_dma_startup(struct
> > snd_pcm_substream *substream)
> >
> >   runtime->private_data = pair;
> >
> > - snd_pcm_hw_constraint_integer(substream->runtime,
> > -   SNDRV_PCM_HW_PARAM_PERIODS);
> > + ret = snd_pcm_hw_constraint_integer(substream->runtime,
> > + SNDRV_PCM_HW_PARAM_PERIODS);
> > + if (ret < 0) {
> > + dev_err(dev, "failed to set pcm hw params periods\n");
> > + return ret;
> > + }
> > +
> > + dma_data = snd_soc_dai_get_dma_data(rtd->cpu_dai, substream);
> > +
> > + /* Request a temp pair, which is release in the end */
> > + fsl_asrc_request_pair(1, pair);
> 
> Not sure if it'd be practical, but a pair request could fail. Will probably 
> need
> to check return value.
> 
> And a quick feeling is that below code is mostly identical to what is in the
> soc-generic-dmaengine-pcm.c file. So I'm wondering if we could abstract a
> helper function somewhere in the ASoC core: Mark?
> 
> Thanks
> Nicolin
> 
Yes, it refers to the code in soc-generic-dmaengine-pcm.c, if there is a common
API, this is helpful.

Best regards
Wang shengjiu

Re: [PATCH 1/2] export.h: remove defined(KERNEL)

2019-09-09 Thread Nicolas Pitre

On Tue, 10 Sep 2019, Masahiro Yamada wrote:

> On Tue, Sep 10, 2019 at 1:06 AM Nicolas Pitre  wrote:
> >
> > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> >
> > > Hi Nicolas,
> > >
> > > On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre  wrote:
> > > >
> > > > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> > > >
> > > > > This line was touched by commit f235541699bc ("export.h: allow for
> > > > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did
> > > > > not explain why.
> > > > >
> > > > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__).
> > > >
> > > > I'm pretty sure it was needed back then so not to interfere with users
> > > > of this file. My fault for not documenting it.
> > >
> > > Hmm, I did not see a problem in my quick build test.
> > >
> > > Do you remember which file was causing the problem?
> >
> > If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the
> > defined(__KERNEL__) test removed then you'll get:
> >
> >   HOSTCC  scripts/mod/modpost.o
> > In file included from scripts/mod/modpost.c:24:
> > scripts/mod/../../include/linux/export.h:81:10: fatal error: 
> > linux/kconfig.h: No such file or directory
> >
> >
> > Nicolas
> 
> 
> Thanks for explaining this.
> 
> It is not the case any more.
> 
> 
> I will reword the commit message as follows:
> 
> >8---
> export.h: remove defined(__KERNEL__), which is no longer needed
> 
> The conditional define(__KERNEL__) was added by commit f235541699bc
> ("export.h: allow for per-symbol configurable EXPORT_SYMBOL()").
> 
> It was needed at that time to avoid the build error of modpost
> with CONFIG_TRIM_UNUSED_KSYMS=y.
> 
> Since commit b2c5cdcfd4bc ("modpost: remove symbol prefix support"),
> modpost no longer includes linux/export.h, thus the define(__KERNEL__)
> is unneeded.
> >8---
> 

Acked-by: Nicolas Pitre 


Nicolas

RE: [EXT] Re: [PATCH 1/3] ASoC: fsl_asrc: Use in(out)put_format instead of in(out)put_word_width

2019-09-09 Thread S.j. Wang

Hi

> 
> On Mon, Sep 09, 2019 at 06:33:19PM -0400, Shengjiu Wang wrote:
> > snd_pcm_format_t is more formal than enum asrc_word_width, which
> has
> > two property, width and physical width, which is more accurate than
> > enum asrc_word_width. So it is better to use in(out)put_format instead
> > of in(out)put_word_width.
> 
> Hmm...I don't really see the benefit of using snd_pcm_format_t here...I
> mean, I know it's a generic one, and would understand if we use it as a
> param for a common API. But this patch merely packs the "width" by
> intentionally using this snd_pcm_format_t and then adds another
> translation to unpack it.. I feel it's a bit overcomplicated. Or am I missing
> something?
> 
> And I feel it's not necessary to use ALSA common format in our own "struct
> asrc_config" since it is more IP/register specific.
> 
> Thanks
> Nicolin
> 

As you know, we have another M2M function internally, when user want to
Set the format through M2M API, it is better to use snd_pcm_format_t instead the
Width, for snd_pcm_format_t include two property, data with and physical width
In driver some place need data width, some place need physical width.
For example how to distinguish S24_LE and S24_3LE in driver,  DMA setting needs
The physical width,  but ASRC need data width. 

Another purpose is that we have another new designed ASRC, which support more
Formats, I would like it can share same API with this ASRC, using 
snd_pcm_format_t
That we can use the common API, like snd_pcm_format_linear,
snd_pcm_format_big_endian to get the property of the format, which is needed by
driver.

Best regards
Wang shengjiu

Re: [PATCH] mm: avoid slub allocation while holding list_lock

2019-09-09 Thread Yu Zhao

On Tue, Sep 10, 2019 at 10:41:31AM +0900, Tetsuo Handa wrote:
> Yu Zhao wrote:
> > I think we can safely assume PAGE_SIZE is unsigned long aligned and
> > page->objects is non-zero. But if you don't feel comfortable with these
> > assumptions, I'd be happy to ensure them explicitly.
> 
> I know PAGE_SIZE is unsigned long aligned. If someone by chance happens to
> change from "dynamic allocation" to "on stack", get_order() will no longer
> be called and the bug will show up.
> 
> I don't know whether __get_free_page(GFP_ATOMIC) can temporarily consume more
> than 4096 bytes, but if it can, we might want to avoid "dynamic allocation".

With GFP_ATOMIC and ~~__GFP_HIGHMEM, it shouldn't.

> By the way, if "struct kmem_cache_node" is object which won't have many 
> thousands
> of instances, can't we embed that buffer into "struct kmem_cache_node" because
> max size of that buffer is only 4096 bytes?

It seems to me allocation in error path is better than always keeping
a page around. But the latter may still be acceptable given it's done
only when debug is on and, of course, on a per-node scale.

Re: [PATCH 1/2] export.h: remove defined(KERNEL)

2019-09-09 Thread Masahiro Yamada

On Tue, Sep 10, 2019 at 1:06 AM Nicolas Pitre  wrote:
>
> On Mon, 9 Sep 2019, Masahiro Yamada wrote:
>
> > Hi Nicolas,
> >
> > On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre  wrote:
> > >
> > > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> > >
> > > > This line was touched by commit f235541699bc ("export.h: allow for
> > > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did
> > > > not explain why.
> > > >
> > > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__).
> > >
> > > I'm pretty sure it was needed back then so not to interfere with users
> > > of this file. My fault for not documenting it.
> >
> > Hmm, I did not see a problem in my quick build test.
> >
> > Do you remember which file was causing the problem?
>
> If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the
> defined(__KERNEL__) test removed then you'll get:
>
>   HOSTCC  scripts/mod/modpost.o
> In file included from scripts/mod/modpost.c:24:
> scripts/mod/../../include/linux/export.h:81:10: fatal error: linux/kconfig.h: 
> No such file or directory
>
>
> Nicolas


Thanks for explaining this.

It is not the case any more.


I will reword the commit message as follows:

>8---
export.h: remove defined(__KERNEL__), which is no longer needed

The conditional define(__KERNEL__) was added by commit f235541699bc
("export.h: allow for per-symbol configurable EXPORT_SYMBOL()").

It was needed at that time to avoid the build error of modpost
with CONFIG_TRIM_UNUSED_KSYMS=y.

Since commit b2c5cdcfd4bc ("modpost: remove symbol prefix support"),
modpost no longer includes linux/export.h, thus the define(__KERNEL__)
is unneeded.
>8---



--
Best Regards
Masahiro Yamada

Re: [PATCH 2/3] ASoC: fsl_asrc: update supported sample format

2019-09-09 Thread S.j. Wang

Hi

> 
> On Mon, Sep 09, 2019 at 06:33:20PM -0400, Shengjiu Wang wrote:
> > The ASRC support 24bit/16bit/8bit input width, so S20_3LE format
> > should not be supported, it is word width is 20bit.
> 
> I thought 3LE used 24-bit physical width. And the driver assigns
> ASRC_WIDTH_24_BIT to "width" for all non-16bit cases, so 20-bit would go
> for that 24-bit slot also. I don't clearly recall if I had explicitly tested
> S20_3LE, but I feel it should work since I put there...
> 
> Thanks
> Nicolin
> 

For S20_3LE, the width is 20bit,  but the ASRC only support 24bit, if set the
ASRMCR1n.IWD= 24bit, because the actual width is 20 bit, the volume is
Lower than expected,  it likes 24bit data right shift 4 bit. 
So it is not supported.

Best regards
Wang shengjiu

[PATCH] reset: uniphier-glue: Add Pro5 USB3 support

2019-09-09 Thread Kunihiko Hayashi

Pro5 SoC has same scheme of USB3 reset as Pro4, so the data for Pro5 is
equivalent to Pro4.

Signed-off-by: Kunihiko Hayashi 
---
 Documentation/devicetree/bindings/reset/uniphier-reset.txt | 5 +++--
 drivers/reset/reset-uniphier-glue.c| 4 
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/reset/uniphier-reset.txt 
b/Documentation/devicetree/bindings/reset/uniphier-reset.txt
index ea00517..e320a8c 100644
--- a/Documentation/devicetree/bindings/reset/uniphier-reset.txt
+++ b/Documentation/devicetree/bindings/reset/uniphier-reset.txt
@@ -130,6 +130,7 @@ this layer. These clocks and resets should be described in 
each property.
 Required properties:
 - compatible: Should be
 "socionext,uniphier-pro4-usb3-reset" - for Pro4 SoC USB3
+"socionext,uniphier-pro5-usb3-reset" - for Pro5 SoC USB3
 "socionext,uniphier-pxs2-usb3-reset" - for PXs2 SoC USB3
 "socionext,uniphier-ld20-usb3-reset" - for LD20 SoC USB3
 "socionext,uniphier-pxs3-usb3-reset" - for PXs3 SoC USB3
@@ -141,12 +142,12 @@ Required properties:
 - clocks: A list of phandles to the clock gate for the glue layer.
According to the clock-names, appropriate clocks are required.
 - clock-names: Should contain
-"gio", "link" - for Pro4 SoC
+"gio", "link" - for Pro4 and Pro5 SoCs
 "link"- for others
 - resets: A list of phandles to the reset control for the glue layer.
According to the reset-names, appropriate resets are required.
 - reset-names: Should contain
-"gio", "link" - for Pro4 SoC
+"gio", "link" - for Pro4 and Pro5 SoCs
 "link"- for others
 
 Example:
diff --git a/drivers/reset/reset-uniphier-glue.c 
b/drivers/reset/reset-uniphier-glue.c
index a45923f..2b188b3bb 100644
--- a/drivers/reset/reset-uniphier-glue.c
+++ b/drivers/reset/reset-uniphier-glue.c
@@ -141,6 +141,10 @@ static const struct of_device_id 
uniphier_glue_reset_match[] = {
.data = &uniphier_pro4_data,
},
{
+   .compatible = "socionext,uniphier-pro5-usb3-reset",
+   .data = &uniphier_pro4_data,
+   },
+   {
.compatible = "socionext,uniphier-pxs2-usb3-reset",
.data = &uniphier_pxs2_data,
},
-- 
2.7.4

Re: [PATCH 3/3] ASoC: fsl_asrc: Fix error with S24_3LE format bitstream in i.MX8

2019-09-09 Thread Nicolin Chen

On Mon, Sep 09, 2019 at 06:33:21PM -0400, Shengjiu Wang wrote:
> There is error "aplay: pcm_write:2023: write error: Input/output error"
> on i.MX8QM/i.MX8QXP platform for S24_3LE format.
> 
> In i.MX8QM/i.MX8QXP, the DMA is EDMA, which don't support 24bit
> sample, but we didn't add any constraint, that cause issues.
> 
> So we need to query the caps of dma, then update the hw parameters
> according to the caps.

> @@ -285,8 +293,81 @@ static int fsl_asrc_dma_startup(struct snd_pcm_substream 
> *substream)
>  
>   runtime->private_data = pair;
>  
> - snd_pcm_hw_constraint_integer(substream->runtime,
> -   SNDRV_PCM_HW_PARAM_PERIODS);
> + ret = snd_pcm_hw_constraint_integer(substream->runtime,
> + SNDRV_PCM_HW_PARAM_PERIODS);
> + if (ret < 0) {
> + dev_err(dev, "failed to set pcm hw params periods\n");
> + return ret;
> + }
> +
> + dma_data = snd_soc_dai_get_dma_data(rtd->cpu_dai, substream);
> +
> + /* Request a temp pair, which is release in the end */
> + fsl_asrc_request_pair(1, pair);

Not sure if it'd be practical, but a pair request could fail. Will
probably need to check return value.

And a quick feeling is that below code is mostly identical to what
is in the soc-generic-dmaengine-pcm.c file. So I'm wondering if we
could abstract a helper function somewhere in the ASoC core: Mark?

Thanks
Nicolin

> + tmp_chan = fsl_asrc_get_dma_channel(pair, dir);
> + if (!tmp_chan) {
> + dev_err(dev, "can't get dma channel\n");
> + return -EINVAL;
> + }
> +
> + ret = dma_get_slave_caps(tmp_chan, &dma_caps);
> + if (ret == 0) {
> + if (dma_caps.cmd_pause)
> + snd_imx_hardware.info |= SNDRV_PCM_INFO_PAUSE |
> +  SNDRV_PCM_INFO_RESUME;
> + if (dma_caps.residue_granularity <=
> + DMA_RESIDUE_GRANULARITY_SEGMENT)
> + snd_imx_hardware.info |= SNDRV_PCM_INFO_BATCH;
> +
> + if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
> + addr_widths = dma_caps.dst_addr_widths;
> + else
> + addr_widths = dma_caps.src_addr_widths;
> + }
> +
> + /*
> +  * If SND_DMAENGINE_PCM_DAI_FLAG_PACK is set keep
> +  * hw.formats set to 0, meaning no restrictions are in place.
> +  * In this case it's the responsibility of the DAI driver to
> +  * provide the supported format information.
> +  */
> + if (!(dma_data->flags & SND_DMAENGINE_PCM_DAI_FLAG_PACK))
> + /*
> +  * Prepare formats mask for valid/allowed sample types. If the
> +  * dma does not have support for the given physical word size,
> +  * it needs to be masked out so user space can not use the
> +  * format which produces corrupted audio.
> +  * In case the dma driver does not implement the slave_caps the
> +  * default assumption is that it supports 1, 2 and 4 bytes
> +  * widths.
> +  */
> + for (i = 0; i <= SNDRV_PCM_FORMAT_LAST; i++) {
> + int bits = snd_pcm_format_physical_width(i);
> +
> + /*
> +  * Enable only samples with DMA supported physical
> +  * widths
> +  */
> + switch (bits) {
> + case 8:
> + case 16:
> + case 24:
> + case 32:
> + case 64:
> + if (addr_widths & (1 << (bits / 8)))
> + snd_imx_hardware.formats |= (1LL << i);
> + break;
> + default:
> + /* Unsupported types */
> + break;
> + }
> + }
> +
> + if (tmp_chan)
> + dma_release_channel(tmp_chan);
> + fsl_asrc_release_pair(pair);
> +
>   snd_soc_set_runtime_hwparams(substream, &snd_imx_hardware);
>  
>   return 0;
> -- 
> 2.21.0
>

Re: [RFC PATCH untested] vhost: block speculation of translated descriptors

2019-09-09 Thread Jason Wang




On 2019/9/9 下午10:45, Michael S. Tsirkin wrote:

On Mon, Sep 09, 2019 at 03:19:55PM +0800, Jason Wang wrote:

On 2019/9/8 下午7:05, Michael S. Tsirkin wrote:

iovec addresses coming from vhost are assumed to be
pre-validated, but in fact can be speculated to a value
out of range.

Userspace address are later validated with array_index_nospec so we can
be sure kernel info does not leak through these addresses, but vhost
must also not leak userspace info outside the allowed memory table to
guests.

Following the defence in depth principle, make sure
the address is not validated out of node range.

Signed-off-by: Michael S. Tsirkin 
---
   drivers/vhost/vhost.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 5dc174ac8cac..0ee375fb7145 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2072,7 +2072,9 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 
addr, u32 len,
size = node->size - addr + node->start;
_iov->iov_len = min((u64)len - s, size);
_iov->iov_base = (void __user *)(unsigned long)
-   (node->userspace_addr + addr - node->start);
+   (node->userspace_addr +
+array_index_nospec(addr - node->start,
+   node->size));
s += size;
addr += size;
++ret;


I've tried this on Kaby Lake smap off metadata acceleration off using
testpmd (virtio-user) + vhost_net. I don't see obvious performance
difference with TX PPS.

Thanks

Should I push this to Linus right now then? It's a security thing so
maybe we better do it ASAP ... what's your opinion?



Yes, you can.

Acked-by: Jason Wang

[PATCH] regulator: uniphier: Add Pro5 USB3 VBUS support

2019-09-09 Thread Kunihiko Hayashi

Pro5 SoC has same scheme of USB3 VBUS as Pro4, so the data for Pro5 is
equivalent to Pro4.

Signed-off-by: Kunihiko Hayashi 
---
 Documentation/devicetree/bindings/regulator/uniphier-regulator.txt | 5 +++--
 drivers/regulator/uniphier-regulator.c | 4 
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt 
b/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt
index c9919f4..94fd38b 100644
--- a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt
@@ -13,6 +13,7 @@ this layer. These clocks and resets should be described in 
each property.
 Required properties:
 - compatible: Should be
 "socionext,uniphier-pro4-usb3-regulator" - for Pro4 SoC
+"socionext,uniphier-pro5-usb3-regulator" - for Pro5 SoC
 "socionext,uniphier-pxs2-usb3-regulator" - for PXs2 SoC
 "socionext,uniphier-ld20-usb3-regulator" - for LD20 SoC
 "socionext,uniphier-pxs3-usb3-regulator" - for PXs3 SoC
@@ -20,12 +21,12 @@ Required properties:
 - clocks: A list of phandles to the clock gate for USB3 glue layer.
According to the clock-names, appropriate clocks are required.
 - clock-names: Should contain
-"gio", "link" - for Pro4 SoC
+"gio", "link" - for Pro4 and Pro5 SoCs
 "link"- for others
 - resets: A list of phandles to the reset control for USB3 glue layer.
According to the reset-names, appropriate resets are required.
 - reset-names: Should contain
-"gio", "link" - for Pro4 SoC
+"gio", "link" - for Pro4 and Pro5 SoCs
 "link"- for others
 
 See Documentation/devicetree/bindings/regulator/regulator.txt
diff --git a/drivers/regulator/uniphier-regulator.c 
b/drivers/regulator/uniphier-regulator.c
index 9026d5a..2311924 100644
--- a/drivers/regulator/uniphier-regulator.c
+++ b/drivers/regulator/uniphier-regulator.c
@@ -186,6 +186,10 @@ static const struct of_device_id 
uniphier_regulator_match[] = {
.data = &uniphier_pro4_usb3_data,
},
{
+   .compatible = "socionext,uniphier-pro5-usb3-regulator",
+   .data = &uniphier_pro4_usb3_data,
+   },
+   {
.compatible = "socionext,uniphier-pxs2-usb3-regulator",
.data = &uniphier_pxs2_usb3_data,
},
-- 
2.7.4

1 2 3 4 5 6 7 8 >

1 - 100 of 751 matches

Mail list logo