date:20200520

[PATCH net 2/3] rxrpc: Trace discarded ACKs [ver #2]

2020-05-20 Thread David Howells

Add a tracepoint to track received ACKs that are discarded due to being
outside of the Tx window.

Signed-off-by: David Howells 
---

 include/trace/events/rxrpc.h |   35 +++
 net/rxrpc/input.c|   12 ++--
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index ab75f261f04a..ba9efdc848f9 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -1541,6 +1541,41 @@ TRACE_EVENT(rxrpc_notify_socket,
  __entry->serial)
);
 
+TRACE_EVENT(rxrpc_rx_discard_ack,
+   TP_PROTO(unsigned int debug_id, rxrpc_serial_t serial,
+rxrpc_seq_t first_soft_ack, rxrpc_seq_t call_ackr_first,
+rxrpc_seq_t prev_pkt, rxrpc_seq_t call_ackr_prev),
+
+   TP_ARGS(debug_id, serial, first_soft_ack, call_ackr_first,
+   prev_pkt, call_ackr_prev),
+
+   TP_STRUCT__entry(
+   __field(unsigned int,   debug_id)
+   __field(rxrpc_serial_t, serial  )
+   __field(rxrpc_seq_t,first_soft_ack)
+   __field(rxrpc_seq_t,call_ackr_first)
+   __field(rxrpc_seq_t,prev_pkt)
+   __field(rxrpc_seq_t,call_ackr_prev)
+),
+
+   TP_fast_assign(
+   __entry->debug_id   = debug_id;
+   __entry->serial = serial;
+   __entry->first_soft_ack = first_soft_ack;
+   __entry->call_ackr_first= call_ackr_first;
+   __entry->prev_pkt   = prev_pkt;
+   __entry->call_ackr_prev = call_ackr_prev;
+  ),
+
+   TP_printk("c=%08x r=%08x %08x<%08x %08x<%08x",
+ __entry->debug_id,
+ __entry->serial,
+ __entry->first_soft_ack,
+ __entry->call_ackr_first,
+ __entry->prev_pkt,
+ __entry->call_ackr_prev)
+   );
+
 #endif /* _TRACE_RXRPC_H */
 
 /* This part must be outside protection */
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index e438bfd3fdf5..2f22f082a66c 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -866,8 +866,12 @@ static void rxrpc_input_ack(struct rxrpc_call *call, 
struct sk_buff *skb)
 
/* Discard any out-of-order or duplicate ACKs (outside lock). */
if (before(first_soft_ack, call->ackr_first_seq) ||
-   before(prev_pkt, call->ackr_prev_seq))
+   before(prev_pkt, call->ackr_prev_seq)) {
+   trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
+  first_soft_ack, call->ackr_first_seq,
+  prev_pkt, call->ackr_prev_seq);
return;
+   }
 
buf.info.rxMTU = 0;
ioffset = offset + nr_acks + 3;
@@ -879,8 +883,12 @@ static void rxrpc_input_ack(struct rxrpc_call *call, 
struct sk_buff *skb)
 
/* Discard any out-of-order or duplicate ACKs (inside lock). */
if (before(first_soft_ack, call->ackr_first_seq) ||
-   before(prev_pkt, call->ackr_prev_seq))
+   before(prev_pkt, call->ackr_prev_seq)) {
+   trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
+  first_soft_ack, call->ackr_first_seq,
+  prev_pkt, call->ackr_prev_seq);
goto out;
+   }
call->acks_latest_ts = skb->tstamp;
 
call->ackr_first_seq = first_soft_ack;

[PATCH net 3/3] rxrpc: Fix ack discard [ver #2]

2020-05-20 Thread David Howells

The Rx protocol has a "previousPacket" field in it that is not handled in
the same way by all protocol implementations.  Sometimes it contains the
serial number of the last DATA packet received, sometimes the sequence
number of the last DATA packet received and sometimes the highest sequence
number so far received.

AF_RXRPC is using this to weed out ACKs that are out of date (it's possible
for ACK packets to get reordered on the wire), but this does not work with
OpenAFS which will just stick the sequence number of the last packet seen
into previousPacket.

The issue being seen is that big AFS FS.StoreData RPC (eg. of ~256MiB) are
timing out when partly sent.  A trace was captured, with an additional
tracepoint to show ACKs being discarded in rxrpc_input_ack().  Here's an
excerpt showing the problem.

 52873.203230: rxrpc_tx_data: c=04ae DATA ed1a3584:0002 0002449c 
q=00024499 fl=09

A DATA packet with sequence number 00024499 has been transmitted (the "q="
field).

 ...
 52873.243296: rxrpc_rx_ack: c=04ae 00012a2b DLY r=00024499 f=00024497 
p=00024496 n=0
 52873.243376: rxrpc_rx_ack: c=04ae 00012a2c IDL r=0002449b f=00024499 
p=00024498 n=0
 52873.243383: rxrpc_rx_ack: c=04ae 00012a2d OOS r=0002449d f=00024499 
p=0002449a n=2

The Out-Of-Sequence ACK indicates that the server didn't see DATA sequence
number 00024499, but did see seq 0002449a (previousPacket, shown as "p=",
skipped the number, but firstPacket, "f=", which shows the bottom of the
window is set at that point).

 52873.252663: rxrpc_retransmit: c=04ae q=24499 a=02 xp=14581537
 52873.252664: rxrpc_tx_data: c=04ae DATA ed1a3584:0002 000244bc 
q=00024499 fl=0b *RETRANS*

The packet has been retransmitted.  Retransmission recurs until the peer
says it got the packet.

 52873.271013: rxrpc_rx_ack: c=04ae 00012a31 OOS r=000244a1 f=00024499 
p=0002449e n=6

More OOS ACKs indicate that the other packets that are already in the
transmission pipeline are being received.  The specific-ACK list is up to 6
ACKs and NAKs.

 ...
 52873.284792: rxrpc_rx_ack: c=04ae 00012a49 OOS r=000244b9 f=00024499 
p=000244b6 n=30
 52873.284802: rxrpc_retransmit: c=04ae q=24499 a=0a xp=63505500
 52873.284804: rxrpc_tx_data: c=04ae DATA ed1a3584:0002 000244c2 
q=00024499 fl=0b *RETRANS*
 52873.287468: rxrpc_rx_ack: c=04ae 00012a4a OOS r=000244ba f=00024499 
p=000244b7 n=31
 52873.287478: rxrpc_rx_ack: c=04ae 00012a4b OOS r=000244bb f=00024499 
p=000244b8 n=32

At this point, the server's receive window is full (n=32) with presumably 1
NAK'd packet and 31 ACK'd packets.  We can't transmit any more packets.

 52873.287488: rxrpc_retransmit: c=04ae q=24499 a=0a xp=61327980
 52873.287489: rxrpc_tx_data: c=04ae DATA ed1a3584:0002 000244c3 
q=00024499 fl=0b *RETRANS*
 52873.293850: rxrpc_rx_ack: c=04ae 00012a4c DLY r=000244bc f=000244a0 
p=00024499 n=25

And now we've received an ACK indicating that a DATA retransmission was
received.  7 packets have been processed (the occupied part of the window
moved, as indicated by f= and n=).

 52873.293853: rxrpc_rx_discard_ack: c=04ae r=00012a4c 000244a0<00024499 
00024499<000244b8

However, the DLY ACK gets discarded because its previousPacket has gone
backwards (from p=000244b8, in the ACK at 52873.287478 to p=00024499 in the
ACK at 52873.293850).

We then end up in a continuous cycle of retransmit/discard.  kafs fails to
update its window because it's discarding the ACKs and can't transmit an
extra packet that would clear the issue because the window is full.
OpenAFS doesn't change the previousPacket value in the ACKs because no new
DATA packets are received with a different previousPacket number.

Fix this by altering the discard check to only discard an ACK based on
previousPacket if there was no advance in the firstPacket.  This allows us
to transmit a new packet which will cause previousPacket to advance in the
next ACK.

The check, however, needs to allow for the possibility that previousPacket
may actually have had the serial number placed in it instead - in which
case it will go outside the window and we should ignore it.

Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks")
Reported-by: Dave Botsch 
Signed-off-by: David Howells 
---

 net/rxrpc/input.c |   30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 2f22f082a66c..3be4177baf70 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -802,6 +802,30 @@ static void rxrpc_input_soft_acks(struct rxrpc_call *call, 
u8 *acks,
}
 }
 
+/*
+ * Return true if the ACK is valid - ie. it doesn't appear to have regressed
+ * with respect to the ack state conveyed by preceding ACKs.
+ */
+static bool rxrpc_is_ack_valid(struct rxrpc_call *call,
+  rxrpc_seq_t first_pkt, rxrpc_seq_t prev_pkt)
+{
+   rxrpc_seq_t base = READ_ONCE(call->ackr_first_seq);
+
+   if (aft

[PATCH net 0/3] rxrpc: Fix retransmission timeout and ACK discard [ver #2]

2020-05-20 Thread David Howells



Here are a couple of fixes and an extra tracepoint for AF_RXRPC:

 (1) Calculate the RTO pretty much as TCP does, rather than making
 something up, including an initial 4s timeout (which causes return
 probes from the fileserver to fail if a packet goes missing), and add
 backoff.

 (2) Fix the discarding of out-of-order received ACKs.  We mustn't let the
 hard-ACK point regress, nor do we want to do unnecessary
 retransmission because the soft-ACK list regresses.  This is not
 trivial, however, due to some loose wording in various old protocol
 specs, the ACK field that should be used for this sometimes has the
 wrong information in it.

 (3) Add a tracepoint to log a discarded ACK.

[V2] Fixed a "Fixes" line in a commit message.

The patches are tagged here:

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
    rxrpc-fixes-20200520

and can also be found on the following branch:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

David
---
David Howells (1):
  rxrpc: Fix ack discard


 fs/afs/fs_probe.c|  18 ++--
 fs/afs/vl_probe.c|  18 ++--
 include/net/af_rxrpc.h   |   2 +-
 include/trace/events/rxrpc.h |  52 +---
 net/rxrpc/Makefile   |   1 +
 net/rxrpc/ar-internal.h  |  25 --
 net/rxrpc/call_accept.c  |   2 +-
 net/rxrpc/call_event.c   |  22 ++---
 net/rxrpc/input.c|  44 --
 net/rxrpc/misc.c |   5 --
 net/rxrpc/output.c   |   9 +-
 net/rxrpc/peer_event.c   |  46 --
 net/rxrpc/peer_object.c  |  12 +--
 net/rxrpc/proc.c |   8 +-
 net/rxrpc/rtt.c  | 195 +++
 net/rxrpc/sendmsg.c  |  26 ++
 net/rxrpc/sysctl.c   |   9 --
 17 files changed, 335 insertions(+), 159 deletions(-)
 create mode 100644 net/rxrpc/rtt.c

[PATCH net 1/3] rxrpc: Fix the excessive initial retransmission timeout [ver #2]

2020-05-20 Thread David Howells

rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
sufficiently sampled.  This can cause problems with some fileservers with
calls to the cache manager in the afs filesystem being dropped from the
fileserver because a packet goes missing and the retransmission timeout is
greater than the call expiry timeout.

Fix this by:

 (1) Copying the RTT/RTO calculation code from Linux's TCP implementation
 and altering it to fit rxrpc.

 (2) Altering the various users of the RTT to make use of the new SRTT
 value.

 (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
 value instead (which is needed in jiffies), along with a backoff.

Notes:

 (1) rxrpc provides RTT samples by matching the serial numbers on outgoing
 DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
 against the reference serial number in incoming REQUESTED ACK and
 PING-RESPONSE ACK packets.

 (2) Each packet that is transmitted on an rxrpc connection gets a new
 per-connection serial number, even for retransmissions, so an ACK can
 be cross-referenced to a specific trigger packet.  This allows RTT
 information to be drawn from retransmitted DATA packets also.

 (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
 on an rxrpc_call because many RPC calls won't live long enough to
 generate more than one sample.

 (4) The calculated SRTT value is in units of 8ths of a microsecond rather
 than nanoseconds.

The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by 
userspace and kernel both")
Signed-off-by: David Howells 
---

 fs/afs/fs_probe.c|   18 +---
 fs/afs/vl_probe.c|   18 +---
 include/net/af_rxrpc.h   |2 
 include/trace/events/rxrpc.h |   17 ++--
 net/rxrpc/Makefile   |1 
 net/rxrpc/ar-internal.h  |   25 -
 net/rxrpc/call_accept.c  |2 
 net/rxrpc/call_event.c   |   22 ++---
 net/rxrpc/input.c|6 +
 net/rxrpc/misc.c |5 -
 net/rxrpc/output.c   |9 +-
 net/rxrpc/peer_event.c   |   46 --
 net/rxrpc/peer_object.c  |   12 ++-
 net/rxrpc/proc.c |8 +-
 net/rxrpc/rtt.c  |  195 ++
 net/rxrpc/sendmsg.c  |   26 ++
 net/rxrpc/sysctl.c   |9 --
 17 files changed, 266 insertions(+), 155 deletions(-)
 create mode 100644 net/rxrpc/rtt.c

diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c
index a587767b6ae1..237352d3cb53 100644
--- a/fs/afs/fs_probe.c
+++ b/fs/afs/fs_probe.c
@@ -32,9 +32,8 @@ void afs_fileserver_probe_result(struct afs_call *call)
struct afs_server *server = call->server;
unsigned int server_index = call->server_index;
unsigned int index = call->addr_ix;
-   unsigned int rtt = UINT_MAX;
+   unsigned int rtt_us;
bool have_result = false;
-   u64 _rtt;
int ret = call->error;
 
_enter("%pU,%u", &server->uuid, index);
@@ -93,15 +92,9 @@ void afs_fileserver_probe_result(struct afs_call *call)
}
}
 
-   /* Get the RTT and scale it to fit into a 32-bit value that represents
-* over a minute of time so that we can access it with one instruction
-* on a 32-bit system.
-*/
-   _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall);
-   _rtt /= 64;
-   rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt;
-   if (rtt < server->probe.rtt) {
-   server->probe.rtt = rtt;
+   rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall);
+   if (rtt_us < server->probe.rtt) {
+   server->probe.rtt = rtt_us;
alist->preferred = index;
have_result = true;
}
@@ -113,8 +106,7 @@ void afs_fileserver_probe_result(struct afs_call *call)
spin_unlock(&server->probe_lock);
 
_debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
-  server_index, index, &alist->addrs[index].transport,
-  (unsigned int)rtt, ret);
+  server_index, index, &alist->addrs[index].transport, rtt_us, 
ret);
 
have_result |= afs_fs_probe_done(server);
if (have_result)
diff --git a/fs/afs/vl_probe.c b/fs/afs/vl_probe.c
index 858498cc1b05..e3aa013c2177 100644
--- a/fs/afs/vl_probe.c
+++ b/fs/afs/vl_probe.c
@@ -31,10 +31,9 @@ void afs_vlserver_probe_result(struct afs_call *call)
struct afs_addr_list *alist = call->alist;
struct afs_vlserver *server = call->vlserver;
unsigned int server_index = call->server_index;
+   unsigned int rtt_us = 0;
unsigned int index = call->addr_ix;
-   unsigned int rtt = UINT_MAX;
bool have_result = false;
-   u64 _rtt;
int ret = call->error;
 
_enter("%s,%u,%u,%d,%d", server->name, server_index, index, ret, 
call->abort_code);
@@ -93

Re: [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods

2020-05-20 Thread Michael Ellerman

Ira Weiny  writes:
> On Wed, May 20, 2020 at 12:30:57AM +0530, Vaibhav Jain wrote:
>> Introduce support for Papr nvDimm Specific Methods (PDSM) in papr_scm
>> modules and add the command family to the white list of NVDIMM command
>> sets. Also advertise support for ND_CMD_CALL for the dimm
>> command mask and implement necessary scaffolding in the module to
>> handle ND_CMD_CALL ioctl and PDSM requests that we receive.
...
>> + *
>> + * Payload Version:
>> + *
>> + * A 'payload_version' field is present in PDSM header that indicates a 
>> specific
>> + * version of the structure present in PDSM Payload for a given PDSM 
>> command.
>> + * This provides backward compatibility in case the PDSM Payload structure
>> + * evolves and different structures are supported by 'papr_scm' and 
>> 'libndctl'.
>> + *
>> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the 
>> version
>> + * of the payload struct it supports via 'payload_version' field. The 
>> 'papr_scm'
>> + * module when servicing the PDSM envelope checks the 'payload_version' and 
>> then
>> + * uses 'payload struct version' == MIN('payload_version field',
>> + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
>> + * After servicing the PDSM, 'papr_scm' put the negotiated version of 
>> payload
>> + * struct in returned 'payload_version' field.
>
> FWIW many people believe using a size rather than version is more sustainable.
> It is expected that new payload structures are larger (more features) than the
> previous payload structure.
>
> I can't find references at the moment through.

I think clone_args is a good modern example:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/sched.h#n88

cheers

[PATCHv2 0/2] optee: register drivers on optee bus

2020-05-20 Thread Maxim Uvarov

v2: - write TEE with capital letters.
- declare __optee_enumerate_device() as static.

Hello,

This patchset fixes issues with probing() tee, optee and optee driver
if they were compiled into kernel, built as modules or any mixed
combination.
These changes require optee-os changes which already were merged.
Main corresponding commits are:
https://github.com/OP-TEE/optee_os/commit/9389d8030ef198c9d7b8ab7ea8e877e0ace3369d
https://github.com/OP-TEE/optee_os/commit/bc5921cdab538c8ae48422f5ffd600f1cbdd95b2

optee_enumerate_devices() which discovers Trusted Applications on tee
bus is split up on 2 changes. Do probe of drivers which do not require
userspace support of tee-supplicant and stage two to run drivers with
support of tee-supplicant only after tee supplicant run. 

Best regards,
Maxim.

Maxim Uvarov (2):
  optee: do drivers initialization before and after tee-supplicant run
  tpm_ftpm_tee: register driver on TEE bus

 drivers/char/tpm/tpm_ftpm_tee.c   | 69 ++-
 drivers/tee/optee/core.c  | 25 +--
 drivers/tee/optee/device.c| 17 +---
 drivers/tee/optee/optee_private.h |  8 +++-
 4 files changed, 99 insertions(+), 20 deletions(-)

-- 
2.17.1

[PATCHv2 1/2] optee: do drivers initialization before and after tee-supplicant run

2020-05-20 Thread Maxim Uvarov

Some drivers (like ftpm) can operate only after tee-supplicant
runs becase of tee-supplicant provides things like storage
services.  This patch splits probe of non tee-supplicant dependable
drivers to early stage, and after tee-supplicant run probe other
drivers.

Signed-off-by: Maxim Uvarov 
Suggested-by: Sumit Garg 
Suggested-by: Arnd Bergmann 
---
 drivers/tee/optee/core.c  | 25 ++---
 drivers/tee/optee/device.c| 17 +++--
 drivers/tee/optee/optee_private.h |  8 +++-
 3 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
index 99698b8a3a74..dd2265c44907 100644
--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "optee_private.h"
 #include "optee_smc.h"
 #include "shm_pool.h"
@@ -218,6 +219,15 @@ static void optee_get_version(struct tee_device *teedev,
*vers = v;
 }
 
+static void optee_bus_scan(struct work_struct *work)
+{
+   int rc;
+
+   rc = optee_enumerate_devices(PTA_CMD_GET_DEVICES_SUPP);
+   if (rc)
+   pr_err("optee_enumerate_devices failed %d\n", rc);
+}
+
 static int optee_open(struct tee_context *ctx)
 {
struct optee_context_data *ctxdata;
@@ -241,8 +251,15 @@ static int optee_open(struct tee_context *ctx)
kfree(ctxdata);
return -EBUSY;
}
-   }
 
+   INIT_WORK(&optee->scan_bus_work, optee_bus_scan);
+   optee->scan_bus_wq = create_workqueue("optee_bus_scan");
+   if (!optee->scan_bus_wq) {
+   pr_err("optee: couldn't create workqueue\n");
+   return -ECHILD;
+   }
+   queue_work(optee->scan_bus_wq, &optee->scan_bus_work);
+   }
mutex_init(&ctxdata->mutex);
INIT_LIST_HEAD(&ctxdata->sess_list);
 
@@ -296,8 +313,10 @@ static void optee_release(struct tee_context *ctx)
 
ctx->data = NULL;
 
-   if (teedev == optee->supp_teedev)
+   if (teedev == optee->supp_teedev) {
+   destroy_workqueue(optee->scan_bus_wq);
optee_supp_release(&optee->supp);
+   }
 }
 
 static const struct tee_driver_ops optee_ops = {
@@ -675,7 +694,7 @@ static int optee_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, optee);
 
-   rc = optee_enumerate_devices();
+   rc = optee_enumerate_devices(PTA_CMD_GET_DEVICES);
if (rc) {
optee_remove(pdev);
return rc;
diff --git a/drivers/tee/optee/device.c b/drivers/tee/optee/device.c
index e3a148521ec1..d4931dad07aa 100644
--- a/drivers/tee/optee/device.c
+++ b/drivers/tee/optee/device.c
@@ -21,7 +21,6 @@
  * TEE_ERROR_BAD_PARAMETERS - Incorrect input param
  * TEE_ERROR_SHORT_BUFFER - Output buffer size less than required
  */
-#define PTA_CMD_GET_DEVICES0x0
 
 static int optee_ctx_match(struct tee_ioctl_version_data *ver, const void 
*data)
 {
@@ -32,7 +31,8 @@ static int optee_ctx_match(struct tee_ioctl_version_data 
*ver, const void *data)
 }
 
 static int get_devices(struct tee_context *ctx, u32 session,
-  struct tee_shm *device_shm, u32 *shm_size)
+  struct tee_shm *device_shm, u32 *shm_size,
+  u32 func)
 {
int ret = 0;
struct tee_ioctl_invoke_arg inv_arg;
@@ -42,7 +42,7 @@ static int get_devices(struct tee_context *ctx, u32 session,
memset(¶m, 0, sizeof(param));
 
/* Invoke PTA_CMD_GET_DEVICES function */
-   inv_arg.func = PTA_CMD_GET_DEVICES;
+   inv_arg.func = func;
inv_arg.session = session;
inv_arg.num_params = 4;
 
@@ -87,7 +87,7 @@ static int optee_register_device(const uuid_t *device_uuid, 
u32 device_id)
return rc;
 }
 
-int optee_enumerate_devices(void)
+static int __optee_enumerate_devices(u32 func)
 {
const uuid_t pta_uuid =
UUID_INIT(0x7011a688, 0xddde, 0x4053,
@@ -118,7 +118,7 @@ int optee_enumerate_devices(void)
goto out_ctx;
}
 
-   rc = get_devices(ctx, sess_arg.session, NULL, &shm_size);
+   rc = get_devices(ctx, sess_arg.session, NULL, &shm_size, func);
if (rc < 0 || !shm_size)
goto out_sess;
 
@@ -130,7 +130,7 @@ int optee_enumerate_devices(void)
goto out_sess;
}
 
-   rc = get_devices(ctx, sess_arg.session, device_shm, &shm_size);
+   rc = get_devices(ctx, sess_arg.session, device_shm, &shm_size, func);
if (rc < 0)
goto out_shm;
 
@@ -158,3 +158,8 @@ int optee_enumerate_devices(void)
 
return rc;
 }
+
+int optee_enumerate_devices(u32 func)
+{
+   return  __optee_enumerate_devices(func);
+}
diff --git a/drivers/tee/optee/optee_private.h 
b/drivers/tee/optee/optee_private.h
index d9c5037b4e03..6cdac4bb7253 100644
--- a/drivers/tee/optee/optee_pri

[PATCH 2/2] tpm_ftpm_tee: register driver on tee bus

2020-05-20 Thread Maxim Uvarov

Register driver on tee bus. module tee registers bus,
and module optee calls optee_enumerate_devices() to scan
all devices on the bus. This TA can be Early TA's ( can be
compiled into optee-os). In that case it will be on optee
bus before linux booting. Also optee-suplicant application
is needed to be loaded between optee module and ftpm module to
to maintain functionality for ftpm driver.

Signed-off-by: Maxim Uvarov 
Suggested-by: Sumit Garg 
Suggested-by: Arnd Bergmann 
---
 drivers/char/tpm/tpm_ftpm_tee.c | 69 -
 1 file changed, 59 insertions(+), 10 deletions(-)

diff --git a/drivers/char/tpm/tpm_ftpm_tee.c b/drivers/char/tpm/tpm_ftpm_tee.c
index 22bf553ccf9d..7bb4ce281050 100644
--- a/drivers/char/tpm/tpm_ftpm_tee.c
+++ b/drivers/char/tpm/tpm_ftpm_tee.c
@@ -214,11 +214,10 @@ static int ftpm_tee_match(struct tee_ioctl_version_data 
*ver, const void *data)
  * Return:
  * On success, 0. On failure, -errno.
  */
-static int ftpm_tee_probe(struct platform_device *pdev)
+static int ftpm_tee_probe(struct device *dev)
 {
int rc;
struct tpm_chip *chip;
-   struct device *dev = &pdev->dev;
struct ftpm_tee_private *pvt_data = NULL;
struct tee_ioctl_open_session_arg sess_arg;
 
@@ -297,6 +296,13 @@ static int ftpm_tee_probe(struct platform_device *pdev)
return rc;
 }
 
+static int ftpm_plat_tee_probe(struct platform_device *pdev)
+{
+   struct device *dev = &pdev->dev;
+
+   return ftpm_tee_probe(dev);
+}
+
 /**
  * ftpm_tee_remove() - remove the TPM device
  * @pdev: the platform_device description.
@@ -304,9 +310,9 @@ static int ftpm_tee_probe(struct platform_device *pdev)
  * Return:
  * 0 always.
  */
-static int ftpm_tee_remove(struct platform_device *pdev)
+static int ftpm_tee_remove(struct device *dev)
 {
-   struct ftpm_tee_private *pvt_data = dev_get_drvdata(&pdev->dev);
+   struct ftpm_tee_private *pvt_data = dev_get_drvdata(dev);
 
/* Release the chip */
tpm_chip_unregister(pvt_data->chip);
@@ -328,11 +334,18 @@ static int ftpm_tee_remove(struct platform_device *pdev)
return 0;
 }
 
+static int ftpm_plat_tee_remove(struct platform_device *pdev)
+{
+   struct device *dev = &pdev->dev;
+
+   return ftpm_tee_remove(dev);
+}
+
 /**
  * ftpm_tee_shutdown() - shutdown the TPM device
  * @pdev: the platform_device description.
  */
-static void ftpm_tee_shutdown(struct platform_device *pdev)
+static void ftpm_plat_tee_shutdown(struct platform_device *pdev)
 {
struct ftpm_tee_private *pvt_data = dev_get_drvdata(&pdev->dev);
 
@@ -347,17 +360,53 @@ static const struct of_device_id of_ftpm_tee_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, of_ftpm_tee_ids);
 
-static struct platform_driver ftpm_tee_driver = {
+static struct platform_driver ftpm_tee_plat_driver = {
.driver = {
.name = "ftpm-tee",
.of_match_table = of_match_ptr(of_ftpm_tee_ids),
},
-   .probe = ftpm_tee_probe,
-   .remove = ftpm_tee_remove,
-   .shutdown = ftpm_tee_shutdown,
+   .shutdown = ftpm_plat_tee_shutdown,
+   .probe = ftpm_plat_tee_probe,
+   .remove = ftpm_plat_tee_remove,
+};
+
+static const struct tee_client_device_id optee_ftpm_id_table[] = {
+   {UUID_INIT(0xbc50d971, 0xd4c9, 0x42c4,
+  0x82, 0xcb, 0x34, 0x3f, 0xb7, 0xf3, 0x78, 0x96)},
+   {}
 };
 
-module_platform_driver(ftpm_tee_driver);
+MODULE_DEVICE_TABLE(tee, optee_ftpm_id_table);
+
+static struct tee_client_driver ftpm_tee_driver = {
+   .id_table   = optee_ftpm_id_table,
+   .driver = {
+   .name   = "optee-ftpm",
+   .bus= &tee_bus_type,
+   .probe  = ftpm_tee_probe,
+   .remove = ftpm_tee_remove,
+   },
+};
+
+static int __init ftpm_mod_init(void)
+{
+   int rc;
+
+   rc = platform_driver_register(&ftpm_tee_plat_driver);
+   if (rc)
+   return rc;
+
+   return driver_register(&ftpm_tee_driver.driver);
+}
+
+static void __exit ftpm_mod_exit(void)
+{
+   platform_driver_unregister(&ftpm_tee_plat_driver);
+   driver_unregister(&ftpm_tee_driver.driver);
+}
+
+module_init(ftpm_mod_init);
+module_exit(ftpm_mod_exit);
 
 MODULE_AUTHOR("Thirupathaiah Annapureddy ");
 MODULE_DESCRIPTION("TPM Driver for fTPM TA in TEE");
-- 
2.17.1

[PATCHv2 2/2] tpm_ftpm_tee: register driver on TEE bus

2020-05-20 Thread Maxim Uvarov

Register driver on TEE bus. module tee registers bus,
and module optee calls optee_enumerate_devices() to scan
all devices on the bus. Trusted Application for this driver
can be Early TA's (can be compiled into optee-os). In that
case it will be on OPTEE bus before linux booting. Also
optee-suplicant application is needed to be loaded between
OPTEE module and ftpm module to maintain functionality
for fTPM driver.

Signed-off-by: Maxim Uvarov 
Suggested-by: Sumit Garg 
Suggested-by: Arnd Bergmann 
---
 drivers/char/tpm/tpm_ftpm_tee.c | 69 -
 1 file changed, 59 insertions(+), 10 deletions(-)

diff --git a/drivers/char/tpm/tpm_ftpm_tee.c b/drivers/char/tpm/tpm_ftpm_tee.c
index 22bf553ccf9d..7bb4ce281050 100644
--- a/drivers/char/tpm/tpm_ftpm_tee.c
+++ b/drivers/char/tpm/tpm_ftpm_tee.c
@@ -214,11 +214,10 @@ static int ftpm_tee_match(struct tee_ioctl_version_data 
*ver, const void *data)
  * Return:
  * On success, 0. On failure, -errno.
  */
-static int ftpm_tee_probe(struct platform_device *pdev)
+static int ftpm_tee_probe(struct device *dev)
 {
int rc;
struct tpm_chip *chip;
-   struct device *dev = &pdev->dev;
struct ftpm_tee_private *pvt_data = NULL;
struct tee_ioctl_open_session_arg sess_arg;
 
@@ -297,6 +296,13 @@ static int ftpm_tee_probe(struct platform_device *pdev)
return rc;
 }
 
+static int ftpm_plat_tee_probe(struct platform_device *pdev)
+{
+   struct device *dev = &pdev->dev;
+
+   return ftpm_tee_probe(dev);
+}
+
 /**
  * ftpm_tee_remove() - remove the TPM device
  * @pdev: the platform_device description.
@@ -304,9 +310,9 @@ static int ftpm_tee_probe(struct platform_device *pdev)
  * Return:
  * 0 always.
  */
-static int ftpm_tee_remove(struct platform_device *pdev)
+static int ftpm_tee_remove(struct device *dev)
 {
-   struct ftpm_tee_private *pvt_data = dev_get_drvdata(&pdev->dev);
+   struct ftpm_tee_private *pvt_data = dev_get_drvdata(dev);
 
/* Release the chip */
tpm_chip_unregister(pvt_data->chip);
@@ -328,11 +334,18 @@ static int ftpm_tee_remove(struct platform_device *pdev)
return 0;
 }
 
+static int ftpm_plat_tee_remove(struct platform_device *pdev)
+{
+   struct device *dev = &pdev->dev;
+
+   return ftpm_tee_remove(dev);
+}
+
 /**
  * ftpm_tee_shutdown() - shutdown the TPM device
  * @pdev: the platform_device description.
  */
-static void ftpm_tee_shutdown(struct platform_device *pdev)
+static void ftpm_plat_tee_shutdown(struct platform_device *pdev)
 {
struct ftpm_tee_private *pvt_data = dev_get_drvdata(&pdev->dev);
 
@@ -347,17 +360,53 @@ static const struct of_device_id of_ftpm_tee_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, of_ftpm_tee_ids);
 
-static struct platform_driver ftpm_tee_driver = {
+static struct platform_driver ftpm_tee_plat_driver = {
.driver = {
.name = "ftpm-tee",
.of_match_table = of_match_ptr(of_ftpm_tee_ids),
},
-   .probe = ftpm_tee_probe,
-   .remove = ftpm_tee_remove,
-   .shutdown = ftpm_tee_shutdown,
+   .shutdown = ftpm_plat_tee_shutdown,
+   .probe = ftpm_plat_tee_probe,
+   .remove = ftpm_plat_tee_remove,
+};
+
+static const struct tee_client_device_id optee_ftpm_id_table[] = {
+   {UUID_INIT(0xbc50d971, 0xd4c9, 0x42c4,
+  0x82, 0xcb, 0x34, 0x3f, 0xb7, 0xf3, 0x78, 0x96)},
+   {}
 };
 
-module_platform_driver(ftpm_tee_driver);
+MODULE_DEVICE_TABLE(tee, optee_ftpm_id_table);
+
+static struct tee_client_driver ftpm_tee_driver = {
+   .id_table   = optee_ftpm_id_table,
+   .driver = {
+   .name   = "optee-ftpm",
+   .bus= &tee_bus_type,
+   .probe  = ftpm_tee_probe,
+   .remove = ftpm_tee_remove,
+   },
+};
+
+static int __init ftpm_mod_init(void)
+{
+   int rc;
+
+   rc = platform_driver_register(&ftpm_tee_plat_driver);
+   if (rc)
+   return rc;
+
+   return driver_register(&ftpm_tee_driver.driver);
+}
+
+static void __exit ftpm_mod_exit(void)
+{
+   platform_driver_unregister(&ftpm_tee_plat_driver);
+   driver_unregister(&ftpm_tee_driver.driver);
+}
+
+module_init(ftpm_mod_init);
+module_exit(ftpm_mod_exit);
 
 MODULE_AUTHOR("Thirupathaiah Annapureddy ");
 MODULE_DESCRIPTION("TPM Driver for fTPM TA in TEE");
-- 
2.17.1

Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally

2020-05-20 Thread Tao Xu





On 5/21/2020 2:37 PM, Xiaoyao Li wrote:

On 5/21/2020 1:28 PM, Tao Xu wrote:



On 5/21/2020 12:33 PM, Xiaoyao Li wrote:

On 5/21/2020 5:05 AM, Paolo Bonzini wrote:

On 20/05/20 18:07, Maxim Levitsky wrote:

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky 
---
  arch/x86/kvm/x86.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
  if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
  min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
  continue;
+    break;
+    case MSR_IA32_UMWAIT_CONTROL:
+    if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+    continue;
  default:
  break;
  }


The patch is correct, and matches what is done for the other entries of
msrs_to_save_all.  However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: 
x86:

Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.

So while at it you should also fix that.  The right way to do that 
is to

add a

 if (vmx_waitpkg_supported())
 kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);


+ Tao

I remember there is certainly some reason why we don't expose WAITPKG 
to guest by default.


Tao, please help clarify it.

Thanks,
-Xiaoyao



Because in VM, umwait and tpause can put a (psysical) CPU into a power 
saving state. So from host view, this cpu will be 100% usage by VM. 
Although umwait and tpause just cause short wait(maybe 100 
microseconds), we still want to unconditionally expose WAITPKG in VM.


I guess you typed "unconditionally" by mistake that you meant to say 
"conditionally" in fact?


I am sorry, I mean:
By default, we don't expose WAITPKG to guest. For QEMU, we can use 
"-overcommit cpu-pm=on" to use WAITPKG.

Re: [PATCH v2] xfrm: policy: Fix xfrm policy match

2020-05-20 Thread Xin Long

On Tue, May 19, 2020 at 4:53 PM Steffen Klassert
 wrote:
>
> On Fri, May 15, 2020 at 04:39:57PM +0800, Yuehaibing wrote:
> >
> > Friendly ping...
> >
> > Any plan for this issue?
>
> There was still no consensus between you and Xin on how
> to fix this issue. Once this happens, I consider applying
> a fix.
>
Sorry, Yuehaibing, I can't really accept to do: (A->mark.m & A->mark.v)
I'm thinking to change to:

 static bool xfrm_policy_mark_match(struct xfrm_policy *policy,
   struct xfrm_policy *pol)
 {
-   u32 mark = policy->mark.v & policy->mark.m;
-
-   if (policy->mark.v == pol->mark.v && policy->mark.m == pol->mark.m)
-   return true;
-
-   if ((mark & pol->mark.m) == pol->mark.v &&
-   policy->priority == pol->priority)
+   if (policy->mark.v == pol->mark.v &&
+   (policy->mark.m == pol->mark.m ||
+policy->priority == pol->priority))
return true;

return false;

which means we consider (the same value and mask) or
(the same value and priority) as the same one. This will
cover both problems.

[PATCH] powercap: remove unused local MSR define

2020-05-20 Thread Sumeet Pawnikar

Remove unused PLATFORM_POWER_LIMIT MSR local definition from file
intel_rapl_common.c. This was missed while splitting old RAPL code
intel_rapl.c file into two new files intel_rapl_msr.c and
intel_rapl_common.c as per the commit 3382388d7148
("intel_rapl: abstract RAPL common code"). Currently, this #define
entry is being used only in intel_rapl_msr.c file and local definition
present in this file.

Signed-off-by: Sumeet Pawnikar 
Reviewed-by: Andy Shevchenko 
---
 drivers/powercap/intel_rapl_common.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/powercap/intel_rapl_common.c 
b/drivers/powercap/intel_rapl_common.c
index eb328655bc01..5527a7c76309 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -26,9 +26,6 @@
 #include 
 #include 
 
-/* Local defines */
-#define MSR_PLATFORM_POWER_LIMIT   0x065C
-
 /* bitmasks for RAPL MSRs, used by primitive access functions */
 #define ENERGY_STATUS_MASK  0x
 
-- 
1.7.9.5

Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally

2020-05-20 Thread Xiaoyao Li


On 5/21/2020 1:28 PM, Tao Xu wrote:



On 5/21/2020 12:33 PM, Xiaoyao Li wrote:

On 5/21/2020 5:05 AM, Paolo Bonzini wrote:

On 20/05/20 18:07, Maxim Levitsky wrote:

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky 
---
  arch/x86/kvm/x86.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
  if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
  min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
  continue;
+    break;
+    case MSR_IA32_UMWAIT_CONTROL:
+    if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+    continue;
  default:
  break;
  }


The patch is correct, and matches what is done for the other entries of
msrs_to_save_all.  However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.

So while at it you should also fix that.  The right way to do that is to
add a

 if (vmx_waitpkg_supported())
 kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);


+ Tao

I remember there is certainly some reason why we don't expose WAITPKG 
to guest by default.


Tao, please help clarify it.

Thanks,
-Xiaoyao



Because in VM, umwait and tpause can put a (psysical) CPU into a power 
saving state. So from host view, this cpu will be 100% usage by VM. 
Although umwait and tpause just cause short wait(maybe 100 
microseconds), we still want to unconditionally expose WAITPKG in VM.


I guess you typed "unconditionally" by mistake that you meant to say 
"conditionally" in fact?

Re: [PATCH] vt: keyboard: avoid integer overflow in k_ascii

2020-05-20 Thread Dmitry Torokhov

Hi,

On Thu, May 21, 2020 at 01:34:08AM +, Kyungtae Kim wrote:
> FuzzUSB (a variant of syzkaller) found an integer overflow 
> while processing keycode value.
> 
> Reference: https://lkml.org/lkml/2020/3/22/482
> 
> This bug occurs because of no validity check when operating keycode values.
> By executing k_ascii() multiple times, npadch can have a large value 
> close to the max of int type e.g., 11. 
> In the following, its muliplication causes an integer overflow.
> 
> This fix prevents the overflow by checking npadch using check_mul_overflow() 
> ahead of its operation.
> 
> 
> UBSAN: Undefined behaviour in drivers/tty/vt/keyboard.c:888:19
> signed integer overflow:
> 10 * 11 cannot be represented in type 'int'
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.11 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0xce/0x128 lib/dump_stack.c:118
>  ubsan_epilogue+0xe/0x30 lib/ubsan.c:154
>  handle_overflow+0xdc/0xf0 lib/ubsan.c:184
>  __ubsan_handle_mul_overflow+0x2a/0x40 lib/ubsan.c:205
>  k_ascii+0xbf/0xd0 drivers/tty/vt/keyboard.c:888
>  kbd_keycode drivers/tty/vt/keyboard.c:1477 [inline]
>  kbd_event+0x888/0x3be0 drivers/tty/vt/keyboard.c:1495
>  input_to_handler+0x3a9/0x4b0 drivers/input/input.c:118
>  input_pass_values.part.8+0x25e/0x690 drivers/input/input.c:145
>  input_pass_values drivers/input/input.c:193 [inline]
>  input_repeat_key+0x1f8/0x2c0 drivers/input/input.c:194
>  call_timer_fn+0x20e/0x770 kernel/time/timer.c:1404
>  expire_timers kernel/time/timer.c:1449 [inline]
>  __run_timers kernel/time/timer.c:1773 [inline]
>  run_timer_softirq+0x63f/0x13c0 kernel/time/timer.c:1786
>  __do_softirq+0x262/0xb46 kernel/softirq.c:292
>  invoke_softirq kernel/softirq.c:373 [inline]
>  irq_exit+0x161/0x1b0 kernel/softirq.c:413
>  exiting_irq arch/x86/include/asm/apic.h:546 [inline]
>  smp_apic_timer_interrupt+0x137/0x500 arch/x86/kernel/apic/apic.c:1146
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
>  
> RIP: 0010:default_idle+0x2d/0x2e0 arch/x86/kernel/process.c:696
> Code: e5 41 57 41 56 65 44 8b 35 30 9d 5d 7a 41 55 41 54 53 0f 1f 44 00 00 e8 
> 11 42 a4 fb e9 07 00 00 00 0f 00 2d d5 29 5e 00 fb f4 <65> 44 8b 35 0b 9d 5d 
> 7a 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 41 5f
> RSP: 0018:87007ce8 EFLAGS: 0292 ORIG_RAX: ff13
> RAX: 0007 RBX: 87032900 RCX: 
> RDX:  RSI: 0006 RDI: 87033154
> RBP: 87007d10 R08: fbfff0e06521 R09: 
> R10:  R11:  R12: 
> R13: 88c99c00 R14:  R15: 
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:686
>  default_idle_call+0x50/0x70 kernel/sched/idle.c:94
>  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
>  do_idle+0x332/0x530 kernel/sched/idle.c:269
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:361
>  rest_init+0x240/0x3d0 init/main.c:660
>  arch_call_rest_init+0xe/0x1b
>  start_kernel+0x7f6/0x81e init/main.c:997
>  x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:490
>  x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:471
>  secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242
> 
> 
> Signed-off-by: Kyungtae Kim 
> Reported-and-tested-by: Kyungtae Kim 
> 
> ---
>  drivers/tty/vt/keyboard.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c
> index 15d33fa0c925..f7e1bb21bd9c 100644
> --- a/drivers/tty/vt/keyboard.c
> +++ b/drivers/tty/vt/keyboard.c
> @@ -869,6 +869,7 @@ static void k_meta(struct vc_data *vc, unsigned char 
> value, char up_flag)
>  static void k_ascii(struct vc_data *vc, unsigned char value, char up_flag)
>  {
>   int base;
> + int bytes;
>  
>   if (up_flag)
>   return;
> @@ -884,6 +885,8 @@ static void k_ascii(struct vc_data *vc, unsigned char 
> value, char up_flag)
>  
>   if (npadch == -1)
>   npadch = value;
> + else if (check_mul_overflow(npadch, base, &bytes) || 
> check_add_overflow(bytes, value, &bytes))
> + return;

Why do we discard the result of calculation and repeat it again below?
Can we say

else if (check_mul_overflow(npadch, base, &new_npadch) ||
check_add_overflow(new_npadch, value, &new_npadch))
npadch = new_npadch;

Thanks.

-- 
Dmitry

[PATCH] scrpits: Remove unneeded assignment parentheses

2020-05-20 Thread Xu Wang

Remove unneeded assignment parentheses.

Signed-off-by: Xu Wang 
---
 scripts/extract-cert.c | 2 +-
 scripts/sign-file.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/extract-cert.c b/scripts/extract-cert.c
index b071bf476fea..8005911926b8 100644
--- a/scripts/extract-cert.c
+++ b/scripts/extract-cert.c
@@ -61,7 +61,7 @@ static void drain_openssl_errors(void)
 
 #define ERR(cond, fmt, ...)\
do {\
-   bool __cond = (cond);   \
+   bool __cond = cond; \
display_openssl_errors(__LINE__);   \
if (__cond) {   \
err(1, fmt, ## __VA_ARGS__);\
diff --git a/scripts/sign-file.c b/scripts/sign-file.c
index fbd34b8e8f57..9ea08b07a0aa 100644
--- a/scripts/sign-file.c
+++ b/scripts/sign-file.c
@@ -104,7 +104,7 @@ static void drain_openssl_errors(void)
 
 #define ERR(cond, fmt, ...)\
do {\
-   bool __cond = (cond);   \
+   bool __cond = cond; \
display_openssl_errors(__LINE__);   \
if (__cond) {   \
err(1, fmt, ## __VA_ARGS__);\
-- 
2.17.1

Re: [PATCH v4 1/2] scripts: Support compiled source, improved precise

2020-05-20 Thread Masahiro Yamada

On Fri, May 15, 2020 at 2:10 AM xujialu  wrote:
>
> Sorry for replying so late.
>
> 
>
> I usually don't run scripts/tags.sh directly. But one day i checked git
> log of scripts/tags.sh, and found this commit c69ef1c87b8c said we may
> run it directly. Then i must took care of that.
>
> Here are some cases that i should write clearly before:
> (I omit COMPILED_SOURCE=1 here just for clear and distinct)
>
> 1) make; make gtags;
> 2) make; ./scripts/tags.sh gtags;
> 3) make O=123; make O=123 gtags;
> 4) make O=123; make gtags;
> 5) make O=123; ./scripts/tags.sh gtags;
> 6) make O=/path/out/of/kernel/; make O=/path/out/of/kernel/ gtags;
> 7) make O=/path/out/of/kernel/; SOMETHING ./scripts/tags.sh gtags;
>
> Assume that we just change directory into kernel root directory and vim
> a source file:
>
> case 1): We have GTAGS generated in current directory, no problem;
> In this case: tree=
> case 2): Same as case 1), no problem;
> In this case: tree=

... if you set SRCARCH.

If SRCARCH is unset, it will get a warning.

$ ./scripts/tags.sh  gtags
find: ‘arch/*.[chS]/’: No such file or directory




> case 3): GTAGS is generated in directory 123; Here comes the problem,
>  gtags will give error "Segmentation fault" (eg. global-5.7.1)
>  or give warnning "is out of source tree." (eg. global-6.6.3-2)
>  because 'make O=123' changed to directory 123 and our cute
>  source files is in ../ and gtags seems do not like this path
>  begin with '../', actually it's not an subdiretories for 123;
>  If above situation is not persuasive, then consider one may
>  want generate gtags.files contains files without '../' in
>  kernel root directory, so that gtags could be useful;
>  And this is why case 4) exist, if case 4) is really bad idea
>  then we must have another way to do this - case 5);
> In this case: tree=../


This is a problem of GNU Global, not of our build system.


"Warning: ... is out of source tree." is listed in
the known bugs:

https://www.gnu.org/software/global/bugs.html


Do not mess up our script.


> case 4): This is not good when we 'make O=123 distclean';
> In this case: tree=../



Of course, "make O=123 distclean" cannot clean up
build artifacts created by "make gtags".

So, what problem are you addressing?



> case 5): Find file '.config' in directory 123, then collect files with
>  path just in current directly; No problem;
> In this case: tree=../

You are misunderstanding.

See the comment at line 8.

# Uses the following environment variables:
# SUBARCH, SRCARCH, srctree


If you want to run this script directly,
you must set all the mentioned environment variables correctly,
and also run this script in the correct working directory.

It will create tag files in the current working directory.
^^

Do not change the working directory internally.





> case 6): What if KBUILD_OUTPUT is out of kernel directory? Assume that
>  we get a gtags.files in that directly, in the gtags.files, the
>  file path all begin with full path, guess what, gtags will give
>  the error or warnning described above. Why don't we just
>  generate GTAGS with relative path in kernel root directory?
> In this case: tree=/path/out/of/kernel/

This is the intended behavior for the other TAGS, tags, cscope.

Again, this is a problem of GNU global.


The general rule is like this:

Tag files should be always output to the separate object tree
if O= is given, then file paths should point to the source
tree with either relative or absolute paths.

This is because the source tree is not always writable.
The source tree might be delivered in a DVD-ROM, read-only mounted nfs,
or located under /usr/src/ which is installed by distro source package.




>
> >
> > +   SRCTREE=$(realpath ${tree}.)
> > +
> > +   cd $(dirname $(find -name .config -print -quit).)
> >
> > Why is this needed?
>
> In case 5), the path of source files collected in .cmd files is some
> begin with '../' and some with full path, eg. /usr/include/stdio.h, we
> must change to directory 123 as 'make O=123 gtags' does so that we could
> use same method (described bellow) in both cases.

No.
scripts/tags.sh must be run in the object tree in this case.

5) make O=123; cd 123; ../scripts/tags.sh srctree=.. SRCARCH=x86 gtags




> > Why is --relative-to=${SRCTREE} needed?
> >
> > You are dropping ${SRCTREE} and adding ${ABSPWD}${tree}.
> > I do not understand what this is doing back-and-forth.
>
> These .cmd files also contain the default include dir (eg.
> /usr/include/stdio.h) and even compiler's header files, try make
> ARCH=arm.


I know. Probably, that would not happen
because the following patch is queued up.
https://patchwork.kernel.org/patch/11505807/




> We should first collect files w

linux-next: build failure after merge of the kvm tree

2020-05-20 Thread Stephen Rothwell

Hi all,

After merging the kvm tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

arch/x86/kvm/svm/svm.c: In function 'kvm_machine_check':
arch/x86/kvm/svm/svm.c:1834:2: error: too many arguments to function 
'do_machine_check'
 1834 |  do_machine_check(®s, 0);
  |  ^~~~
In file included from arch/x86/kvm/svm/svm.c:36:
arch/x86/include/asm/mce.h:254:6: note: declared here
  254 | void do_machine_check(struct pt_regs *pt_regs);
  |  ^~~~

Caused by commit

  1c164cb3ffd0 ("KVM: SVM: Use do_machine_check to pass MCE to the host")

interacting with commit

  aaa4947defff ("x86/entry: Convert Machine Check to IDTENTRY_IST")

from the tip tree.

I added the following merge fix patch.

From: Stephen Rothwell 
Date: Thu, 21 May 2020 16:24:59 +1000
Subject: [PATCH] KVM: SVM: fix up for do_machine_check() API change

Signed-off-by: Stephen Rothwell 
---
 arch/x86/kvm/svm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ae287980c027..7488c8abe825 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1831,7 +1831,7 @@ static void kvm_machine_check(void)
.flags = X86_EFLAGS_IF,
};
 
-   do_machine_check(®s, 0);
+   do_machine_check(®s);
 #endif
 }
 
-- 
2.26.2

-- 
Cheers,
Stephen Rothwell


pgplghMZi0Jca.pgp
Description: OpenPGP digital signature

[PATCH] [v2] media: staging: tegra-vde: fix runtime pm imbalance on error

2020-05-20 Thread Dinghao Liu

pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu 
---

Changelog:

v2: - Remove unused label 'unlock'
---
 drivers/staging/media/tegra-vde/vde.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/vde.c 
b/drivers/staging/media/tegra-vde/vde.c
index d3e63512a765..3fdf2cd0b99e 100644
--- a/drivers/staging/media/tegra-vde/vde.c
+++ b/drivers/staging/media/tegra-vde/vde.c
@@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde 
*vde,
 
ret = pm_runtime_get_sync(dev);
if (ret < 0)
-   goto unlock;
+   goto put_runtime_pm;
 
/*
 * We rely on the VDE registers reset value, otherwise VDE
@@ -843,8 +843,6 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde 
*vde,
 put_runtime_pm:
pm_runtime_mark_last_busy(dev);
pm_runtime_put_autosuspend(dev);
-
-unlock:
mutex_unlock(&vde->lock);
 
 release_dpb_frames:
-- 
2.17.1

Re: [RFC PATCH v3 2/2] CPPC: add support for SW BOOST

2020-05-20 Thread Xiongfeng Wang

Hi Viresh,

On 2020/5/20 13:00, Viresh Kumar wrote:
> On 19-05-20, 19:41, Xiongfeng Wang wrote:
>> To add SW BOOST support for CPPC, we need to get the max frequency of
>> boost mode and non-boost mode. ACPI spec 6.2 section 8.4.7.1 describe
>> the following two CPC registers.
>>
>> "Highest performance is the absolute maximum performance an individual
>> processor may reach, assuming ideal conditions. This performance level
>> may not be sustainable for long durations, and may only be achievable if
>> other platform components are in a specific state; for example, it may
>> require other processors be in an idle state.
>>
>> Nominal Performance is the maximum sustained performance level of the
>> processor, assuming ideal operating conditions. In absence of an
>> external constraint (power, thermal, etc.) this is the performance level
>> the platform is expected to be able to maintain continuously. All
>> processors are expected to be able to sustain their nominal performance
>> state simultaneously."
>>
>> To add SW BOOST support for CPPC, we can use Highest Performance as the
>> max performance in boost mode and Nominal Performance as the max
>> performance in non-boost mode. If the Highest Performance is greater
>> than the Nominal Performance, we assume SW BOOST is supported.
>>
>> The current CPPC driver does not support SW BOOST and use 'Highest
>> Performance' as the max performance the CPU can achieve. 'Nominal
>> Performance' is used to convert 'performance' to 'frequency'. That
>> means, if firmware enable boost and provide a value for Highest
>> Performance which is greater than Nominal Performance, boost feature is
>> enabled by default.
>>
>> Because SW BOOST is disabled by default, so, after this patch, boost
>> feature is disabled by default even if boost is enabled by firmware.
>>
>> Signed-off-by: Xiongfeng Wang 
>> ---
>>  drivers/cpufreq/cppc_cpufreq.c | 39 +--
>>  1 file changed, 37 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index bda0b24..792ed9e 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -37,6 +37,7 @@
>>   * requested etc.
>>   */
>>  static struct cppc_cpudata **all_cpu_data;
>> +static bool boost_supported;
>>  
>>  struct cppc_workaround_oem_info {
>>  char oem_id[ACPI_OEM_ID_SIZE + 1];
>> @@ -310,7 +311,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy 
>> *policy)
>>   * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>>   */
>>  policy->min = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.lowest_nonlinear_perf);
>> -policy->max = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.highest_perf);
>> +policy->max = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.nominal_perf);
>>  
>>  /*
>>   * Set cpuinfo.min_freq to Lowest to make the full range of performance
>> @@ -318,7 +319,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy 
>> *policy)
>>   * nonlinear perf
>>   */
>>  policy->cpuinfo.min_freq = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.lowest_perf);
>> -policy->cpuinfo.max_freq = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.highest_perf);
>> +policy->cpuinfo.max_freq = cppc_cpufreq_perf_to_khz(cpu, 
>> cpu->perf_caps.nominal_perf);
>>  
>>  policy->transition_delay_us = 
>> cppc_cpufreq_get_transition_delay_us(cpu_num);
>>  policy->shared_type = cpu->shared_type;
>> @@ -343,6 +344,13 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy 
>> *policy)
>>  
>>  cpu->cur_policy = policy;
>>  
>> +/*
>> + * If 'highest_perf' is greater than 'nominal_perf', we assume CPU Boost
>> + * is supported.
>> + */
>> +if (cpu->perf_caps.highest_perf > cpu->perf_caps.nominal_perf)
>> +boost_supported = true;
>> +
>>  /* Set policy->cur to max now. The governors will adjust later. */
>>  policy->cur = cppc_cpufreq_perf_to_khz(cpu,
>>  cpu->perf_caps.highest_perf);
>> @@ -410,6 +418,32 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int 
>> cpunum)
>>  return cppc_get_rate_from_fbctrs(cpu, fb_ctrs_t0, fb_ctrs_t1);
>>  }
>>  
>> +static int cppc_cpufreq_set_boost(struct cpufreq_policy *policy, int state)
>> +{
>> +struct cppc_cpudata *cpudata;
>> +int ret = 0;
> 
> No need to initialize this.

I will change it in the next version.

Thanks for your advice. I will add your 'Suggested-by' for these two patches.


Thanks,
Xiongfeng

> 
>> +
>> +if (!boost_supported) {
>> +pr_err("BOOST not supported by CPU or firmware\n");
>> +return -EINVAL;
>> +}
>> +
>> +cpudata = all_cpu_data[policy->cpu];
>> +if (state)
>> +policy->max = cppc_cpufreq_perf_to_khz(cpudata,
>> +cpudata->perf_caps.highest_perf);
>> +else
>> +policy->max = cppc_cpufreq_perf_to_khz(cpudata,

[PATCH] perf evlist: Ensure grouped events with same cpu map

2020-05-20 Thread Jin Yao

A metric may consist of core event and uncore event (or other
per-socket event)

For example, the metric "C2_Pkg_Residency" consists of
"cstate_pkg/c2-residency" and "msr/tsc". The former is per-socket
event and the latter is per-cpu event.

"C2_Pkg_Residency" hits assertion failure on cascadelakex.

 # perf stat -M "C2_Pkg_Residency" -a -- sleep 1
 perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
 Aborted

The root cause is one issue in get_group_fd(), access violation!

For a group mixed with per-socket event and per-cpu event and the
group leader is per-socket event, access violation will happen.

perf_evsel__alloc_fd allocates one FD member for per-socket event.
Only FD(evsel, 0, 0) is valid (suppose one-socket system).

But for per-cpu event, perf_evsel__alloc_fd allocates N FD members
(N = ncpus). For example, if ncpus is 8, FD(evsel, 0, 0) to
FD(evsel, 7, 0) are valid.

get_group_fd(struct evsel *evsel, int cpu, int thread)
{
   struct evsel *leader = evsel->leader;

   fd = FD(leader, cpu, thread);/* access violation */
}

If leader is per-socket event, only FD(leader, 0, 0) is valid.
So when get_group_fd tries to access FD(leader, 1, 0), access
violation will happen.

This patch ensures that the grouped events with same cpu maps
before we go to get_group_fd.

If the cpu maps are not matched, we force to disable the group.

Fixes: 6a4bb04caacc8 ("perf tools: Enable grouping logic for parsed events")
Signed-off-by: Jin Yao 
---
 tools/perf/builtin-stat.c |  3 +++
 tools/perf/util/evlist.c  | 32 
 tools/perf/util/evlist.h  |  5 +
 3 files changed, 40 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 377e575f9645..0e4fc6b3323c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -584,6 +584,9 @@ static int __run_perf_stat(int argc, const char **argv, int 
run_idx)
if (affinity__setup(&affinity) < 0)
return -1;
 
+   if (!evlist__cpus_matched(evsel_list))
+   evlist__force_disable_group(evsel_list);
+
evlist__for_each_cpu (evsel_list, i, cpu) {
affinity__set(&affinity, cpu);
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 2a9de6491700..fc6e410ca63b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1704,3 +1704,35 @@ struct evsel *perf_evlist__reset_weak_group(struct 
evlist *evsel_list,
}
return leader;
 }
+
+bool evlist__cpus_matched(struct evlist *evlist)
+{
+   struct evsel *prev = evlist__first(evlist), *evsel = prev;
+
+   if (prev->core.nr_members <= 1)
+   return true;
+
+   evlist__for_each_entry_continue(evlist, evsel) {
+   if (evsel->core.cpus->nr != prev->core.cpus->nr)
+   return false;
+
+   for (int i = 0; i < evsel->core.cpus->nr; i++) {
+   if (evsel->core.cpus->map[i] != prev->core.cpus->map[i])
+   return false;
+   }
+
+   prev = evsel;
+   }
+
+   return true;
+}
+
+void evlist__force_disable_group(struct evlist *evlist)
+{
+   struct evsel *evsel;
+
+   evlist__for_each_entry(evlist, evsel) {
+   evsel->leader = evsel;
+   evsel->core.nr_members = 0;
+   }
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b6f325dfb4d2..ea7a53166cbd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -355,4 +355,9 @@ void perf_evlist__force_leader(struct evlist *evlist);
 struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
 struct evsel *evsel,
bool close);
+
+bool evlist__cpus_matched(struct evlist *evlist);
+
+void evlist__force_disable_group(struct evlist *evlist);
+
 #endif /* __PERF_EVLIST_H */
-- 
2.17.1

Re: Re: [PATCH] hwrng: ks-sa - fix runtime pm imbalance on error

2020-05-20 Thread dinghao . liu

Hi Alexander,

There are large amounts of cases that assume pm_runtime_get_sync()
will modify runtime PM usage counter on error. Fixing this in PM 
subsystem will influence all callers of pm_runtime_get_sync() and
introduce new bugs. Therefore I think the better solution is to fix
misused cases individually.

Dinghao

> Hello Dinghao,
> 
> On Wed, 2020-05-20 at 21:29 +0800, Dinghao Liu wrote:
> > pm_runtime_get_sync() increments the runtime PM usage counter even
> > the call returns an error code. Thus a pairing decrement is needed
> > on the error handling path to keep the counter balanced.
> 
> I believe, this is the wrong place for such kind of fix.
> pm_runtime_get_sync() has obviously a broken semantics with regards to
> your observation but no other driver does what you propose.
> I think the proper fix belong into PM subsystem, please take a look
> onto commit 15bcb91d7e60 "PM / Runtime: Implement autosuspend support".
> 
> > Signed-off-by: Dinghao Liu 
> > ---
> >  drivers/char/hw_random/ks-sa-rng.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/char/hw_random/ks-sa-rng.c 
> > b/drivers/char/hw_random/ks-sa-rng.c
> > index e2330e757f1f..85c81da4a8af 100644
> > --- a/drivers/char/hw_random/ks-sa-rng.c
> > +++ b/drivers/char/hw_random/ks-sa-rng.c
> > @@ -244,6 +244,7 @@ static int ks_sa_rng_probe(struct platform_device *pdev)
> > ret = pm_runtime_get_sync(dev);
> > if (ret < 0) {
> > dev_err(dev, "Failed to enable SA power-domain\n");
> > +   pm_runtime_put_sync(dev);
> > pm_runtime_disable(dev);
> > return ret;
> > }
> -- 
> Alexander Sverdlin.
>

Re: [PATCH v4 3/9] usb: dwc3: Increase timeout for CmdAct cleared by device controller

2020-05-20 Thread Felipe Balbi


Hi Jun,

Felipe Balbi  writes:
>> In any case, increasing the timeout should be fine with me. It maybe 
>> difficult to determine the max timeout base on the slowest clock rate 
>> and number of cycles. Different controller and controller versions 
>> behave differently and may have different number of clock cycles to 
>> complete a command.
>>
>> The RTL engineer recommended timeout to be at least 1ms (which maybe 
>> more than the polling rate of this patch). I'm fine with either the rate 
>> provided by this tested patch or higher.
>
> A whole ms waiting for a command to complete? Wow, that's a lot of time
> blocking the CPU. It looks like, perhaps, we should move to command
> completion interrupts. The difficulty here is that we issue commands
> from within the interrupt handler and, as such, can't
> wait_for_completion().
>
> Meanwhile, we will take the timeout increase I guess, otherwise NXP
> won't have a working setup.

patch 1 in this series doesn't apply to testing/next. Care to rebase and
resend?

Thank you

-- 
balbi


signature.asc
Description: PGP signature

Re: [PATCH v1 2/6] bus: mhi: core: Mark device inactive soon after host issues a shutdown

2020-05-20 Thread kbuild test robot

Hi Bhaumik,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20200519]
[cannot apply to linus/master v5.7-rc6 v5.7-rc5 v5.7-rc4 v5.7-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Bhaumik-Bhatt/Bug-fixes-and-bootup-and-shutdown-improvements/20200520-083400
base:fb57b1fabcb28f358901b2df90abd2b48abc1ca8
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 
3393cc4cebf9969db94dc424b7a2b6195589c33b)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> drivers/bus/mhi/core/main.c:397:8: error: implicit declaration of function 
>> 'mhi_is_active' [-Werror,-Wimplicit-function-declaration]
if (!mhi_is_active(mhi_cntrl)) {
^
1 error generated.

vim +/mhi_is_active +397 drivers/bus/mhi/core/main.c

   371  
   372  irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv)
   373  {
   374  struct mhi_controller *mhi_cntrl = priv;
   375  struct device *dev = &mhi_cntrl->mhi_dev->dev;
   376  enum mhi_state state = MHI_STATE_MAX;
   377  enum mhi_pm_state pm_state = 0;
   378  enum mhi_ee_type ee = 0;
   379  bool handle_rddm = false;
   380  
   381  write_lock_irq(&mhi_cntrl->pm_lock);
   382  if (!MHI_REG_ACCESS_VALID(mhi_cntrl->pm_state)) {
   383  write_unlock_irq(&mhi_cntrl->pm_lock);
   384  goto exit_intvec;
   385  }
   386  
   387  state = mhi_get_mhi_state(mhi_cntrl);
   388  ee = mhi_cntrl->ee;
   389  mhi_cntrl->ee = mhi_get_exec_env(mhi_cntrl);
   390  dev_dbg(dev, "local ee:%s device ee:%s dev_state:%s\n",
   391  TO_MHI_EXEC_STR(mhi_cntrl->ee), TO_MHI_EXEC_STR(ee),
   392  TO_MHI_STATE_STR(state));
   393  
   394   /* If device supports RDDM don't bother processing SYS error */
   395  if (mhi_cntrl->rddm_image) {
   396  /* host may be performing a device power down already */
 > 397  if (!mhi_is_active(mhi_cntrl)) {
   398  write_unlock_irq(&mhi_cntrl->pm_lock);
   399  goto exit_intvec;
   400  }
   401  
   402  if (mhi_cntrl->ee == MHI_EE_RDDM && mhi_cntrl->ee != 
ee) {
   403  /* prevent clients from queueing any more 
packets */
   404  pm_state = mhi_tryset_pm_state(mhi_cntrl,
   405 
MHI_PM_SYS_ERR_DETECT);
   406  if (pm_state == MHI_PM_SYS_ERR_DETECT)
   407  handle_rddm = true;
   408  }
   409  
   410  write_unlock_irq(&mhi_cntrl->pm_lock);
   411  
   412  if (handle_rddm) {
   413  dev_err(dev, "RDDM event occurred!\n");
   414  mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_EE_RDDM);
   415  wake_up_all(&mhi_cntrl->state_event);
   416  }
   417  goto exit_intvec;
   418  }
   419  
   420  if (state == MHI_STATE_SYS_ERR) {
   421  dev_dbg(dev, "System error detected\n");
   422  pm_state = mhi_tryset_pm_state(mhi_cntrl,
   423 MHI_PM_SYS_ERR_DETECT);
   424  }
   425  
   426  write_unlock_irq(&mhi_cntrl->pm_lock);
   427  
   428  if (pm_state == MHI_PM_SYS_ERR_DETECT) {
   429  wake_up_all(&mhi_cntrl->state_event);
   430  
   431  /* For fatal errors, we let controller decide next step 
*/
   432  if (MHI_IN_PBL(ee))
   433  mhi_cntrl->status_cb(mhi_cntrl, 
MHI_CB_FATAL_ERROR);
   434  else
   435  mhi_pm_sys_err_handler(mhi_cntrl);
   436  }
   437  
   438  exit_intvec:
   439  
   440  r

Re: [PATCH v4 3/9] usb: dwc3: Increase timeout for CmdAct cleared by device controller

2020-05-20 Thread Felipe Balbi


Hi,

Thinh Nguyen  writes:
 "Power Down Scale (PwrDnScale)
 The USB3 suspend_clk input replaces pipe3_rx_pclk as a clock source
 to a small part of the USB3 controller that operates when the SS
 PHY is in its lowest power (P3) state, and therefore does not provide 
 a clock.
 The Power Down Scale field specifies how many suspend_clk periods
 fit into a 16 kHz clock period. When performing the division, round
 up the remainder.
 For example, when using an 8-bit/16-bit/32-bit PHY and 25-MHz
 Suspend clock, Power Down Scale = 25000 kHz/16 kHz = 13'd1563
 (rounder up)
 Note:
 - Minimum Suspend clock frequency is 32 kHz
 - Maximum Suspend clock frequency is 125 MHz"
>>> Cool, now do we have an upper bound for how many clock cycles it
>>> takes to wake up the PHY?
>> My understanding is this ep command does not wake up the SS PHY, the
>> SS PHY still stays at P3 when execute this ep command. The time
>> required here is to wait controller complete something for this ep
>> command with 32K clock.
> Sorry I made a mistake. You're right. Just checked with one of the RTL
> engineers, and it doesn't need to wake up the phy. However, if it is
> eSS speed, it may take longer time as the command may be completing
> with the suspend clock.
>
 What's the value for GCTL[7:6]?
>>> 2'b00
>>>
>>> Thanks
>>> Li Jun
>> (Sorry for the delay reply)
>>
>> If it's 0, then the ram clock should be the same as the bus_clk, which
>> is odd since you mentioned that the suspend_clk is used instead while in P3.
>
> Just checked with the RTL engineer, even if GCTL[7:6] is set to 0, 
> internally it can still run with suspend clock during P3.
>
>> Anyway, I was looking for a way maybe to improve the speed during
>> issuing a command. One way is to set GUSB3PIPECTL[17]=0, and it should
>> wakeup the phy anytime. I think Felipe suggested it. It's odd that it
>> doesn't work for you. I don't have other ideas beside increasing the
>> command timeout.
>>
>
> In any case, increasing the timeout should be fine with me. It maybe 
> difficult to determine the max timeout base on the slowest clock rate 
> and number of cycles. Different controller and controller versions 
> behave differently and may have different number of clock cycles to 
> complete a command.
>
> The RTL engineer recommended timeout to be at least 1ms (which maybe 
> more than the polling rate of this patch). I'm fine with either the rate 
> provided by this tested patch or higher.

A whole ms waiting for a command to complete? Wow, that's a lot of time
blocking the CPU. It looks like, perhaps, we should move to command
completion interrupts. The difficulty here is that we issue commands
from within the interrupt handler and, as such, can't
wait_for_completion().

Meanwhile, we will take the timeout increase I guess, otherwise NXP
won't have a working setup.

-- 
balbi


signature.asc
Description: PGP signature

[PATCH v2 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Robin Gong

Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low group
(1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage range too.
Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate i.mx8mn
datasheet as the below warning log in kernel:

[0.995524] LDO1: Bringing 180uV into 300-300uV
[0.999196] LDO2: Bringing 80uV into 90-90uV

Signed-off-by: Robin Gong 
---
 arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts 
b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
index d07e0e6..a1e5483 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
@@ -113,7 +113,7 @@
 
ldo1_reg: LDO1 {
regulator-name = "LDO1";
-   regulator-min-microvolt = <300>;
+   regulator-min-microvolt = <160>;
regulator-max-microvolt = <330>;
regulator-boot-on;
regulator-always-on;
@@ -121,7 +121,7 @@
 
ldo2_reg: LDO2 {
regulator-name = "LDO2";
-   regulator-min-microvolt = <90>;
+   regulator-min-microvolt = <80>;
regulator-max-microvolt = <90>;
regulator-boot-on;
regulator-always-on;
-- 
2.7.4

Hallo

2020-05-20 Thread Katharina Hedwig Muller

Hallo,

Ihre E-Mail-Adresse wurde zufällig mit einem Computer-Spin-Ball ausgewählt, um 
eine (Spende) von Katharina Hedwig Muller (KHM Foundations) zu erhalten. 
Bestätigen Sie die Antwort auf Ihre E-Mail und senden Sie eine Antwort an 
kath.he...@hedwigmuller.com, um weitere Informationen zu erhalten.

Katharina Hedwig
MüllerKHM Stiftungen
.
Your email address was randomly selected with a computer spin ball to receive a 
(donation) from Katharina Hedwig Muller (KHM Foundations). Confirm the reply to 
your email and send a reply to  kath.he...@hedwigmuller.com   for more details.

Katharina Hedwig 
MullerKHM foundations

-- 
Esta mensagem foi verificada pelo sistema de antivírus e
 acredita-se estar livre de perigo.

[PATCH v2 1/2] arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Robin Gong

Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low group
(1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage range too.
Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate i.mx8mm
datasheet as the below warning log in kernel:

[0.995524] LDO1: Bringing 180uV into 300-300uV
[0.999196] LDO2: Bringing 80uV into 90-90uV

Signed-off-by: Robin Gong 
---
 arch/arm64/boot/dts/freescale/imx8mm-evk.dts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts 
b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
index e5ec832..0f1d7f8 100644
--- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
@@ -208,7 +208,7 @@
 
ldo1_reg: LDO1 {
regulator-name = "LDO1";
-   regulator-min-microvolt = <300>;
+   regulator-min-microvolt = <160>;
regulator-max-microvolt = <330>;
regulator-boot-on;
regulator-always-on;
@@ -216,7 +216,7 @@
 
ldo2_reg: LDO2 {
regulator-name = "LDO2";
-   regulator-min-microvolt = <90>;
+   regulator-min-microvolt = <80>;
regulator-max-microvolt = <90>;
regulator-boot-on;
regulator-always-on;
-- 
2.7.4

Re: [PATCHv3 3/5] Input: EXC3000: add EXC80H60 and EXC80H84 support

2020-05-20 Thread Dmitry Torokhov

On Wed, May 20, 2020 at 11:20:03PM +0200, Sebastian Reichel wrote:
> Hi,
> 
> On Wed, May 20, 2020 at 10:45:19AM -0700, Dmitry Torokhov wrote:
> > Hi Sebastian,
> > 
> > On Wed, May 20, 2020 at 05:39:34PM +0200, Sebastian Reichel wrote:
> > >  
> > >   data->client = client;
> > > + data->info = device_get_match_data(&client->dev);
> 
> The above is for DT (and ACPI, but driver has no ACPI table).
> 
> > > + if (!data->info) {
> > > + enum eeti_dev_id eeti_dev_id =
> > > + i2c_match_id(exc3000_id, client)->driver_data;
> > 
> > I believe i2c devices can be instantiated via sysfs, so I think we
> > better handle case where we can't find matching id. Also driver_data is
> > enough to store a pointer, maybe we can have individual structures
> > instead of using an array and indexing here?
> 
> The above code is only for exactly this usecase (loading via sysfs).
> There is zero chance, that we cannot find matching id. The sysfs
> based probing works by providing the device address and the name
> listed in driver's id_table. I took the above code style from
> drivers/i2c/muxes/i2c-mux-ltc4306.c.

Ah, OK, so i2c does not provide the "new_id" attribute to extend the
match table.

> 
> We can store the pointer directly in i2c_device_id's driver_data
> field, but that requires two type casts (field is ulong instead
> of pointer). The array variant feels a bit cleaner to me.

OK, fair enough. I'll wait for Rob to ack the yaml conversion and will
apply.

Thanks.

-- 
Dmitry

Re: [PATCH v3/RESEND 0/3] Even moar rpmh cleanups

2020-05-20 Thread Randy Dunlap

On 5/20/20 11:04 PM, Stephen Boyd wrote:
> (Resent with more Ccs and To lines)
> 
> We remove the tcs_is_free() API and then do super micro optimizations on
> the irq handler. I haven't tested anything here so most likely there's a
> bug (again again)!
> 


Subject: s/moar/more/


-- 
~Randy

Re: [PATCH v2] KVM: PPC: Book3S HV: relax check on H_SVM_INIT_ABORT

2020-05-20 Thread Ram Pai

On Wed, May 20, 2020 at 07:43:08PM +0200, Laurent Dufour wrote:
> The commit 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_*
> Hcalls") added checks of secure bit of SRR1 to filter out the Hcall
> reserved to the Ultravisor.
> 
> However, the Hcall H_SVM_INIT_ABORT is made by the Ultravisor passing the
> context of the VM calling UV_ESM. This allows the Hypervisor to return to
> the guest without going through the Ultravisor. Thus the Secure bit of SRR1
> is not set in that particular case.
> 
> In the case a regular VM is calling H_SVM_INIT_ABORT, this hcall will be
> filtered out in kvmppc_h_svm_init_abort() because kvm->arch.secure_guest is
> not set in that case.
> 
> Fixes: 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_* Hcalls")
> Signed-off-by: Laurent Dufour 


Reviewed-by: Ram Pai 

> ---
>  arch/powerpc/kvm/book3s_hv.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 93493f0cbfe8..6ad1a3b14300 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1099,9 +1099,12 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
>   ret = kvmppc_h_svm_init_done(vcpu->kvm);
>   break;
>   case H_SVM_INIT_ABORT:
> - ret = H_UNSUPPORTED;
> - if (kvmppc_get_srr1(vcpu) & MSR_S)
> - ret = kvmppc_h_svm_init_abort(vcpu->kvm);
> + /*
> +  * Even if that call is made by the Ultravisor, the SSR1 value
> +  * is the guest context one, with the secure bit clear as it has
> +  * not yet been secured. So we can't check it here.
> +  */

Frankly speaking, the comment above when read in isolation; i.e without
the delete code above, feels out of place.  The reasoning for change is
anyway captured in the changelog.  So, I think, we should delete this
comment.

Also the comment above assumes the Ultravisor will call H_SVM_INIT_ABORT
with SRR1(S) bit not set; which may or may not be true.  Regardless of
who and how H_SVM_INIT_ABORT is called, we should just call
kvmppc_h_svm_init_abort() and let it deal with the complexities.


RP

Re: [PATCH v10 1/5] usb: xhci: Change the XHCI link order in the Makefile

2020-05-20 Thread Greg Kroah-Hartman

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, May 20, 2020 at 01:29:45PM -0400, Alan Cooper wrote:
> Greg, Alan,
> 
> The other 4 related patches were accepted into usb-next and I just
> realized that this one didn't make it. This patch will not fix the
> "insmod out of order" issue, but will help our controllers work with
> some poorly behaved USB devices when the drivers are builtin.

As it doesn't solve the real issue, I did not accept this so that you
all can continue to work on creating a real solution that works for both
situations (built in and as modules.)

I thought I said that already...

thanks,

greg k-h

RE: [PATCH v1 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Robin Gong

2020/05/21 14:02 Peng Fan  wrote:
> > Subject: [PATCH v1 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2
> > voltage range
> >
> > Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low
> > group
> > (1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage
> > groups have been supported at bd718x7-regulator driver, hence, just
> > corrrect the voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage 
> > range
> too.
> > Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate
> > i.mx8mm datasheet as the below warning log in kernel:
> >
> > [0.995524] LDO1: Bringing 180uV into 300-300uV
> > [0.999196] LDO2: Bringing 80uV into 90-90uV
> >
> > Signed-off-by: Robin Gong 
> > ---
> >  arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 ++--
> >  arch/arm64/boot/dts/freescale/imx8mn-evk.dts  | 9 +
> >  2 files changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > index d07e0e6..a1e5483 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > +++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > @@ -113,7 +113,7 @@
> >
> > ldo1_reg: LDO1 {
> > regulator-name = "LDO1";
> > -   regulator-min-microvolt = <300>;
> > +   regulator-min-microvolt = <160>;
> > regulator-max-microvolt = <330>;
> > regulator-boot-on;
> > regulator-always-on;
> > @@ -121,7 +121,7 @@
> >
> > ldo2_reg: LDO2 {
> > regulator-name = "LDO2";
> > -   regulator-min-microvolt = <90>;
> > +   regulator-min-microvolt = <80>;
> > regulator-max-microvolt = <90>;
> > regulator-boot-on;
> > regulator-always-on;
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> > b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> > index 61f3519..117ff4b 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> > +++ b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> > @@ -13,6 +13,15 @@
> > compatible = "fsl,imx8mn-evk", "fsl,imx8mn";  };
> >
> > +&ecspi1 {
> > +   status = "okay";
> > +spidev0: spi@0 {
> > +   compatible = "ge,achc";
> > +   reg = <0>;
> > +   spi-max-frequency = <100>;
> > +   };
> > +};
> > +
> 
> This was added by mistake?
Sorry, will send out v2.

[PATCHv3/RESEND 2/3] soc: qcom: rpmh-rsc: Loop over fewer bits in irq handler

2020-05-20 Thread Stephen Boyd

readl() returns a u32, and BITS_PER_LONG is different on 32-bit vs.
64-bit architectures. Let's loop over the possible bits set in that type
instead of looping over more bits than we ever may need to.

Cc: Maulik Shah 
Reviewed-by: Douglas Anderson 
Reviewed-by: Bjorn Andersson 
Signed-off-by: Stephen Boyd 
---
 drivers/soc/qcom/rpmh-rsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 60fc56987659..ce725d4ff097 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -383,7 +383,7 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
 
irq_status = readl_relaxed(drv->tcs_base + RSC_DRV_IRQ_STATUS);
 
-   for_each_set_bit(i, &irq_status, BITS_PER_LONG) {
+   for_each_set_bit(i, &irq_status, BITS_PER_TYPE(u32)) {
req = get_req_from_tcs(drv, i);
if (!req) {
WARN_ON(1);
-- 
Sent by a computer, using git, on the internet

[PATCH v3/RESEND 0/3] Even moar rpmh cleanups

2020-05-20 Thread Stephen Boyd

(Resent with more Ccs and To lines)

We remove the tcs_is_free() API and then do super micro optimizations on
the irq handler. I haven't tested anything here so most likely there's a
bug (again again)!

Changes from v2:
 * Went back in time and used the v1 patch for the first patch with
   the fixes to make it not so complicated

Changes from v1:
 * First patch became even moar complicated because it combines
   find_free_tcs() with the check for a request in flight
 * Fixed subject in patch 2
 * Put back unsigned long for bitmap operation to silence compiler
   warning
 * Picked up review tags

Stephen Boyd (3):
  soc: qcom: rpmh-rsc: Remove tcs_is_free() API
  soc: qcom: rpmh-rsc: Loop over fewer bits in irq handler
  soc: qcom: rpmh-rsc: Fold WARN_ON() into if condition

 drivers/soc/qcom/rpmh-rsc.c | 65 +
 1 file changed, 22 insertions(+), 43 deletions(-)

Cc: Maulik Shah 
Cc: Douglas Anderson 

base-commit: 1f7a3eb785e4a4e196729cd3d5ec97bd5f9f2940
-- 
Sent by a computer, using git, on the internet

[PATCHv3/RESEND 3/3] soc: qcom: rpmh-rsc: Fold WARN_ON() into if condition

2020-05-20 Thread Stephen Boyd

Move the WARN_ON() into the if condition so the compiler can see that
the branch is unlikely() and possibly optimize it better.

Cc: Maulik Shah 
Reviewed-by: Douglas Anderson 
Reviewed-by: Bjorn Andersson 
Signed-off-by: Stephen Boyd 
---
 drivers/soc/qcom/rpmh-rsc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index ce725d4ff097..8381bd012de4 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -385,10 +385,8 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
 
for_each_set_bit(i, &irq_status, BITS_PER_TYPE(u32)) {
req = get_req_from_tcs(drv, i);
-   if (!req) {
-   WARN_ON(1);
+   if (WARN_ON(!req))
goto skip;
-   }
 
err = 0;
for (j = 0; j < req->num_cmds; j++) {
-- 
Sent by a computer, using git, on the internet

[PATCHv3/RESEND 1/3] soc: qcom: rpmh-rsc: Remove tcs_is_free() API

2020-05-20 Thread Stephen Boyd

This API does very little. Let's replace all the callsites with the
normal operations that would be done on top of the bitmap that
tcs_in_use is. This simplifies and reduces the code size.

Cc: Maulik Shah 
Cc: Douglas Anderson 
Signed-off-by: Stephen Boyd 
---
 drivers/soc/qcom/rpmh-rsc.c | 59 +
 1 file changed, 20 insertions(+), 39 deletions(-)

diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 076fd27f3081..60fc56987659 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -184,22 +184,6 @@ static void write_tcs_reg_sync(const struct rsc_drv *drv, 
int reg, int tcs_id,
   data, tcs_id, reg);
 }
 
-/**
- * tcs_is_free() - Return if a TCS is totally free.
- * @drv:The RSC controller.
- * @tcs_id: The global ID of this TCS.
- *
- * Returns true if nobody has claimed this TCS (by setting tcs_in_use).
- *
- * Context: Must be called with the drv->lock held.
- *
- * Return: true if the given TCS is free.
- */
-static bool tcs_is_free(struct rsc_drv *drv, int tcs_id)
-{
-   return !test_bit(tcs_id, drv->tcs_in_use);
-}
-
 /**
  * tcs_invalidate() - Invalidate all TCSes of the given type (sleep or wake).
  * @drv:  The RSC controller.
@@ -512,7 +496,7 @@ static void __tcs_buffer_write(struct rsc_drv *drv, int 
tcs_id, int cmd_id,
  *
  * Return: 0 if nothing in flight or -EBUSY if we should try again later.
  * The caller must re-enable interrupts between tries since that's
- * the only way tcs_is_free() will ever return true and the only way
+ * the only way tcs_in_use will ever be updated and the only way
  * RSC_DRV_CMD_ENABLE will ever be cleared.
  */
 static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
@@ -520,17 +504,14 @@ static int check_for_req_inflight(struct rsc_drv *drv, 
struct tcs_group *tcs,
 {
unsigned long curr_enabled;
u32 addr;
-   int i, j, k;
-   int tcs_id = tcs->offset;
+   int j, k;
+   int i = tcs->offset;
 
-   for (i = 0; i < tcs->num_tcs; i++, tcs_id++) {
-   if (tcs_is_free(drv, tcs_id))
-   continue;
-
-   curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, tcs_id);
+   for_each_set_bit_from(i, drv->tcs_in_use, tcs->offset + tcs->num_tcs) {
+   curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i);
 
for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
-   addr = read_tcs_cmd(drv, RSC_DRV_CMD_ADDR, tcs_id, j);
+   addr = read_tcs_cmd(drv, RSC_DRV_CMD_ADDR, i, j);
for (k = 0; k < msg->num_cmds; k++) {
if (addr == msg->cmds[k].addr)
return -EBUSY;
@@ -548,18 +529,19 @@ static int check_for_req_inflight(struct rsc_drv *drv, 
struct tcs_group *tcs,
  *
  * Must be called with the drv->lock held since that protects tcs_in_use.
  *
- * Return: The first tcs that's free.
+ * Return: The first tcs that's free or -EBUSY if all in use.
  */
 static int find_free_tcs(struct tcs_group *tcs)
 {
-   int i;
+   const struct rsc_drv *drv = tcs->drv;
+   unsigned long i;
+   unsigned long max = tcs->offset + tcs->num_tcs;
 
-   for (i = 0; i < tcs->num_tcs; i++) {
-   if (tcs_is_free(tcs->drv, tcs->offset + i))
-   return tcs->offset + i;
-   }
+   i = find_next_zero_bit(drv->tcs_in_use, max, tcs->offset);
+   if (i >= max)
+   return -EBUSY;
 
-   return -EBUSY;
+   return i;
 }
 
 /**
@@ -757,8 +739,9 @@ int rpmh_rsc_write_ctrl_data(struct rsc_drv *drv, const 
struct tcs_request *msg)
  */
 static bool rpmh_rsc_ctrlr_is_busy(struct rsc_drv *drv)
 {
-   int m;
-   struct tcs_group *tcs = &drv->tcs[ACTIVE_TCS];
+   unsigned long set;
+   const struct tcs_group *tcs = &drv->tcs[ACTIVE_TCS];
+   unsigned long max;
 
/*
 * If we made an active request on a RSC that does not have a
@@ -769,12 +752,10 @@ static bool rpmh_rsc_ctrlr_is_busy(struct rsc_drv *drv)
if (!tcs->num_tcs)
tcs = &drv->tcs[WAKE_TCS];
 
-   for (m = tcs->offset; m < tcs->offset + tcs->num_tcs; m++) {
-   if (!tcs_is_free(drv, m))
-   return true;
-   }
+   max = tcs->offset + tcs->num_tcs;
+   set = find_next_bit(drv->tcs_in_use, max, tcs->offset);
 
-   return false;
+   return set < max;
 }
 
 /**
-- 
Sent by a computer, using git, on the internet

RE: [PATCH v1 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Peng Fan

> Subject: [PATCH v1 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2
> voltage range
> 
> Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low group
> (1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage groups
> have been supported at bd718x7-regulator driver, hence, just corrrect the
> voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage range too.
> Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate i.mx8mm
> datasheet as the below warning log in kernel:
> 
> [0.995524] LDO1: Bringing 180uV into 300-300uV
> [0.999196] LDO2: Bringing 80uV into 90-90uV
> 
> Signed-off-by: Robin Gong 
> ---
>  arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 ++--
>  arch/arm64/boot/dts/freescale/imx8mn-evk.dts  | 9 +
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> index d07e0e6..a1e5483 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> @@ -113,7 +113,7 @@
> 
>   ldo1_reg: LDO1 {
>   regulator-name = "LDO1";
> - regulator-min-microvolt = <300>;
> + regulator-min-microvolt = <160>;
>   regulator-max-microvolt = <330>;
>   regulator-boot-on;
>   regulator-always-on;
> @@ -121,7 +121,7 @@
> 
>   ldo2_reg: LDO2 {
>   regulator-name = "LDO2";
> - regulator-min-microvolt = <90>;
> + regulator-min-microvolt = <80>;
>   regulator-max-microvolt = <90>;
>   regulator-boot-on;
>   regulator-always-on;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> index 61f3519..117ff4b 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
> @@ -13,6 +13,15 @@
>   compatible = "fsl,imx8mn-evk", "fsl,imx8mn";  };
> 
> +&ecspi1 {
> + status = "okay";
> +spidev0: spi@0 {
> + compatible = "ge,achc";
> + reg = <0>;
> + spi-max-frequency = <100>;
> + };
> +};
> +

This was added by mistake?

Regards,
Peng.

>  &A53_0 {
>   /delete-property/operating-points-v2;
>  };
> --
> 2.7.4

[PATCH v3] kvm/x86 : Remove redundant function implement

2020-05-20 Thread Richard

pic_in_kernel(),ioapic_in_kernel() and irqchip_kernel() have the
same implementation.

Signed-off-by: Peng Hao 
---
 arch/x86/kvm/ioapic.h  |  8 ++--
 arch/x86/kvm/irq.h | 14 --
 arch/x86/kvm/lapic.c   |  1 +
 arch/x86/kvm/mmu/mmu.c |  1 +
 arch/x86/kvm/x86.c |  1 +
 5 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 2fb2e3c..7a3c53b 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -5,7 +5,7 @@
 #include 

 #include 
-
+#include "irq.h"
 struct kvm;
 struct kvm_vcpu;

@@ -108,11 +108,7 @@ do {\

 static inline int ioapic_in_kernel(struct kvm *kvm)
 {
-int mode = kvm->arch.irqchip_mode;
-
-/* Matches smp_wmb() when setting irqchip_mode */
-smp_rmb();
-return mode == KVM_IRQCHIP_KERNEL;
+return irqchip_kernel(kvm);
 }

 void kvm_rtc_eoi_tracking_restore_one(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index f173ab6..e133c1a 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -16,7 +16,6 @@
 #include 

 #include 
-#include "ioapic.h"
 #include "lapic.h"

 #define PIC_NUM_PINS 16
@@ -66,15 +65,6 @@ void kvm_pic_destroy(struct kvm *kvm);
 int kvm_pic_read_irq(struct kvm *kvm);
 void kvm_pic_update_irq(struct kvm_pic *s);

-static inline int pic_in_kernel(struct kvm *kvm)
-{
-int mode = kvm->arch.irqchip_mode;
-
-/* Matches smp_wmb() when setting irqchip_mode */
-smp_rmb();
-return mode == KVM_IRQCHIP_KERNEL;
-}
-
 static inline int irqchip_split(struct kvm *kvm)
 {
 int mode = kvm->arch.irqchip_mode;
@@ -93,6 +83,10 @@ static inline int irqchip_kernel(struct kvm *kvm)
 return mode == KVM_IRQCHIP_KERNEL;
 }

+static inline int pic_in_kernel(struct kvm *kvm)
+{
+return irqchip_kernel(kvm);
+}
 static inline int irqchip_in_kernel(struct kvm *kvm)
 {
 int mode = kvm->arch.irqchip_mode;
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9af25c9..de4d046 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -36,6 +36,7 @@
 #include 
 #include "kvm_cache_regs.h"
 #include "irq.h"
+#include "ioapic.h"
 #include "trace.h"
 #include "x86.h"
 #include "cpuid.h"
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8071952..6133f69 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -16,6 +16,7 @@
  */

 #include "irq.h"
+#include "ioapic.h"
 #include "mmu.h"
 #include "x86.h"
 #include "kvm_cache_regs.h"
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d786c7d..c8b62ac 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -18,6 +18,7 @@

 #include 
 #include "irq.h"
+#include "ioapic.h"
 #include "mmu.h"
 #include "i8254.h"
 #include "tss.h"
--
2.7.4


OPPO

本电子邮件及其附件含有OPPO公司的保密信息，仅限于邮件指明的收件人使用（包含个人及群组）。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件，请立即以电子邮件通知发件人并删除本邮件及其附件。

This e-mail and its attachments contain confidential information from OPPO, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete it!

Re: [PATCH] kthread: Use TASK_IDLE state for newly created kernel threads

2020-05-20 Thread Greg Kroah-Hartman

On Thu, May 21, 2020 at 07:05:44AM +0530, Pavan Kondeti wrote:
> On Wed, May 20, 2020 at 08:18:58PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, May 20, 2020 at 05:25:09PM +0530, Pavankumar Kondeti wrote:
> > > When kernel threads are created for later use, they will be in
> > > TASK_UNINTERRUPTIBLE state until they are woken up. This results
> > > in increased loadavg and false hung task reports. To fix this,
> > > use TASK_IDLE state instead of TASK_UNINTERRUPTIBLE when
> > > a kernel thread schedules out for the first time.
> > > 
> > > Signed-off-by: Pavankumar Kondeti 
> > > ---
> > >  kernel/kthread.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > > index bfbfa48..b74ed8e 100644
> > > --- a/kernel/kthread.c
> > > +++ b/kernel/kthread.c
> > > @@ -250,7 +250,7 @@ static int kthread(void *_create)
> > >   current->vfork_done = &self->exited;
> > >  
> > >   /* OK, tell user we're spawned, wait for stop or wakeup */
> > > - __set_current_state(TASK_UNINTERRUPTIBLE);
> > > + __set_current_state(TASK_IDLE);
> > >   create->result = current;
> > >   /*
> > >* Thread is going to call schedule(), do not preempt it,
> > > @@ -428,7 +428,7 @@ static void __kthread_bind(struct task_struct *p, 
> > > unsigned int cpu, long state)
> > >  
> > >  void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask)
> > >  {
> > > - __kthread_bind_mask(p, mask, TASK_UNINTERRUPTIBLE);
> > > + __kthread_bind_mask(p, mask, TASK_IDLE);
> > >  }
> > >  
> > >  /**
> > > @@ -442,7 +442,7 @@ void kthread_bind_mask(struct task_struct *p, const 
> > > struct cpumask *mask)
> > >   */
> > >  void kthread_bind(struct task_struct *p, unsigned int cpu)
> > >  {
> > > - __kthread_bind(p, cpu, TASK_UNINTERRUPTIBLE);
> > > + __kthread_bind(p, cpu, TASK_IDLE);
> > >  }
> > >  EXPORT_SYMBOL(kthread_bind);
> > 
> > It's as if people never read mailing lists:
> > 
> > https://lore.kernel.org/r/dm6pr11mb3531d3b164357b2dc476102ddf...@dm6pr11mb3531.namprd11.prod.outlook.com
> > 
> > Given that this is an identical resend of the previous patch, why are
> > you doing so, and what has changed since that original rejection?
> > 
> I did not know that it is attempted before. Thanks for pointing to the
> previous discussion. 
> 
> We have seen hung task reports from customers and it is due to a downstream
> change which create bunch of kernel threads for later use.

Do you have a pointer to that specific change?

> From Peter's reply, I understood that one must wake up the kthread
> after creation and put it in INTERRUPTIBLE sleep. I will pass on the
> message.

Just go fix that code, it sounds like it's in your tree already :)

thanks,

greg k-h

Re: [PATCHv3 4/5] Input: EXC3000: Add support to query model and fw_version

2020-05-20 Thread Dmitry Torokhov

On Wed, May 20, 2020 at 11:25:40PM +0200, Sebastian Reichel wrote:
> Hi,
> 
> On Wed, May 20, 2020 at 10:49:52AM -0700, Dmitry Torokhov wrote:
> > Hi Sebastian,
> > 
> > On Wed, May 20, 2020 at 05:39:35PM +0200, Sebastian Reichel wrote:
> > > Expose model and fw_version via sysfs. Also query the model
> > > in probe to make sure, that the I2C communication with the
> > > device works before successfully probing the driver.
> > > 
> > > This is a bit complicated, since EETI devices do not have
> > > a sync interface. Sending the commands and directly reading
> > > does not work. Sending the command and waiting for some time
> > > is also not an option, since there might be touch events in
> > > the mean time.
> > > 
> > > Last but not least we do not cache the results, since this
> > > interface can be used to check the I2C communication is still
> > > working as expected.
> > > 
> > > Signed-off-by: Sebastian Reichel 
> > > ---
> > >  .../ABI/testing/sysfs-driver-input-exc3000|  15 ++
> > >  drivers/input/touchscreen/exc3000.c   | 145 +-
> > >  2 files changed, 159 insertions(+), 1 deletion(-)
> > >  create mode 100644 Documentation/ABI/testing/sysfs-driver-input-exc3000
> > > 
> > > diff --git a/Documentation/ABI/testing/sysfs-driver-input-exc3000 
> > > b/Documentation/ABI/testing/sysfs-driver-input-exc3000
> > > new file mode 100644
> > > index ..d79da4f869af
> > > --- /dev/null
> > > +++ b/Documentation/ABI/testing/sysfs-driver-input-exc3000
> > > @@ -0,0 +1,15 @@
> > > +What:/sys/class/input/inputX/fw_version
> > > +Date:May 2020
> > > +Contact: linux-in...@vger.kernel.org
> > > +Description: Reports the firmware version provided by the 
> > > touchscreen, for example "00_T6" on a EXC80H60
> > > +
> > > + Access: Read
> > > + Valid values: Represented as string
> > > +
> > > +What:/sys/class/input/inputX/model
> > > +Date:May 2020
> > > +Contact: linux-in...@vger.kernel.org
> > > +Description: Reports the model identification provided by the 
> > > touchscreen, for example "Orion_1320" on a EXC80H60
> > > +
> > > + Access: Read
> > > + Valid values: Represented as string
> > 
> > These are properties of the controller (i2c device), not input
> > abstraction class on top of it, so the attributes should be attached to
> > i2c_client instance.
> > 
> > Please use devm_device_add_group() in probe to instantiate them at the
> > proper level.
> 
> As written in the cover letter using devm_device_add_group() in
> probe routine results in a udev race condition:
> 
> http://kroah.com/log/blog/2013/06/26/how-to-create-a-sysfs-file-correctly/

This race has been solved with the addition of KOBJ_BIND/KOBJ_UNBIND
uevents that signal when driver is bound or unbound from the device.
Granted, current systemd/udev drops them as it does not know how to
"add" to the device state, but this is on systemd to solve.

Thanks.

-- 
Dmitry

[RFC PATCH] optee: __optee_enumerate_devices() can be static

2020-05-20 Thread kbuild test robot



Signed-off-by: kbuild test robot 
---
 device.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tee/optee/device.c b/drivers/tee/optee/device.c
index 8263b308efd56..d4931dad07aaa 100644
--- a/drivers/tee/optee/device.c
+++ b/drivers/tee/optee/device.c
@@ -87,7 +87,7 @@ static int optee_register_device(const uuid_t *device_uuid, 
u32 device_id)
return rc;
 }
 
-int __optee_enumerate_devices(u32 func)
+static int __optee_enumerate_devices(u32 func)
 {
const uuid_t pta_uuid =
UUID_INIT(0x7011a688, 0xddde, 0x4053,

Re: [PATCH 2/6] soc: ti: omap-prm: Add basic power domain support

2020-05-20 Thread kbuild test robot

Hi Tony,

I love your patch! Perhaps something to improve:

[auto build test WARNING on omap/for-next]
[also build test WARNING on robh/for-next keystone/next v5.7-rc6 next-20200519]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Tony-Lindgren/Add-initial-genpd-support-for-omap-PRM-driver/20200521-063328
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap.git 
for-next
config: arm-defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

drivers/soc/ti/omap_prm.c: In function 'omap_prm_domain_detach_dev':
>> drivers/soc/ti/omap_prm.c:347:26: warning: variable 'prmd' set but not used 
>> [-Wunused-but-set-variable]
347 |  struct omap_prm_domain *prmd;
|  ^~~~
At top level:
drivers/soc/ti/omap_prm.c:123:21: warning: 'omap_prm_onoff_noauto' defined but 
not used [-Wunused-const-variable=]
123 | omap_prm_domain_map omap_prm_onoff_noauto = {
| ^
drivers/soc/ti/omap_prm.c:115:21: warning: 'omap_prm_nooff' defined but not 
used [-Wunused-const-variable=]
115 | omap_prm_domain_map omap_prm_nooff = {
| ^~
drivers/soc/ti/omap_prm.c:107:21: warning: 'omap_prm_noinact' defined but not 
used [-Wunused-const-variable=]
107 | omap_prm_domain_map omap_prm_noinact = {
| ^~~~
drivers/soc/ti/omap_prm.c:99:21: warning: 'omap_prm_all' defined but not used 
[-Wunused-const-variable=]
99 | omap_prm_domain_map omap_prm_all = {
| ^~~~

vim +/prmd +347 drivers/soc/ti/omap_prm.c

   342  
   343  static void omap_prm_domain_detach_dev(struct generic_pm_domain *domain,
   344 struct device *dev)
   345  {
   346  struct generic_pm_domain_data *genpd_data;
 > 347  struct omap_prm_domain *prmd;
   348  
   349  prmd = genpd_to_prm_domain(domain);
   350  
   351  genpd_data = dev_gpd_data(dev);
   352  genpd_data->data = NULL;
   353  }
   354  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] clk: qcom: gcc: Fix parent for gpll0_out_even

2020-05-20 Thread Bjorn Andersson

On Wed 20 May 22:27 PDT 2020, Vinod Koul wrote:

> Documentation says that gpll0 is parent of gpll0_out_even, somehow
> driver coded that as bi_tcxo, so fix it
> 
> Fixes: 2a1d7eb854bb ("clk: qcom: gcc: Add global clock controller driver for 
> SM8150")
> Reported-by: Jonathan Marek 
> Signed-off-by: Vinod Koul 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/clk/qcom/gcc-sm8150.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/clk/qcom/gcc-sm8150.c b/drivers/clk/qcom/gcc-sm8150.c
> index 2bc08e7125bf..72524cf11048 100644
> --- a/drivers/clk/qcom/gcc-sm8150.c
> +++ b/drivers/clk/qcom/gcc-sm8150.c
> @@ -76,8 +76,7 @@ static struct clk_alpha_pll_postdiv gpll0_out_even = {
>   .clkr.hw.init = &(struct clk_init_data){
>   .name = "gpll0_out_even",
>   .parent_data = &(const struct clk_parent_data){
> - .fw_name = "bi_tcxo",
> - .name = "bi_tcxo",
> + .hw = &gpll0.clkr.hw,
>   },
>   .num_parents = 1,
>   .ops = &clk_trion_pll_postdiv_ops,
> -- 
> 2.25.4
>

Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally

2020-05-20 Thread Tao Xu





On 5/21/2020 12:33 PM, Xiaoyao Li wrote:

On 5/21/2020 5:05 AM, Paolo Bonzini wrote:

On 20/05/20 18:07, Maxim Levitsky wrote:

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky 
---
  arch/x86/kvm/x86.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
  if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
  min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
  continue;
+    break;
+    case MSR_IA32_UMWAIT_CONTROL:
+    if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+    continue;
  default:
  break;
  }


The patch is correct, and matches what is done for the other entries of
msrs_to_save_all.  However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.

So while at it you should also fix that.  The right way to do that is to
add a

 if (vmx_waitpkg_supported())
 kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);


+ Tao

I remember there is certainly some reason why we don't expose WAITPKG to 
guest by default.


Tao, please help clarify it.

Thanks,
-Xiaoyao



Because in VM, umwait and tpause can put a (psysical) CPU into a power 
saving state. So from host view, this cpu will be 100% usage by VM. 
Although umwait and tpause just cause short wait(maybe 100 
microseconds), we still want to unconditionally expose WAITPKG in VM.

[PATCH] platform: cros_ec_debugfs: control uptime information request

2020-05-20 Thread Gwendal Grignou

When EC does not support uptime command (EC_CMD_GET_UPTIME_INFO),
return -EPROTO to read of /sys/kernel/debug/cros_ec/uptime without
calling the EC after the first try.

The EC console log will not contain EC_CMD_GET_UPTIME_INFO anymore.

Signed-off-by: Gwendal Grignou 
---
 drivers/platform/chrome/cros_ec_debugfs.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/platform/chrome/cros_ec_debugfs.c 
b/drivers/platform/chrome/cros_ec_debugfs.c
index 6ae484989d1f5..70a29afb6d9e7 100644
--- a/drivers/platform/chrome/cros_ec_debugfs.c
+++ b/drivers/platform/chrome/cros_ec_debugfs.c
@@ -49,6 +49,8 @@ struct cros_ec_debugfs {
struct delayed_work log_poll_work;
/* EC panicinfo */
struct debugfs_blob_wrapper panicinfo_blob;
+   /* EC uptime */
+   bool uptime_supported;
 };
 
 /*
@@ -256,12 +258,19 @@ static ssize_t cros_ec_uptime_read(struct file *file, 
char __user *user_buf,
char read_buf[32];
int ret;
 
+   if (!debug_info->uptime_supported)
+   return -EPROTO;
+
resp = (struct ec_response_uptime_info *)&msg.resp;
 
msg.cmd.command = EC_CMD_GET_UPTIME_INFO;
msg.cmd.insize = sizeof(*resp);
 
ret = cros_ec_cmd_xfer_status(ec_dev, &msg.cmd);
+   if (ret == -EPROTO && msg.cmd.result == EC_RES_INVALID_COMMAND) {
+   debug_info->uptime_supported = false;
+   return ret;
+   }
if (ret < 0)
return ret;
 
@@ -434,6 +443,9 @@ static int cros_ec_debugfs_probe(struct platform_device *pd)
debug_info->ec = ec;
debug_info->dir = debugfs_create_dir(name, NULL);
 
+   /* Give uptime a chance to run. */
+   debug_info->uptime_supported = true;
+
ret = cros_ec_create_panicinfo(debug_info);
if (ret)
goto remove_debugfs;
-- 
2.26.2.761.g0e0b3e54be-goog

[PATCH] clk: qcom: gcc: Fix parent for gpll0_out_even

2020-05-20 Thread Vinod Koul

Documentation says that gpll0 is parent of gpll0_out_even, somehow
driver coded that as bi_tcxo, so fix it

Fixes: 2a1d7eb854bb ("clk: qcom: gcc: Add global clock controller driver for 
SM8150")
Reported-by: Jonathan Marek 
Signed-off-by: Vinod Koul 
---
 drivers/clk/qcom/gcc-sm8150.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/clk/qcom/gcc-sm8150.c b/drivers/clk/qcom/gcc-sm8150.c
index 2bc08e7125bf..72524cf11048 100644
--- a/drivers/clk/qcom/gcc-sm8150.c
+++ b/drivers/clk/qcom/gcc-sm8150.c
@@ -76,8 +76,7 @@ static struct clk_alpha_pll_postdiv gpll0_out_even = {
.clkr.hw.init = &(struct clk_init_data){
.name = "gpll0_out_even",
.parent_data = &(const struct clk_parent_data){
-   .fw_name = "bi_tcxo",
-   .name = "bi_tcxo",
+   .hw = &gpll0.clkr.hw,
},
.num_parents = 1,
.ops = &clk_trion_pll_postdiv_ops,
-- 
2.25.4

Re: Re: [PATCH] Input: omap-keypad - fix runtime pm imbalance on error

2020-05-20 Thread dinghao . liu

Fixing this in the PM core will influence all callers of pm_runtime_get_sync().
Therefore I think the better solution is to fix its misused callers.

Regards,
Dinghao

"Dmitry Torokhov" 写道：
> Hi Dinghao,
> 
> On Wed, May 20, 2020 at 6:35 AM Dinghao Liu  wrote:
> >
> > pm_runtime_get_sync() increments the runtime PM usage counter even
> > the call returns an error code. Thus a pairing decrement is needed
> > on the error handling path to keep the counter balanced.
> 
> This is a very surprising behavior and I wonder if this should be
> fixed in the PM core (or the required cleanup steps need to be called
> out in the function description). I also see that a few drivers that
> handle this situation correctly (?) call pm_runtime_put_noidle()
> instead of pm_runtime_put_sync() in the error path.
> 
> Rafael, do you have any guidance here?
> 
> Thanks.
> 
> -- 
> Dmitry

[PATCH v1 1/1] drivers: mtd: spi-nor: update read capabilities for w25q64 and s25fl064k

2020-05-20 Thread Rayagonda Kokatanur

Both w25q64 and s25fl064k nor flash support QUAD and DUAL read
command, hence update the same in flash_info table.

Signed-off-by: Rayagonda Kokatanur 
---
 drivers/mtd/spi-nor/spansion.c | 3 ++-
 drivers/mtd/spi-nor/winbond.c  | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/spi-nor/spansion.c b/drivers/mtd/spi-nor/spansion.c
index 6756202ace4b..c91bbb8d9cd6 100644
--- a/drivers/mtd/spi-nor/spansion.c
+++ b/drivers/mtd/spi-nor/spansion.c
@@ -52,7 +52,8 @@ static const struct flash_info spansion_parts[] = {
 SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
{ "s25fl016k",  INFO(0xef4015,  0,  64 * 1024,  32,
 SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
-   { "s25fl064k",  INFO(0xef4017,  0,  64 * 1024, 128, SECT_4K) },
+   { "s25fl064k",  INFO(0xef4017,  0,  64 * 1024, 128,
+SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
{ "s25fl116k",  INFO(0x014015,  0,  64 * 1024,  32,
 SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
{ "s25fl132k",  INFO(0x014016,  0,  64 * 1024,  64, SECT_4K) },
diff --git a/drivers/mtd/spi-nor/winbond.c b/drivers/mtd/spi-nor/winbond.c
index 17deabad57e1..2028cab3eff9 100644
--- a/drivers/mtd/spi-nor/winbond.c
+++ b/drivers/mtd/spi-nor/winbond.c
@@ -39,7 +39,8 @@ static const struct flash_info winbond_parts[] = {
SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ |
SPI_NOR_HAS_LOCK | SPI_NOR_HAS_TB) },
{ "w25x64", INFO(0xef3017, 0, 64 * 1024, 128, SECT_4K) },
-   { "w25q64", INFO(0xef4017, 0, 64 * 1024, 128, SECT_4K) },
+   { "w25q64", INFO(0xef4017, 0, 64 * 1024, 128,
+SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
{ "w25q64dw", INFO(0xef6017, 0, 64 * 1024, 128,
   SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ |
   SPI_NOR_HAS_LOCK | SPI_NOR_HAS_TB) },
-- 
2.17.1

Re: [PATCH v3 03/14] remoteproc: Add new operation and flags for synchronistation

2020-05-20 Thread Bjorn Andersson

On Wed 20 May 15:06 PDT 2020, Mathieu Poirier wrote:

> On Mon, May 18, 2020 at 05:55:00PM -0700, Bjorn Andersson wrote:
> > On Fri 15 May 12:24 PDT 2020, Mathieu Poirier wrote:
> > 
> > > Good day Bjorn,
> > > 
> > > On Wed, May 13, 2020 at 06:32:24PM -0700, Bjorn Andersson wrote:
> > > > On Fri 08 May 14:01 PDT 2020, Mathieu Poirier wrote:
> > > > 
> > > > > On Tue, May 05, 2020 at 05:22:53PM -0700, Bjorn Andersson wrote:
> > > > > > On Fri 24 Apr 13:01 PDT 2020, Mathieu Poirier wrote:
[..]
> > > > > > > + bool after_crash;
> > > > > > 
> > > > > > Similarly what is the expected steps to be taken by the core when 
> > > > > > this
> > > > > > is true? Should rproc_report_crash() simply stop/start the 
> > > > > > subdevices
> > > > > > and upon one of the ops somehow tell the remote controller that it 
> > > > > > can
> > > > > > proceed with the recovery?
> > > > > 
> > > > > The exact same sequence of steps will be carried out as they are 
> > > > > today, except
> > > > > that if after_crash == true, the remoteproc core won't be switching 
> > > > > the remote
> > > > > processor on, exactly as it would do when on_init == true.
> > > > > 
> > > > 
> > > > Just to make sure we're on the same page:
> > > > 
> > > > after_crash = false is what we have today, and would mean:
> > > > 
> > > > 1) stop subdevices
> > > > 2) power off
> > > > 3) unprepare subdevices
> > > > 4) generate coredump
> > > > 5) request firmware
> > > > 6) load segments
> > > > 7) find resource table
> > > > 8) prepare subdevices
> > > > 9) "boot"
> > > > 10) start subdevices
> > > 
> > > Exactly
> > > 
> > > > 
> > > > after_crash = true would mean:
> > > > 
> > > > 1) stop subdevices
> > > > 2) "detach"
> > > > 3) unprepare subdevices
> > > > 4) prepare subdevices
> > > > 5) "attach"
> > > > 6) start subdevices
> > > >
> > > 
> > > Yes
> > >  
> > > > State diagram wise both of these would represent the transition RUNNING
> > > > -> CRASHED -> RUNNING, but somehow the platform driver needs to be able
> > > > to specify which of these sequences to perform. Per your naming
> > > > suggestion above, this does sound like a "autonomous_recovery" boolean
> > > > to me.
> > > 
> > > Right, semantically "rproc->autonomous" would apply quite well.
> > > 
> > > In function rproc_crash_handler_work(), a call to rproc_set_sync_flag() 
> > > has been
> > > strategically placed to set the value of rproc->autonomous based on
> > > "after_crash".  From there the core knows which rproc_ops to use.  Here 
> > > too we
> > > have to rely on the rproc_ops provided by the platform to do the right 
> > > thing
> > > based on the scenario to enact.
> > > 
> > 
> > Do you think that autonomous_recovery would be something that changes
> > for a given remoteproc instance? I envisioned it as something that you
> > know at registration time, but perhaps I'm missing some details here.
> 
> I don't envision any of the transision flags to change once they are set by 
> the
> platform.   The same applies to the new rproc_ops, it can be set only once.
> Otherwise combination of possible scenarios becomes too hard to manage, 
> leading
> to situations where the core and MCU get out of sync and can't talk to each
> other.
> 

Sounds good, I share this expectation, just wanted to check with you.

> > 
> > > > 
> > > > > These flags are there to indicate how to set rproc::sync_with_rproc 
> > > > > after
> > > > > different events, that is when the remoteproc core boots, when the 
> > > > > remoteproc
> > > > > has been stopped or when it has crashed.
> > > > > 
> > > > 
> > > > Right, that was clear from your patches. Sorry that my reply didn't
> > > > convey the information that I had understood this.
> > > > 
> > > > > > 
> > > > > > > +};
> > > > > > > +
> > > > > > >  /**
> > > > > > >   * struct rproc_ops - platform-specific device handlers
> > > > > > >   * @start:   power on the device and boot it
> > > > > > > @@ -459,6 +476,9 @@ struct rproc_dump_segment {
> > > > > > >   * @firmware: name of firmware file to be loaded
> > > > > > >   * @priv: private data which belongs to the platform-specific 
> > > > > > > rproc module
> > > > > > >   * @ops: platform-specific start/stop rproc handlers
> > > > > > > + * @sync_ops: platform-specific start/stop rproc handlers when
> > > > > > > + * synchronising with a remote processor.
> > > > > > > + * @sync_flags: Determine the rproc_ops to choose in specific 
> > > > > > > states.
> > > > > > >   * @dev: virtual device for refcounting and common remoteproc 
> > > > > > > behavior
> > > > > > >   * @power: refcount of users who need this rproc powered up
> > > > > > >   * @state: state of the device
> > > > > > > @@ -482,6 +502,7 @@ struct rproc_dump_segment {
> > > > > > >   * @table_sz: size of @cached_table
> > > > > > >   * @has_iommu: flag to indicate if remote processor is behind an 
> > > > > > > MMU
> > > > > > >   * @auto_boot: flag to indicate if remote processor should be 
> > > > > > > auto-s

Re: [RFC][PATCH 3/5] thermal: Add support for setting notification thresholds

2020-05-20 Thread Amit Kucheria

Hi Srinivas,

On Wed, May 20, 2020 at 11:46 PM Srinivas Pandruvada
 wrote:
>
> On Wed, 2020-05-20 at 09:58 +0530, Amit Kucheria wrote:
> > On Tue, May 19, 2020 at 5:10 AM Srinivas Pandruvada
> >  wrote:
> > > On Mon, 2020-05-18 at 18:37 +0200, Daniel Lezcano wrote:
> > > > On 04/05/2020 20:16, Srinivas Pandruvada wrote:
> > > > > Add new attributes in thermal syfs when a thermal drivers
> > > > > provides
> > > > > callbacks for them and CONFIG_THERMAL_USER_EVENT_INTERFACE is
> > > > > defined.
> > > > >
> > > > > These attribute allow user space to stop polling for
> > > > > temperature.
> > > > >
> > > > > These attributes are:
> > > > > - temp_thres_low: Specify a notification temperature for a low
> > > > > temperature threshold event.
> > > > > temp_thres_high: Specify a notification temperature for a high
> > > > > temperature threshold event.
> > > > > temp_thres_hyst: Specify a change in temperature to send
> > > > > notification
> > > > > again.
> > > > >
> > > > > This is implemented by adding additional sysfs attribute group.
> > > > > The
> > > > > changes in this patch are trivial to add new attributes in
> > > > > thermal
> > > > > sysfs as done for other attributes.
> > > >
> > > > Isn't it duplicate with the trip point?
> > > A trip point is where an in-kernel governor takes some action. This
> > > is
> > > not same as a notification temperature. For example at trip point
> > > configured by ACPI at 85C, the thermal governor may start
> > > aggressive
> > > throttling.
> > > But a user space can set a notification threshold at 80C and start
> > > some
> > > active controls like activate some fan to reduce the impact of
> > > passive
> > > control on performance.
> >
> > Then what is the use of thermal trip type "ACTIVE" ?
> This is an example.
> The defaults are set by the OEMs via ACPI. User can't modify that if
> they want to optimize for their usage on Linux. There are fan control
> daemon's which user use on top.

-ENOPARSE. Are you saying users "can" modify these?

In any case, how is what you described earlier not possible with an
ACTIVE trip point directly wired to the fan as a cooling device or
with a HOT trip point that causes the platform driver to send
notification to userspace where a fan control daemon can do what it
needs to?

Basically, I think the issue of polling is orthogonal to the
introduction of the new attributes introduced in this patch and I
don't understand the reason for these attributes from your commit
description.

> > > We need a way to distinguish between temperature notification
> > > threshold
> > > and actual trip point. Changing a trip point means that user wants
> > > kernel to throttle at temperature.
>

RE: [PATCH 2/2] soundwire: intel: transition to 3 steps initialization

2020-05-20 Thread Liao, Bard

> -Original Message-
> From: Vinod Koul 
> Sent: Thursday, May 21, 2020 12:37 PM
> To: Liao, Bard 
> Cc: Bard Liao ; alsa-de...@alsa-project.org;
> linux-kernel@vger.kernel.org; ti...@suse.de; broo...@kernel.org;
> gre...@linuxfoundation.org; j...@cadence.com;
> srinivas.kandaga...@linaro.org; rander.w...@linux.intel.com;
> ranjani.sridha...@linux.intel.com; hui.w...@canonical.com; pierre-
> louis.boss...@linux.intel.com; Kale, Sanyog R ;
> Blauciak, Slawomir ; Lin, Mengdong
> 
> Subject: Re: [PATCH 2/2] soundwire: intel: transition to 3 steps 
> initialization
> 
> On 21-05-20, 02:23, Liao, Bard wrote:
> > > -Original Message-
> > > From: Vinod Koul 
> > > Sent: Wednesday, May 20, 2020 9:54 PM
> > > To: Bard Liao 
> > > Cc: alsa-de...@alsa-project.org; linux-kernel@vger.kernel.org;
> > > ti...@suse.de; broo...@kernel.org; gre...@linuxfoundation.org;
> > > j...@cadence.com; srinivas.kandaga...@linaro.org;
> > > rander.w...@linux.intel.com; ranjani.sridha...@linux.intel.com;
> > > hui.w...@canonical.com; pierre- louis.boss...@linux.intel.com; Kale,
> > > Sanyog R ; Blauciak, Slawomir
> > > ; Lin, Mengdong
> > > ; Liao, Bard 
> > > Subject: Re: [PATCH 2/2] soundwire: intel: transition to 3 steps
> > > initialization
> > >
> > > On 20-05-20, 03:19, Bard Liao wrote:
> > > > From: Pierre-Louis Bossart 
> > > >
> > > > Rather than a plain-vanilla init/exit, this patch provides 3 steps
> > > > in the initialization (ACPI scan, probe, startup) which makes it
> > > > easier to detect platform support for SoundWire, allocate required
> > > > resources as early as possible, and conversely help make the
> > > > startup() callback lighter-weight with only hardware register setup.
> > >
> > > Okay but can you add details in changelog on what each step would do?
> >
> > Sure. Will do.
> >
> > >
> > > > @@ -1134,25 +1142,15 @@ static int intel_probe(struct
> > > > platform_device
> > > *pdev)
> > > >
> > > > intel_pdi_ch_update(sdw);
> > > >
> > > > -   /* Acquire IRQ */
> > > > -   ret = request_threaded_irq(sdw->link_res->irq,
> > > > -  sdw_cdns_irq, sdw_cdns_thread,
> > > > -  IRQF_SHARED, KBUILD_MODNAME, &sdw-
> > > >cdns);
> > >
> > > This is removed here but not added anywhere else, do we have no irq
> > > after this patch?
> >
> > We use a single irq for all Intel Audio DSP events and it will be
> > requested in the SOF driver.
> 
> And how will the irq be propagated to sdw/cdns drivers here?

We export the handler and call it on SOF driver.

> 
> --
> ~Vinod

powerpc64-linux-ld: mm/page_alloc.o:undefined reference to `node_reclaim_distance'

2020-05-20 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   b85051e755b0e9d6dd8f17ef1da083851b83287d
commit: a55c7454a8c887b226a01d7eed088ccb5374d81e sched/topology: Improve load 
balancing on AMD EPYC systems
date:   9 months ago
config: powerpc-randconfig-c004-20200520 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout a55c7454a8c887b226a01d7eed088ccb5374d81e
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

powerpc64-linux-ld: warning: orphan section `.gnu.hash' from `linker stubs' 
being placed in section `.gnu.hash'
>> powerpc64-linux-ld: mm/page_alloc.o:(.toc+0x0): undefined reference to 
>> `node_reclaim_distance'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: Re: [PATCH] wlcore: fix runtime pm imbalance in wl1271_op_suspend

2020-05-20 Thread dinghao . liu

There is a check against ret after out_sleep tag. If 
wl1271_configure_suspend_ap()
returns an error code, ret will be caught by this check and a warning will be 
issued.


"Tony Lindgren" 写道：
> * Dinghao Liu  [200520 12:58]:
> > When wlcore_hw_interrupt_notify() returns an error code,
> > a pairing runtime PM usage counter decrement is needed to
> > keep the counter balanced.
> 
> We should probably keep the warning though, nothing will
> get shown for wl1271_configure_suspend_ap() errors.
> 
> Otherwise looks good to me.
> 
> Regards,
> 
> Tony

Re: linux-next: manual merge of the rcu tree with the powerpc tree

2020-05-20 Thread Stephen Rothwell

Hi all,

On Tue, 19 May 2020 17:23:16 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the rcu tree got a conflict in:
> 
>   arch/powerpc/kernel/traps.c
> 
> between commit:
> 
>   116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting")
> 
> from the powerpc tree and commit:
> 
>   187416eeb388 ("hardirq/nmi: Allow nested nmi_enter()")
> 
> from the rcu tree.
> 
> I fixed it up (I used the powerpc tree version for now) and can carry the
> fix as necessary. This is now fixed as far as linux-next is concerned,
> but any non trivial conflicts should be mentioned to your upstream
> maintainer when your tree is submitted for merging.  You may also want
> to consider cooperating with the maintainer of the conflicting tree to
> minimise any particularly complex conflicts.

This is now a conflict between the powerpc commit and commit

  69ea03b56ed2 ("hardirq/nmi: Allow nested nmi_enter()")

from the tip tree.  I assume that the rcu and tip trees are sharing
some patches (but not commits) :-(

-- 
Cheers,
Stephen Rothwell


pgpqCKNtba24A.pgp
Description: OpenPGP digital signature

Re: [PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-20 Thread Michel Lespinasse

Looks good, thanks !

On Wed, May 20, 2020 at 8:22 PM Andrew Morton  wrote:
> On Tue, 19 May 2020 22:29:08 -0700 Michel Lespinasse  
> wrote:
> > Convert comments that reference mmap_sem to reference mmap_lock instead.
>
> This may not be complete..
>
> From: Andrew Morton 
> Subject: mmap-locking-api-convert-mmap_sem-comments-fix
>
> fix up linux-next leftovers
>
> Cc: Daniel Jordan 
> Cc: Davidlohr Bueso 
> Cc: David Rientjes 
> Cc: Hugh Dickins 
> Cc: Jason Gunthorpe 
> Cc: Jerome Glisse 
> Cc: John Hubbard 
> Cc: Laurent Dufour 
> Cc: Liam Howlett 
> Cc: Matthew Wilcox 
> Cc: Michel Lespinasse 
> Cc: Peter Zijlstra 
> Cc: Vlastimil Babka 
> Cc: Ying Han 
> Signed-off-by: Andrew Morton 

Reviewed-by: Michel Lespinasse

Re: [PATCH v6 05/12] mmap locking API: convert mmap_sem call sites missed by coccinelle

2020-05-20 Thread Michel Lespinasse

Looks good. I'm not sure if you need a review, but just in case:

On Wed, May 20, 2020 at 8:23 PM Andrew Morton  wrote:
> On Tue, 19 May 2020 22:29:01 -0700 Michel Lespinasse  
> wrote:
>
> > Convert the last few remaining mmap_sem rwsem calls to use the new
> > mmap locking API. These were missed by coccinelle for some reason
> > (I think coccinelle does not support some of the preprocessor
> > constructs in these files ?)
>
> From: Andrew Morton 
> Subject: mmap-locking-api-convert-mmap_sem-call-sites-missed-by-coccinelle-fix
>
> convert linux-next leftovers
>
> Cc: Michel Lespinasse 
> Cc: Daniel Jordan 
> Cc: Laurent Dufour 
> Cc: Vlastimil Babka 
> Cc: Davidlohr Bueso 
> Cc: David Rientjes 
> Cc: Hugh Dickins 
> Cc: Jason Gunthorpe 
> Cc: Jerome Glisse 
> Cc: John Hubbard 
> Cc: Liam Howlett 
> Cc: Matthew Wilcox 
> Cc: Peter Zijlstra 
> Cc: Ying Han 
> Signed-off-by: Andrew Morton 

Reviewed-by: Michel Lespinasse

Re: [PATCH] HID: intel-ish-hid: Replace PCI_DEV_FLAGS_NO_D3 with pci_save_state

2020-05-20 Thread Kai-Heng Feng

Hi Srinivas,

> On May 9, 2020, at 01:45, Srinivas Pandruvada 
>  wrote:
> 
> On Tue, 2020-05-05 at 21:17 +0800, Kai-Heng Feng wrote:
>> PCI_DEV_FLAGS_NO_D3 should not be used outside of PCI core.
>> 
>> Instead, we can use pci_save_state() to hint PCI core that the device
>> should stay at D0 during suspend.
> 
> Your changes are doing more than just changing the flag. Can you
> explain more about the other changes?

By using pci_save_state(), in addition to keep itself stay at D0, the parent 
bridge will also stay at D0.
So it's a better approach to achieve the same thing.

> Also make sure that you test on both platforms which has regular S3 and
> S0ix (modern standby system).

Actually I don't have any physical hardware to test the patch, I found the 
issue when I search for D3 quirks through the source code.

Can you guys do a quick smoketest for this patch?

Kai-Heng

> 
> Thanks,
> Srinivas
> 
> 
>> 
>> Signed-off-by: Kai-Heng Feng 
>> ---
>> drivers/hid/intel-ish-hid/ipc/pci-ish.c | 15 ++-
>> 1 file changed, 10 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/hid/intel-ish-hid/ipc/pci-ish.c
>> b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
>> index f491d8b4e24c..ab588b9c8d09 100644
>> --- a/drivers/hid/intel-ish-hid/ipc/pci-ish.c
>> +++ b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
>> @@ -106,6 +106,11 @@ static inline bool ish_should_enter_d0i3(struct
>> pci_dev *pdev)
>>  return !pm_suspend_via_firmware() || pdev->device ==
>> CHV_DEVICE_ID;
>> }
>> 
>> +static inline bool ish_should_leave_d0i3(struct pci_dev *pdev)
>> +{
>> +return !pm_resume_via_firmware() || pdev->device ==
>> CHV_DEVICE_ID;
>> +}
>> +
>> /**
>>  * ish_probe() - PCI driver probe callback
>>  * @pdev:pci device
>> @@ -215,9 +220,7 @@ static void __maybe_unused
>> ish_resume_handler(struct work_struct *work)
>>  struct ishtp_device *dev = pci_get_drvdata(pdev);
>>  int ret;
>> 
>> -/* Check the NO_D3 flag to distinguish the resume paths */
>> -if (pdev->dev_flags & PCI_DEV_FLAGS_NO_D3) {
>> -pdev->dev_flags &= ~PCI_DEV_FLAGS_NO_D3;
>> +if (ish_should_leave_d0i3(pdev) && !dev->suspend_flag) {
>>  disable_irq_wake(pdev->irq);
>> 
>>  ishtp_send_resume(dev);
>> @@ -281,8 +284,10 @@ static int __maybe_unused ish_suspend(struct
>> device *device)
>>   */
>>  ish_disable_dma(dev);
>>  } else {
>> -/* Set the NO_D3 flag, the ISH would enter D0i3
>> */
>> -pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3;
>> +/* Save state so PCI core will keep the device
>> at D0,
>> + * the ISH would enter D0i3
>> + */
>> +pci_save_state(pdev);
>> 
> Did you test on some C
> 
> 
>>  enable_irq_wake(pdev->irq);
>>  }

Re: [RFC PATCH 2/2] init: Allow multi-line output of kernel command line

2020-05-20 Thread Andrew Morton

On Thu, 21 May 2020 13:36:28 +0900 Sergey Senozhatsky 
 wrote:

> On (20/05/20 18:00), Andrew Morton wrote:
> [..]
> > I'm wondering if we shold add a kernel puts() (putsk()?  yuk) which can
> > puts() a string of any length.
> > 
> > I'm counting around 150 instances of printk("%s", ...) and pr_foo("%s",
> > ...) which could perhaps be converted, thus saving an argument.
> 
> Can you point me at some examples?
> 

./arch/powerpc/kernel/udbg.c:   printk("%s", s);
./arch/powerpc/xmon/nonstdio.c: printk("%s", xmon_outbuf);
./arch/um/os-Linux/drivers/ethertap_user.c: printk("%s", output);
./arch/um/os-Linux/drivers/ethertap_user.c: printk("%s", output);
./arch/um/os-Linux/drivers/tuntap_user.c:   printk("%s", out

etc.

My point is, if we created a length-unlimited puts() function for printing the
kernel command line, it could be reused in such places, resulting in a
smaller kernel.

Re: [PATCH v3] /dev/mem: Revoke mappings when a driver claims the region

2020-05-20 Thread Dan Williams

On Wed, May 20, 2020 at 9:37 PM Dan Williams  wrote:
>
> On Wed, May 20, 2020 at 7:26 PM Matthew Wilcox  wrote:
> >
> > On Wed, May 20, 2020 at 06:35:25PM -0700, Dan Williams wrote:
> > > +static struct inode *devmem_inode;
> > > +
> > > +#ifdef CONFIG_IO_STRICT_DEVMEM
> > > +void revoke_devmem(struct resource *res)
> > > +{
> > > + struct inode *inode = READ_ONCE(devmem_inode);
> > > +
> > > + /*
> > > +  * Check that the initialization has completed. Losing the race
> > > +  * is ok because it means drivers are claiming resources before
> > > +  * the fs_initcall level of init and prevent /dev/mem from
> > > +  * establishing mappings.
> > > +  */
> > > + smp_rmb();
> > > + if (!inode)
> > > + return;
> >
> > But we don't need the smp_rmb() here, right?  READ_ONCE and WRITE_ONCE
> > are a DATA DEPENDENCY barrier (in Documentation/memory-barriers.txt 
> > parlance)
> > so the smp_rmb() is superfluous ...
>
> Is it? I did not grok that from Documentation/memory-barriers.txt.
> READ_ONCE and WRITE_ONCE are certainly ordered with respect to each
> other in the same function, but I thought they still depend on
> barriers for smp ordering?
>
> >
> > > + /*
> > > +  * Use a unified address space to have a single point to manage
> > > +  * revocations when drivers want to take over a /dev/mem mapped
> > > +  * range.
> > > +  */
> > > + inode->i_mapping = devmem_inode->i_mapping;
> > > + inode->i_mapping->host = devmem_inode;
> >
> > umm ... devmem_inode->i_mapping->host doesn't already point to devmem_inode?
>
> Not if inode is coming from:
>
>  mknod ./newmem c 1 1
>
> ...that's the problem that a unified inode solves. You can mknod all
> you want, but mapping and mapping->host will point to a common
> instance.
>
> >
> > > +
> > > + /* publish /dev/mem initialized */
> > > + smp_wmb();
> > > + WRITE_ONCE(devmem_inode, inode);
> >
> > As above, unnecessary barrier, I think.
>
> Well, if you're not sure, how sure should I be?

I'm pretty sure they are needed, because I need the prior writes to
initialize the inode to be fenced before the final write to publish
the inode. I don't think WRITE_ONCE() enforces that prior writes have
completed.

Re: [PATCH v3] /dev/mem: Revoke mappings when a driver claims the region

2020-05-20 Thread Dan Williams

On Wed, May 20, 2020 at 7:26 PM Matthew Wilcox  wrote:
>
> On Wed, May 20, 2020 at 06:35:25PM -0700, Dan Williams wrote:
> > +static struct inode *devmem_inode;
> > +
> > +#ifdef CONFIG_IO_STRICT_DEVMEM
> > +void revoke_devmem(struct resource *res)
> > +{
> > + struct inode *inode = READ_ONCE(devmem_inode);
> > +
> > + /*
> > +  * Check that the initialization has completed. Losing the race
> > +  * is ok because it means drivers are claiming resources before
> > +  * the fs_initcall level of init and prevent /dev/mem from
> > +  * establishing mappings.
> > +  */
> > + smp_rmb();
> > + if (!inode)
> > + return;
>
> But we don't need the smp_rmb() here, right?  READ_ONCE and WRITE_ONCE
> are a DATA DEPENDENCY barrier (in Documentation/memory-barriers.txt parlance)
> so the smp_rmb() is superfluous ...

Is it? I did not grok that from Documentation/memory-barriers.txt.
READ_ONCE and WRITE_ONCE are certainly ordered with respect to each
other in the same function, but I thought they still depend on
barriers for smp ordering?

>
> > + /*
> > +  * Use a unified address space to have a single point to manage
> > +  * revocations when drivers want to take over a /dev/mem mapped
> > +  * range.
> > +  */
> > + inode->i_mapping = devmem_inode->i_mapping;
> > + inode->i_mapping->host = devmem_inode;
>
> umm ... devmem_inode->i_mapping->host doesn't already point to devmem_inode?

Not if inode is coming from:

 mknod ./newmem c 1 1

...that's the problem that a unified inode solves. You can mknod all
you want, but mapping and mapping->host will point to a common
instance.

>
> > +
> > + /* publish /dev/mem initialized */
> > + smp_wmb();
> > + WRITE_ONCE(devmem_inode, inode);
>
> As above, unnecessary barrier, I think.

Well, if you're not sure, how sure should I be?

Re: [PATCH] perf evsel: Get group fd from CPU0 for system wide event

2020-05-20 Thread Jin, Yao


Hi Jiri,

On 5/20/2020 3:50 PM, Jiri Olsa wrote:

On Wed, May 20, 2020 at 01:36:40PM +0800, Jin, Yao wrote:

Hi Jiri,

On 5/18/2020 11:28 AM, Jin, Yao wrote:

Hi Jiri,

On 5/15/2020 4:33 PM, Jiri Olsa wrote:

On Fri, May 15, 2020 at 02:04:57PM +0800, Jin, Yao wrote:

SNIP


I think I get the root cause. That should be a serious bug in get_group_fd, 
access violation!

For a group mixed with system-wide event and per-core event and the group
leader is system-wide event, access violation will happen.

perf_evsel__alloc_fd allocates one FD member for system-wide event (only 
FD(evsel, 0, 0) is valid).

But for per core event, perf_evsel__alloc_fd allocates N FD members (N =
ncpus). For example, for ncpus is 8, FD(evsel, 0, 0) to FD(evsel, 7, 0) are
valid.

get_group_fd(struct evsel *evsel, int cpu, int thread)
{
  struct evsel *leader = evsel->leader;

  fd = FD(leader, cpu, thread);    /* access violation may happen here */
}

If leader is system-wide event, only the FD(leader, 0, 0) is valid.

When get_group_fd accesses FD(leader, 1, 0), access violation happens.

My fix is:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..db05b8a1e1a8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1440,6 +1440,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int 
thread)
  if (evsel__is_group_leader(evsel))
  return -1;

+   if (leader->core.system_wide && !evsel->core.system_wide)
+   return -2;


so this effectively stops grouping system_wide events with others,
and I think it's correct, how about events that differ in cpumask?



My understanding for the events that differ in cpumaks is, if the
leader's cpumask is not fully matched with the evsel's cpumask then we
stop the grouping. Is this understanding correct?

I have done some tests and get some conclusions:

1. If the group is mixed with core and uncore events, the system_wide checking 
can distinguish them.

2. If the group is mixed with core and uncore events and "-a" is
specified, the system_wide for core event is also false. So system_wide
checking can distinguish them too

3. In my test, the issue only occurs when we collect the metric which is
mixed with uncore event and core event, so maybe checking the
system_wide is OK.


should we perhaps ensure this before we call open? go throught all
groups and check they are on the same cpus?



The issue doesn't happen at most of the time (only for the metric
consisting of uncore event and core event), so fallback to stop grouping
if call open is failed looks reasonable.

Thanks
Jin Yao


thanks,
jirka



+
  /*
   * Leader must be already processed/open,
   * if not it's a bug.
@@ -1665,6 +1668,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct 
perf_cpu_map *cpus,
  pid = perf_thread_map__pid(threads, thread);

  group_fd = get_group_fd(evsel, cpu, thread);
+   if (group_fd == -2) {
+   errno = EINVAL;
+   err = -EINVAL;
+   goto out_close;
+   }
   retry_open:
  test_attr__ready();

It enables the perf_evlist__reset_weak_group. And in the second_pass (in
__run_perf_stat), the events will be opened successfully.

I have tested OK for this fix on cascadelakex.

Thanks
Jin Yao





Is this fix OK?

Another thing is, do you think if we need to rename
"evsel->core.system_wide" to "evsel->core.has_cpumask".

The "system_wide" may misleading.

evsel->core.system_wide = pmu ? pmu->is_uncore : false;

"pmu->is_uncore" is true if PMU has a "cpumask". But it's not just uncore
PMU which has cpumask. Some other PMUs, e.g. cstate_pkg, also have cpumask.
So for this case, "has_cpumask" should be better.


so those flags are checked in many places in the code so I don't
think it's wise to mess with them

what I meant before was that the cpumask could be different for
different events so even when both events are 'system_wide' the
leader 'fd' might not exist for the groupped events and vice versa

so maybe we should ensure that we are groupping events with same
cpu maps before we go for open, so the get_group_fd stays simple



Thanks for the comments. I'm preparing the patch according to this idea.



But I'm not sure if the change is OK for other case, e.g. PT, which also
uses "evsel->core.system_wide".


plz CC Adrian Hunter  on next patches
if you are touching this



I will not touch "evsel->core.system_wide" in the new patch.

Thanks
Jin Yao


thanks,
jirka

Re: [PATCH 2/2] soundwire: intel: transition to 3 steps initialization

2020-05-20 Thread Vinod Koul

On 21-05-20, 02:23, Liao, Bard wrote:
> > -Original Message-
> > From: Vinod Koul 
> > Sent: Wednesday, May 20, 2020 9:54 PM
> > To: Bard Liao 
> > Cc: alsa-de...@alsa-project.org; linux-kernel@vger.kernel.org; 
> > ti...@suse.de;
> > broo...@kernel.org; gre...@linuxfoundation.org; j...@cadence.com;
> > srinivas.kandaga...@linaro.org; rander.w...@linux.intel.com;
> > ranjani.sridha...@linux.intel.com; hui.w...@canonical.com; pierre-
> > louis.boss...@linux.intel.com; Kale, Sanyog R ;
> > Blauciak, Slawomir ; Lin, Mengdong
> > ; Liao, Bard 
> > Subject: Re: [PATCH 2/2] soundwire: intel: transition to 3 steps 
> > initialization
> > 
> > On 20-05-20, 03:19, Bard Liao wrote:
> > > From: Pierre-Louis Bossart 
> > >
> > > Rather than a plain-vanilla init/exit, this patch provides 3 steps in
> > > the initialization (ACPI scan, probe, startup) which makes it easier to
> > > detect platform support for SoundWire, allocate required resources as
> > > early as possible, and conversely help make the startup() callback
> > > lighter-weight with only hardware register setup.
> > 
> > Okay but can you add details in changelog on what each step would do?
> 
> Sure. Will do.
> 
> > 
> > > @@ -1134,25 +1142,15 @@ static int intel_probe(struct platform_device
> > *pdev)
> > >
> > >   intel_pdi_ch_update(sdw);
> > >
> > > - /* Acquire IRQ */
> > > - ret = request_threaded_irq(sdw->link_res->irq,
> > > -sdw_cdns_irq, sdw_cdns_thread,
> > > -IRQF_SHARED, KBUILD_MODNAME, &sdw-
> > >cdns);
> > 
> > This is removed here but not added anywhere else, do we have no irq
> > after this patch?
> 
> We use a single irq for all Intel Audio DSP events and it will
> be requested in the SOF driver.

And how will the irq be propagated to sdw/cdns drivers here?

-- 
~Vinod

Re: [RFC PATCH 2/2] init: Allow multi-line output of kernel command line

2020-05-20 Thread Sergey Senozhatsky

On (20/05/20 18:00), Andrew Morton wrote:
[..]
> I'm wondering if we shold add a kernel puts() (putsk()?  yuk) which can
> puts() a string of any length.
> 
> I'm counting around 150 instances of printk("%s", ...) and pr_foo("%s",
> ...) which could perhaps be converted, thus saving an argument.

Can you point me at some examples?

-ss

Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally

2020-05-20 Thread Xiaoyao Li


On 5/21/2020 5:05 AM, Paolo Bonzini wrote:

On 20/05/20 18:07, Maxim Levitsky wrote:

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky 
---
  arch/x86/kvm/x86.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
continue;
+   break;
+   case MSR_IA32_UMWAIT_CONTROL:
+   if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+   continue;
default:
break;
}


The patch is correct, and matches what is done for the other entries of
msrs_to_save_all.  However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.

So while at it you should also fix that.  The right way to do that is to
add a

 if (vmx_waitpkg_supported())
 kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);


+ Tao

I remember there is certainly some reason why we don't expose WAITPKG to 
guest by default.


Tao, please help clarify it.

Thanks,
-Xiaoyao



in vmx_set_cpu_caps.

Thanks,

Paolo

[PATCH] kbuild: doc: remove documentation about copying Module.symvers around

2020-05-20 Thread Masahiro Yamada

This is a left-over of commit 39808e451fdf ("kbuild: do not read
$(KBUILD_EXTMOD)/Module.symvers").

Kbuild no longer supports this way.

Signed-off-by: Masahiro Yamada 
---

 Documentation/kbuild/modules.rst | 12 
 1 file changed, 12 deletions(-)

diff --git a/Documentation/kbuild/modules.rst b/Documentation/kbuild/modules.rst
index e0b45a257f21..a45cccff467d 100644
--- a/Documentation/kbuild/modules.rst
+++ b/Documentation/kbuild/modules.rst
@@ -528,18 +528,6 @@ build.
will then do the expected and compile both modules with
full knowledge of symbols from either module.
 
-   Use an extra Module.symvers file
-   When an external module is built, a Module.symvers file
-   is generated containing all exported symbols which are
-   not defined in the kernel. To get access to symbols
-   from bar.ko, copy the Module.symvers file from the
-   compilation of bar.ko to the directory where foo.ko is
-   built. During the module build, kbuild will read the
-   Module.symvers file in the directory of the external
-   module, and when the build is finished, a new
-   Module.symvers file is created containing the sum of
-   all symbols defined and not part of the kernel.
-
Use "make" variable KBUILD_EXTRA_SYMBOLS
If it is impractical to add a top-level kbuild file,
you can assign a space separated list
-- 
2.25.1

Re: [RFC PATCH 2/2] init: Allow multi-line output of kernel command line

2020-05-20 Thread Sergey Senozhatsky

On (20/05/20 13:36), Joe Perches wrote:
> > We can split command line in a loop - memchr(pos, ' ') - and
> > pr_cont() parts of the command line. pr_cont() has overflow
> > control and it flushes cont buffer before it overflows, so
> > we should not lose anything.
> 
> It doesn't matter much here, but I believe
> there's an 8k max buffer for pr_cont output.
> 
> include/linux/printk.h:#define CONSOLE_EXT_LOG_MAX  8192

This is for extended payload - the key:value dictionaries
which device core appends to normal printk() messages. We
don't have that many consoles that handle extended output
(netcon and, maybe, a few more).

-ss

[PATCH v2] libata: Use per port sync for detach

2020-05-20 Thread Kai-Heng Feng

Commit 130f4caf145c ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.

Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.

Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.

Fixes: 130f4caf145c ("libata: Ensure ata_port probe has completed before 
detach")
BugLink: https://bugs.launchpad.net/bugs/1867983
Suggested-by: John Garry 
Signed-off-by: Kai-Heng Feng 
---
v2:
 - Sync up to cookie + 1.
 - Squash the synchronization into the same loop.

 drivers/ata/libata-core.c | 9 -
 include/linux/libata.h| 3 +++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index beca5f91bb4c..b6be84f2cecb 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -5778,7 +5777,7 @@ int ata_host_register(struct ata_host *host, struct 
scsi_host_template *sht)
/* perform each probe asynchronously */
for (i = 0; i < host->n_ports; i++) {
struct ata_port *ap = host->ports[i];
-   async_schedule(async_port_probe, ap);
+   ap->cookie = async_schedule(async_port_probe, ap);
}
 
return 0;
@@ -5921,10 +5920,10 @@ void ata_host_detach(struct ata_host *host)
int i;
 
/* Ensure ata_port probe has completed */
-   async_synchronize_full();
-
-   for (i = 0; i < host->n_ports; i++)
+   for (i = 0; i < host->n_ports; i++) {
+   async_synchronize_cookie(host->ports[i]->cookie + 1);
ata_port_detach(host->ports[i]);
+   }
 
/* the host is dead now, dissociate ACPI */
ata_acpi_dissociate(host);
diff --git a/include/linux/libata.h b/include/linux/libata.h
index cffa4714bfa8..ae6dfc107ea8 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Define if arch has non-standard setup.  This is a _PCI_ standard
@@ -872,6 +873,8 @@ struct ata_port {
struct timer_list   fastdrain_timer;
unsigned long   fastdrain_cnt;
 
+   async_cookie_t  cookie;
+
int em_message_type;
void*private_data;
 
-- 
2.17.1

Re: [PATCH RFC] sched: Add a per-thread core scheduling interface(Internet mail)

2020-05-20 Thread 蒋彪




> On May 21, 2020, at 6:26 AM, Joel Fernandes (Google)  
> wrote:
> 
> Add a per-thread core scheduling interface which allows a thread to tag
> itself and enable core scheduling. Based on discussion at OSPM with
> maintainers, we propose a prctl(2) interface accepting values of 0 or 1.
> 1 - enable core scheduling for the task.
> 0 - disable core scheduling for the task.
> 
> Special cases:
> (1)
> The core-scheduling patchset contains a CGroup interface as well. In
> order for us to respect users of that interface, we avoid overriding the
> tag if a task was CGroup-tagged because the task becomes inconsistent
> with the CGroup tag. Instead return -EBUSY.
> 
> (2)
> If a task is prctl-tagged, allow the CGroup interface to override
> the task's tag.
> 
> ChromeOS will use core-scheduling to securely enable hyperthreading.
> This cuts down the keypress latency in Google docs from 150ms to 50ms
> while improving the camera streaming frame rate by ~3%.
Hi,
Are the performance improvements compared to the hyperthreading disabled 
scenario or not?
Could you help to explain how the keypress latency improvement comes with 
core-scheduling?

Thanks a lot.

Regards,
Jiang

Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

2020-05-20 Thread Anshuman Khandual

On 05/20/2020 12:45 PM, HORIGUCHI NAOYA(堀口　直也) wrote:
> On Mon, May 18, 2020 at 12:12:36PM +0530, Anshuman Khandual wrote:
>> This adds the following two new VM events which will help in validating PMD
>> based THP migration without split. Statistics reported through these events
>> will help in performance debugging.
>>
>> 1. THP_PMD_MIGRATION_SUCCESS
>> 2. THP_PMD_MIGRATION_FAILURE
>>
>> Cc: Naoya Horiguchi 
>> Cc: Zi Yan 
>> Cc: John Hubbard 
>> Cc: Andrew Morton 
>> Cc: linux...@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual 
> 
> Hi Anshuman,

Hi Naoya,

> 
> I'm neutral for additinal lines in /proc/vmstat. It's a classic (so widely
> used) but inflexible interface. Users disabling thp are not happy with many
> thp-related lines, but judging from the fact that we already have many

Right, for similar reason, I am not too keen on enabling these counters
without migration being enabled with ARCH_ENABLE_THP_MIGRATION.

> thp-related lines some users really need them. So I feel hard to decide to
> agree or disagree with additional lines.

Currently these are conditional on ARCH_ENABLE_THP_MIGRATION. So we are
not adding these new lines unless it migration is available and enabled.

> 
> I think that tracepoints are the more flexible interfaces for monitoring,
> so I'm interested more in whether thp migration could be monitorable via
> tracepoint. Do you have any idea/plan on it?

Sure, we can add some trace points as well which can give more granular
details regarding THP migration mechanism itself e.g setting and removing
PMD migration entries etc probably with (vaddr, pmdp, pmd) details.

But we will still need /proc/vmstat entries that will be available right
away without requiring additional steps. This simplicity is essential for
folks to consider using these events more often.

Sure, will look into what trace points can be added for THP migration but
in a subsequent patch.

- Anshuman

Hi

2020-05-20 Thread Jerry Machel




Hi,

I write to inform you of a great business opportunity. My names is 
Jerry Machel Ivoirien Français, there is a business proposal i will like 
to discuss with you or your ORG. If it interests you please let me know 
and let's work it together.

Regards

Jerry Machel.

Re: [PATCH 09/29] kbuild: disallow multi-word in M= or KBUILD_EXTMOD

2020-05-20 Thread Masahiro Yamada

On Sun, May 17, 2020 at 9:33 PM David Laight  wrote:
>
> From: Masahiro Yamada
> > Sent: 17 May 2020 10:49
> > $(firstword ...) in scripts/Makefile.modpost was added by commit
> > 3f3fd3c05585 ("[PATCH] kbuild: allow multi-word $M in Makefile.modpost")
> > to build multiple external module directories.
> >
> > This feature has been broken for a while. Remove the bitrotten code, and
> > stop parsing if M or KBUILD_EXTMOD contains multiple words.
>
> ISTR that one of the kernel documentation files says that it is possible
> to build multiple modules together in order to avoid 'faffing' with
> exported symbol lists.
>
> So the docs need updating to match.


Do you remember which doc mentions it?



> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
>


-- 
Best Regards
Masahiro Yamada

[PATCH] ASoC: dt-bindings: simple-card: care missing address #address-cells

2020-05-20 Thread Kuninori Morimoto

From: Kuninori Morimoto 

Current simple-card will get below error,
because it doesn't care about #address-cells at some part.

DTC 
Documentation/devicetree/bindings/sound/simple-card.example.dt.yaml

Documentation/devicetree/bindings/sound/simple-card.example.dts:171.46-173.15: \
Warning (unit_address_vs_reg): 
/example-4/sound/simple-audio-card,cpu@0: \
node has a unit name, but no reg or ranges property

Documentation/devicetree/bindings/sound/simple-card.example.dts:175.37-177.15: \
Warning (unit_address_vs_reg): 
/example-4/sound/simple-audio-card,cpu@1: \
node has a unit name, but no reg or ranges property
...

This patch fixup this issue.

Signed-off-by: Kuninori Morimoto 
---
 .../bindings/sound/simple-card.yaml   | 25 ++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/sound/simple-card.yaml 
b/Documentation/devicetree/bindings/sound/simple-card.yaml
index cb2bb5fac0e1..6c4c2c6d6d3c 100644
--- a/Documentation/devicetree/bindings/sound/simple-card.yaml
+++ b/Documentation/devicetree/bindings/sound/simple-card.yaml
@@ -208,6 +208,11 @@ patternProperties:
   reg:
 maxItems: 1
 
+  "#address-cells":
+const: 1
+  "#size-cells":
+const: 0
+
   # common properties
   frame-master:
 $ref: "#/definitions/frame-master"
@@ -288,7 +293,6 @@ examples:
 
 #address-cells = <1>;
 #size-cells = <0>;
-
 simple-audio-card,dai-link@0 { /* I2S - HDMI */
 reg = <0>;
 format = "i2s";
@@ -392,11 +396,15 @@ examples:
 simple-audio-card,routing = "ak4642 Playback", "DAI0 Playback",
 "ak4642 Playback", "DAI1 Playback";
 
+#address-cells = <1>;
+#size-cells = <0>;
 dpcmcpu: simple-audio-card,cpu@0 {
+reg = <0>;
 sound-dai = <&rcar_sound 0>;
 };
 
 simple-audio-card,cpu@1 {
+reg = <1>;
 sound-dai = <&rcar_sound 1>;
 };
 
@@ -427,7 +435,12 @@ examples:
 "pcm3168a Playback", "DAI3 Playback",
 "pcm3168a Playback", "DAI4 Playback";
 
+#address-cells = <1>;
+#size-cells = <0>;
+
 simple-audio-card,dai-link@0 {
+reg = <0>;
+
 format = "left_j";
 bitclock-master = <&sndcpu0>;
 frame-master = <&sndcpu0>;
@@ -441,22 +454,30 @@ examples:
 };
 
 simple-audio-card,dai-link@1 {
+reg = <1>;
+
 format = "i2s";
 bitclock-master = <&sndcpu1>;
 frame-master = <&sndcpu1>;
 
 convert-channels = <8>; /* TDM Split */
 
+#address-cells = <1>;
+#size-cells = <0>;
 sndcpu1: cpu@0 {
+reg = <0>;
 sound-dai = <&rcar_sound 1>;
 };
 cpu@1 {
+reg = <1>;
 sound-dai = <&rcar_sound 2>;
 };
 cpu@2 {
+reg = <2>;
 sound-dai = <&rcar_sound 3>;
 };
 cpu@3 {
+reg = <3>;
 sound-dai = <&rcar_sound 4>;
 };
 codec {
@@ -468,6 +489,8 @@ examples:
 };
 
 simple-audio-card,dai-link@2 {
+reg = <2>;
+
 format = "i2s";
 bitclock-master = <&sndcpu2>;
 frame-master = <&sndcpu2>;
-- 
2.17.1

Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Rik van Riel

On Wed, 2020-05-20 at 16:34 -0400, Alan Stern wrote:
> On Wed, May 20, 2020 at 03:21:44PM -0400, Rik van Riel wrote:
> > 
> > Interesting. That makes me really curious why things are
> > getting stuck, now...
> 
> This could be a bug in xhci-hcd.  Perhaps the controller's endpoint 
> state needs to be updated after one of these errors occurs.  Mathias 
> will know all about that.

I am seeing something potentially interesting in the
giant trace. First the final enqueue/dequeue before
the babble error:

  -0 [005] d.s. 776367.638233: xhci_inc_enq: ISOC
33a6879e: enq 0x001014070420(0x00101407) deq
0x001014070360(0x00101407) segs 2 stream 0 free_trbs 497
bounce 196 cycle 1

The next reference to 0x001014070360 is the babble error,
and some info on the ISOC buffer itself:

  -0 [005] d.h. 776367.639187: xhci_handle_event:
EVENT: TRB 001014070360 status 'Babble Detected' len 196 slot 15 ep
9 type 'Transfer Event' flags e:C
  -0 [005] d.h. 776367.639195: xhci_handle_transfer:
ISOC: Buffer 000e2676f400 length 196 TD size 0 intr 0 type 'Isoch'
flags b:i:I:c:s:I:e:C

Immediately after the babble error, the next request is enqueued,
and the doorbell is rung:

  -0 [005] d.h. 776367.639196: xhci_inc_deq: ISOC 
33a6879e: enq 0x001014070420(0x00101407) deq 
0x001014070370(0x00101407) segs 2 stream 0 free_trbs 498 bounce 196 
cycle 1
  -0 [005] d.h. 776367.639197: xhci_urb_giveback: ep4in-isoc: 
urb 72126553 pipe 135040 slot 15 length 196/196 sgs 0/0 stream 0 flags 
0206
  -0 [005] d.h. 776367.639197: xhci_inc_deq: EVENT 
97f84b16: enq 0x0010170b5000(0x0010170b5000) deq 
0x0010170b5670(0x0010170b5000) segs 1 stream 0 free_trbs 254 bounce 0 
cycle 1
  -0 [005] ..s. 776367.639212: xhci_urb_enqueue: ep4in-isoc: 
urb 72126553 pipe 135040 slot 15 length 0/196 sgs 0/0 stream 0 flags 
0206
  -0 [005] d.s. 776367.639214: xhci_queue_trb: ISOC: Buffer 
000e2676f400 length 196 TD size 0 intr 0 type 'Isoch' flags b:i:I:c:s:I:e:c
  -0 [005] d.s. 776367.639214: xhci_inc_enq: ISOC 
33a6879e: enq 0x001014070430(0x00101407) deq 
0x001014070370(0x00101407) segs 2 stream 0 free_trbs 497 bounce 196 
cycle 1
  -0 [005] d.s. 776367.639215: xhci_ring_ep_doorbell: Ring 
doorbell for Slot 15 ep4in

However, after that point, no more xhci_handle_transfer: ISOC
lines ar seen in the log. The doorbell line above is the last
line in the log for ep4in.

Is this some area where USB3 and USB2 behave differently?

dmesg: 
https://drive.google.com/open?id=1S2Qc8lroqA5-RMukuLBLWFGx10vEjG-i

usb trace, as requested by Mathias: 
https://drive.google.com/open?id=1cbLcOnAtQRW0Chgak6PNC0l4yJv__4uO

-- 
All Rights Reversed.


signature.asc
Description: This is a digitally signed message part

Re: Re: [PATCH] media: staging: tegra-vde: fix runtime pm imbalance on error

2020-05-20 Thread dinghao . liu

Hi, Dan,

I agree the best solution is to fix __pm_runtime_resume(). But there are also 
many cases that assume pm_runtime_get_sync() will change PM usage 
counter on error. According to my static analysis results, the number of these 
"right" cases are larger. Adjusting __pm_runtime_resume() directly will 
introduce 
more new bugs. Therefore I think we should resolve the "bug" cases individually.

I think that Dmitry's patch is more reasonable than mine. 

Dinghao

"Dan Carpenter" 写道：
> On Wed, May 20, 2020 at 01:15:44PM +0300, Dmitry Osipenko wrote:
> > 20.05.2020 12:51, Dinghao Liu пишет:
> > > pm_runtime_get_sync() increments the runtime PM usage counter even
> > > it returns an error code. Thus a pairing decrement is needed on
> > > the error handling path to keep the counter balanced.
> > > 
> > > Signed-off-by: Dinghao Liu 
> > > ---
> > >  drivers/staging/media/tegra-vde/vde.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/staging/media/tegra-vde/vde.c 
> > > b/drivers/staging/media/tegra-vde/vde.c
> > > index d3e63512a765..dd134a3a15c7 100644
> > > --- a/drivers/staging/media/tegra-vde/vde.c
> > > +++ b/drivers/staging/media/tegra-vde/vde.c
> > > @@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct 
> > > tegra_vde *vde,
> > >  
> > >   ret = pm_runtime_get_sync(dev);
> > >   if (ret < 0)
> > > - goto unlock;
> > > + goto put_runtime_pm;
> > >  
> > >   /*
> > >* We rely on the VDE registers reset value, otherwise VDE
> > > 
> > 
> > Hello Dinghao,
> > 
> > Thank you for the patch. I sent out a similar patch a week ago [1].
> > 
> > [1]
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/20200514210847.9269-2-dig...@gmail.com/
> > 
> > The pm_runtime_put_noidle() should have the same effect as yours
> > variant, although my variant won't change the last_busy RPM time, which
> > I think is a bit more appropriate behavior.
> 
> I don't think either patch is correct.  The right thing to do is to fix
> __pm_runtime_resume() so it doesn't leak a reference count on error.
> 
> The problem is that a lot of functions don't check the return so
> possibly we are relying on that behavior.  We may need to introduce a
> new function which cleans up properly instead of leaking reference
> counts?
> 
> Also it's not documented that pm_runtime_get_sync() returns 1 sometimes
> on success so it leads to a few bugs.
> 
> drivers/gpu/drm/stm/ltdc.c: ret = pm_runtime_get_sync(ddev->dev);
> drivers/gpu/drm/stm/ltdc.c- if (ret) {
> --
> drivers/gpu/drm/stm/ltdc.c: ret = pm_runtime_get_sync(ddev->dev);
> drivers/gpu/drm/stm/ltdc.c- if (ret) {
> 
> drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c:  ret = 
> pm_runtime_get_sync(pm->dev);
> drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c-  if (ret)
> 
> drivers/media/platform/ti-vpe/cal.c:ret = pm_runtime_get_sync(&pdev->dev);
> drivers/media/platform/ti-vpe/cal.c-if (ret)
> 
> drivers/mfd/arizona-core.c: ret = 
> pm_runtime_get_sync(arizona->dev);
> drivers/mfd/arizona-core.c- if (ret != 0)
> 
> drivers/remoteproc/qcom_q6v5_adsp.c:ret = pm_runtime_get_sync(adsp->dev);
> drivers/remoteproc/qcom_q6v5_adsp.c-if (ret)
> 
> drivers/spi/spi-img-spfi.c: ret = pm_runtime_get_sync(dev);
> drivers/spi/spi-img-spfi.c- if (ret)
> 
> drivers/usb/dwc3/dwc3-pci.c:ret = pm_runtime_get_sync(&dwc3->dev);
> drivers/usb/dwc3/dwc3-pci.c-if (ret)
> 
> drivers/watchdog/rti_wdt.c: ret = pm_runtime_get_sync(dev);
> drivers/watchdog/rti_wdt.c- if (ret) {
> 
> regards,
> dan carpenter
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 99c7da112c95..e280991a977d 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1082,6 +1082,9 @@ int __pm_runtime_resume(struct device *dev, int 
> rpmflags)
>   retval = rpm_resume(dev, rpmflags);
>   spin_unlock_irqrestore(&dev->power.lock, flags);
>  
> + if (retval < 0 && rpmflags & RPM_GET_PUT)
> + atomic_dec(&dev->power.usage_count);
> +
>   return retval;
>  }
>  EXPORT_SYMBOL_GPL(__pm_runtime_resume);

[PATCH] init/do_mounts: fix a coding style error

2020-05-20 Thread zhouchuangao

Fix code style errors reported by scripts/checkpatch.pl.

Signed-off-by: zhouchuangao 
---
 init/do_mounts.c | 52 ++--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index 29d326b..2f8bd41 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -249,7 +249,7 @@ dev_t name_to_dev_t(const char *name)
 #endif
 
if (strncmp(name, "/dev/", 5) != 0) {
-   unsigned maj, min, offset;
+   unsigned int maj, min, offset;
char dummy;
 
if ((sscanf(name, "%u:%u%c", &maj, &min, &dummy) == 2) ||
@@ -412,8 +412,7 @@ static int __init do_mount_root(const char *name, const 
char *fs,
ksys_chdir("/root");
s = current->fs->pwd.dentry->d_sb;
ROOT_DEV = s->s_dev;
-   printk(KERN_INFO
-  "VFS: Mounted root (%s filesystem)%s on device %u:%u.\n",
+   pr_info("VFS: Mounted root (%s filesystem)%s on device %u:%u.\n",
   s->s_type->name,
   sb_rdonly(s) ? " readonly" : "",
   MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
@@ -437,25 +436,26 @@ void __init mount_block_root(char *name, int flags)
 retry:
for (p = fs_names; *p; p += strlen(p)+1) {
int err = do_mount_root(name, p, flags, root_mount_data);
+
switch (err) {
-   case 0:
-   goto out;
-   case -EACCES:
-   case -EINVAL:
-   continue;
+   case 0:
+   goto out;
+   case -EACCES:
+   case -EINVAL:
+   continue;
}
-   /*
+   /*
 * Allow the user to distinguish between failed sys_open
 * and bad superblock on root device.
 * and give them a list of the available devices
 */
-   printk("VFS: Cannot open root device \"%s\" or %s: error %d\n",
+   pr_info("VFS: Cannot open root device \"%s\" or %s: error %d\n",
root_device_name, b, err);
-   printk("Please append a correct \"root=\" boot option; here are 
the available partitions:\n");
+   pr_info("Please append a correct \"root=\" boot option; here 
are the available partitions:\n");
 
printk_all_partitions();
 #ifdef CONFIG_DEBUG_BLOCK_EXT_DEVT
-   printk("DEBUG_BLOCK_EXT_DEVT is enabled, you need to specify "
+   pr_info("DEBUG_BLOCK_EXT_DEVT is enabled, you need to specify "
   "explicit textual name for \"root=\" boot option.\n");
 #endif
panic("VFS: Unable to mount root fs on %s", b);
@@ -465,17 +465,17 @@ void __init mount_block_root(char *name, int flags)
goto retry;
}
 
-   printk("List of all partitions:\n");
+   pr_info("List of all partitions:\n");
printk_all_partitions();
-   printk("No filesystem could mount root, tried: ");
+   pr_info("No filesystem could mount root, tried: ");
for (p = fs_names; *p; p += strlen(p)+1)
-   printk(" %s", p);
-   printk("\n");
+   pr_info(" %s", p);
+   pr_info("\n");
panic("VFS: Unable to mount root fs on %s", b);
 out:
put_page(page);
 }
- 
+
 #ifdef CONFIG_ROOT_NFS
 
 #define NFSROOT_TIMEOUT_MIN5
@@ -560,6 +560,7 @@ void __init change_floppy(char *fmt, ...)
char c;
int fd;
va_list args;
+
va_start(args, fmt);
vsprintf(buf, fmt, args);
va_end(args);
@@ -568,7 +569,7 @@ void __init change_floppy(char *fmt, ...)
ksys_ioctl(fd, FDEJECT, 0);
ksys_close(fd);
}
-   printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
+   pr_notice("VFS: Insert %s and press ENTER\n", buf);
fd = ksys_open("/dev/console", O_RDWR, 0);
if (fd >= 0) {
ksys_ioctl(fd, TCGETS, (long)&termios);
@@ -585,27 +586,27 @@ void __init change_floppy(char *fmt, ...)
 void __init mount_root(void)
 {
 #ifdef CONFIG_ROOT_NFS
-   if (ROOT_DEV == Root_NFS) {
+   if (Root_NFS == ROOT_DEV) {
if (mount_nfs_root())
return;
 
-   printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying 
floppy.\n");
+   pr_err("VFS: Unable to mount root fs via NFS, trying 
floppy.\n");
ROOT_DEV = Root_FD0;
}
 #endif
 #ifdef CONFIG_CIFS_ROOT
-   if (ROOT_DEV == Root_CIFS) {
+   if (Root_CIFS == ROOT_DEV) {
if (mount_cifs_root())
return;
 
-   printk(KERN_ERR "VFS: Unable to mount root fs via SMB, trying 
floppy.\n");
+   pr_err("VFS: Unable to mount root fs via SMB, trying 
floppy.\n");
ROOT_DEV = Root_FD0;
}

Re: [PATCH] arm64: dts: qcom: sc7180: Move mss node to the right place

2020-05-20 Thread Sibi Sankar


On 2020-05-21 06:33, Stephen Boyd wrote:
The modem node has an address of 408 and thus should come after 
tlmm

and before gpu. Move the node to the right place to maintainer proper
address sort order.

Cc: Evan Green 
Cc: Sibi Sankar 
Fixes: e14a15eba89a ("arm64: dts: qcom: sc7180: Add Q6V5 MSS node")
Signed-off-by: Stephen Boyd 


Reviewed-by: Sibi Sankar 


---
 arch/arm64/boot/dts/qcom/sc7180.dtsi | 102 +--
 1 file changed, 51 insertions(+), 51 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi
b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index 6b12c60c37fb..1027ef70f8db 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -1459,6 +1459,57 @@ pinconf-sd-cd {
};
};

+   remoteproc_mpss: remoteproc@408 {
+   compatible = "qcom,sc7180-mpss-pas";
+   reg = <0 0x0408 0 0x4040>, <0 0x0418 0 0x48>;
+   reg-names = "qdsp6", "rmb";
+
+   interrupts-extended = <&intc GIC_SPI 266 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 0 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 1 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 2 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 3 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 7 
IRQ_TYPE_EDGE_RISING>;
+   interrupt-names = "wdog", "fatal", "ready", "handover",
+ "stop-ack", "shutdown-ack";
+
+   clocks = <&gcc GCC_MSS_CFG_AHB_CLK>,
+<&gcc GCC_MSS_Q6_MEMNOC_AXI_CLK>,
+<&gcc GCC_MSS_NAV_AXI_CLK>,
+<&gcc GCC_MSS_SNOC_AXI_CLK>,
+<&gcc GCC_MSS_MFAB_AXIS_CLK>,
+<&rpmhcc RPMH_CXO_CLK>;
+   clock-names = "iface", "bus", "nav", "snoc_axi",
+ "mnoc_axi", "xo";
+
+   power-domains = <&aoss_qmp AOSS_QMP_LS_MODEM>,
+   <&rpmhpd SC7180_CX>,
+   <&rpmhpd SC7180_MX>,
+   <&rpmhpd SC7180_MSS>;
+   power-domain-names = "load_state", "cx", "mx", "mss";
+
+   memory-region = <&mpss_mem>;
+
+   qcom,smem-states = <&modem_smp2p_out 0>;
+   qcom,smem-state-names = "stop";
+
+   resets = <&aoss_reset AOSS_CC_MSS_RESTART>,
+<&pdc_reset PDC_MODEM_SYNC_RESET>;
+   reset-names = "mss_restart", "pdc_reset";
+
+   qcom,halt-regs = <&tcsr_mutex_regs 0x23000 0x25000 
0x24000>;
+   qcom,spare-regs = <&tcsr_regs 0xb3e4>;
+
+   status = "disabled";
+
+   glink-edge {
+   interrupts = ;
+   label = "modem";
+   qcom,remote-pid = <1>;
+   mboxes = <&apss_shared 12>;
+   };
+   };
+
gpu: gpu@500 {
compatible = "qcom,adreno-618.0", "qcom,adreno";
#stream-id-cells = <16>;
@@ -2054,57 +2105,6 @@ apss_merge_funnel_in: endpoint {
};
};

-   remoteproc_mpss: remoteproc@408 {
-   compatible = "qcom,sc7180-mpss-pas";
-   reg = <0 0x0408 0 0x4040>, <0 0x0418 0 0x48>;
-   reg-names = "qdsp6", "rmb";
-
-   interrupts-extended = <&intc GIC_SPI 266 
IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 0 
IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 1 
IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 2 
IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 3 
IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 7 
IRQ_TYPE_EDGE_RISING>;
-   interrupt-names = "wdog", "fatal", "ready", "handover",
- "stop-ack", "shutdown-ack";
-
-   clocks = <&gcc GCC_MSS_CFG_AHB_CLK>,
-<&gcc GCC_MSS_Q6_MEMNOC_AXI_CLK>,
-<&gcc GCC_MSS_NAV_AXI_CLK>,
-<&gcc GCC_MSS_SNOC_AXI_CLK>,
-<&gcc GCC_MSS_MFAB_AXIS_CLK>,
-<&rpmhcc RPMH_CX

[PATCH v5 3/4] mm/memory.c: Add memory read privilege on page fault handling

2020-05-20 Thread Bibo Mao

Here add pte_sw_mkyoung function to make page readable on MIPS
platform during page fault handling. This patch improves page
fault latency about 10% on my MIPS machine with lmbench
lat_pagefault case.

It is noop function on other arches, there is no negative
influence on those architectures.

Signed-off-by: Bibo Mao 
---
 arch/mips/include/asm/pgtable.h |  2 ++
 include/asm-generic/pgtable.h   | 16 
 mm/memory.c |  3 +++
 3 files changed, 21 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 5f610ec..9cd811e 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -414,6 +414,8 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte;
 }
 
+#define pte_sw_mkyoung pte_mkyoung
+
 #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
 static inline int pte_huge(pte_t pte)  { return pte_val(pte) & _PAGE_HUGE; }
 
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 329b8c8..7dcfa30 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -227,6 +227,22 @@ static inline void ptep_set_wrprotect(struct mm_struct 
*mm, unsigned long addres
 }
 #endif
 
+/*
+ * On some architectures hardware does not set page access bit when accessing
+ * memory page, it is responsibilty of software setting this bit. It brings
+ * out extra page fault penalty to track page access bit. For optimization page
+ * access bit can be set during all page fault flow on these arches.
+ * To be differentiate with macro pte_mkyoung, this macro is used on platforms
+ * where software maintains page access bit.
+ */
+#ifndef pte_sw_mkyoung
+static inline pte_t pte_sw_mkyoung(pte_t pte)
+{
+   return pte;
+}
+#define pte_sw_mkyoung pte_sw_mkyoung
+#endif
+
 #ifndef pte_savedwrite
 #define pte_savedwrite pte_write
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index 9e2be4a..33d3b4c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
}
flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
entry = mk_pte(new_page, vma->vm_page_prot);
+   entry = pte_sw_mkyoung(entry);
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
/*
 * Clear the pte entry and flush it first, before updating the
@@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
__SetPageUptodate(page);
 
entry = mk_pte(page, vma->vm_page_prot);
+   entry = pte_sw_mkyoung(entry);
if (vma->vm_flags & VM_WRITE)
entry = pte_mkwrite(pte_mkdirty(entry));
 
@@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct 
mem_cgroup *memcg,
 
flush_icache_page(vma, page);
entry = mk_pte(page, vma->vm_page_prot);
+   entry = pte_sw_mkyoung(entry);
if (write)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
/* copy-on-write page */
-- 
1.8.3.1

[PATCH v5 1/4] MIPS: Do not flush tlb page when updating PTE entry

2020-05-20 Thread Bibo Mao

It is not necessary to flush tlb page on all CPUs if suitable PTE
entry exists already during page fault handling, just updating
TLB is fine.

Here redefine flush_tlb_fix_spurious_fault as empty on MIPS system.
V5:
- Define update_mmu_cache function specified on MIPS platform, and
  add page fault smp-race stats info
V4:
- add pte_sw_mkyoung function to implement readable privilege, and
  this function is  only in effect on MIPS system.
- add page valid bit judgement in function pte_modify
V3:
- add detailed changelog, modify typo issue in patch V2
v2:
- split flush_tlb_fix_spurious_fault and tlb update into two patches
- comments typo modification
- separate tlb update and add pte readable privilege into two patches

Signed-off-by: Bibo Mao 
---
 arch/mips/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 9b01d2d..0d625c2 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -478,6 +478,8 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
return __pgprot(prot);
 }
 
+#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
-- 
1.8.3.1

Re: [PATCH v3 4/4] PCI: cadence: Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property

2020-05-20 Thread Kishon Vijay Abraham I

Hi Rob,

On 5/19/2020 10:41 PM, Rob Herring wrote:
> On Fri, May 8, 2020 at 7:07 AM Kishon Vijay Abraham I  wrote:
>>
>> Cadence PCIe core driver (host mode) uses "cdns,no-bar-match-nbits"
>> property to configure the number of bits passed through from PCIe
>> address to internal address in Inbound Address Translation register.
>> This only used the NO MATCH BAR.
>>
>> However standard PCI dt-binding already defines "dma-ranges" to
>> describe the address ranges accessible by PCIe controller. Add support
>> in Cadence PCIe host driver to parse dma-ranges and configure the
>> inbound regions for BAR0, BAR1 and NO MATCH BAR. Cadence IP specifies
>> maximum size for BAR0 as 256GB, maximum size for BAR1 as 2 GB, so if
>> the dma-ranges specifies a size larger than the maximum allowed, the
>> driver will split and configure the BARs.
> 
> Would be useful to know what your dma-ranges contains now.
> 
> 
>> Legacy device tree binding compatibility is maintained by retaining
>> support for "cdns,no-bar-match-nbits".
>>
>> Signed-off-by: Kishon Vijay Abraham I 
>> ---
>>  .../controller/cadence/pcie-cadence-host.c| 141 --
>>  drivers/pci/controller/cadence/pcie-cadence.h |  17 ++-
>>  2 files changed, 141 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c 
>> b/drivers/pci/controller/cadence/pcie-cadence-host.c
>> index 6ecebb79057a..2485ecd8434d 100644
>> --- a/drivers/pci/controller/cadence/pcie-cadence-host.c
>> +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
>> @@ -11,6 +11,12 @@
>>
>>  #include "pcie-cadence.h"
>>
>> +static u64 cdns_rp_bar_max_size[] = {
>> +   [RP_BAR0] = _ULL(128 * SZ_2G),
>> +   [RP_BAR1] = SZ_2G,
>> +   [RP_NO_BAR] = SZ_64T,
>> +};
>> +
>>  void __iomem *cdns_pci_map_bus(struct pci_bus *bus, unsigned int devfn,
>>int where)
>>  {
>> @@ -106,6 +112,117 @@ static int cdns_pcie_host_init_root_port(struct 
>> cdns_pcie_rc *rc)
>> return 0;
>>  }
>>
>> +static void cdns_pcie_host_bar_ib_config(struct cdns_pcie_rc *rc,
>> +enum cdns_pcie_rp_bar bar,
>> +u64 cpu_addr, u32 aperture)
>> +{
>> +   struct cdns_pcie *pcie = &rc->pcie;
>> +   u32 addr0, addr1;
>> +
>> +   addr0 = CDNS_PCIE_AT_IB_RP_BAR_ADDR0_NBITS(aperture) |
>> +   (lower_32_bits(cpu_addr) & GENMASK(31, 8));
>> +   addr1 = upper_32_bits(cpu_addr);
>> +   cdns_pcie_writel(pcie, CDNS_PCIE_AT_IB_RP_BAR_ADDR0(bar), addr0);
>> +   cdns_pcie_writel(pcie, CDNS_PCIE_AT_IB_RP_BAR_ADDR1(bar), addr1);
>> +}
>> +
>> +static int cdns_pcie_host_bar_config(struct cdns_pcie_rc *rc,
>> +struct resource_entry *entry,
>> +enum cdns_pcie_rp_bar *index)
>> +{
>> +   u64 cpu_addr, pci_addr, size, winsize;
>> +   struct cdns_pcie *pcie = &rc->pcie;
>> +   struct device *dev = pcie->dev;
>> +   enum cdns_pcie_rp_bar bar;
>> +   unsigned long flags;
>> +   u32 aperture;
>> +   u32 value;
>> +
>> +   cpu_addr = entry->res->start;
>> +   flags = entry->res->flags;
>> +   pci_addr = entry->res->start - entry->offset;
>> +   size = resource_size(entry->res);
>> +   bar = *index;
>> +
>> +   if (entry->offset) {
>> +   dev_err(dev, "Cannot map PCI addr: %llx to CPU addr: %llx\n",
>> +   pci_addr, cpu_addr);
> 
> Would be a bit more clear to say PCI addr must equal CPU addr.
> 
>> +   return -EINVAL;
>> +   }
>> +
>> +   value = cdns_pcie_readl(pcie, CDNS_PCIE_LM_RC_BAR_CFG);
>> +   while (size > 0) {
>> +   if (bar > RP_NO_BAR) {
>> +   dev_err(dev, "Failed to map inbound regions!\n");
>> +   return -EINVAL;
>> +   }
>> +
>> +   winsize = size;
>> +   if (size > cdns_rp_bar_max_size[bar])
>> +   winsize = cdns_rp_bar_max_size[bar];
>> +
>> +   aperture = ilog2(winsize);
>> +
>> +   cdns_pcie_host_bar_ib_config(rc, bar, cpu_addr, aperture);
>> +
>> +   if (bar == RP_NO_BAR)
>> +   break;
>> +
>> +   if (winsize + cpu_addr >= SZ_4G) {
>> +   if (!(flags & IORESOURCE_PREFETCH))
>> +   value |= LM_RC_BAR_CFG_CTRL_MEM_64BITS(bar);
>> +   value |= LM_RC_BAR_CFG_CTRL_PREF_MEM_64BITS(bar);
>> +   } else {
>> +   if (!(flags & IORESOURCE_PREFETCH))
>> +   value |= LM_RC_BAR_CFG_CTRL_MEM_32BITS(bar);
>> +   value |= LM_RC_BAR_CFG_CTRL_PREF_MEM_32BITS(bar);
>> +   }
>> +
>> +   value |= LM_RC_BAR_CFG_APERTURE(bar, aperture);
>> +
>> +   size -= winsize;
>> +   cpu_addr += winsize;
>> +

Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation

2020-05-20 Thread Nathan Chancellor

On Thu, May 21, 2020 at 12:17:12AM +0200, Borislav Petkov wrote:
> Hi,
> 
> On Tue, May 12, 2020 at 02:36:53PM -, tip-bot2 for Will Deacon wrote:
> > The following commit has been merged into the locking/kcsan branch of tip:
> > 
> > Commit-ID: cdd28ad2d8110099e43527e96d059c5639809680
> > Gitweb:
> > https://git.kernel.org/tip/cdd28ad2d8110099e43527e96d059c5639809680
> > Author:Will Deacon 
> > AuthorDate:Mon, 11 May 2020 21:41:49 +01:00
> > Committer: Thomas Gleixner 
> > CommitterDate: Tue, 12 May 2020 11:04:17 +02:00
> > 
> > READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> > 
> > Rather then open-code the disabling/enabling of KCSAN across the guts of
> > {READ,WRITE}_ONCE(), defer to the data_race() macro instead.
> > 
> > Signed-off-by: Will Deacon 
> > Signed-off-by: Thomas Gleixner 
> > Acked-by: Peter Zijlstra (Intel) 
> > Cc: Marco Elver 
> > Link: https://lkml.kernel.org/r/20200511204150.27858-18-w...@kernel.org
> 
> so this commit causes a kernel build slowdown depending on the .config
> of between 50% and over 100%. I just bisected locking/kcsan and got
> 
> NOT_OK:   cdd28ad2d811 READ_ONCE: Use data_race() to avoid KCSAN 
> instrumentation
> OK:   88f1be32068d kcsan: Rework data_race() so that it can be used by 
> READ_ONCE()
> 
> with a simple:
> 
> $ git clean -dqfx && mk defconfig
> $ time make -j
> 
> I'm not even booting the kernels - simply checking out the above commits
> and building the target kernels. I.e., something in that commit is
> making gcc go nuts in the compilation phases.
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

For what it's worth, I also noticed the same thing with clang. I only
verified the issue in one of my first build targets, an arm defconfig
build, which regressed from 2.5 minutes to 10+ minutes.

More details available on our issue tracker (Nick did some more
profiling on other configs with both clang and gcc):

https://github.com/ClangBuiltLinux/linux/issues/1032

More than happy to do further triage as time permits. I do note Marco's
message about the upcoming series to eliminate this but it would be nice
if this did not regress in the meantime.

Cheers,
Nathan

[PATCH v5 2/4] mm/memory.c: Update local TLB if PTE entry exists

2020-05-20 Thread Bibo Mao

If two threads concurrently fault at the same address, the thread that
won the race updates the PTE and its local TLB. For now, the other
thread gives up, simply does nothing, and continues.

It could happen that this second thread triggers another fault, whereby
it only updates its local TLB while handling the fault. Instead of
triggering another fault, let's directly update the local TLB of the
second thread.

It is only useful to architectures where software can update TLB, it may
bring out some negative effect if update_mmu_cache is used for other
purpose also. It seldom happens where multiple threads access the same
page at the same time, so the negative effect is limited on other arches.

With specjvm2008 workload, smp-race pgfault counts is about 3% to 4%
of the total pgfault counts by watching /proc/vmstats information

Signed-off-by: Bibo Mao 
---
 arch/mips/include/asm/pgtable.h | 20 
 mm/memory.c | 27 +++
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 0d625c2..5f610ec 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -480,6 +480,26 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
 
 #define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
 
+#define __HAVE_ARCH_PTE_SAME
+static inline int pte_same(pte_t pte_a, pte_t pte_b)
+{
+   return pte_val(pte_a) == pte_val(pte_b);
+}
+
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
+static inline int ptep_set_access_flags(struct vm_area_struct *vma,
+   unsigned long address, pte_t *ptep,
+   pte_t entry, int dirty)
+{
+   if (!pte_same(*ptep, entry))
+   set_pte_at(vma->vm_mm, address, ptep, entry);
+   /*
+* update_mmu_cache will unconditionally execute, handling both
+* the case that the PTE changed and the spurious fault case.
+*/
+   return true;
+}
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
diff --git a/mm/memory.c b/mm/memory.c
index f703fe8..9e2be4a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2436,10 +2436,9 @@ static inline bool cow_user_page(struct page *dst, 
struct page *src,
if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
/*
 * Other thread has already handled the fault
-* and we don't need to do anything. If it's
-* not the case, the fault will be triggered
-* again on the same address.
+* and update local tlb only
 */
+   update_mmu_cache(vma, addr, vmf->pte);
ret = false;
goto pte_unlock;
}
@@ -2463,7 +2462,8 @@ static inline bool cow_user_page(struct page *dst, struct 
page *src,
vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
locked = true;
if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
-   /* The PTE changed under us. Retry page fault. */
+   /* The PTE changed under us, update local tlb */
+   update_mmu_cache(vma, addr, vmf->pte);
ret = false;
goto pte_unlock;
}
@@ -2752,6 +2752,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
new_page = old_page;
page_copied = 1;
} else {
+   update_mmu_cache(vma, vmf->address, vmf->pte);
mem_cgroup_cancel_charge(new_page, memcg, false);
}
 
@@ -2812,6 +2813,7 @@ vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf)
 * pte_offset_map_lock.
 */
if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+   update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
pte_unmap_unlock(vmf->pte, vmf->ptl);
return VM_FAULT_NOPAGE;
}
@@ -2936,6 +2938,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
vmf->address, &vmf->ptl);
if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+   update_mmu_cache(vma, vmf->address, vmf->pte);
unlock_page(vmf->page);
pte_unmap_unlock(vmf->pte, vmf->ptl);
put_page(vmf->page);
@@ -3341,8 +3344,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
vma->vm_page_prot));
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,

[PATCH v5 4/4] MIPS: mm: add page valid judgement in function pte_modify

2020-05-20 Thread Bibo Mao

If original PTE has _PAGE_ACCESSED bit set, and new pte has no
_PAGE_NO_READ bit set, we can add _PAGE_SILENT_READ bit to enable
page valid bit.

Signed-off-by: Bibo Mao 
---
 arch/mips/include/asm/pgtable.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 9cd811e..ef26552 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -529,8 +529,11 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #else
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-   return __pte((pte_val(pte) & _PAGE_CHG_MASK) |
-(pgprot_val(newprot) & ~_PAGE_CHG_MASK));
+   pte_val(pte) &= _PAGE_CHG_MASK;
+   pte_val(pte) |= pgprot_val(newprot) & ~_PAGE_CHG_MASK;
+   if ((pte_val(pte) & _PAGE_ACCESSED) && !(pte_val(pte) & _PAGE_NO_READ))
+   pte_val(pte) |= _PAGE_SILENT_READ;
+   return pte;
 }
 #endif
 
-- 
1.8.3.1

Re: [PATCH v14 03/11] soc: mediatek: Add basic_clk_name to scp_power_data

2020-05-20 Thread Weiyi Lu

On Mon, 2020-05-18 at 19:52 +0200, Enric Balletbo i Serra wrote:
> Hi Weiyi,
> 
> On 15/5/20 5:35, Weiyi Lu wrote:
> > On Mon, 2020-05-11 at 14:02 +0800, Weiyi Lu wrote:
> >> On Wed, 2020-05-06 at 23:01 +0200, Enric Balletbo i Serra wrote:
> >>> Hi Weiyi,
> >>>
> >>> Thank you for your patch.
> >>>
> >>> On 6/5/20 10:15, Weiyi Lu wrote:
>  Try to stop extending the clk_id or clk_names if there are
>  more and more new BASIC clocks. To get its own clocks by the
>  basic_clk_name of each power domain.
>  And then use basic_clk_name strings for all compatibles, instead of
>  mixing clk_id and clk_name.
> 
>  Signed-off-by: Weiyi Lu 
>  Reviewed-by: Nicolas Boichat 
>  ---
>   drivers/soc/mediatek/mtk-scpsys.c | 134 
>  --
>   1 file changed, 41 insertions(+), 93 deletions(-)
> 
>  diff --git a/drivers/soc/mediatek/mtk-scpsys.c 
>  b/drivers/soc/mediatek/mtk-scpsys.c
>  index f669d37..c9c3cf7 100644
>  --- a/drivers/soc/mediatek/mtk-scpsys.c
>  +++ b/drivers/soc/mediatek/mtk-scpsys.c
>  @@ -78,34 +78,6 @@
>   #define PWR_STATUS_HIF1 BIT(26) /* MT7622 */
>   #define PWR_STATUS_WB   BIT(27) /* MT7622 */
>   
>  -enum clk_id {
>  -CLK_NONE,
>  -CLK_MM,
>  -CLK_MFG,
>  -CLK_VENC,
>  -CLK_VENC_LT,
>  -CLK_ETHIF,
>  -CLK_VDEC,
>  -CLK_HIFSEL,
>  -CLK_JPGDEC,
>  -CLK_AUDIO,
>  -CLK_MAX,
>  -};
>  -
>  -static const char * const clk_names[] = {
>  -NULL,
>  -"mm",
>  -"mfg",
>  -"venc",
>  -"venc_lt",
>  -"ethif",
>  -"vdec",
>  -"hif_sel",
>  -"jpgdec",
>  -"audio",
>  -NULL,
>  -};
>  -
>   #define MAX_CLKS3
>   
>   /**
>  @@ -116,7 +88,7 @@ enum clk_id {
>    * @sram_pdn_bits: The mask for sram power control bits.
>    * @sram_pdn_ack_bits: The mask for sram power control acked bits.
>    * @bus_prot_mask: The mask for single step bus protection.
>  - * @clk_id: The basic clocks required by this power domain.
>  + * @basic_clk_name: The basic clocks required by this power domain.
>    * @caps: The flag for active wake-up action.
>    */
>   struct scp_domain_data {
>  @@ -126,7 +98,7 @@ struct scp_domain_data {
>   u32 sram_pdn_bits;
>   u32 sram_pdn_ack_bits;
>   u32 bus_prot_mask;
>  -enum clk_id clk_id[MAX_CLKS];
>  +const char *basic_clk_name[MAX_CLKS];
> >>>
> >>> I only reviewed v13, so sorry if this was already discussed. I am 
> >>> wondering if
> >>> would be better take advantage of the devm_clk_bulk_get() function 
> >>> instead of
> >>> kind of reimplementing the same, something like this
> >>>
> >>>   const struct clk_bulk_data *basic_clocks;
> >>>
> >>
> >> I thought it should be const struct clk_bulk_data
> >> basic_clocks[MAX_CLKS]; instead of const struct clk_bulk_data
> >> *basic_clocks; in struct scp_domain_data data type
> >>
>   u8 caps;
>   };
>   
>  @@ -411,12 +383,19 @@ static int scpsys_power_off(struct 
>  generic_pm_domain *genpd)
>   return ret;
>   }
>   
>  -static void init_clks(struct platform_device *pdev, struct clk **clk)
>  +static int init_basic_clks(struct platform_device *pdev, struct clk 
>  **clk,
>  +const char * const *name)
>   {
>   int i;
>   
>  -for (i = CLK_NONE + 1; i < CLK_MAX; i++)
>  -clk[i] = devm_clk_get(&pdev->dev, clk_names[i]);
>  +for (i = 0; i < MAX_CLKS && name[i]; i++) {
>  +clk[i] = devm_clk_get(&pdev->dev, name[i]);
>  +
>  +if (IS_ERR(clk[i]))
>  +return PTR_ERR(clk[i]);
>  +}
> >>>
> >>> You will be able to remove this function, see below ...
> >>>
>  +
>  +return 0;
>   }
>   
>   static struct scp *init_scp(struct platform_device *pdev,
>  @@ -426,9 +405,8 @@ static struct scp *init_scp(struct platform_device 
>  *pdev,
>   {
>   struct genpd_onecell_data *pd_data;
>   struct resource *res;
>  -int i, j;
>  +int i, ret;
>   struct scp *scp;
>  -struct clk *clk[CLK_MAX];
>   
>   scp = devm_kzalloc(&pdev->dev, sizeof(*scp), GFP_KERNEL);
>   if (!scp)
>  @@ -481,8 +459,6 @@ static struct scp *init_scp(struct platform_device 
>  *pdev,
>   
>   pd_data->num_domains = num;
>   
>  -init_clks(pdev, clk);
>  -
>   for (i = 0; i < num; i++) {
>   struct scp

[PATCH v1 2/2] arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Robin Gong

Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low group
(1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage range too.
Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate i.mx8mm
datasheet as the below warning log in kernel:

[0.995524] LDO1: Bringing 180uV into 300-300uV
[0.999196] LDO2: Bringing 80uV into 90-90uV

Signed-off-by: Robin Gong 
---
 arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 ++--
 arch/arm64/boot/dts/freescale/imx8mn-evk.dts  | 9 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts 
b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
index d07e0e6..a1e5483 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
@@ -113,7 +113,7 @@
 
ldo1_reg: LDO1 {
regulator-name = "LDO1";
-   regulator-min-microvolt = <300>;
+   regulator-min-microvolt = <160>;
regulator-max-microvolt = <330>;
regulator-boot-on;
regulator-always-on;
@@ -121,7 +121,7 @@
 
ldo2_reg: LDO2 {
regulator-name = "LDO2";
-   regulator-min-microvolt = <90>;
+   regulator-min-microvolt = <80>;
regulator-max-microvolt = <90>;
regulator-boot-on;
regulator-always-on;
diff --git a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts 
b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
index 61f3519..117ff4b 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mn-evk.dts
@@ -13,6 +13,15 @@
compatible = "fsl,imx8mn-evk", "fsl,imx8mn";
 };
 
+&ecspi1 {
+   status = "okay";
+spidev0: spi@0 {
+   compatible = "ge,achc";
+   reg = <0>;
+   spi-max-frequency = <100>;
+   };
+};
+
 &A53_0 {
/delete-property/operating-points-v2;
 };
-- 
2.7.4

[PATCH v1 1/2] arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range

2020-05-20 Thread Robin Gong

Correct ldo1 voltage range from wrong high group(3.0v~3.3v) to low group
(1.6v~1.9v) because the ldo1 should be 1.8v. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6v~3.3v. For ldo2@0.8v, correct voltage range too.
Otherwise, ldo1 would be kept @3.0v and ldo2@0.9v which violate i.mx8mm
datasheet as the below warning log in kernel:

[0.995524] LDO1: Bringing 180uV into 300-300uV
[0.999196] LDO2: Bringing 80uV into 90-90uV

Signed-off-by: Robin Gong 
---
 arch/arm64/boot/dts/freescale/imx8mm-evk.dts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts 
b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
index e5ec832..0f1d7f8 100644
--- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
@@ -208,7 +208,7 @@
 
ldo1_reg: LDO1 {
regulator-name = "LDO1";
-   regulator-min-microvolt = <300>;
+   regulator-min-microvolt = <160>;
regulator-max-microvolt = <330>;
regulator-boot-on;
regulator-always-on;
@@ -216,7 +216,7 @@
 
ldo2_reg: LDO2 {
regulator-name = "LDO2";
-   regulator-min-microvolt = <90>;
+   regulator-min-microvolt = <80>;
regulator-max-microvolt = <90>;
regulator-boot-on;
regulator-always-on;
-- 
2.7.4

Re: [PATCH] soc: fsl: qe: Replace one-element array and use struct_size() helper

2020-05-20 Thread Kees Cook

On Wed, May 20, 2020 at 06:52:21PM -0500, Li Yang wrote:
> On Mon, May 18, 2020 at 5:57 PM Kees Cook  wrote:
> > Hm, looking at this code, I see a few other things that need to be
> > fixed:
> >
> > 1) drivers/tty/serial/ucc_uart.c does not do a be32_to_cpu() conversion
> >on the length test (understandably, a little-endian system has never run
> >this code since it's ppc specific), but it's still wrong:
> >
> > if (firmware->header.length != fw->size) {
> >
> >compare to the firmware loader:
> >
> > length = be32_to_cpu(hdr->length);
> >
> > 2) drivers/soc/fsl/qe/qe.c does not perform bounds checking on the
> >per-microcode offsets, so the uploader might send data outside the
> >firmware buffer. Perhaps:
> 
> We do validate the CRC for each microcode, it is unlikely the CRC
> check can pass if the offset or length is not correct.  But you are
> probably right that it will be safer to check the boundary and fail

Right, but a malicious firmware file could still match CRC but trick the
kernel code.

> quicker before we actually start the CRC check.  Will you come up with
> a formal patch or you want us to deal with it?

It sounds like Gustavo will be sending one, though I don't think either
of us have the hardware to test it with, so if you could do that part,
that would be great! :)

-- 
Kees Cook

Re: [PATCH -V2] swap: Reduce lock contention on swap cache from swap slots allocation

2020-05-20 Thread Huang, Ying

Andrew Morton  writes:

> On Wed, 20 May 2020 11:15:02 +0800 Huang Ying  wrote:
>
>> In some swap scalability test, it is found that there are heavy lock
>> contention on swap cache even if we have split one swap cache radix
>> tree per swap device to one swap cache radix tree every 64 MB trunk in
>> commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks").
>> 
>> The reason is as follow.  After the swap device becomes fragmented so
>> that there's no free swap cluster, the swap device will be scanned
>> linearly to find the free swap slots.  swap_info_struct->cluster_next
>> is the next scanning base that is shared by all CPUs.  So nearby free
>> swap slots will be allocated for different CPUs.  The probability for
>> multiple CPUs to operate on the same 64 MB trunk is high.  This causes
>> the lock contention on the swap cache.
>> 
>> To solve the issue, in this patch, for SSD swap device, a percpu
>> version next scanning base (cluster_next_cpu) is added.  Every CPU
>> will use its own per-cpu next scanning base.  And after finishing
>> scanning a 64MB trunk, the per-cpu scanning base will be changed to
>> the beginning of another randomly selected 64MB trunk.  In this way,
>> the probability for multiple CPUs to operate on the same 64 MB trunk
>> is reduced greatly.  Thus the lock contention is reduced too.  For
>> HDD, because sequential access is more important for IO performance,
>> the original shared next scanning base is used.
>> 
>> To test the patch, we have run 16-process pmbench memory benchmark on
>> a 2-socket server machine with 48 cores.  One ram disk is configured
>
> What does "ram disk" mean here?  Which drivers(s) are in use and backed
> by what sort of memory?

We use the following kernel command line

memmap=48G!6G memmap=48G!68G

to create 2 DRAM based /dev/pmem disks (48GB each).  Then we use these
ram disks as swap devices.

>> as the swap device per socket.  The pmbench working-set size is much
>> larger than the available memory so that swapping is triggered.  The
>> memory read/write ratio is 80/20 and the accessing pattern is random.
>> In the original implementation, the lock contention on the swap cache
>> is heavy.  The perf profiling data of the lock contention code path is
>> as following,
>> 
>> _raw_spin_lock_irq.add_to_swap_cache.add_to_swap.shrink_page_list:  7.91
>> _raw_spin_lock_irqsave.__remove_mapping.shrink_page_list:   7.11
>> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 2.51
>> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 1.66
>> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node:  1.29
>> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.03
>> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:0.93
>> 
>> After applying this patch, it becomes,
>> 
>> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 3.58
>> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node:  2.3
>> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 2.26
>> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:1.8
>> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.19
>> 
>> The lock contention on the swap cache is almost eliminated.
>> 
>> And the pmbench score increases 18.5%.  The swapin throughput
>> increases 18.7% from 2.96 GB/s to 3.51 GB/s.  While the swapout
>> throughput increases 18.5% from 2.99 GB/s to 3.54 GB/s.
>
> If this was backed by plain old RAM, can we assume that the performance
> improvement on SSD swap is still good?

We need really fast disk to show the benefit.  I have tried this on 2
Intel P3600 NVMe disks.  The performance improvement is only about 1%.
The improvement should be better on the faster disks, such as Intel
Optane disk.  I will try to find some to test.

> Does the ram disk actually set SWP_SOLIDSTATE?

Yes.  "blk_queue_flag_set(QUEUE_FLAG_NONROT, q)" is called in
drivers/nvdimm/pmem.c.

Best Regards,
Huang, Ying

Re: [PATCH v6 05/12] mmap locking API: convert mmap_sem call sites missed by coccinelle

2020-05-20 Thread Andrew Morton

On Tue, 19 May 2020 22:29:01 -0700 Michel Lespinasse  wrote:

> Convert the last few remaining mmap_sem rwsem calls to use the new
> mmap locking API. These were missed by coccinelle for some reason
> (I think coccinelle does not support some of the preprocessor
> constructs in these files ?)


From: Andrew Morton 
Subject: mmap-locking-api-convert-mmap_sem-call-sites-missed-by-coccinelle-fix

convert linux-next leftovers

Cc: Michel Lespinasse 
Cc: Daniel Jordan 
Cc: Laurent Dufour 
Cc: Vlastimil Babka 
Cc: Davidlohr Bueso 
Cc: David Rientjes 
Cc: Hugh Dickins 
Cc: Jason Gunthorpe 
Cc: Jerome Glisse 
Cc: John Hubbard 
Cc: Liam Howlett 
Cc: Matthew Wilcox 
Cc: Peter Zijlstra 
Cc: Ying Han 
Signed-off-by: Andrew Morton 
---

 arch/arm64/kvm/mmu.c |   14 +++---
 lib/test_hmm.c   |   14 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

--- 
a/lib/test_hmm.c~mmap-locking-api-convert-mmap_sem-call-sites-missed-by-coccinelle-fix
+++ a/lib/test_hmm.c
@@ -243,9 +243,9 @@ static int dmirror_range_fault(struct dm
}
 
range->notifier_seq = mmu_interval_read_begin(range->notifier);
-   down_read(&mm->mmap_sem);
+   mmap_read_lock(mm);
ret = hmm_range_fault(range);
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
if (ret) {
if (ret == -EBUSY)
continue;
@@ -684,7 +684,7 @@ static int dmirror_migrate(struct dmirro
if (!mmget_not_zero(mm))
return -EINVAL;
 
-   down_read(&mm->mmap_sem);
+   mmap_read_lock(mm);
for (addr = start; addr < end; addr = next) {
vma = find_vma(mm, addr);
if (!vma || addr < vma->vm_start ||
@@ -711,7 +711,7 @@ static int dmirror_migrate(struct dmirro
dmirror_migrate_finalize_and_map(&args, dmirror);
migrate_vma_finalize(&args);
}
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
mmput(mm);
 
/* Return the migrated data for verification. */
@@ -731,7 +731,7 @@ static int dmirror_migrate(struct dmirro
return ret;
 
 out:
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
mmput(mm);
return ret;
 }
@@ -823,9 +823,9 @@ static int dmirror_range_snapshot(struct
 
range->notifier_seq = mmu_interval_read_begin(range->notifier);
 
-   down_read(&mm->mmap_sem);
+   mmap_read_lock(mm);
ret = hmm_range_fault(range);
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
if (ret) {
if (ret == -EBUSY)
continue;
--- 
a/arch/arm64/kvm/mmu.c~mmap-locking-api-convert-mmap_sem-call-sites-missed-by-coccinelle-fix
+++ a/arch/arm64/kvm/mmu.c
@@ -1084,7 +1084,7 @@ void stage2_unmap_vm(struct kvm *kvm)
int idx;
 
idx = srcu_read_lock(&kvm->srcu);
-   down_read(¤t->mm->mmap_sem);
+   mmap_read_lock(current->mm);
spin_lock(&kvm->mmu_lock);
 
slots = kvm_memslots(kvm);
@@ -1092,7 +1092,7 @@ void stage2_unmap_vm(struct kvm *kvm)
stage2_unmap_memslot(kvm, memslot);
 
spin_unlock(&kvm->mmu_lock);
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
srcu_read_unlock(&kvm->srcu, idx);
 }
 
@@ -1848,11 +1848,11 @@ static int user_mem_abort(struct kvm_vcp
}
 
/* Let's check if we will get back a huge page backed by hugetlbfs */
-   down_read(¤t->mm->mmap_sem);
+   mmap_read_lock(current->mm);
vma = find_vma_intersection(current->mm, hva, hva + 1);
if (unlikely(!vma)) {
kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
return -EFAULT;
}
 
@@ -1879,7 +1879,7 @@ static int user_mem_abort(struct kvm_vcp
if (vma_pagesize == PMD_SIZE ||
(vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> 
PAGE_SHIFT;
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
 
/* We need minimum second+third level pages */
ret = mmu_topup_memory_cache(memcache, kvm_mmu_cache_min_pages(kvm),
@@ -2456,7 +2456,7 @@ int kvm_arch_prepare_memory_region(struc
(kvm_phys_size(kvm) >> PAGE_SHIFT))
return -EFAULT;
 
-   down_read(¤t->mm->mmap_sem);
+   mmap_read_lock(current->mm);
/*
 * A memory region could potentially cover multiple VMAs, and any holes
 * between them, so iterate over all of them to find out if we can map
@@ -2515,7 +2515,7 @@ int kvm_arch_prepare_memory_region(struc
stage2_flush_memslot(kvm, memslot);
spin_unlock(&kvm->mmu_lock);
 out:
-   up_read(¤t->mm->m

Re: [PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-20 Thread Andrew Morton

On Tue, 19 May 2020 22:29:08 -0700 Michel Lespinasse  wrote:

> Convert comments that reference mmap_sem to reference mmap_lock instead.

This may not be complete..

From: Andrew Morton 
Subject: mmap-locking-api-convert-mmap_sem-comments-fix

fix up linux-next leftovers

Cc: Daniel Jordan 
Cc: Davidlohr Bueso 
Cc: David Rientjes 
Cc: Hugh Dickins 
Cc: Jason Gunthorpe 
Cc: Jerome Glisse 
Cc: John Hubbard 
Cc: Laurent Dufour 
Cc: Liam Howlett 
Cc: Matthew Wilcox 
Cc: Michel Lespinasse 
Cc: Peter Zijlstra 
Cc: Vlastimil Babka 
Cc: Ying Han 
Signed-off-by: Andrew Morton 
---

 arch/powerpc/mm/fault.c |2 +-
 include/linux/pgtable.h |6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/powerpc/mm/fault.c~mmap-locking-api-convert-mmap_sem-comments-fix
+++ a/arch/powerpc/mm/fault.c
@@ -138,7 +138,7 @@ static noinline int bad_access_pkey(stru
 * 2. T1   : set AMR to deny access to pkey=4, touches, page
 * 3. T1   : faults...
 * 4.T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
-* 5. T1   : enters fault handler, takes mmap_sem, etc...
+* 5. T1   : enters fault handler, takes mmap_lock, etc...
 * 6. T1   : reaches here, sees vma_pkey(vma)=5, when we really
 *   faulted on a pte with its pkey=4.
 */
--- a/include/linux/pgtable.h~mmap-locking-api-convert-mmap_sem-comments-fix
+++ a/include/linux/pgtable.h
@@ -1101,11 +1101,11 @@ static inline pmd_t pmd_read_atomic(pmd_
 #endif
 /*
  * This function is meant to be used by sites walking pagetables with
- * the mmap_sem hold in read mode to protect against MADV_DONTNEED and
+ * the mmap_lock held in read mode to protect against MADV_DONTNEED and
  * transhuge page faults. MADV_DONTNEED can convert a transhuge pmd
  * into a null pmd and the transhuge page fault can convert a null pmd
  * into an hugepmd or into a regular pmd (if the hugepage allocation
- * fails). While holding the mmap_sem in read mode the pmd becomes
+ * fails). While holding the mmap_lock in read mode the pmd becomes
  * stable and stops changing under us only if it's not null and not a
  * transhuge pmd. When those races occurs and this function makes a
  * difference vs the standard pmd_none_or_clear_bad, the result is
@@ -1115,7 +1115,7 @@ static inline pmd_t pmd_read_atomic(pmd_
  *
  * For 32bit kernels with a 64bit large pmd_t this automatically takes
  * care of reading the pmd atomically to avoid SMP race conditions
- * against pmd_populate() when the mmap_sem is hold for reading by the
+ * against pmd_populate() when the mmap_lock is hold for reading by the
  * caller (a special atomic read not done by "gcc" as in the generic
  * version above, is also needed when THP is disabled because the page
  * fault can populate the pmd from under us).
_

Re: [PATCH] arm64/cpufeature: Move BUG_ON() inside get_arm64_ftr_reg()

2020-05-20 Thread Anshuman Khandual




On 05/20/2020 11:09 PM, Will Deacon wrote:
> On Wed, May 20, 2020 at 04:47:11PM +0100, Catalin Marinas wrote:
>> On Wed, May 20, 2020 at 01:20:13PM +0100, Will Deacon wrote:
>>> On Wed, May 20, 2020 at 06:52:54AM +0530, Anshuman Khandual wrote:
 There is no way to proceed when requested register could not be searched in
 arm64_ftr_reg[]. Requesting for a non present register would be an error as
 well. Hence lets just BUG_ON() when the search fails in get_arm64_ftr_reg()
 rather than checking for return value and doing the same in some individual
 callers.

 But there are some callers that dont BUG_ON() upon search failure. It adds
 an argument 'failsafe' that provides required switch between callers based
 on whether they could proceed or not.

 Cc: Catalin Marinas 
 Cc: Will Deacon 
 Cc: Suzuki K Poulose 
 Cc: Mark Brown 
 Cc: linux-arm-ker...@lists.infradead.org
 Cc: linux-kernel@vger.kernel.org

 Signed-off-by: Anshuman Khandual 
 ---
 Applies on next-20200518 that has recent cpufeature changes from Will.

  arch/arm64/kernel/cpufeature.c | 26 +-
  1 file changed, 13 insertions(+), 13 deletions(-)

 diff --git a/arch/arm64/kernel/cpufeature.c 
 b/arch/arm64/kernel/cpufeature.c
 index bc5048f152c1..62767cc540c3 100644
 --- a/arch/arm64/kernel/cpufeature.c
 +++ b/arch/arm64/kernel/cpufeature.c
 @@ -557,7 +557,7 @@ static int search_cmp_ftr_reg(const void *id, const 
 void *regp)
   * - NULL on failure. It is upto the caller to decide
   * the impact of a failure.
   */
 -static struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id)
 +static struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id, bool failsafe)
>>>
>>> Generally, I'm not a big fan of boolean arguments because they are really
>>> opaque at the callsite. It also seems bogus to me that we don't trust the
>>> caller to pass a valid sys_id, but we trust it to get "failsafe" right,
>>> which seems to mean "I promise to check the result isn't NULL before
>>> dereferencing it."
>>>
>>> So I don't see how this patch improves anything. I'd actually be more
>>> inclined to stick a WARN() in get_arm64_ftr_reg() when it returns NULL and
>>> have the callers handle NULL by returning early, getting rid of all the
>>> BUG_ONs in here. Sure, the system might end up in a funny state, but we
>>> WARN()d about it and tried to keep going (and Linus has some strong opinions
>>> on this too).
>>
>> Such WARN can be triggered by the user via emulate_sys_reg(), so we
>> can't really have it in get_arm64_ftr_reg() without a 'failsafe' option.
> 
> Ah yes, that would be bad. In which case, I don't think the existing code
> should change.

The existing code has BUG_ON() in three different callers doing exactly the
same thing that can easily be taken care in get_arm64_ftr_reg() itself. As
mentioned before an enum variable (as preferred - over a bool) can still
preserve the existing behavior for emulate_sys_reg().

IMHO these are very good reasons for us to change the code which will make
it cleaner while also removing three redundant BUG_ON() instances. Hence I
will request you to please reconsider this proposal.

- Anshuman

[PATCH] [v2] PCI: tegra194: Fix runtime PM imbalance on error

2020-05-20 Thread Dinghao Liu

pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu 
---
 drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
b/drivers/pci/controller/dwc/pcie-tegra194.c
index ae30a2fd3716..2c0d2ce16b47 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw 
*pcie)
ret = pinctrl_pm_select_default_state(dev);
if (ret < 0) {
dev_err(dev, "Failed to configure sideband pins: %d\n", ret);
-   goto fail_pinctrl;
+   goto fail_pm_get_sync;
}
 
tegra_pcie_init_controller(pcie);
@@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw 
*pcie)
 
 fail_host_init:
tegra_pcie_deinit_controller(pcie);
-fail_pinctrl:
-   pm_runtime_put_sync(dev);
 fail_pm_get_sync:
+   pm_runtime_put_sync(dev);
pm_runtime_disable(dev);
return ret;
 }
-- 
2.17.1

Re: [PATCH bpf] security: Fix hook iteration for secid_to_secctx

2020-05-20 Thread Alexei Starovoitov

On Wed, May 20, 2020 at 7:02 PM James Morris  wrote:
>
> On Wed, 20 May 2020, Alexei Starovoitov wrote:
>
> > On Wed, May 20, 2020 at 8:15 AM Casey Schaufler  
> > wrote:
> > >
> > >
> > > On 5/20/2020 5:56 AM, KP Singh wrote:
> > > > From: KP Singh 
> > > >
> > > > secid_to_secctx is not stackable, and since the BPF LSM registers this
> > > > hook by default, the call_int_hook logic is not suitable which
> > > > "bails-on-fail" and casues issues when other LSMs register this hook and
> > > > eventually breaks Audit.
> > > >
> > > > In order to fix this, directly iterate over the security hooks instead
> > > > of using call_int_hook as suggested in:
> > > >
> > > > https: 
> > > > //lore.kernel.org/bpf/9d0eb6c6-803a-ff3a-5603-9ad6d9edf...@schaufler-ca.com/#t
> > > >
> > > > Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks")
> > > > Fixes: 625236ba3832 ("security: Fix the default value of 
> > > > secid_to_secctx hook"
> > > > Reported-by: Alexei Starovoitov 
> > > > Signed-off-by: KP Singh 
> > >
> > > This looks fine.
> >
> > Tested. audit works now.
> > I fixed missing ')' in the commit log
> > and applied to bpf tree.
> > It will be on the way to Linus tree soon.
>
> Please add:
>
>
> Acked-by: James Morris 

Thank you. Done.

Re: [PATCH] arm64/cpufeature: Move BUG_ON() inside get_arm64_ftr_reg()

2020-05-20 Thread Anshuman Khandual




On 05/20/2020 05:50 PM, Will Deacon wrote:
> Hi Anshuman,
> 
> On Wed, May 20, 2020 at 06:52:54AM +0530, Anshuman Khandual wrote:
>> There is no way to proceed when requested register could not be searched in
>> arm64_ftr_reg[]. Requesting for a non present register would be an error as
>> well. Hence lets just BUG_ON() when the search fails in get_arm64_ftr_reg()
>> rather than checking for return value and doing the same in some individual
>> callers.
>>
>> But there are some callers that dont BUG_ON() upon search failure. It adds
>> an argument 'failsafe' that provides required switch between callers based
>> on whether they could proceed or not.
>>
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: Suzuki K Poulose 
>> Cc: Mark Brown 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>>
>> Signed-off-by: Anshuman Khandual 
>> ---
>> Applies on next-20200518 that has recent cpufeature changes from Will.
>>
>>  arch/arm64/kernel/cpufeature.c | 26 +-
>>  1 file changed, 13 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index bc5048f152c1..62767cc540c3 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -557,7 +557,7 @@ static int search_cmp_ftr_reg(const void *id, const void 
>> *regp)
>>   * - NULL on failure. It is upto the caller to decide
>>   *   the impact of a failure.
>>   */
>> -static struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id)
>> +static struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id, bool failsafe)
> 
> Generally, I'm not a big fan of boolean arguments because they are really
> opaque at the callsite. It also seems bogus to me that we don't trust the

If preferred, we could replace with an enum variable here with some
more context e.g

enum ftr_reg_search {
FTR_REG_SEARCH_SAFE,
FTR_REG_SEARCH_UNSAFE,
};

> caller to pass a valid sys_id, but we trust it to get "failsafe" right,

If we really trust the callers, then why BUG_ON() checks are present in
the first place. Because it is always prudent to protect against the
unexpected.

> which seems to mean "I promise to check the result isn't NULL before
> dereferencing it."

Not sure I got this. Do you mean all the present BUG_ON() are trying to
check that returned arm64_ftr_reg is valid before dereferencing it ? If
there is real trust on the callers that a non present sys_id will never
get requested, then all present BUG_ON() instances should never be there.

Either we trust the callers - drop all BUG_ON() and WARN_ON() instances
or we dont - consolidate BUG_ON() and WARN_ON() instances appropriately.

> 
> So I don't see how this patch improves anything. I'd actually be more

It consolidates multiple BUG_ON() in various callers which are not really
required. Code consolidation and reduction especially BUG_ON() instances,
is invariably a good thing.

> inclined to stick a WARN() in get_arm64_ftr_reg() when it returns NULL and

AFAICS in emulate_sys_reg() where the user can send non-present sys_id
registers that eventually gets emulated, should not expect an WARN_ON()
as it did not do anything wrong.

> have the callers handle NULL by returning early, getting rid of all the
> BUG_ONs in here. Sure, the system might end up in a funny state, but we
> WARN()d about it and tried to keep going (and Linus has some strong opinions
> on this too).

Sure, we could go with an WARN_ON() instead, if acceptable and preferred.

Re: [PATCH v3 01/16] spi: dw: Add Tx/Rx finish wait methods to the MID DMA

2020-05-20 Thread Feng Tang

Hi Serge,

On Thu, May 21, 2020 at 04:21:51AM +0300, Serge Semin wrote:
> Since DMA transfers are performed asynchronously with actual SPI
> transaction, then even if DMA transfers are finished it doesn't mean
> all data is actually pushed to the SPI bus. Some data might still be
> in the controller FIFO. This is specifically true for Tx-only
> transfers. In this case if the next SPI transfer is recharged while
> a tail of the previous one is still in FIFO, we'll loose that tail
> data. In order to fix this lets add the wait procedure of the Tx/Rx
> SPI transfers completion after the corresponding DMA transactions
> are finished.
> 
> Co-developed-by: Georgy Vlasov 
> Signed-off-by: Georgy Vlasov 
> Signed-off-by: Serge Semin 
> Fixes: 7063c0d942a1 ("spi/dw_spi: add DMA support")
> Cc: Ramil Zaripov 
> Cc: Alexey Malahov 
> Cc: Thomas Bogendoerfer 
> Cc: Paul Burton 
> Cc: Ralf Baechle 
> Cc: Arnd Bergmann 
> Cc: Andy Shevchenko 
> Cc: Rob Herring 
> Cc: linux-m...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> 
> ---
> 
> Changelog v2:
> - Use conditional statement instead of the ternary operator in the ref
>   clock getter.
> - Move the patch to the head of the series so one could be picked up to
>   the stable kernels as a fix.
> 
> Changelog v3:
> - Use spi_delay_exec() method to wait for the current operation completion.
> ---
>  drivers/spi/spi-dw-mid.c | 69 
>  drivers/spi/spi-dw.h | 10 ++
>  2 files changed, 79 insertions(+)
> 
> diff --git a/drivers/spi/spi-dw-mid.c b/drivers/spi/spi-dw-mid.c
> index f9757a370699..3526b196a7fc 100644
> --- a/drivers/spi/spi-dw-mid.c
> +++ b/drivers/spi/spi-dw-mid.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  
> +#define WAIT_RETRIES 5
>  #define RX_BUSY  0
>  #define TX_BUSY  1
>  
> @@ -143,6 +144,47 @@ static enum dma_slave_buswidth convert_dma_width(u32 
> dma_width) {
>   return DMA_SLAVE_BUSWIDTH_UNDEFINED;
>  }
>  
> +static void dw_spi_dma_calc_delay(struct dw_spi *dws, u32 nents,
> +   struct spi_delay *delay)
> +{
> + unsigned long ns, us;
> +
> + ns = (NSEC_PER_SEC / spi_get_clk(dws)) * nents * dws->n_bytes *
> +  BITS_PER_BYTE;
> +
> + if (ns <= NSEC_PER_USEC) {
> + delay->unit = SPI_DELAY_UNIT_NSECS;
> + delay->value = ns;
> + } else {
> + us = DIV_ROUND_UP(ns, NSEC_PER_USEC);
> + delay->unit = SPI_DELAY_UNIT_USECS;
> + delay->value = clamp_val(us, 0, USHRT_MAX);
> + }
> +}
> +
> +static inline bool dw_spi_dma_tx_busy(struct dw_spi *dws)
> +{
> + return !(dw_readl(dws, DW_SPI_SR) & SR_TF_EMPT);
> +}
> +
> +static void dw_spi_dma_wait_tx_done(struct dw_spi *dws)
> +{
> + int retry = WAIT_RETRIES;
> + struct spi_delay delay;
> + u32 nents;
> +
> + nents = dw_readl(dws, DW_SPI_TXFLR);
> + dw_spi_dma_calc_delay(dws, nents, &delay);
> +
> + while (dw_spi_dma_tx_busy(dws) && retry--)
> + spi_delay_exec(&delay, NULL);
> +
> + if (retry < 0) {
> + dev_err(&dws->master->dev, "Tx hanged up\n");
> + dws->master->cur_msg->status = -EIO;
> + }
> +}
> +
>  /*
>   * dws->dma_chan_busy is set before the dma transfer starts, callback for tx
>   * channel will clear a corresponding bit.
> @@ -151,6 +193,8 @@ static void dw_spi_dma_tx_done(void *arg)
>  {
>   struct dw_spi *dws = arg;
>  
> + dw_spi_dma_wait_tx_done(dws);
> +
>   clear_bit(TX_BUSY, &dws->dma_chan_busy);
>   if (test_bit(RX_BUSY, &dws->dma_chan_busy))
>   return;
> @@ -192,6 +236,29 @@ static struct dma_async_tx_descriptor 
> *dw_spi_dma_prepare_tx(struct dw_spi *dws,
>   return txdesc;
>  }
>  
> +static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
> +{
> + return !!(dw_readl(dws, DW_SPI_SR) & SR_RF_NOT_EMPT);
> +}
> +
> +static void dw_spi_dma_wait_rx_done(struct dw_spi *dws)
> +{
> + int retry = WAIT_RETRIES;
> + struct spi_delay delay;
> + u32 nents;
> +
> + nents = dw_readl(dws, DW_SPI_RXFLR);
> + dw_spi_dma_calc_delay(dws, nents, &delay);
> +
> + while (dw_spi_dma_rx_busy(dws) && retry--)
> + spi_delay_exec(&delay, NULL);
> +
> + if (retry < 0) {
> + dev_err(&dws->master->dev, "Rx hanged up\n");
> + dws->master->cur_msg->status = -EIO;
> + }
> +}
> +
>  /*
>   * dws->dma_chan_busy is set before the dma transfer starts, callback for rx
>   * channel will clear a corresponding bit.
> @@ -200,6 +267,8 @@ static void dw_spi_dma_rx_done(void *arg)
>  {
>   struct dw_spi *dws = arg;
>  
> + dw_spi_dma_wait_rx_done(dws);

I can understand the problem about TX, but I don't see how RX
will get hurt, can you elaborate more? thanks

- Feng


> +
>   clear_bit(RX_BUSY, &dws->dma_chan_busy);
>   if (test_bit(TX_BUSY, &dws->dma_chan_busy))
>   return;
> diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h

Re: [PATCH v3 0/3] Even moar rpmh cleanups

2020-05-20 Thread Bjorn Andersson

On Wed 20 May 18:21 PDT 2020, Stephen Boyd wrote:

> We remove the tcs_is_free() API and then do super micro optimizations on
> the irq handler. I haven't tested anything here so most likely there's a
> bug (again again)!
> 
> Changes from v2:
>  * Went back in time and used the v1 patch for the first patch with
>the fixes to make it not so complicated
> 
> Changes from v1:
>  * First patch became even moar complicated because it combines
>find_free_tcs() with the check for a request in flight
>  * Fixed subject in patch 2
>  * Put back unsigned long for bitmap operation to silence compiler
>warning
>  * Picked up review tags
> 

Can you please resend this series with both linux-arm-msm and myself on
Cc for all three patches?

Thanks,
Bjorn

> Stephen Boyd (3):
>   soc: qcom: rpmh-rsc: Remove tcs_is_free() API
>   soc: qcom: rpmh-rsc: Loop over fewer bits in irq handler
>   soc: qcom: rpmh-rsc: Fold WARN_ON() into if condition
> 
>  drivers/soc/qcom/rpmh-rsc.c | 65 +
>  1 file changed, 22 insertions(+), 43 deletions(-)
> 
> Cc: Maulik Shah 
> Cc: Douglas Anderson 
> 
> base-commit: 1f7a3eb785e4a4e196729cd3d5ec97bd5f9f2940
> -- 
> Sent by a computer, using git, on the internet
>

Re: Re: [PATCH] PCI: tegra: fix runtime pm imbalance on error

2020-05-20 Thread dinghao . liu

Thank you for your advice. I will fix these problems in the next edition of 
patch.

"Thierry Reding" 写道：
> On Wed, May 20, 2020 at 04:52:23PM +0800, Dinghao Liu wrote:
> > pm_runtime_get_sync() increments the runtime PM usage counter even
> > it returns an error code. Thus a pairing decrement is needed on
> 
> s/even it/even when it/
> 
> Might also be a good idea to use a different subject prefix because I
> was almost not going to look at the other patch, taking this to be a
> replacement for it.
> 
> Although, looking at the log we have used this same prefix for both
> drivers in the past...
> 
> > the error handling path to keep the counter balanced.
> > 
> > Signed-off-by: Dinghao Liu 
> > ---
> >  drivers/pci/controller/dwc/pcie-tegra194.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
> > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > index ae30a2fd3716..a69f9e49dcb5 100644
> > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > @@ -1651,8 +1651,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw 
> > *pcie)
> >  fail_host_init:
> > tegra_pcie_deinit_controller(pcie);
> >  fail_pinctrl:
> > -   pm_runtime_put_sync(dev);
> >  fail_pm_get_sync:
> 
> Either of those two labels is now no longer needed. Of course it'll now
> be odd to jump to fail_pm_get_sync on pinctrl_pm_select_default_state()
> failure, but that's one of the reasons why label should have names
> describing what they do rather than describe the failure location. I
> guess we can live with that for now. I'll make a note to send a cleanup
> patch for that later on.
> 
> With the fixup in the commit message and either of the labels removed:
> 
> Acked-by: Thierry Reding

RE: [PATCH 0/3] arm64: perf: Add support for Perf NMI interrupts

2020-05-20 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: linux-arm-kernel [mailto:linux-arm-kernel-boun...@lists.infradead.org]
> On Behalf Of Alexandru Elisei
> Sent: Wednesday, May 20, 2020 10:31 PM> 
> Hi,
> 
> On 5/18/20 12:17 PM, Alexandru Elisei wrote:
> > Hi,
> >
> > On 5/18/20 11:45 AM, Mark Rutland wrote:
> >> Hi all,
> >>
> >> On Mon, May 18, 2020 at 02:26:00PM +0800, Lecopzer Chen wrote:
> >>> HI Sumit,
> >>>
> >>> Thanks for your information.
> >>>
> >>> I've already implemented IPI (same as you did [1], little difference
> >>> in detail), hardlockup detector and perf in last year(2019) for
> >>> debuggability.
> >>> And now we tend to upstream to reduce kernel maintaining effort.
> >>> I'm glad if someone in ARM can do this work :)
> >>>
> >>> Hi Julien,
> >>>
> >>> Does any Arm maintainers can proceed this action?
> >> Alexandru (Cc'd) has been rebasing and reworking Julien's patches,
> >> which is my preferred approach.
> >>
> >> I understand that's not quite ready for posting since he's
> >> investigating some of the nastier subtleties (e.g. mutual exclusion
> >> with the NMI), but maybe we can put the work-in-progress patches
> >> somewhere in the mean time.
> >>
> >> Alexandru, do you have an idea of what needs to be done, and/or when
> >> you expect you could post that?
> > I'm currently working on rebasing the patches on top of 5.7-rc5, when
> > I have something usable I'll post a link (should be a couple of days).
> > After that I will address the review comments, and I plan to do a
> > thorough testing because I'm not 100% confident that some of the
> > assumptions around the locks that were removed are correct. My guess is
> this will take a few weeks.
> 
> Pushed a WIP branch on linux-arm.org [1]:
> 
> git clone -b WIP-pmu-nmi git://linux-arm.org/linux-ae
> 
> Practically untested, I only did perf record on a defconfig kernel running on 
> the
> model.
> 
> [1]
> http://www.linux-arm.org/git?p=linux-ae.git;a=shortlog;h=refs/heads/WIP-pm
> u-nmi

Fortunately, it does work. I used this tree to perf annotate 
arm_smmu_cmdq_issue_cmdlist() which
is completely disabling IRQ. Luckily, it reports correct data. Before that, it 
reported all time was spent by
the code which enabled IRQ .


Barry

> 
> Thanks,
> Alex
> >
> > Thanks,
> > Alex
> >> Thanks,
> >> Mark.
> >>
> >>> This is really useful in debugging.
> >>> Thank you!!
> >>>
> >>>
> >>>
> >>> [1] https://lkml.org/lkml/2020/4/24/328
> >>>
> >>>
> >>> Lecopzer
> >>>
> >>> Sumit Garg  於 2020年5月18日 週一 下午
> 1:46寫道：
>  + Julien
> 
>  Hi Lecopzer,
> 
>  On Sat, 16 May 2020 at 18:20, Lecopzer Chen 
> wrote:
> > These series implement Perf NMI funxtionality and depends on
> > Pseudo NMI [1] which has been upstreamed.
> >
> > In arm64 with GICv3, Pseudo NMI was implemented for NMI-like
> interruts.
> > That can be extended to Perf NMI which is the prerequisite for
> > hard-lockup detector which had already a standard interface inside
> Linux.
> >
> > Thus the first step we need to implement perf NMI interface and
> > make sure it works fine.
> >
>  This is something that is already implemented via Julien's
>  patch-set [1]. Its v4 has been floating since July, 2019 and I
>  couldn't find any major blocking comments but not sure why things
>  haven't progressed further.
> 
>  Maybe Julien or Arm maintainers can provide updates on existing
>  patch-set [1] and how we should proceed further with this
>  interesting feature.
> 
>  And regarding hard-lockup detection, I have been able to enable it
>  based on perf NMI events using Julien's perf patch-set [1]. Have a
>  look at the patch here [2].
> 
>  [1] https://patchwork.kernel.org/cover/11047407/
>  [2]
>  http://lists.infradead.org/pipermail/linux-arm-kernel/2020-May/7322
>  27.html
> 
>  -Sumit
> 
> > Perf NMI has been test by dd if=/dev/urandom of=/dev/null like the
> > link [2] did.
> >
> > [1] https://lkml.org/lkml/2019/1/31/535
> > [2] https://www.linaro.org/blog/debugging-arm-kernels-using-nmifiq
> >
> >
> > Lecopzer Chen (3):
> >   arm_pmu: Add support for perf NMI interrupts registration
> >   arm64: perf: Support NMI context for perf event ISR
> >   arm64: Kconfig: Add support for the Perf NMI
> >
> >  arch/arm64/Kconfig | 10 +++
> >  arch/arm64/kernel/perf_event.c | 36 ++--
> >  drivers/perf/arm_pmu.c | 51
> ++
> >  include/linux/perf/arm_pmu.h   |  6 
> >  4 files changed, 88 insertions(+), 15 deletions(-)
> >
> > --
> > 2.25.1

Re: [PATCH] KVM: PPC: Book3S HV: relax check on H_SVM_INIT_ABORT

2020-05-20 Thread Greg Kurz

On Wed, 20 May 2020 18:51:10 +0200
Laurent Dufour  wrote:

> The commit 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_*
> Hcalls") added checks of secure bit of SRR1 to filter out the Hcall
> reserved to the Ultravisor.
> 
> However, the Hcall H_SVM_INIT_ABORT is made by the Ultravisor passing the
> context of the VM calling UV_ESM. This allows the Hypervisor to return to
> the guest without going through the Ultravisor. Thus the Secure bit of SRR1
> is not set in that particular case.
> 
> In the case a regular VM is calling H_SVM_INIT_ABORT, this hcall will be
> filtered out in kvmppc_h_svm_init_abort() because kvm->arch.secure_guest is
> not set in that case.
> 

Why not checking vcpu->kvm->arch.secure_guest then ?

> Fixes: 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_* Hcalls")
> Signed-off-by: Laurent Dufour 
> ---
>  arch/powerpc/kvm/book3s_hv.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 93493f0cbfe8..eb1f96cb7b72 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1099,9 +1099,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
>   ret = kvmppc_h_svm_init_done(vcpu->kvm);
>   break;
>   case H_SVM_INIT_ABORT:
> - ret = H_UNSUPPORTED;
> - if (kvmppc_get_srr1(vcpu) & MSR_S)
> - ret = kvmppc_h_svm_init_abort(vcpu->kvm);

or at least put a comment to explain why H_SVM_INIT_ABORT
doesn't have the same sanity check as the other SVM hcalls.

> + ret = kvmppc_h_svm_init_abort(vcpu->kvm);
>   break;
>  
>   default:

Re: Re: [PATCH] PCI: tegra: fix runtime pm imbalance on error

2020-05-20 Thread dinghao . liu

Thank you for your advice. I think tegra194 is a good choice and 
I will use it in the next edition of patch.

"Bjorn Helgaas" 写道：
> On Wed, May 20, 2020 at 11:59:08AM +0200, Thierry Reding wrote:
> > On Wed, May 20, 2020 at 04:52:23PM +0800, Dinghao Liu wrote:
> > > pm_runtime_get_sync() increments the runtime PM usage counter even
> > > it returns an error code. Thus a pairing decrement is needed on
> > 
> > s/even it/even when it/
> > 
> > Might also be a good idea to use a different subject prefix because I
> > was almost not going to look at the other patch, taking this to be a
> > replacement for it.
> 
> Amen.  This would be a good change to start using "PCI: tegra194" or
> something for pcie-tegra194.c.  Or will there be tegra195, tegra 196,
> etc added to this driver?
> 
> Also, please capitalize the first word and "PM" in the subjects:
> 
>   PCI: tegra194: Fix runtime PM imbalance on error
> 
> Bjorn

Re: [PATCH -V2] swap: Reduce lock contention on swap cache from swap slots allocation

2020-05-20 Thread Andrew Morton

On Wed, 20 May 2020 11:15:02 +0800 Huang Ying  wrote:

> In some swap scalability test, it is found that there are heavy lock
> contention on swap cache even if we have split one swap cache radix
> tree per swap device to one swap cache radix tree every 64 MB trunk in
> commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks").
> 
> The reason is as follow.  After the swap device becomes fragmented so
> that there's no free swap cluster, the swap device will be scanned
> linearly to find the free swap slots.  swap_info_struct->cluster_next
> is the next scanning base that is shared by all CPUs.  So nearby free
> swap slots will be allocated for different CPUs.  The probability for
> multiple CPUs to operate on the same 64 MB trunk is high.  This causes
> the lock contention on the swap cache.
> 
> To solve the issue, in this patch, for SSD swap device, a percpu
> version next scanning base (cluster_next_cpu) is added.  Every CPU
> will use its own per-cpu next scanning base.  And after finishing
> scanning a 64MB trunk, the per-cpu scanning base will be changed to
> the beginning of another randomly selected 64MB trunk.  In this way,
> the probability for multiple CPUs to operate on the same 64 MB trunk
> is reduced greatly.  Thus the lock contention is reduced too.  For
> HDD, because sequential access is more important for IO performance,
> the original shared next scanning base is used.
> 
> To test the patch, we have run 16-process pmbench memory benchmark on
> a 2-socket server machine with 48 cores.  One ram disk is configured

What does "ram disk" mean here?  Which drivers(s) are in use and backed
by what sort of memory?

> as the swap device per socket.  The pmbench working-set size is much
> larger than the available memory so that swapping is triggered.  The
> memory read/write ratio is 80/20 and the accessing pattern is random.
> In the original implementation, the lock contention on the swap cache
> is heavy.  The perf profiling data of the lock contention code path is
> as following,
> 
> _raw_spin_lock_irq.add_to_swap_cache.add_to_swap.shrink_page_list:  7.91
> _raw_spin_lock_irqsave.__remove_mapping.shrink_page_list:   7.11
> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 2.51
> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 1.66
> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node:  1.29
> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.03
> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:0.93
> 
> After applying this patch, it becomes,
> 
> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 3.58
> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node:  2.3
> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 2.26
> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:1.8
> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.19
> 
> The lock contention on the swap cache is almost eliminated.
> 
> And the pmbench score increases 18.5%.  The swapin throughput
> increases 18.7% from 2.96 GB/s to 3.51 GB/s.  While the swapout
> throughput increases 18.5% from 2.99 GB/s to 3.54 GB/s.

If this was backed by plain old RAM, can we assume that the performance
improvement on SSD swap is still good?

Does the ram disk actually set SWP_SOLIDSTATE?

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1630 matches

Mail list logo