date:20200602

Re: [PATCH 1/4] io_uring: fix open/close/statx with {SQ,IO}POLL

2020-06-02 Thread Pavel Begunkov

On 02/06/2020 15:34, Pavel Begunkov wrote:
> Trying to use them with IORING_SETUP_IOPOLL:
> 
> RIP: 0010:io_iopoll_getevents+0x111/0x5a0
> Call Trace:
>  ? _raw_spin_unlock_irqrestore+0x24/0x40
>  ? do_send_sig_info+0x64/0x90
>  io_iopoll_reap_events.part.0+0x5e/0xa0
>  io_ring_ctx_wait_and_kill+0x132/0x1c0
>  io_uring_release+0x20/0x30
>  __fput+0xcd/0x230
>  fput+0xe/0x10
>  task_work_run+0x67/0xa0
>  do_exit+0x353/0xb10
>  ? handle_mm_fault+0xd4/0x200
>  ? syscall_trace_enter+0x18c/0x2c0
>  do_group_exit+0x43/0xa0
>  __x64_sys_exit_group+0x18/0x20
>  do_syscall_64+0x60/0x1e0
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9

io_do_iopoll()
{
...
ret = kiocb->*ki_filp*->f_op->iopoll(kiocb, spin);
}

Hmm, I'll double check later that only read*/write* can be done
with IOPOLL, and send a follow-up patch if necessary.

> 
> Also SQPOLL thread can't know which file table to use with
> open/close. Disallow all these cases.
> 
> Signed-off-by: Pavel Begunkov 
> ---
>  fs/io_uring.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 732ec73ec3c0..7208f91e9e77 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2990,6 +2990,8 @@ static int io_openat_prep(struct io_kiocb *req, const 
> struct io_uring_sqe *sqe)
>   const char __user *fname;
>   int ret;
>  
> + if (unlikely(req->ctx->flags & 
> (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL)))
> + return -EINVAL;
>   if (sqe->ioprio || sqe->buf_index)
>   return -EINVAL;
>   if (req->flags & REQ_F_FIXED_FILE)
> @@ -3023,6 +3025,8 @@ static int io_openat2_prep(struct io_kiocb *req, const 
> struct io_uring_sqe *sqe)
>   size_t len;
>   int ret;
>  
> + if (unlikely(req->ctx->flags & 
> (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL)))
> + return -EINVAL;
>   if (sqe->ioprio || sqe->buf_index)
>   return -EINVAL;
>   if (req->flags & REQ_F_FIXED_FILE)
> @@ -3373,6 +3377,8 @@ static int io_fadvise(struct io_kiocb *req, bool 
> force_nonblock)
>  
>  static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe 
> *sqe)
>  {
> + if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
> + return -EINVAL;
>   if (sqe->ioprio || sqe->buf_index)
>   return -EINVAL;
>   if (req->flags & REQ_F_FIXED_FILE)
> @@ -3417,6 +3423,8 @@ static int io_close_prep(struct io_kiocb *req, const 
> struct io_uring_sqe *sqe)
>*/
>   req->work.flags |= IO_WQ_WORK_NO_CANCEL;
>  
> + if (unlikely(req->ctx->flags & 
> (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL)))
> + return -EINVAL;
>   if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
>   sqe->rw_flags || sqe->buf_index)
>   return -EINVAL;
> 

-- 
Pavel Begunkov

Re: [PATCH v3 032/105] drm/vc4: crtc: Enable and disable the PV in atomic_enable / disable

2020-06-02 Thread Maxime Ripard

Hi Eric

On Wed, May 27, 2020 at 09:54:44AM -0700, Eric Anholt wrote:
> On Wed, May 27, 2020 at 8:50 AM Maxime Ripard  wrote:
> >
> > The VIDEN bit in the pixelvalve currently being used to enable or disable
> > the pixelvalve seems to not be enough in some situations, which whill end
> > up with the pixelvalve stalling.
> >
> > In such a case, even re-enabling VIDEN doesn't bring it back and we need to
> > clear the FIFO. This can only be done if the pixelvalve is disabled though.
> >
> > In order to overcome this, we can configure the pixelvalve during
> > mode_set_no_fb, but only enable it in atomic_enable and flush the FIFO
> > there, and in atomic_disable disable the pixelvalve again.
> 
> What displays has this been tested with?  Getting this sequencing
> right is so painful, and things like DSI are tricky to get to light
> up.

That FIFO is between the HVS and the HDMI PVs, so this was obviously
tested against that. Dave also tested the DSI output IIRC, so we should
be covered here.

Maxime


signature.asc
Description: PGP signature

Re: [PATCH] spi: spi-ti-qspi: call pm_runtime_put on pm_runtime_get failure

2020-06-02 Thread Mark Brown

On Tue, Jun 02, 2020 at 12:02:11PM +0200, Markus Elfring wrote:

> > The original changelog is perfectly fine, please stop sending these.

> I find this commit message improvable also according to Linux software
> development documentation.

Causing people to send out new versions of things for tweaks to the
commit log consumes time for them and everyone they're sending changes
to.  Pushing people to make trivial rewordings of their commit logs to
match your particular taste is not a good use of people's time.

signature.asc
Description: PGP signature

[PATCH v2] gpiolib: split character device into gpiolib-cdev

2020-06-02 Thread Kent Gibson

Split the cdev specific functionality out of gpiolib.c and into
gpiolib-cdev.c. This improves the readability and maintainability of both
the cdev and core gpiolib code.

Suggested-by: Bartosz Golaszewski 
Signed-off-by: Kent Gibson 

---
While this patch is complete and ready for review, I don't expect it to
be applied as is. There are a few cdev patches pending merge into 
gpio/devel that are sure to conflict, and it makes more sense to
rebase this on top of them than vice versa. But I thought it would
be worthwhile to get it out for review so it can be ready to be rebased
when the time is right.

Also, this is a naive split. That is, if applied as is, it will lose the
line history of the cdev code.  This is not what I intend, and I
understand can be avoided by encouraging git to remember the history
with a few moves, but I'm unsure how the maintainers would prefer that
to be done.

Bart,
 As this was your idea, I've taken the liberty of adding the Suggested-by.
 I hope that is ok.

Changes in v2:
 - rebased to latest gpio/devel and added base-commit to placate the
  build bot.  The comments above still apply, as there are still a
  couple of commits in gpio/fixes that will conflict.

Kent.

 drivers/gpio/Makefile   |1 +
 drivers/gpio/gpiolib-cdev.c | 1148 +++
 drivers/gpio/gpiolib-cdev.h |   11 +
 drivers/gpio/gpiolib.c  | 1112 +
 4 files changed, 1164 insertions(+), 1108 deletions(-)
 create mode 100644 drivers/gpio/gpiolib-cdev.c
 create mode 100644 drivers/gpio/gpiolib-cdev.h

diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 65bf3940e33c..b5b58b624f37 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_GPIOLIB)   += gpiolib.o
 obj-$(CONFIG_GPIOLIB)  += gpiolib-devres.o
 obj-$(CONFIG_GPIOLIB)  += gpiolib-legacy.o
 obj-$(CONFIG_GPIOLIB)  += gpiolib-devprop.o
+obj-$(CONFIG_GPIOLIB)  += gpiolib-cdev.o
 obj-$(CONFIG_OF_GPIO)  += gpiolib-of.o
 obj-$(CONFIG_GPIO_SYSFS)   += gpiolib-sysfs.o
 obj-$(CONFIG_GPIO_ACPI)+= gpiolib-acpi.o
diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
new file mode 100644
index ..971470bdc9c9
--- /dev/null
+++ b/drivers/gpio/gpiolib-cdev.c
@@ -0,0 +1,1148 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+#include "gpiolib.h"
+
+/* Implementation infrastructure for GPIO interfaces.
+ *
+ * The GPIO programming interface allows for inlining speed-critical
+ * get/set operations for common cases, so that access to SOC-integrated
+ * GPIOs can sometimes cost only an instruction or two per bit.
+ */
+
+/*
+ * GPIO line handle management
+ */
+
+/**
+ * struct linehandle_state - contains the state of a userspace handle
+ * @gdev: the GPIO device the handle pertains to
+ * @label: consumer label used to tag descriptors
+ * @descs: the GPIO descriptors held by this handle
+ * @numdescs: the number of descriptors held in the descs array
+ */
+struct linehandle_state {
+   struct gpio_device *gdev;
+   const char *label;
+   struct gpio_desc *descs[GPIOHANDLES_MAX];
+   u32 numdescs;
+};
+
+#define GPIOHANDLE_REQUEST_VALID_FLAGS \
+   (GPIOHANDLE_REQUEST_INPUT | \
+   GPIOHANDLE_REQUEST_OUTPUT | \
+   GPIOHANDLE_REQUEST_ACTIVE_LOW | \
+   GPIOHANDLE_REQUEST_BIAS_PULL_UP | \
+   GPIOHANDLE_REQUEST_BIAS_PULL_DOWN | \
+   GPIOHANDLE_REQUEST_BIAS_DISABLE | \
+   GPIOHANDLE_REQUEST_OPEN_DRAIN | \
+   GPIOHANDLE_REQUEST_OPEN_SOURCE)
+
+static int linehandle_validate_flags(u32 flags)
+{
+   /* Return an error if an unknown flag is set */
+   if (flags & ~GPIOHANDLE_REQUEST_VALID_FLAGS)
+   return -EINVAL;
+
+   /*
+* Do not allow both INPUT & OUTPUT flags to be set as they are
+* contradictory.
+*/
+   if ((flags & GPIOHANDLE_REQUEST_INPUT) &&
+   (flags & GPIOHANDLE_REQUEST_OUTPUT))
+   return -EINVAL;
+
+   /*
+* Do not allow OPEN_SOURCE & OPEN_DRAIN flags in a single request. If
+* the hardware actually supports enabling both at the same time the
+* electrical result would be disastrous.
+*/
+   if ((flags & GPIOHANDLE_REQUEST_OPEN_DRAIN) &&
+   (flags & GPIOHANDLE_REQUEST_OPEN_SOURCE))
+   return -EINVAL;
+
+   /* OPEN_DRAIN and OPEN_SOURCE flags only make sense for output mode. */
+   if (!(flags & GPIOHANDLE_REQUEST_OUTPUT) &&
+   ((flags & GPIOHANDLE_REQUEST_OPEN_DRAIN) ||
+(flags & GPIOHANDLE_REQUEST_OPEN_SOURCE)))
+   return -EINVAL;
+
+   /* Bias flags only allowed for input or output mode. */
+   if (!((flags &

Re: [PATCH 1/2] perf tools: check libasan and libubsan in Makefile.config

2020-06-02 Thread Arnaldo Carvalho de Melo

Em Tue, Jun 02, 2020 at 12:15:03PM +0800, Tiezhu Yang escreveu:
> When build perf with ASan or UBSan, if libasan or libubsan can not find,
> the feature-glibc is 0 and there exists the following error log which is
> wrong, because we can find gnu/libc-version.h in /usr/include, glibc-devel
> is also installed.

I'll check this later,

- Arnaldo
 
> [yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer 
> -fsanitize=address'
>   BUILD:   Doing 'make -j4' parallel build
>   HOSTCC   fixdep.o
>   HOSTLD   fixdep-in.o
>   LINK fixdep
> :1:0: warning: -fsanitize=address and -fsanitize=kernel-address are 
> not supported for this target
> :1:0: warning: -fsanitize=address not supported for this target
> 
> Auto-detecting system features:
> ... dwarf: [ OFF ]
> ...dwarf_getlocations: [ OFF ]
> ... glibc: [ OFF ]
> ...  gtk2: [ OFF ]
> ...  libaudit: [ OFF ]
> ...libbfd: [ OFF ]
> ...libcap: [ OFF ]
> ...libelf: [ OFF ]
> ...   libnuma: [ OFF ]
> ...numa_num_possible_cpus: [ OFF ]
> ...   libperl: [ OFF ]
> ... libpython: [ OFF ]
> ... libcrypto: [ OFF ]
> ... libunwind: [ OFF ]
> ...libdw-dwarf-unwind: [ OFF ]
> ...  zlib: [ OFF ]
> ...  lzma: [ OFF ]
> ... get_cpuid: [ OFF ]
> ...   bpf: [ OFF ]
> ...libaio: [ OFF ]
> ...   libzstd: [ OFF ]
> ...disassembler-four-args: [ OFF ]
> 
> Makefile.config:393: *** No gnu/libc-version.h found, please install 
> glibc-dev[el].  Stop.
> Makefile.perf:224: recipe for target 'sub-make' failed
> make[1]: *** [sub-make] Error 2
> Makefile:69: recipe for target 'all' failed
> make: *** [all] Error 2
> [yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
> /usr/include/gnu/libc-version.h
> 
> After install libasan and libubsan, the feature-glibc is 1 and the build
> process is success, so the cause is related with libasan or libubsan, we
> should check them and print an error log to reflect the reality.
> 
> Signed-off-by: Tiezhu Yang 
> ---
>  tools/perf/Makefile.config | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 12a8204..b699d21 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -387,6 +387,12 @@ else
>NO_LIBBPF := 1
>NO_JVMTI := 1
>  else
> +  ifneq ($(shell ldconfig -p | grep libasan >/dev/null 2>&1; echo $$?), 
> 0)
> +msg := $(error No libasan found, please install libasan);
> +  endif
> +  ifneq ($(shell ldconfig -p | grep libubsan >/dev/null 2>&1; echo $$?), 
> 0)
> +msg := $(error No libubsan found, please install libubsan);
> +  endif
>ifneq ($(filter s% -static%,$(LDFLAGS),),)
>  msg := $(error No static glibc found, please install glibc-static);
>else
> -- 
> 2.1.0
> 

-- 

- Arnaldo

Re: [PATCH 2/2] perf tools: Remove some duplicated includes

2020-06-02 Thread Arnaldo Carvalho de Melo

Em Tue, Jun 02, 2020 at 12:15:04PM +0800, Tiezhu Yang escreveu:
> There exists some duplicated includes in tools/perf, remove them.

Applied, thanks.

- Arnaldo
 
> Signed-off-by: Tiezhu Yang 
> ---
>  tools/perf/builtin-report.c | 1 -
>  tools/perf/util/annotate.c  | 1 -
>  tools/perf/util/auxtrace.c  | 1 -
>  tools/perf/util/config.c| 1 -
>  tools/perf/util/session.c   | 1 -
>  5 files changed, 5 deletions(-)
> 
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index ba63390..5425a2c 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -47,7 +47,6 @@
>  #include "util/time-utils.h"
>  #include "util/auxtrace.h"
>  #include "util/units.h"
> -#include "util/branch.h"
>  #include "util/util.h" // perf_tip()
>  #include "ui/ui.h"
>  #include "ui/progress.h"
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index d828c2d..76bfb4a 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -41,7 +41,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 749487a..94a8f4f 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -55,7 +55,6 @@
>  #include "util/mmap.h"
>  
>  #include 
> -#include 
>  #include "symbol/kallsyms.h"
>  #include 
>  
> diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
> index ef38eba..64f14a5 100644
> --- a/tools/perf/util/config.c
> +++ b/tools/perf/util/config.c
> @@ -20,7 +20,6 @@
>  #include "build-id.h"
>  #include "debug.h"
>  #include "config.h"
> -#include "debug.h"
>  #include 
>  #include 
>  #include 
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index c11d89e..5550e26e 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -33,7 +33,6 @@
>  #include "../perf.h"
>  #include "arch/common.h"
>  #include 
> -#include 
>  
>  #ifdef HAVE_ZSTD_SUPPORT
>  static int perf_session__process_compressed_event(struct perf_session 
> *session,
> -- 
> 2.1.0
> 

-- 

- Arnaldo

Re: [PATCH] hwmon: bt1-pvt: Declare Temp- and Volt-to-N poly when alarms are enabled

2020-06-02 Thread Guenter Roeck

On Tue, Jun 02, 2020 at 12:12:19PM +0300, Serge Semin wrote:
> Clang-based kernel building with W=1 warns that some static const
> variables are unused:
> 
> drivers/hwmon/bt1-pvt.c:67:30: warning: unused variable 'poly_temp_to_N' 
> [-Wunused-const-variable]
> static const struct pvt_poly poly_temp_to_N = {
>  ^
> drivers/hwmon/bt1-pvt.c:99:30: warning: unused variable 'poly_volt_to_N' 
> [-Wunused-const-variable]
> static const struct pvt_poly poly_volt_to_N = {
>  ^
> 
> Indeed these polynomials are utilized only when the PVT sensor alarms are
> enabled. In that case they are used to convert the temperature and
> voltage alarm limits from normal quantities (Volts and degree Celsius) to
> the sensor data representation N = [0, 1023]. Otherwise when alarms are
> disabled the driver only does the detected data conversion to the human
> readable form and doesn't need that polynomials defined. So let's declare
> the Temp-to-N and Volt-to-N polynomials only if the PVT alarms are
> switched on at compile-time.
> 
> Note gcc with W=1 doesn't notice the problem.
> 
> Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
> Reported-by: kbuild test robot 
> Signed-off-by: Serge Semin 
> Cc: Maxim Kaurkin 
> Cc: Alexey Malahov 

I don't really like the added #if. Can you use __maybe_unused instead ?

Thanks,
Guenter

> ---
>  drivers/hwmon/bt1-pvt.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
> index 1a9772fb1f73..1a5212c04549 100644
> --- a/drivers/hwmon/bt1-pvt.c
> +++ b/drivers/hwmon/bt1-pvt.c
> @@ -64,6 +64,7 @@ static const struct pvt_sensor_info pvt_info[] = {
>   * 48380,
>   * where T = [-48380, 147438] mC and N = [0, 1023].
>   */
> +#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
>  static const struct pvt_poly poly_temp_to_N = {
>   .total_divider = 1,
>   .terms = {
> @@ -74,6 +75,7 @@ static const struct pvt_poly poly_temp_to_N = {
>   {0, 1720400, 1, 1}
>   }
>  };
> +#endif /* CONFIG_SENSORS_BT1_PVT_ALARMS */
>  
>  static const struct pvt_poly poly_N_to_temp = {
>   .total_divider = 1,
> @@ -96,6 +98,7 @@ static const struct pvt_poly poly_N_to_temp = {
>   * N = (18658e-3*V - 11572) / 10,
>   * V = N * 10^5 / 18658 + 11572 * 10^4 / 18658.
>   */
> +#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
>  static const struct pvt_poly poly_volt_to_N = {
>   .total_divider = 10,
>   .terms = {
> @@ -103,6 +106,7 @@ static const struct pvt_poly poly_volt_to_N = {
>   {0, -11572, 1, 1}
>   }
>  };
> +#endif /* CONFIG_SENSORS_BT1_PVT_ALARMS */
>  
>  static const struct pvt_poly poly_N_to_volt = {
>   .total_divider = 10,

Re: [PATCH] perf tools: Fix kernel maps for kcore and eBPF

2020-06-02 Thread Arnaldo Carvalho de Melo

Em Tue, Jun 02, 2020 at 02:25:05PM +0300, Adrian Hunter escreveu:
> Adjust pgoff also when moving a map's start address.
> 
> Example with v5.4.34 based kernel:
> 
>   Before:
> 
> $ sudo tools/perf/perf record -a --kcore -e intel_pt//k sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.958 MB perf.data ]
> $ sudo tools/perf/perf script --itrace=e >/dev/null
> Warning:
> 961 instruction trace errors
> 
>   After:
> 
> $ sudo tools/perf/perf script --itrace=e >/dev/null
> $

Thanks, tested on a kaby lake system (i5-7500) and applied,

- Arnaldo
 
> Signed-off-by: Adrian Hunter 
> Fixes: fb5a88d4131a ("perf tools: Preserve eBPF maps when loading kcore")
> Cc: sta...@vger.kernel.org
> ---
>  tools/perf/util/symbol.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 57cbe7a29868..5ddf84dcbae7 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -1224,6 +1224,7 @@ int maps__merge_in(struct maps *kmaps, struct map 
> *new_map)
>  
>   m->end = old_map->start;
>   list_add_tail(>node, );
> + new_map->pgoff += old_map->end - new_map->start;
>   new_map->start = old_map->end;
>   }
>   } else {
> @@ -1244,6 +1245,7 @@ int maps__merge_in(struct maps *kmaps, struct map 
> *new_map)
>*  |new..| -> |new...|
>* |old|-> |old|
>*/
> + new_map->pgoff += old_map->end - new_map->start;
>   new_map->start = old_map->end;
>   }
>   }
> -- 
> 2.17.1
> 

-- 

- Arnaldo

Re: [PATCHv3] perf stat: Ensure group is defined on top of the same cpu mask

2020-06-02 Thread Jiri Olsa

On Tue, Jun 02, 2020 at 10:42:56AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jun 02, 2020 at 12:17:36PM +0200, Jiri Olsa escreveu:
> > Jin Yao reported the issue (and posted first versions of this change)
> > with groups being defined over events with different cpu mask.
>  
> > This causes assert aborts in get_group_fd, like:
>  
> >   # perf stat -M "C2_Pkg_Residency" -a -- sleep 1
> >   perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
> >   Aborted
>  
> > All the events in the group have to be defined over the same
> > cpus so the group_fd can be found for every leader/member pair.
>  
> > Adding check to ensure this condition is met and removing the
> > group (with warning) if we detect mixed cpus, like:
>  
> >   $ sudo perf stat -e 
> > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
> >   WARNING: event cpu maps do not match, disabling group:
> > anon group { power/energy-cores/, cycles }
> > anon group { instructions, power/energy-cores/ }
> 
> So it doesn't disable the 'group', it disables the 'grouping' of those
> events, right? I.e. reading the WARNING, I thought that it would count
> nothing, since it lists both groups as being disabled, but when I tested
> I noticed that:
> 
>   [root@seventh ~]# perf stat -e 
> '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
>   WARNING: grouped events cpus do not match, disabling group:
> anon group { power/energy-cores/, cycles }
> anon group { instructions, power/energy-cores/ }
>   ^C
>Performance counter stats for 'system wide':
>   
>12.62 Joules power/energy-cores/
>  106,920,637cycles
>   80,228,899instructions  #0.75  insn per 
> cycle
>12.62 Joules power/energy-cores/
>   
> 14.514476987 seconds time elapsed
>   
>   
>   [root@seventh ~]#
> 
>   I.e. it counted the events, ungrouped, or am I missing something?

right, it disables 'grouping', events are scheduled/counted individualy

this way we will not hit the issue when looking for group_fd FD
and there's not any, because of different cpu maps

> 
> If I do:
> 
>   [root@seventh ~]# perf stat -e 
> '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2
>   
>Performance counter stats for 'system wide':
>   
> 1.73 Joules power/energy-cores/   
>   
> 0.92 Joules power/energy-ram/ 
>   
>   12,191,658instructions  #0.67  insn per 
> cycle 
>   18,275,233cycles
>   
>   
>  2.001272492 seconds time elapsed
>   
>   [root@seventh ~]# 
> 
> It works, grouped. One observation, shouldn't we somehow show in the
> output that the first two were indeed grouped, ditto for the second two?

yea, we don't display groups in output.. also there's no number
for the group, it's still separate events numbers in output
grouping is only used when creating events

> 
> Also, this needs improvement:
> 
>   [root@seventh ~]# perf stat -e 
> '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2
>   Error:
>   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for 
> event (power/energy-cores/).
>   /bin/dmesg | grep -i perf may provide additional information.

yes, power events don't work with events without cpu being defined,
which is what we do for 'workload' session.. we should either check
for that and display some sensible error for power events

or perhaps check if we could monitor like perf record does with creating
events for task and every cpu in the system

thanks,
jirka

Re: kobject_init_and_add is easy to misuse

2020-06-02 Thread Greg Kroah-Hartman

On Tue, Jun 02, 2020 at 05:10:35AM -0700, Matthew Wilcox wrote:
> On Tue, Jun 02, 2020 at 07:50:33PM +0800, Wang Hai wrote:
> > syzkaller reports for memory leak when kobject_init_and_add()
> > returns an error in the function sysfs_slab_add() [1]
> > 
> > When this happened, the function kobject_put() is not called for the
> > corresponding kobject, which potentially leads to memory leak.
> > 
> > This patch fixes the issue by calling kobject_put() even if
> > kobject_init_and_add() fails.
> 
> I think this speaks to a deeper problem with kobject_init_and_add()
> -- the need to call kobject_put() if it fails is not readily apparent
> to most users.  This same bug appears in the first three users of
> kobject_init_and_add() that I checked --
> arch/ia64/kernel/topology.c
> drivers/firmware/dmi-sysfs.c
> drivers/firmware/efi/esrt.c
> drivers/scsi/iscsi_boot_sysfs.c
> 
> Some do get it right --
> arch/powerpc/kernel/cacheinfo.c
> drivers/gpu/drm/ttm/ttm_bo.c
> drivers/gpu/drm/ttm/ttm_memory.c
> drivers/infiniband/hw/mlx4/sysfs.c

Why are random individual drivers calling kobject* functions?  That
speaks to a larger problem here...

Anyway, yes, it's a tricky function, but the issue usually is that the
kobject is embedded in something else and if you call init_and_add() you
want to tear things down _before_ the final put happens.

The good thing is, that function is really hard to get to fail except if
you abuse it with syzkaller :)

> I'd argue that the current behaviour is wrong, that kobject_init_and_add()
> should call kobject_put() if the add fails.  This would need a tree-wide
> audit.  But somebody needs to do that anyway because based on my random
> sampling, half of the users currently get it wrong.

As said above, this is "tricky", and might break things.

thanks,

greg k-h

Re: [PATCH v4 4/4] kdb: Switch to use safer dbg_io_ops over console APIs

2020-06-02 Thread Sumit Garg

On Tue, 2 Jun 2020 at 19:16, Daniel Thompson  wrote:
>
> On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote:
> > In kgdb context, calling console handlers aren't safe due to locks used
> > in those handlers which could in turn lead to a deadlock. Although, using
> > oops_in_progress increases the chance to bypass locks in most console
> > handlers but it might not be sufficient enough in case a console uses
> > more locks (VT/TTY is good example).
> >
> > Currently when a driver provides both polling I/O and a console then kdb
> > will output using the console. We can increase robustness by using the
> > currently active polling I/O driver (which should be lockless) instead
> > of the corresponding console. For several common cases (e.g. an
> > embedded system with a single serial port that is used both for console
> > output and debugger I/O) this will result in no console handler being
> > used.
> >
> > In order to achieve this we need to reverse the order of preference to
> > use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> > store "struct console" that represents debugger I/O in dbg_io_ops and
> > while emitting kdb messages, skip console that matches dbg_io_ops
> > console in order to avoid duplicate messages. After this change,
> > "is_console" param becomes redundant and hence removed.
> >
> > Suggested-by: Daniel Thompson 
> > Signed-off-by: Sumit Garg 
>
> Looking good, only one minor comment left on my side (including the
> three patches prior).
>
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 9e5a40d..5e00bc8 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len)
> >   if (msg_len == 0)
> >   return;
> >
> > - if (dbg_io_ops && !dbg_io_ops->is_console)
> > + if (dbg_io_ops)
> >   kdb_io_write(msg, msg_len);
>
> Since this now slots on so cleanly and there are not multiple calls
> to kdb_io_write() then I think perhaps factoring this out into its
> own function (in patch 1) is no long necessary. The character write
> loop can go directly into this function.
>

Okay, will update it in the next version.

-Sumit

>
> Daniel.

[PATCH 4/4] serial: core: drop redundant sysrq checks

2020-06-02 Thread Johan Hovold

The sysrq timestamp will never be set unless port->has_sysrq so drop the
redundant checks that where added by commit 1997e9dfdc84 ("serial_core:
Un-ifdef sysrq SUPPORT_SYSRQ").

Signed-off-by: Johan Hovold 
---
 include/linux/serial_core.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index a8a213b71553..2f6c3cfe2ae7 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -468,7 +468,7 @@ bool uart_try_toggle_sysrq(struct uart_port *port, unsigned 
int ch);
 
 static inline int uart_handle_sysrq_char(struct uart_port *port, unsigned int 
ch)
 {
-   if (!port->has_sysrq || !port->sysrq)
+   if (!port->sysrq)
return 0;
 
if (ch && time_before(jiffies, port->sysrq)) {
@@ -487,7 +487,7 @@ static inline int uart_handle_sysrq_char(struct uart_port 
*port, unsigned int ch
 
 static inline int uart_prepare_sysrq_char(struct uart_port *port, unsigned int 
ch)
 {
-   if (!port->has_sysrq || !port->sysrq)
+   if (!port->sysrq)
return 0;
 
if (ch && time_before(jiffies, port->sysrq)) {
-- 
2.26.2

[PATCH 2/4] serial: core: fix broken sysrq port unlock

2020-06-02 Thread Johan Hovold

Commit d6e1935819db ("serial: core: Allow processing sysrq at port
unlock time") worked around a circular locking dependency by adding
helpers used to defer sysrq processing to when the port lock was
released.

A later commit unfortunately converted these inline helpers to exported
functions despite the fact that the unlock helper was restoring irq
flags, something which needs to be done in the same function that saved
them (e.g. on SPARC).

Fixes: 8e20fc391711 ("serial_core: Move sysrq functions from header file")
Cc: stable  # 5.6
Cc: Dmitry Safonov <0x7f454...@gmail.com>
Signed-off-by: Johan Hovold 
---
 drivers/tty/serial/serial_core.c | 19 ---
 include/linux/serial_core.h  | 21 +++--
 2 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index edfb7bc14bbf..f6cf9cc4ce69 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -3239,25 +3239,6 @@ int uart_prepare_sysrq_char(struct uart_port *port, 
unsigned int ch)
 }
 EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char);
 
-void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long 
irqflags)
-{
-   int sysrq_ch;
-
-   if (!port->has_sysrq) {
-   spin_unlock_irqrestore(>lock, irqflags);
-   return;
-   }
-
-   sysrq_ch = port->sysrq_ch;
-   port->sysrq_ch = 0;
-
-   spin_unlock_irqrestore(>lock, irqflags);
-
-   if (sysrq_ch)
-   handle_sysrq(sysrq_ch);
-}
-EXPORT_SYMBOL_GPL(uart_unlock_and_check_sysrq);
-
 /*
  * We do the SysRQ and SAK checking like this...
  */
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 1f4443db5474..858c5dd926ad 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -462,8 +462,25 @@ extern void uart_insert_char(struct uart_port *port, 
unsigned int status,
 
 extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch);
 extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch);
-extern void uart_unlock_and_check_sysrq(struct uart_port *port,
-   unsigned long irqflags);
+
+static inline void uart_unlock_and_check_sysrq(struct uart_port *port, 
unsigned long irqflags)
+{
+   int sysrq_ch;
+
+   if (!port->has_sysrq) {
+   spin_unlock_irqrestore(>lock, irqflags);
+   return;
+   }
+
+   sysrq_ch = port->sysrq_ch;
+   port->sysrq_ch = 0;
+
+   spin_unlock_irqrestore(>lock, irqflags);
+
+   if (sysrq_ch)
+   handle_sysrq(sysrq_ch);
+}
+
 extern int uart_handle_break(struct uart_port *port);
 
 /*
-- 
2.26.2

[PATCH 0/4] serial: core: fix up sysrq regressions

2020-06-02 Thread Johan Hovold

This series fixes a few regressions introduced by the recent sysrq
rework that went into 5.6.

The port unlock fix is tagged for stable, and the fix for the
unnecessary per-character overhead probably could be as well although
it is a bit more intrusive.

Johan


Johan Hovold (4):
  Revert "serial: core: Refactor uart_unlock_and_check_sysrq()"
  serial: core: fix broken sysrq port unlock
  serial: core: fix sysrq overhead regression
  serial: core: drop redundant sysrq checks

 drivers/tty/serial/serial_core.c |  96 +---
 include/linux/serial_core.h  | 105 +--
 2 files changed, 103 insertions(+), 98 deletions(-)

-- 
2.26.2

[PATCH 1/4] Revert "serial: core: Refactor uart_unlock_and_check_sysrq()"

2020-06-02 Thread Johan Hovold

This reverts commit da9a5aa3402db0ff3b57216d8dbf2478e1046cae.

In order to ease backporting a fix for broken port unlock, revert this
rewrite which was since added on top.

The other sysrq helpers bail out early when sysrq is not enabled; it's
better to keep that pattern here as well.

Note that the __releases() attribute won't be needed after the follow-on
fix either.

Signed-off-by: Johan Hovold 
---
 drivers/tty/serial/serial_core.c | 23 +--
 include/linux/serial_core.h  |  3 ++-
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 66a5e2faf57e..edfb7bc14bbf 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -3239,19 +3239,22 @@ int uart_prepare_sysrq_char(struct uart_port *port, 
unsigned int ch)
 }
 EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char);
 
-void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long flags)
-__releases(>lock)
+void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long 
irqflags)
 {
-   if (port->has_sysrq) {
-   int sysrq_ch = port->sysrq_ch;
+   int sysrq_ch;
 
-   port->sysrq_ch = 0;
-   spin_unlock_irqrestore(>lock, flags);
-   if (sysrq_ch)
-   handle_sysrq(sysrq_ch);
-   } else {
-   spin_unlock_irqrestore(>lock, flags);
+   if (!port->has_sysrq) {
+   spin_unlock_irqrestore(>lock, irqflags);
+   return;
}
+
+   sysrq_ch = port->sysrq_ch;
+   port->sysrq_ch = 0;
+
+   spin_unlock_irqrestore(>lock, irqflags);
+
+   if (sysrq_ch)
+   handle_sysrq(sysrq_ch);
 }
 EXPORT_SYMBOL_GPL(uart_unlock_and_check_sysrq);
 
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 92f5eba86052..1f4443db5474 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -462,7 +462,8 @@ extern void uart_insert_char(struct uart_port *port, 
unsigned int status,
 
 extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch);
 extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch);
-extern void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long 
flags);
+extern void uart_unlock_and_check_sysrq(struct uart_port *port,
+   unsigned long irqflags);
 extern int uart_handle_break(struct uart_port *port);
 
 /*
-- 
2.26.2

[PATCH 3/4] serial: core: fix sysrq overhead regression

2020-06-02 Thread Johan Hovold

Commit 8e20fc391711 ("serial_core: Move sysrq functions from header
file") converted the inline sysrq helpers to exported functions which
are now called for every received character and break signal also on
systems without CONFIG_MAGIC_SYSRQ_SERIAL instead of being optimised
away by the compiler.

Inlining these helpers again also avoids the function call overhead when
CONFIG_MAGIC_SYSRQ_SERIAL is enabled.

Fixes: 8e20fc391711 ("serial_core: Move sysrq functions from header file")
Cc: Dmitry Safonov <0x7f454...@gmail.com>
Signed-off-by: Johan Hovold 
---
 drivers/tty/serial/serial_core.c | 80 +-
 include/linux/serial_core.h  | 85 ++--
 2 files changed, 84 insertions(+), 81 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index f6cf9cc4ce69..a714402dcf4e 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -41,8 +41,6 @@ static struct lock_class_key port_lock_key;
 
 #define HIGH_BITS_OFFSET   ((sizeof(long)-sizeof(int))*8)
 
-#define SYSRQ_TIMEOUT  (HZ * 5)
-
 static void uart_change_speed(struct tty_struct *tty, struct uart_state *state,
struct ktermios *old_termios);
 static void uart_wait_until_sent(struct tty_struct *tty, int timeout);
@@ -3163,7 +3161,7 @@ static DECLARE_WORK(sysrq_enable_work, uart_sysrq_on);
  * Returns false if @ch is out of enabling sequence and should be
  * handled some other way, true if @ch was consumed.
  */
-static bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch)
+bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch)
 {
int sysrq_toggle_seq_len = strlen(sysrq_toggle_seq);
 
@@ -3186,83 +3184,9 @@ static bool uart_try_toggle_sysrq(struct uart_port 
*port, unsigned int ch)
port->sysrq = 0;
return true;
 }
-#else
-static inline bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int 
ch)
-{
-   return false;
-}
+EXPORT_SYMBOL_GPL(uart_try_toggle_sysrq);
 #endif
 
-int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
-{
-   if (!IS_ENABLED(CONFIG_MAGIC_SYSRQ_SERIAL))
-   return 0;
-
-   if (!port->has_sysrq || !port->sysrq)
-   return 0;
-
-   if (ch && time_before(jiffies, port->sysrq)) {
-   if (sysrq_mask()) {
-   handle_sysrq(ch);
-   port->sysrq = 0;
-   return 1;
-   }
-   if (uart_try_toggle_sysrq(port, ch))
-   return 1;
-   }
-   port->sysrq = 0;
-
-   return 0;
-}
-EXPORT_SYMBOL_GPL(uart_handle_sysrq_char);
-
-int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch)
-{
-   if (!IS_ENABLED(CONFIG_MAGIC_SYSRQ_SERIAL))
-   return 0;
-
-   if (!port->has_sysrq || !port->sysrq)
-   return 0;
-
-   if (ch && time_before(jiffies, port->sysrq)) {
-   if (sysrq_mask()) {
-   port->sysrq_ch = ch;
-   port->sysrq = 0;
-   return 1;
-   }
-   if (uart_try_toggle_sysrq(port, ch))
-   return 1;
-   }
-   port->sysrq = 0;
-
-   return 0;
-}
-EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char);
-
-/*
- * We do the SysRQ and SAK checking like this...
- */
-int uart_handle_break(struct uart_port *port)
-{
-   struct uart_state *state = port->state;
-
-   if (port->handle_break)
-   port->handle_break(port);
-
-   if (port->has_sysrq && uart_console(port)) {
-   if (!port->sysrq) {
-   port->sysrq = jiffies + SYSRQ_TIMEOUT;
-   return 1;
-   }
-   port->sysrq = 0;
-   }
-
-   if (port->flags & UPF_SAK)
-   do_SAK(state->port.tty);
-   return 0;
-}
-EXPORT_SYMBOL_GPL(uart_handle_break);
-
 EXPORT_SYMBOL(uart_write_wakeup);
 EXPORT_SYMBOL(uart_register_driver);
 EXPORT_SYMBOL(uart_unregister_driver);
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 858c5dd926ad..a8a213b71553 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -460,8 +460,49 @@ extern void uart_handle_cts_change(struct uart_port *uport,
 extern void uart_insert_char(struct uart_port *port, unsigned int status,
 unsigned int overrun, unsigned int ch, unsigned int flag);
 
-extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch);
-extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch);
+#ifdef CONFIG_MAGIC_SYSRQ_SERIAL
+
+#define SYSRQ_TIMEOUT  (HZ * 5)
+
+bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch);
+
+static inline int uart_handle_sysrq_char(struct uart_port *port, unsigned int 
ch)
+{
+   if (!port->has_sysrq || !port->sysrq)
+   return 0;
+
+   if (ch

Re: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where appropriate

2020-06-02 Thread Andy Shevchenko

On Tue, Jun 2, 2020 at 4:38 PM Ruhl, Michael J  wrote:
> >-Original Message-
> >From: dri-devel  On Behalf Of
> >Piotr Stankiewicz
> >Sent: Tuesday, June 2, 2020 5:21 AM
> >To: Alex Deucher ; Christian König
> >; David Zhou ; David
> >Airlie ; Daniel Vetter 
> >Cc: Stankiewicz, Piotr ; dri-
> >de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; linux-
> >ker...@vger.kernel.org
> >Subject: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where
> >appropriate

...

> >   int nvec = pci_msix_vec_count(adev->pdev);
> >   unsigned int flags;
> >
> >-  if (nvec <= 0) {
> >+  if (nvec > 0)
> >+  flags = PCI_IRQ_MSI_TYPES;
> >+  else
> >   flags = PCI_IRQ_MSI;
> >-  } else {
> >-  flags = PCI_IRQ_MSI | PCI_IRQ_MSIX;
> >-  }
>
> Minor nit:
>
> Is it really necessary to set do this check?  Can flags just
> be set?
>
> I.e.:
> flags = PCI_IRQ_MSI_TYPES;
>
> pci_alloc_irq_vector() tries stuff in order.  If MSIX is not available,
> it will try MSI.

That's also what I proposed earlier. But I suggested as well to wait
for AMD people to confirm that neither pci_msix_vec_count() nor flags
is needed and we can directly supply MSI_TYPES to the below call.

> >   /* we only need one vector */
> >   nvec = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags);

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH] spi: sprd: call pm_runtime_put if pm_runtime_get_sync fails

2020-06-02 Thread Baolin Wang

Hi,

On Tue, Jun 2, 2020 at 1:20 PM Navid Emamdoost
 wrote:
>
> Call to pm_runtime_get_sync increments counter even in case of
> failure leading to incorrect ref count.
> Call pm_runtime_put_noidle if pm_runtime_get_sync fails.
>
> Signed-off-by: Navid Emamdoost 

Looks good to me. Thanks
Reviewed-by: Baolin Wang 

> ---
>  drivers/spi/spi-sprd.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/spi/spi-sprd.c b/drivers/spi/spi-sprd.c
> index 6678f1cbc566..860032af4b98 100644
> --- a/drivers/spi/spi-sprd.c
> +++ b/drivers/spi/spi-sprd.c
> @@ -1018,6 +1018,7 @@ static int sprd_spi_remove(struct platform_device *pdev)
> ret = pm_runtime_get_sync(ss->dev);
> if (ret < 0) {
> dev_err(ss->dev, "failed to resume SPI controller\n");
> +   pm_runtime_put_noidle(>dev);
> return ret;
> }
>
> --
> 2.17.1
>


-- 
Baolin Wang

[PATCH v4 1/6] arm64: Detect the ARMv8.4 TTL feature

2020-06-02 Thread Zhenyu Ye

From: Marc Zyngier 

In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL
feature allows TLBs to be issued with a level allowing for quicker
invalidation.

Let's detect the feature for now. Further patches will implement
its actual usage.

Signed-off-by: Marc Zyngier 
Signed-off-by: Zhenyu Ye 
Reviewed-by: Catalin Marinas 
---
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  1 +
 arch/arm64/kernel/cpufeature.c   | 11 +++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8eb5a088ae65..cabb0c49a1d1 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -61,7 +61,8 @@
 #define ARM64_HAS_AMU_EXTN 51
 #define ARM64_HAS_ADDRESS_AUTH 52
 #define ARM64_HAS_GENERIC_AUTH 53
+#define ARM64_HAS_ARMv8_4_TTL  54
 
-#define ARM64_NCAPS54
+#define ARM64_NCAPS55
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c4ac0ac25a00..477d84ba1056 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -725,6 +725,7 @@
 
 /* id_aa64mmfr2 */
 #define ID_AA64MMFR2_E0PD_SHIFT60
+#define ID_AA64MMFR2_TTL_SHIFT 48
 #define ID_AA64MMFR2_FWB_SHIFT 40
 #define ID_AA64MMFR2_AT_SHIFT  32
 #define ID_AA64MMFR2_LVA_SHIFT 16
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9fac745aa7bb..d993dc6dc7d5 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -244,6 +244,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR2_E0PD_SHIFT, 4, 0),
+   ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR2_TTL_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR2_FWB_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR2_AT_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR2_LVA_SHIFT, 4, 0),
@@ -1622,6 +1623,16 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.matches = has_cpuid_feature,
.cpu_enable = cpu_has_fwb,
},
+   {
+   .desc = "ARMv8.4 Translation Table Level",
+   .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+   .capability = ARM64_HAS_ARMv8_4_TTL,
+   .sys_reg = SYS_ID_AA64MMFR2_EL1,
+   .sign = FTR_UNSIGNED,
+   .field_pos = ID_AA64MMFR2_TTL_SHIFT,
+   .min_field_value = 1,
+   .matches = has_cpuid_feature,
+   },
 #ifdef CONFIG_ARM64_HW_AFDBM
{
/*
-- 
2.19.1

[PATCH v4 3/6] arm64: Add tlbi_user_level TLB invalidation helper

2020-06-02 Thread Zhenyu Ye

Add a level-hinted parameter to __tlbi_user, which only gets used
if ARMv8.4-TTL gets detected.

ARMv8.4-TTL provides the TTL field in tlbi instruction to indicate
the level of translation table walk holding the leaf entry for the
address that is being invalidated.

This patch set the default level value of flush_tlb_range() to 0,
which will be updated in future patches.  And set the ttl value of
flush_tlb_page_nosync() to 3 because it is only called to flush a
single pte page.

Signed-off-by: Zhenyu Ye 
---
 arch/arm64/include/asm/tlbflush.h | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h 
b/arch/arm64/include/asm/tlbflush.h
index 8adbd6fd8489..bfb58e62c127 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -88,6 +88,12 @@
__tlbi(op,  arg);   \
 } while (0)
 
+#define __tlbi_user_level(op, arg, level) do { \
+   if (arm64_kernel_unmapped_at_el0()) \
+   __tlbi_level(op, (arg | USER_ASID_FLAG), level);\
+} while (0)
+
+
 /*
  * TLB Invalidation
  * 
@@ -189,8 +195,9 @@ static inline void flush_tlb_page_nosync(struct 
vm_area_struct *vma,
unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm));
 
dsb(ishst);
-   __tlbi(vale1is, addr);
-   __tlbi_user(vale1is, addr);
+   /* This function is only called on a small page */
+   __tlbi_level(vale1is, addr, 3);
+   __tlbi_user_level(vale1is, addr, 3);
 }
 
 static inline void flush_tlb_page(struct vm_area_struct *vma,
@@ -230,11 +237,11 @@ static inline void __flush_tlb_range(struct 
vm_area_struct *vma,
dsb(ishst);
for (addr = start; addr < end; addr += stride) {
if (last_level) {
-   __tlbi(vale1is, addr);
-   __tlbi_user(vale1is, addr);
+   __tlbi_level(vale1is, addr, 0);
+   __tlbi_user_level(vale1is, addr, 0);
} else {
-   __tlbi(vae1is, addr);
-   __tlbi_user(vae1is, addr);
+   __tlbi_level(vae1is, addr, 0);
+   __tlbi_user_level(vae1is, addr, 0);
}
}
dsb(ish);
-- 
2.19.1

[PATCH v4 5/6] arm64: tlb: Set the TTL field in flush_tlb_range

2020-06-02 Thread Zhenyu Ye

This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().

Signed-off-by: Zhenyu Ye 
Reviewed-by: Catalin Marinas 
---
 arch/arm64/include/asm/tlb.h  | 29 -
 arch/arm64/include/asm/tlbflush.h | 14 --
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index b76df828e6b7..61c97d3b58c7 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -21,11 +21,37 @@ static void tlb_flush(struct mmu_gather *tlb);
 
 #include 
 
+/*
+ * get the tlbi levels in arm64.  Default value is 0 if more than one
+ * of cleared_* is set or neither is set.
+ * Arm64 doesn't support p4ds now.
+ */
+static inline int tlb_get_level(struct mmu_gather *tlb)
+{
+   if (tlb->cleared_ptes && !(tlb->cleared_pmds ||
+  tlb->cleared_puds ||
+  tlb->cleared_p4ds))
+   return 3;
+
+   if (tlb->cleared_pmds && !(tlb->cleared_ptes ||
+  tlb->cleared_puds ||
+  tlb->cleared_p4ds))
+   return 2;
+
+   if (tlb->cleared_puds && !(tlb->cleared_ptes ||
+  tlb->cleared_pmds ||
+  tlb->cleared_p4ds))
+   return 1;
+
+   return 0;
+}
+
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0);
bool last_level = !tlb->freed_tables;
unsigned long stride = tlb_get_unmap_size(tlb);
+   int tlb_level = tlb_get_level(tlb);
 
/*
 * If we're tearing down the address space then we only care about
@@ -38,7 +64,8 @@ static inline void tlb_flush(struct mmu_gather *tlb)
return;
}
 
-   __flush_tlb_range(, tlb->start, tlb->end, stride, last_level);
+   __flush_tlb_range(, tlb->start, tlb->end, stride,
+ last_level, tlb_level);
 }
 
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
diff --git a/arch/arm64/include/asm/tlbflush.h 
b/arch/arm64/include/asm/tlbflush.h
index bfb58e62c127..84cb98b60b7b 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -215,7 +215,8 @@ static inline void flush_tlb_page(struct vm_area_struct 
*vma,
 
 static inline void __flush_tlb_range(struct vm_area_struct *vma,
 unsigned long start, unsigned long end,
-unsigned long stride, bool last_level)
+unsigned long stride, bool last_level,
+int tlb_level)
 {
unsigned long asid = ASID(vma->vm_mm);
unsigned long addr;
@@ -237,11 +238,11 @@ static inline void __flush_tlb_range(struct 
vm_area_struct *vma,
dsb(ishst);
for (addr = start; addr < end; addr += stride) {
if (last_level) {
-   __tlbi_level(vale1is, addr, 0);
-   __tlbi_user_level(vale1is, addr, 0);
+   __tlbi_level(vale1is, addr, tlb_level);
+   __tlbi_user_level(vale1is, addr, tlb_level);
} else {
-   __tlbi_level(vae1is, addr, 0);
-   __tlbi_user_level(vae1is, addr, 0);
+   __tlbi_level(vae1is, addr, tlb_level);
+   __tlbi_user_level(vae1is, addr, tlb_level);
}
}
dsb(ish);
@@ -253,8 +254,9 @@ static inline void flush_tlb_range(struct vm_area_struct 
*vma,
/*
 * We cannot use leaf-only invalidation here, since we may be 
invalidating
 * table entries as part of collapsing hugepages or moving page tables.
+* Set the tlb_level to 0 because we can not get enough information 
here.
 */
-   __flush_tlb_range(vma, start, end, PAGE_SIZE, false);
+   __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
 }
 
 static inline void flush_tlb_kernel_range(unsigned long start, unsigned long 
end)
-- 
2.19.1

[PATCH v4 4/6] tlb: mmu_gather: add tlb_flush_*_range APIs

2020-06-02 Thread Zhenyu Ye

From: "Peter Zijlstra (Intel)" 

tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and
tlb->end, then set corresponding cleared_*.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Zhenyu Ye 
Acked-by: Catalin Marinas 
---
 include/asm-generic/tlb.h | 55 ---
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 3f1649a8cf55..ef75ec86f865 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -512,6 +512,38 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
 }
 #endif
 
+/*
+ * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end,
+ * and set corresponding cleared_*.
+ */
+static inline void tlb_flush_pte_range(struct mmu_gather *tlb,
+unsigned long address, unsigned long size)
+{
+   __tlb_adjust_range(tlb, address, size);
+   tlb->cleared_ptes = 1;
+}
+
+static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
+unsigned long address, unsigned long size)
+{
+   __tlb_adjust_range(tlb, address, size);
+   tlb->cleared_pmds = 1;
+}
+
+static inline void tlb_flush_pud_range(struct mmu_gather *tlb,
+unsigned long address, unsigned long size)
+{
+   __tlb_adjust_range(tlb, address, size);
+   tlb->cleared_puds = 1;
+}
+
+static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
+unsigned long address, unsigned long size)
+{
+   __tlb_adjust_range(tlb, address, size);
+   tlb->cleared_p4ds = 1;
+}
+
 #ifndef __tlb_remove_tlb_entry
 #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 #endif
@@ -525,19 +557,17 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
  */
 #define tlb_remove_tlb_entry(tlb, ptep, address)   \
do {\
-   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
-   tlb->cleared_ptes = 1;  \
+   tlb_flush_pte_range(tlb, address, PAGE_SIZE);   \
__tlb_remove_tlb_entry(tlb, ptep, address); \
} while (0)
 
 #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)   \
do {\
unsigned long _sz = huge_page_size(h);  \
-   __tlb_adjust_range(tlb, address, _sz);  \
if (_sz == PMD_SIZE)\
-   tlb->cleared_pmds = 1;  \
+   tlb_flush_pmd_range(tlb, address, _sz); \
else if (_sz == PUD_SIZE)   \
-   tlb->cleared_puds = 1;  \
+   tlb_flush_pud_range(tlb, address, _sz); \
__tlb_remove_tlb_entry(tlb, ptep, address); \
} while (0)
 
@@ -551,8 +581,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
 
 #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address)   \
do {\
-   __tlb_adjust_range(tlb, address, HPAGE_PMD_SIZE);   \
-   tlb->cleared_pmds = 1;  \
+   tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE);  \
__tlb_remove_pmd_tlb_entry(tlb, pmdp, address); \
} while (0)
 
@@ -566,8 +595,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
 
 #define tlb_remove_pud_tlb_entry(tlb, pudp, address)   \
do {\
-   __tlb_adjust_range(tlb, address, HPAGE_PUD_SIZE);   \
-   tlb->cleared_puds = 1;  \
+   tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE);  \
__tlb_remove_pud_tlb_entry(tlb, pudp, address); \
} while (0)
 
@@ -592,9 +620,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
 #ifndef pte_free_tlb
 #define pte_free_tlb(tlb, ptep, address)   \
do {\
-   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
+   tlb_flush_pmd_range(tlb, address, PAGE_SIZE);   \
tlb->freed_tables = 1;  \
-   tlb->cleared_pmds = 1;  \
__pte_free_tlb(tlb, ptep, address); \
} while (0)
 #endif
@@ -602,9 +629,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, 
struct vm_area_struct *vm
 #ifndef pmd_free_tlb
 #define pmd_free_tlb(tlb, pmdp, address)

[PATCH v4 6/6] arm64: tlb: Set the TTL field in flush_*_tlb_range

2020-06-02 Thread Zhenyu Ye

This patch implement flush_{pmd|pud}_tlb_range() in arm64 by
calling __flush_tlb_range() with the corresponding stride and
tlb_level values.

Signed-off-by: Zhenyu Ye 
---
 arch/arm64/include/asm/pgtable.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 538c85e62f86..bc59814eda64 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -40,6 +40,16 @@ extern void __pmd_error(const char *file, int line, unsigned 
long val);
 extern void __pud_error(const char *file, int line, unsigned long val);
 extern void __pgd_error(const char *file, int line, unsigned long val);
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
+
+/* Set stride and tlb_level in flush_*_tlb_range */
+#define flush_pmd_tlb_range(vma, addr, end)\
+   __flush_tlb_range(vma, addr, end, PMD_SIZE, false, 2)
+#define flush_pud_tlb_range(vma, addr, end)\
+   __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
-- 
2.19.1

[PATCH v4 2/6] arm64: Add level-hinted TLB invalidation helper

2020-06-02 Thread Zhenyu Ye

From: Marc Zyngier 

Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.

Signed-off-by: Marc Zyngier 
Signed-off-by: Zhenyu Ye 
Reviewed-by: Catalin Marinas 
---
 arch/arm64/include/asm/tlbflush.h | 29 +
 1 file changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/tlbflush.h 
b/arch/arm64/include/asm/tlbflush.h
index bc3949064725..8adbd6fd8489 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -10,6 +10,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include 
 #include 
 #include 
 #include 
@@ -59,6 +60,34 @@
__ta;   \
})
 
+#define TLBI_TTL_MASK  GENMASK_ULL(47, 44)
+
+#define __tlbi_level(op, addr, level) do { \
+   u64 arg = addr; \
+   \
+   if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&   \
+   level) {\
+   u64 ttl = level;\
+   \
+   switch (PAGE_SIZE) {\
+   case SZ_4K: \
+   ttl |= 1 << 2;  \
+   break;  \
+   case SZ_16K:\
+   ttl |= 2 << 2;  \
+   break;  \
+   case SZ_64K:\
+   ttl |= 3 << 2;  \
+   break;  \
+   }   \
+   \
+   arg &= ~TLBI_TTL_MASK;  \
+   arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);  \
+   }   \
+   \
+   __tlbi(op,  arg);   \
+} while (0)
+
 /*
  * TLB Invalidation
  * 
-- 
2.19.1

Re: [PATCH 1/7] drm/rockchip: prepare common code for cdns and rk dpi/dp driver

2020-06-02 Thread Emil Velikov

HI Sandor Yu

On Mon, 1 Jun 2020 at 07:29,  wrote:
>
> From: Sandor Yu 
>
> - Extracted common fields from cdn_dp_device to a new cdns_mhdp_device
>   structure which will be used by two separate drivers later on.
> - Moved some datatypes (audio_format, audio_info, vic_pxl_encoding_format,
>   video_info) from cdn-dp-core.c to cdn-dp-reg.h.
> - Changed prefixes from cdn_dp to cdns_mhdp
> cdn -> cdns to match the other Cadence's drivers
> dp -> mhdp to distinguish it from a "just a DP" as the IP underneath
>   this registers map can be a HDMI (which is internally different,
>   but the interface for commands, events is pretty much the same).
> - Modified cdn-dp-core.c to use the new driver structure and new function
>   names.
> - writel and readl are replaced by cdns_mhdp_bus_write and
>   cdns_mhdp_bus_read.
>
The high-level idea is great - split, refactor and reuse the existing drivers.

Although looking at the patches themselves - they seems to be doing
multiple things at once.
As indicated by the extensive list in the commit log.

I would suggest splitting those up a bit, roughly in line of the
itemisation as per the commit message.

Here is one hand wavy way to chunk this patch:
 1) use put_unalligned*
 2) 'use local variable dev' style of changes (as seem in cdn_dp_clk_enable)
 3) add writel/readl wrappers
 4) hookup struct cdns_mhdp_device, keep dp->mhdp detail internal.
The cdn-dp-reg.h function names/signatures will stay the same.
 5) finalize the helpers - use mhdp directly, rename

HTH
Emil

Examples:
4)
 static int cdn_dp_mailbox_read(struct cdn_dp_device *dp)
 {
+"  struct cdns_mhdp_device *mhdp = dp->mhdp;
   int val, ret;

-  ret = readx_poll_timeout(readl, dp->regs + MAILBOX_EMPTY_ADDR,
+  ret = readx_poll_timeout(readl, mhdp->regs_base + MAILBOX_EMPTY_ADDR,
...
   return fancy_readl(dp, MAILBOX0_RD_DATA) & 0xff;
 }

5)
-static int cdn_dp_mailbox_read(struct cdn_dp_device *dp)
+static int mhdp_mailbox_read(struct cdns_mhdp_device *mhdp)
 {
-  struct cdns_mhdp_device *mhdp = dp->mhdp;
   int val, ret;
...
-  return fancy_readl(dp, MAILBOX0_RD_DATA) & 0xff;
+  return cdns_mhdp_bus_read(mhdp, MAILBOX0_RD_DATA) & 0xff;
 }

[PATCH v4 0/6] arm64: tlb: add support for TTL feature

2020-06-02 Thread Zhenyu Ye

In order to reduce the cost of TLB invalidation, ARMv8.4 provides
the TTL field in TLBI instruction.  The TTL field indicates the
level of page table walk holding the leaf entry for the address
being invalidated.  This series provide support for this feature.

When ARMv8.4-TTL is implemented, the operand for TLBIs looks like
below:

* +--+---+--+
* |   ASID   |  TTL  |BADDR |
* +--+---+--+
* |63  48|47   44|43   0|

See patches for details, Thanks.

--
ChangeList:
v4:
implement flush_*_tlb_range only on arm64.

v3:
minor changes: reduce the indentation levels of __tlbi_level().

v2:
rebase series on Linux 5.7-rc1 and simplify the code implementation.

v1:
add support for TTL feature in arm64.


Marc Zyngier (2):
  arm64: Detect the ARMv8.4 TTL feature
  arm64: Add level-hinted TLB invalidation helper

Peter Zijlstra (Intel) (1):
  tlb: mmu_gather: add tlb_flush_*_range APIs

Zhenyu Ye (3):
  arm64: Add tlbi_user_level TLB invalidation helper
  arm64: tlb: Set the TTL field in flush_tlb_range
  arm64: tlb: Set the TTL field in flush_*_tlb_range

 arch/arm64/include/asm/cpucaps.h  |  3 +-
 arch/arm64/include/asm/pgtable.h  | 10 ++
 arch/arm64/include/asm/sysreg.h   |  1 +
 arch/arm64/include/asm/tlb.h  | 29 +++-
 arch/arm64/include/asm/tlbflush.h | 54 +-
 arch/arm64/kernel/cpufeature.c| 11 +++
 include/asm-generic/tlb.h | 55 ++-
 7 files changed, 138 insertions(+), 25 deletions(-)

-- 
2.19.1

Re: linux-next: manual merge of the hyperv tree with the kvm tree

2020-06-02 Thread Wei Liu

On Tue, Jun 02, 2020 at 05:18:02PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the hyperv tree got a conflict in:
> 
>   arch/x86/include/asm/hyperv-tlfs.h
> 
> between commit:
> 
>   22ad0026d097 ("x86/hyper-v: Add synthetic debugger definitions")
> 

Paolo

As far as I can tell you merged that series a few days ago. Do you plan
to submit it to Linus in this merge window? How do you want to proceed
to fix the conflict?

Wei.

> from the kvm tree and commit:
> 
>   c55a844f46f9 ("x86/hyperv: Split hyperv-tlfs.h into arch dependent and 
> independent files")
> 
> from the hyperv tree.
> 
> I fixed it up (I removed the conficting bits from that file and added
> the following patch) and can carry the fix as necessary. This is now fixed
> as far as linux-next is concerned, but any non trivial conflicts should
> be mentioned to your upstream maintainer when your tree is submitted for
> merging.  You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.
> 
> From: Stephen Rothwell 
> Date: Tue, 2 Jun 2020 17:15:49 +1000
> Subject: [PATCH] x86/hyperv: merge fix for hyperv-tlfs.h split
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  include/asm-generic/hyperv-tlfs.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/asm-generic/hyperv-tlfs.h 
> b/include/asm-generic/hyperv-tlfs.h
> index 262fae9526b1..e73a11850055 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -145,6 +145,9 @@ struct ms_hyperv_tsc_page {
>  #define HVCALL_SET_VP_REGISTERS  0x0051
>  #define HVCALL_POST_MESSAGE  0x005c
>  #define HVCALL_SIGNAL_EVENT  0x005d
> +#define HVCALL_POST_DEBUG_DATA   0x0069
> +#define HVCALL_RETRIEVE_DEBUG_DATA   0x006a
> +#define HVCALL_RESET_DEBUG_SESSION   0x006b
>  #define HVCALL_RETARGET_INTERRUPT0x007e
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
> @@ -177,6 +180,7 @@ enum HV_GENERIC_SET_FORMAT {
>  #define HV_STATUS_INVALID_HYPERCALL_INPUT3
>  #define HV_STATUS_INVALID_ALIGNMENT  4
>  #define HV_STATUS_INVALID_PARAMETER  5
> +#define HV_STATUS_OPERATION_DENIED   8
>  #define HV_STATUS_INSUFFICIENT_MEMORY11
>  #define HV_STATUS_INVALID_PORT_ID17
>  #define HV_STATUS_INVALID_CONNECTION_ID  18
> -- 
> 2.26.2
> 
> -- 
> Cheers,
> Stephen Rothwell

[PATCH v2] crypto: hisilicon - allow smaller reads in debugfs

2020-06-02 Thread Dan Carpenter

Originally this code rejected any read less than 256 bytes.  There
is no need for this artificial limit.  We should just use the normal
helper functions to read a string from the kernel.

Signed-off-by: Dan Carpenter 
---
v2: Use simple_read_from_buffer().  The v1 was slightly half arsed
because I left the original check for:

if (*pos)
return 0;

So it could result in partial reads.  The new code means that if you
want to read the buffer one byte at a time, that's fine or if you want
to read it in one 256 byte chunk that's also fine.  Plus it deletes 21
lines of code and is a lot cleaner.

 drivers/crypto/hisilicon/qm.c | 33 ++---
 1 file changed, 6 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index 9bb263cec6c30..13ccb9e29a2e1 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -1064,19 +1064,10 @@ static ssize_t qm_cmd_read(struct file *filp, char 
__user *buffer,
char buf[QM_DBG_READ_LEN];
int len;
 
-   if (*pos)
-   return 0;
-
-   if (count < QM_DBG_READ_LEN)
-   return -ENOSPC;
+   len = scnprintf(buf, QM_DBG_READ_LEN, "%s\n",
+   "Please echo help to cmd to get help information");
 
-   len = snprintf(buf, QM_DBG_READ_LEN, "%s\n",
-  "Please echo help to cmd to get help information");
-
-   if (copy_to_user(buffer, buf, len))
-   return -EFAULT;
-
-   return (*pos = len);
+   return simple_read_from_buffer(buffer, count, pos, buf, len);
 }
 
 static void *qm_ctx_alloc(struct hisi_qm *qm, size_t ctx_size,
@@ -2691,24 +2682,12 @@ static ssize_t qm_status_read(struct file *filp, char 
__user *buffer,
 {
struct hisi_qm *qm = filp->private_data;
char buf[QM_DBG_READ_LEN];
-   int val, cp_len, len;
-
-   if (*pos)
-   return 0;
-
-   if (count < QM_DBG_READ_LEN)
-   return -ENOSPC;
+   int val, len;
 
val = atomic_read(>status.flags);
-   len = snprintf(buf, QM_DBG_READ_LEN, "%s\n", qm_s[val]);
-   if (!len)
-   return -EFAULT;
-
-   cp_len = copy_to_user(buffer, buf, len);
-   if (cp_len)
-   return -EFAULT;
+   len = scnprintf(buf, QM_DBG_READ_LEN, "%s\n", qm_s[val]);
 
-   return (*pos = len);
+   return simple_read_from_buffer(buffer, count, pos, buf, len);
 }
 
 static const struct file_operations qm_status_fops = {
-- 
2.26.2

Re: [PATCH] ARM: dts: AM33xx-l4: add gpio-ranges

2020-06-02 Thread Tony Lindgren

* Grygorii Strashko  [200602 13:44]:
> 
> 
> On 02/06/2020 16:14, Drew Fustini wrote:
> > Add gpio-ranges properties to the gpio controller nodes.
> > 
> > These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE
> > REGISTERS" in the  "AM335x Technical Reference Manual" [0] and "Table
> > 4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1].
> > A csv file with this data is available for reference [2].
> 
> It will be good if you can explain not only "what" is changed, but
> also "why" it's needed in commit message.

Also, please check (again) that this is the same for all the am3
variants. For omap3, we had different pad assignments even between
SoC revisions. Different pad routings should be easy to deal with
in the dts if needed though.

Regards,

Tony

[PATCH] workqueue: ensure all flush_work() completed when being destoryed

2020-06-02 Thread Lai Jiangshan

In old days, worker threads are not shared among different
workqueues and destroy_workqueue() used kthread_stop() to destroy
all workers before going to destroy workqueue structures.
And kthread_stop() can ensure the scheduled (worker->scheduled)
work items and the linked work items queued by flush_work()
to be completed.

For a workqueue to be completed/unused for wq users means that all
queued works have completed and all flush_work() have return,
and the workqueue is legitimate to passed to destroy_workqueue().

But
e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool")
made worker pools and workers shared among different
workqueues and kthread_stop() is not used to sync the completion
of the work items. destroy_workqueue() uses drain_workqueue()
to drain user work items, but internal work items queued by
flush_work() is not drained due to they don't have colors.

So problems may occur when wq_barrier_func() does complete(>done)
and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue()
can be scheduled eariler than the proccess_one_work() to do
the put_pwq(), so that the sanity check in destroy_workqueue()
can see the no yet put pwq->refcnt and blame false positively.

The problem can be easily fixed by removing the WORK_NO_COLOR
and making the internal work item queued by flush_work() inherit
the color of the work items to be flushed. It would definitely
revert the design and the benefits of the WORK_NO_COLOR.

The patch simply adds an atomic counter for in-flight flush_work()
and a completion for destroy_workqueue() waiting for them.

Signed-off-by: Lai Jiangshan 
---
Changed from V1:
Change from flush_no_color based mechanism to atomic+completion
based as TJ suggested.

 kernel/workqueue.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1921c982f920..71272beb8e01 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -253,6 +253,9 @@ struct workqueue_struct {
int nr_drainers;/* WQ: drain in progress */
int saved_max_active; /* WQ: saved pwq max_active */
 
+   atomic_tnr_flush_work;  /* flush work in progress */
+   struct completion   flush_work_done; /* sync flush_work() */
+
struct workqueue_attrs  *unbound_attrs; /* PW: only for unbound wqs */
struct pool_workqueue   *dfl_pwq;   /* PW: only for unbound wqs */
 
@@ -1154,6 +1157,12 @@ static void pwq_activate_first_delayed(struct 
pool_workqueue *pwq)
pwq_activate_delayed_work(work);
 }
 
+static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
+{
+   if (atomic_dec_and_test(>nr_flush_work))
+   complete(>flush_work_done);
+}
+
 /**
  * pwq_dec_nr_in_flight - decrement pwq's nr_in_flight
  * @pwq: pwq of interest
@@ -1168,8 +1177,10 @@ static void pwq_activate_first_delayed(struct 
pool_workqueue *pwq)
 static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
 {
/* uncolored work items don't participate in flushing or nr_active */
-   if (color == WORK_NO_COLOR)
+   if (color == WORK_NO_COLOR) {
+   dec_nr_in_flight_flush_work(pwq->wq);
goto out_put;
+   }
 
pwq->nr_in_flight[color]--;
 
@@ -2682,6 +2693,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
}
 
debug_work_activate(>work);
+   atomic_inc(>wq->nr_flush_work);
insert_work(pwq, >work, head,
work_color_to_flags(WORK_NO_COLOR) | linked);
 }
@@ -4278,6 +4290,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
wq_init_lockdep(wq);
INIT_LIST_HEAD(>list);
 
+   atomic_set(>nr_flush_work, 1);
+   init_completion(>flush_work_done);
+
if (alloc_and_link_pwqs(wq) < 0)
goto err_unreg_lockdep;
 
@@ -4354,6 +4369,10 @@ void destroy_workqueue(struct workqueue_struct *wq)
/* drain it before proceeding with destruction */
drain_workqueue(wq);
 
+   /* flush all uncompleted internal work items queued by flush_work() */
+   dec_nr_in_flight_flush_work(wq);
+   wait_for_completion(>flush_work_done);
+
/* kill rescuer, if sanity checks fail, leave it w/o rescuer */
if (wq->rescuer) {
struct worker *rescuer = wq->rescuer;
-- 
2.20.1

Re: kobject_init_and_add is easy to misuse

2020-06-02 Thread Konstantin Khlebnikov


On 02/06/2020 15.10, Matthew Wilcox wrote:

On Tue, Jun 02, 2020 at 07:50:33PM +0800, Wang Hai wrote:

syzkaller reports for memory leak when kobject_init_and_add()
returns an error in the function sysfs_slab_add() [1]

When this happened, the function kobject_put() is not called for the
corresponding kobject, which potentially leads to memory leak.

This patch fixes the issue by calling kobject_put() even if
kobject_init_and_add() fails.


I think this speaks to a deeper problem with kobject_init_and_add()
-- the need to call kobject_put() if it fails is not readily apparent
to most users.  This same bug appears in the first three users of
kobject_init_and_add() that I checked --
arch/ia64/kernel/topology.c
drivers/firmware/dmi-sysfs.c
drivers/firmware/efi/esrt.c
drivers/scsi/iscsi_boot_sysfs.c

Some do get it right --
arch/powerpc/kernel/cacheinfo.c
drivers/gpu/drm/ttm/ttm_bo.c
drivers/gpu/drm/ttm/ttm_memory.c
drivers/infiniband/hw/mlx4/sysfs.c

I'd argue that the current behaviour is wrong, that kobject_init_and_add()
should call kobject_put() if the add fails.  This would need a tree-wide
audit.  But somebody needs to do that anyway because based on my random
sampling, half of the users currently get it wrong.



At his point kobject doesn't own kmem-cache structure itself yet.

So calling kobject_put() will free kmem-cache and then it will be
freed second time on error path in create_cache().

I suppose freeing in case of error should be pushed from common
create_cache() into slab-specific __kmem_cache_create().

[PATCHv5 1/1] ext4: mballoc: Use raw_cpu_ptr instead of this_cpu_ptr

2020-06-02 Thread Ritesh Harjani

It doesn't really matter in ext4_mb_new_blocks() about whether the code
is rescheduled on any other cpu due to preemption. Because we care
about discard_pa_seq only when the block allocation fails and then too
we add the seq counter of all the cpus against the initial sampled one
to check if anyone has freed any blocks while we were doing allocation.

So just use raw_cpu_ptr instead of this_cpu_ptr to avoid this BUG.

BUG: using smp_processor_id() in preemptible [] code: syz-fuzzer/6927
caller is ext4_mb_new_blocks+0xa4d/0x3b70 fs/ext4/mballoc.c:4711
CPU: 1 PID: 6927 Comm: syz-fuzzer Not tainted 5.7.0-next-20200602-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 check_preemption_disabled+0x20d/0x220 lib/smp_processor_id.c:48
 ext4_mb_new_blocks+0xa4d/0x3b70 fs/ext4/mballoc.c:4711
 ext4_ext_map_blocks+0x201b/0x33e0 fs/ext4/extents.c:4244
 ext4_map_blocks+0x4cb/0x1640 fs/ext4/inode.c:626
 ext4_getblk+0xad/0x520 fs/ext4/inode.c:833
 ext4_bread+0x7c/0x380 fs/ext4/inode.c:883
 ext4_append+0x153/0x360 fs/ext4/namei.c:67
 ext4_init_new_dir fs/ext4/namei.c:2757 [inline]
 ext4_mkdir+0x5e0/0xdf0 fs/ext4/namei.c:2802
 vfs_mkdir+0x419/0x690 fs/namei.c:3632
 do_mkdirat+0x21e/0x280 fs/namei.c:3655
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Ritesh Harjani 
Reported-by: syzbot+82f324bb69744c5f6...@syzkaller.appspotmail.com
---
 fs/ext4/mballoc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index a9083113a8c0..b79b32dbe3ea 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4708,7 +4708,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
}
 
ac->ac_op = EXT4_MB_HISTORY_PREALLOC;
-   seq = *this_cpu_ptr(_pa_seq);
+   seq = *raw_cpu_ptr(_pa_seq);
if (!ext4_mb_use_preallocated(ac)) {
ac->ac_op = EXT4_MB_HISTORY_ALLOC;
ext4_mb_normalize_request(ac, ar);
-- 
2.21.3

Re: [PATCH v4 4/4] kdb: Switch to use safer dbg_io_ops over console APIs

2020-06-02 Thread Daniel Thompson

On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote:
> In kgdb context, calling console handlers aren't safe due to locks used
> in those handlers which could in turn lead to a deadlock. Although, using
> oops_in_progress increases the chance to bypass locks in most console
> handlers but it might not be sufficient enough in case a console uses
> more locks (VT/TTY is good example).
> 
> Currently when a driver provides both polling I/O and a console then kdb
> will output using the console. We can increase robustness by using the
> currently active polling I/O driver (which should be lockless) instead
> of the corresponding console. For several common cases (e.g. an
> embedded system with a single serial port that is used both for console
> output and debugger I/O) this will result in no console handler being
> used.
> 
> In order to achieve this we need to reverse the order of preference to
> use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> store "struct console" that represents debugger I/O in dbg_io_ops and
> while emitting kdb messages, skip console that matches dbg_io_ops
> console in order to avoid duplicate messages. After this change,
> "is_console" param becomes redundant and hence removed.
> 
> Suggested-by: Daniel Thompson 
> Signed-off-by: Sumit Garg 

Looking good, only one minor comment left on my side (including the
three patches prior).

> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 9e5a40d..5e00bc8 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len)
>   if (msg_len == 0)
>   return;
>  
> - if (dbg_io_ops && !dbg_io_ops->is_console)
> + if (dbg_io_ops)
>   kdb_io_write(msg, msg_len);

Since this now slots on so cleanly and there are not multiple calls
to kdb_io_write() then I think perhaps factoring this out into its
own function (in patch 1) is no long necessary. The character write
loop can go directly into this function.


Daniel.

Re: [PATCH v5 13/13] perf record: introduce --ctl-fd[-ack] options

2020-06-02 Thread Adrian Hunter

On 2/06/20 12:12 pm, Alexey Budankov wrote:
> 
> On 02.06.2020 11:32, Alexey Budankov wrote:
>>
>> On 02.06.2020 2:37, Andi Kleen wrote:
> or a pathname, or including also the event default of "disabled".

 For my cases conversion of pathnames into open fds belongs to external
 controlling process e.g. like in the examples provided in the patch set.
 Not sure about "event default of 'disabled'"
>>>
>>> It would be nicer for manual use cases if perf supported the path names
>>> directly like in Adrian's example, not needing a complex wrapper script.
>>
>> fds interface is required for VTune integration since VTune wants control
>> over files creation aside of Perf tool process. The script demonstrates
>> just one possible use case.
>>
>> Control files could easily be implemented on top of fds making open 
>> operations
>> for paths and then initializing fds. Interface below is vague and with 
>> explicit
>> options like below it could be more explicit:
>> --ctl-file /tmp/my-perf.fifo --ctl-file-ack /tmp/my-perf-ack.fifo
> 
> Or even clearer:
> 
> --ctl-fifo /tmp/my-perf --ctl-fifo-ack /tmp/my-perf-ack

If people are OK with having so many options, then that is fine by me.

> 
>>
>> Make either fds and or files provided on the command line. Implement file
>> options handling callbacks that would open paths and setting fds. Close fds
>> if they were opened by Perf tool process.
>>
>> Adrian, please share your mind and use case.
>>
>> ~Alexey
>>
>>>
>>> -Andi

>
> e.g. add "--control" and support all of:
>
> --control
> --control 11
> --control 11,15
> --control 11,15,disabled
> --control 11,,disabled
> --control /tmp/my-perf.fifo
> --control /tmp/my-perf.fifo,/tmp/my-perf-ack.fifo
> --control /tmp/my-perf.fifo,/tmp/my-perf-ack.fifo,disabled
> --control /tmp/my-perf.fifo,,disabled
>
> Regards
> Adrian
>

 Regards,
 Alexey

Re: [PATCH] ARM: dts: AM33xx-l4: add gpio-ranges

2020-06-02 Thread Grygorii Strashko





On 02/06/2020 16:14, Drew Fustini wrote:

Add gpio-ranges properties to the gpio controller nodes.

These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE
REGISTERS" in the  "AM335x Technical Reference Manual" [0] and "Table
4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1].
A csv file with this data is available for reference [2].


It will be good if you can explain not only "what" is changed, but
also "why" it's needed in commit message.



[0] https://www.ti.com/lit/ug/spruh73q/spruh73q.pdf
[1] http://www.ti.com/lit/ds/symlink/am3358.pdf
[2] https://gist.github.com/pdp7/6ffaddc8867973c1c3e8612cfaf72020

Signed-off-by: Drew Fustini 
---
Notes:
There was previous dicussion relevant to this patch:
https://lore.kernel.org/linux-gpio/20200525131731.GA948395@x1/

Here is the list I compiled which shows the mapping between a gpioline
and the pin number including the memory address for the pin control
register

gpiochip,gpio-line,pinctrl-PIN,pinctrl-address
0,0,82,44e10948
0,1,83,44e1094c
0,2,84,44e10950
0,3,85,44e10954
0,4,86,44e10958
0,5,87,44e1095c


...


On a BeagleBlack Black board (AM3358) with this patch:
cat /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single/gpio-ranges

GPIO ranges handled:
0: gpio-0-31 GPIOS [0 - 7] PINS [82 - 89]
8: gpio-0-31 GPIOS [8 - 11] PINS [52 - 55]
12: gpio-0-31 GPIOS [12 - 15] PINS [94 - 97]
16: gpio-0-31 GPIOS [16 - 17] PINS [71 - 72]
18: gpio-0-31 GPIOS [18 - 18] PINS [135 - 135]
19: gpio-0-31 GPIOS [19 - 20] PINS [108 - 109]
21: gpio-0-31 GPIOS [21 - 21] PINS [73 - 73]
22: gpio-0-31 GPIOS [22 - 23] PINS [8 - 9]
26: gpio-0-31 GPIOS [26 - 27] PINS [10 - 11]
28: gpio-0-31 GPIOS [28 - 28] PINS [74 - 74]
29: gpio-0-31 GPIOS [29 - 29] PINS [81 - 81]
30: gpio-0-31 GPIOS [30 - 31] PINS [28 - 29]
0: gpio-32-63 GPIOS [32 - 39] PINS [0 - 7]
8: gpio-32-63 GPIOS [40 - 43] PINS [90 - 93]
12: gpio-32-63 GPIOS [44 - 59] PINS [12 - 27]
28: gpio-32-63 GPIOS [60 - 63] PINS [30 - 33]
0: gpio-64-95 GPIOS [64 - 81] PINS [34 - 51]
18: gpio-64-95 GPIOS [82 - 85] PINS [77 - 80]
22: gpio-64-95 GPIOS [86 - 95] PINS [56 - 65]
0: gpio-96-127 GPIOS [96 - 100] PINS [66 - 70]
5: gpio-96-127 GPIOS [101 - 102] PINS [98 - 99]
7: gpio-96-127 GPIOS [103 - 104] PINS [75 - 76]
13: gpio-96-127 GPIOS [109 - 109] PINS [141 - 141]
14: gpio-96-127 GPIOS [110 - 117] PINS [100 - 107]

  arch/arm/boot/dts/am33xx-l4.dtsi | 24 
  1 file changed, 24 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi
index 5ed7f3c58c0f..340ea331e54d 100644
--- a/arch/arm/boot/dts/am33xx-l4.dtsi
+++ b/arch/arm/boot/dts/am33xx-l4.dtsi
@@ -151,6 +151,18 @@ SYSC_OMAP2_SOFTRESET |
  
  			gpio0: gpio@0 {

compatible = "ti,omap4-gpio";
+   gpio-ranges =   <_pinmux  0  82 8>,
+   <_pinmux  8  52 4>,
+   <_pinmux 12  94 4>,
+   <_pinmux 16  71 2>,
+   <_pinmux 18 135 1>,
+   <_pinmux 19 108 2>,
+   <_pinmux 21  73 1>,
+   <_pinmux 22   8 2>,
+   <_pinmux 26  10 2>,
+   <_pinmux 28  74 1>,
+   <_pinmux 29  81 1>,
+   <_pinmux 30  28 2>;
gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
@@ -1298,6 +1310,10 @@ SYSC_OMAP2_SOFTRESET |
  
  			gpio1: gpio@0 {

compatible = "ti,omap4-gpio";
+   gpio-ranges =   <_pinmux  0  0  8>,
+   <_pinmux  8 90  4>,
+   <_pinmux 12 12 16>,
+   <_pinmux 28 30  4>;
gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
@@ -1700,6 +1716,9 @@ SYSC_OMAP2_SOFTRESET |
  
  			gpio2: gpio@0 {

compatible = "ti,omap4-gpio";
+gpio-ranges =  <_pinmux  0 34 18>,
+   <_pinmux 18 77  4>,
+   <_pinmux 22 56 10>;
gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
@@ -1733,6 +1752,11 @@ SYSC_OMAP2_SOFTRESET |
  
  			gpio3: gpio@0 {

compatible = "ti,omap4-gpio";
+   gpio-ranges =   <_pinmux

Re: [PATCHv3] perf stat: Ensure group is defined on top of the same cpu mask

2020-06-02 Thread Arnaldo Carvalho de Melo

Em Tue, Jun 02, 2020 at 12:17:36PM +0200, Jiri Olsa escreveu:
> Jin Yao reported the issue (and posted first versions of this change)
> with groups being defined over events with different cpu mask.
 
> This causes assert aborts in get_group_fd, like:
 
>   # perf stat -M "C2_Pkg_Residency" -a -- sleep 1
>   perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
>   Aborted
 
> All the events in the group have to be defined over the same
> cpus so the group_fd can be found for every leader/member pair.
 
> Adding check to ensure this condition is met and removing the
> group (with warning) if we detect mixed cpus, like:
 
>   $ sudo perf stat -e 
> '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
>   WARNING: event cpu maps do not match, disabling group:
> anon group { power/energy-cores/, cycles }
> anon group { instructions, power/energy-cores/ }

So it doesn't disable the 'group', it disables the 'grouping' of those
events, right? I.e. reading the WARNING, I thought that it would count
nothing, since it lists both groups as being disabled, but when I tested
I noticed that:

  [root@seventh ~]# perf stat -e 
'{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
  WARNING: grouped events cpus do not match, disabling group:
anon group { power/energy-cores/, cycles }
anon group { instructions, power/energy-cores/ }
  ^C
   Performance counter stats for 'system wide':
  
   12.62 Joules power/energy-cores/
 106,920,637cycles
  80,228,899instructions  #0.75  insn per cycle
   12.62 Joules power/energy-cores/
  
14.514476987 seconds time elapsed
  
  
  [root@seventh ~]#

  I.e. it counted the events, ungrouped, or am I missing something?

If I do:

  [root@seventh ~]# perf stat -e 
'{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2
  
   Performance counter stats for 'system wide':
  
1.73 Joules power/energy-cores/ 

0.92 Joules power/energy-ram/   

  12,191,658instructions  #0.67  insn per cycle 

  18,275,233cycles  

  
 2.001272492 seconds time elapsed
  
  [root@seventh ~]# 

It works, grouped. One observation, shouldn't we somehow show in the
output that the first two were indeed grouped, ditto for the second two?

Also, this needs improvement:

  [root@seventh ~]# perf stat -e 
'{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for 
event (power/energy-cores/).
  /bin/dmesg | grep -i perf may provide additional information.
  
  [root@seventh ~]#

Probably stating that the power/ events can only be done on a system
wide mode or per-cpu?

I'm applying the patch now, with the above observations as committer
notes, we can improve this in follow on patch,

- Arnaldo
 
> Ian asked also for cpu maps details, it's displayed in verbose mode:
> 
>   $ sudo perf stat -e '{cycles,power/energy-cores/}' -v
>   WARNING: group events cpu maps do not match, disabling group:
> anon group { power/energy-cores/, cycles }
>power/energy-cores/: 0
>cycles: 0-7
> anon group { instructions, power/energy-cores/ }
>instructions: 0-7
>power/energy-cores/: 0
> 
> Fixes: 6a4bb04caacc8 ("perf tools: Enable grouping logic for parsed events")
> Co-developed-by: Jin Yao 
> Signed-off-by: Jin Yao 
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/builtin-stat.c | 55 +++
>  1 file changed, 55 insertions(+)
> 
>  v3 changes:
>- reword the warning with Ian's suggestion
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index b2b79aa161dd..9be020e0098a 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -190,6 +190,59 @@ static struct perf_stat_config stat_config = {
>   .big_num= true,
>  };
>  
> +static bool cpus_map_matched(struct evsel *a, struct evsel *b)
> +{
> + if (!a->core.cpus && !b->core.cpus)
> + return true;
> +
> + if (!a->core.cpus || !b->core.cpus)
> + return false;
> +
> + if (a->core.cpus->nr != b->core.cpus->nr)
> + return false;
> +
> + for (int i = 0; i < a->core.cpus->nr; i++) {
> + if (a->core.cpus->map[i] != b->core.cpus->map[i])
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static void evlist__check_cpu_maps(struct evlist *evlist)
> +{
> + struct evsel *evsel, *pos, *leader;
> + char buf[1024];
> +
> + evlist__for_each_entry(evlist, evsel) {
> + leader = evsel->leader;
> +
> + /* Check that leader matches cpus with each member. */
> +

Re: [PATCH 1/2] i2c: mux: pca9541: Change to correct bus control commands

2020-06-02 Thread Guenter Roeck

On 6/2/20 5:12 AM, Quentin Strydom wrote:
> Change current bus commands to match the pca9541a datasheet
> (see table 12 on page 14 of
> https://www.nxp.com/docs/en/data-sheet/PCA9541A.pdf). Also
> where entries are marked as no change the current control
> command is repeated as the current master reading the
> control register has control of the bus and bus is on.
> 
> Signed-off-by: Quentin Strydom 

I am not going to reply to your other e-mail. After all, it is marked
confidential with a lot of legalese around it.

- Split long lines in your e-mails.
- Do not just send the patch again, reply to the original patch
  with a comment such as "ping"
- If you resend (for example because still no one replied after
  more time, or you missed some mailing lists/reviewers), mark the
  patch subject with "PATCH RESEND"
- Never under any circumstances send an e-mail to a public list that
  states that "This email and any files transmitted with it are
  confidential ...".

For my part I did not reply because I have no access to the hardware
anymore, and thus can not validate if the code still works after this
change (or why I didn't notice the problem when I wrote the code initially).

Guenter

> ---
>  drivers/i2c/muxes/i2c-mux-pca9541.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/muxes/i2c-mux-pca9541.c 
> b/drivers/i2c/muxes/i2c-mux-pca9541.c
> index 6a39ada..50808fa 100644
> --- a/drivers/i2c/muxes/i2c-mux-pca9541.c
> +++ b/drivers/i2c/muxes/i2c-mux-pca9541.c
> @@ -211,7 +211,7 @@ static void pca9541_release_bus(struct i2c_client *client)
>  
>  /* Control commands per PCA9541 datasheet */
>  static const u8 pca9541_control[16] = {
> - 4, 0, 1, 5, 4, 4, 5, 5, 0, 0, 1, 1, 0, 4, 5, 1
> + 4, 4, 5, 5, 4, 4, 5, 7, 8, 0, 1, 11, 0, 0, 1, 1
>  };
>  
>  /*
>

RE: [PATCH v4 3/5] scsi: ufs: fix potential access NULL pointer while memcpy

2020-06-02 Thread Avri Altman

But this is just a suggestion.
Your way is fine too.

Thanks,
Avri

> 
> How about something like the untested attached?
> 
> Thanks,
> Avri
> 
> > -Original Message-
> > From: Bean Huo 
> > Sent: Tuesday, June 2, 2020 2:36 PM
> > To: Avri Altman ; alim.akh...@samsung.com;
> > asuto...@codeaurora.org; j...@linux.ibm.com;
> > martin.peter...@oracle.com; stanley@mediatek.com;
> > bean...@micron.com; bvanass...@acm.org; tomas.wink...@intel.com;
> > c...@codeaurora.org
> > Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH v4 3/5] scsi: ufs: fix potential access NULL pointer 
> > while
> > memcpy
> >
> > CAUTION: This email originated from outside of Western Digital. Do not click
> > on links or open attachments unless you recognize the sender and know
> that
> > the content is safe.
> >
> >
> > hi Avri
> > thanks review.
> >
> >
> > On Mon, 2020-06-01 at 06:25 +, Avri Altman wrote:
> > > Hi,
> > >
> > > > If param_offset is not 0, the memcpy length shouldn't be the
> > > > true descriptor length.
> > > >
> > > > Fixes: a4b0e8a4e92b ("scsi: ufs: Factor out
> > > > ufshcd_read_desc_param")
> > > > Signed-off-by: Bean Huo 
> > > > ---
> > > >  drivers/scsi/ufs/ufshcd.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > > index f7e8bfefe3d4..bc52a0e89cd3 100644
> > > > --- a/drivers/scsi/ufs/ufshcd.c
> > > > +++ b/drivers/scsi/ufs/ufshcd.c
> > > > @@ -3211,7 +3211,7 @@ int ufshcd_read_desc_param(struct ufs_hba
> > > > *hba,
> > > >
> > > > /* Check wherher we will not copy more data, than available
> > > > */
> > > > if (is_kmalloc && param_size > buff_len)
> > > > -   param_size = buff_len;
> > > > +   param_size = buff_len - param_offset;
> > >
> > > But Is_kmalloc is true if (param_offset != 0 || param_size <
> > > buff_len)
> > > So  if (is_kmalloc && param_size > buff_len) implies that
> > > param_offset is 0,
> > > Or did I get it wrong?
> >
> > If param_offset is 0, This willn't get any wrong, after this patch, it
> > is the same since offset is 0. As mentioned in the commit message, this
> > patch is only for the case of param_offset is not 0.
> >
> > >
> > > Still, I think that there is a problem here because nowhere we are
> > > checking that
> > > param_offset + param_size < buff_len, which now can happen because of
> > > ufs-bsg.
> > > Maybe you can add it and get rid of that is_kmalloc which is an
> > > awkward way to test for valid values?
> >
> > let me explain further:
> > we have these conditinos:
> >
> > 1) param_offset == 0, param_size >= buff_len;//no problem,
> > ufshcd_query_descriptor_retry() will read descripor with true
> > descriptor length, and no memcpy() called.
> >
> >
> > 2) param_offset == 0, param_size < buff_len;// no problem,
> > ufshcd_query_descriptor_retry() will read descripor with true
> > descriptor length buff_len, and memcpy() with param_size length.
> >
> >
> > 3) param_offset != 0, param_offset + param_size <= buff_len;// no
> > problem, ufshcd_query_descriptor_retry() will read descripor with true
> > descriptor length, and memcpy() with param_size length.
> >
> >
> > 4) param_offset != 0, param_offset + param_size > buff_len;// NULL
> > pointer reference problem, since ufshcd_query_descriptor_retry() will
> > read descripor with true descriptor length, and memcpy() with buff_len
> > length. correct memcpy length should be (buff_len - param_offset)
> >
> > param_offset + param_size < buff_len doesn't need to add, and
> > is_kmalloc is very hard to be removed based on current flow.
> >
> > so, the correct fixup patch shoulbe be like this:
> >
> >
> > -if (is_kmalloc && param_size > buff_len)
> > -   param_size = buff_len
> > +if (is_kmalloc && (param_size + param_offset) > buff_len)
> > +   param_size = buff_len - param_offset;
> >
> >
> > how do you think about it? if no problem, I will update it in next
> > version patch.
> >
> > thanks,
> >
> > Bean

Re: [PATCH] serial: 8250_port: Fix imprecise external abort for mctrl if inactive

2020-06-02 Thread Tony Lindgren

* Andy Shevchenko  [200602 08:33]:
> On Tue, Jun 2, 2020 at 11:09 AM Johan Hovold  wrote:
> > On Mon, Jun 01, 2020 at 05:18:13PM -0700, Tony Lindgren wrote:
> 
> ...
> 
> > There's shouldn't be anything fundamental preventing you from adding the
> > missing resume calls to the mctrl paths even if it may require reworking
> > (and fixing) the whole RPM implementation (which would be a good thing
> > of course).
> 
> Yes, for serial core I have long standing patch series to implement
> RPM (more or less?) properly.

Yeah let's try after the merge window.

Not sure what else to do with the fix though. We currently have
8250_port.c not really aware of the hardare state for PM runtime at
least for the hang-up path.

> However, OMAP is a beast which prevents us to go due to a big hack
> called pm_runtime_irq_safe().
> Tony is aware of this and I think the above is somehow related to removal of 
> it.

Now that we can detach and reattach the kernel serial console,
there should not be any need for pm_runtime_irq_safe() anymore :)

And the UART wake-up from deeper idle states can only happen with
help of external hardware like GPIO controller or pinctrl controller.

And for the always-on wake-up interrupt controllers we have the
Linux generic wakeirqs to wake-up serial device on events.

So I think the way to procedd with pm_runtime_irq_safe() removal
for serial drivers is to block serial PM runtime unless we have a
wakeirq configured for omaps in devicetree. In the worst case the
regression is that PM runtime for serial won't work unless properly
configured.

And the UART wakeup latency will be a bit longer compared to
pm_runtime_irq_safe() naturally.

> But I completely agree that the goal is to get better runtime PM
> implementation over all.

Yes agreed.

Regards,

Tony

RE: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where appropriate

2020-06-02 Thread Ruhl, Michael J

>-Original Message-
>From: dri-devel  On Behalf Of
>Piotr Stankiewicz
>Sent: Tuesday, June 2, 2020 5:21 AM
>To: Alex Deucher ; Christian König
>; David Zhou ; David
>Airlie ; Daniel Vetter 
>Cc: Stankiewicz, Piotr ; dri-
>de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; linux-
>ker...@vger.kernel.org
>Subject: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where
>appropriate
>
>Seeing as there is shorthand available to use when asking for any type
>of interrupt, or any type of message signalled interrupt, leverage it.
>
>Signed-off-by: Piotr Stankiewicz 
>Reviewed-by: Andy Shevchenko 
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>index 5ed4227f304b..6dbe173a9fd4 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>@@ -251,11 +251,11 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>   int nvec = pci_msix_vec_count(adev->pdev);
>   unsigned int flags;
>
>-  if (nvec <= 0) {
>+  if (nvec > 0)
>+  flags = PCI_IRQ_MSI_TYPES;
>+  else
>   flags = PCI_IRQ_MSI;
>-  } else {
>-  flags = PCI_IRQ_MSI | PCI_IRQ_MSIX;
>-  }

Minor nit:

Is it really necessary to set do this check?  Can flags just
be set?

I.e.: 
flags = PCI_IRQ_MSI_TYPES;

pci_alloc_irq_vector() tries stuff in order.  If MSIX is not available,
it will try MSI.

M

>+
>   /* we only need one vector */
>   nvec = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags);
>   if (nvec > 0) {
>--
>2.17.2
>
>___
>dri-devel mailing list
>dri-de...@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] media: stm32-dcmi: Set minimum cpufreq requirement

2020-06-02 Thread Valentin Schneider



On 02/06/20 12:37, Benjamin GAIGNARD wrote:
> On 6/2/20 11:31 AM, Valentin Schneider wrote:
>>> @@ -99,6 +100,8 @@ enum state {
>>>
>>>   #define OVERRUN_ERROR_THRESHOLD   3
>>>
>>> +#define DCMI_MIN_FREQ  65 /* in KHz */
>>> +
>> This assumes the handling part is guaranteed to always run on the same CPU
>> with the same performance profile (regardless of the platform). If that's
>> not guaranteed, it feels like you'd want this to be configurable in some
>> way.
> Yes I could add a st,stm32-dcmi-min-frequency (in KHz) parameter the
> device tree node.
>

Something like that - I'm not sure how well this fits with the DT
landscape, as you could argue it isn't really a description of the
hardware, more of a description of the performance expectations of the
software. I won't really argue here.

>>
>>>   struct dcmi_graph_entity {
>>>struct v4l2_async_subdev asd;
>>>
>> [...]
>>> @@ -2020,6 +2042,8 @@ static int dcmi_probe(struct platform_device *pdev)
>>>goto err_cleanup;
>>>}
>>>
>>> +   dcmi->policy = cpufreq_cpu_get(0);
>>> +
>> Ideally you'd want to fetch the policy of the CPU your IRQ (and handling
>> thread) is affined to; The only compatible DTS I found describes a single
>> A7, which is somewhat limited in the affinity area...
> If I move this code just before start streaming and use get_cpu(), would
> it works ?
>

AFAIA streaming_start() is not necessarily executing on the same CPU as the
one that will handle the interrupt. I was thinking you could use the IRQ's
effective affinity as a hint of which CPU(s) to boost, i.e. something like:

---
struct cpumask_var_t visited;
struct irq_data *d = irq_get_irq_data(irq);

err = alloc_cpumask_var(visited, GFP_KERNEL);
/* ... */
for_each_cpu(cpu, irq_data_get_effective_affinity_mask(d)) {
/* check if not already spanned */
if (cpumask_test_cpu(cpu, visited))
continue;

policy = cpufreq_cpu_get(cpu);
cpumask_or(visited, visited, policy->cpus);
/* do the boost for that policy here */
/* ... */
cpufreq_cpu_put(policy);
}
---

That of course falls apart when hotplug gets involved, and the effective
affinity changes... There's irq_set_affinity_notifier() out there, but it
seems it's only about the affinity, not the effective_affinity, I'm not
sure how valid it would be to query the effective_affinity in that
notifier.

> Benjamin
>>
>>>dev_info(>dev, "Probe done\n");
>>>
>>>platform_set_drvdata(pdev, dcmi);
>>> @@ -2049,6 +2073,9 @@ static int dcmi_remove(struct platform_device *pdev)
>>>
>>>pm_runtime_disable(>dev);
>>>
>>> +   if (dcmi->policy)
>>> +   cpufreq_cpu_put(dcmi->policy);
>>> +
>>>v4l2_async_notifier_unregister(>notifier);
>>>v4l2_async_notifier_cleanup(>notifier);
>>>media_entity_cleanup(>vdev->entity);

Re: [RFC 06/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory

2020-06-02 Thread Kirill A. Shutemov

On Mon, Jun 01, 2020 at 06:35:22PM +0200, Paolo Bonzini wrote:
> On 25/05/20 17:17, Kirill A. Shutemov wrote:
> >> Personally, I would've just added 'struct kvm' pointer to 'struct
> >> kvm_memory_slot' to be able to extract 'mem_protected' info when
> >> needed. This will make the patch much smaller.
> > Okay, can do.
> > 
> > Other thing I tried is to have per-slot flag to indicate that it's
> > protected. But Sean pointed that it's all-or-nothing feature and having
> > the flag in the slot would be misleading.
> > 
> 
> Perhaps it would be misleading, but it's an optimization.  Saving a
> pointer dereference can be worth it, also because there are some places
> where we just pass around a memslot and we don't have the struct kvm*.

Vitaly proposed to add struct kvm pointer into memslot. Do you object
against it?

-- 
 Kirill A. Shutemov

Re: [PATCH 4/6] vhost_vdpa: support doorbell mapping via mmap

2020-06-02 Thread Michael S. Tsirkin

On Tue, Jun 02, 2020 at 02:49:38PM +0800, Jason Wang wrote:
> 
> On 2020/6/2 下午12:56, Michael S. Tsirkin wrote:
> > On Tue, Jun 02, 2020 at 03:22:49AM +0800, kbuild test robot wrote:
> > > Hi Jason,
> > > 
> > > I love your patch! Yet something to improve:
> > > 
> > > [auto build test ERROR on vhost/linux-next]
> > > [also build test ERROR on linus/master v5.7 next-20200529]
> > > [if your patch is applied to the wrong git tree, please drop us a note to 
> > > help
> > > improve the system. BTW, we also suggest to use '--base' option to 
> > > specify the
> > > base tree in git format-patch, please 
> > > seehttps://stackoverflow.com/a/37406982]
> > > 
> > > url:https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-doorbell-mapping/20200531-070834
> > > base:https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git  
> > > linux-next
> > > config: m68k-randconfig-r011-20200601 (attached as .config)
> > > compiler: m68k-linux-gcc (GCC) 9.3.0
> > > reproduce (this is a W=1 build):
> > >  
> > > wgethttps://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > >   -O ~/bin/make.cross
> > >  chmod +x ~/bin/make.cross
> > >  # save the attached .config to linux build tree
> > >  COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> > > ARCH=m68k
> > > 
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kbuild test robot
> > > 
> > > All errors (new ones prefixed by >>, old ones prefixed by <<):
> > > 
> > > drivers/vhost/vdpa.c: In function 'vhost_vdpa_fault':
> > > > > drivers/vhost/vdpa.c:754:22: error: implicit declaration of function 
> > > > > 'pgprot_noncached' [-Werror=implicit-function-declaration]
> > > 754 |  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> > > |  ^~~~
> > > > > drivers/vhost/vdpa.c:754:22: error: incompatible types when assigning 
> > > > > to type 'pgprot_t' {aka 'struct '} from type 'int'
> > > cc1: some warnings being treated as errors
> > > 
> > > vim +/pgprot_noncached +754 drivers/vhost/vdpa.c
> > > 
> > > 742   
> > > 743   static vm_fault_t vhost_vdpa_fault(struct vm_fault *vmf)
> > > 744   {
> > > 745   struct vhost_vdpa *v = vmf->vma->vm_file->private_data;
> > > 746   struct vdpa_device *vdpa = v->vdpa;
> > > 747   const struct vdpa_config_ops *ops = vdpa->config;
> > > 748   struct vdpa_notification_area notify;
> > > 749   struct vm_area_struct *vma = vmf->vma;
> > > 750   u16 index = vma->vm_pgoff;
> > > 751   
> > > 752   notify = ops->get_vq_notification(vdpa, index);
> > > 753   
> > >   > 754   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> > > 755   if (remap_pfn_range(vma, vmf->address & PAGE_MASK,
> > > 756   notify.addr >> PAGE_SHIFT, 
> > > PAGE_SIZE,
> > > 757   vma->vm_page_prot))
> > > 758   return VM_FAULT_SIGBUS;
> > > 759   
> > > 760   return VM_FAULT_NOPAGE;
> > > 761   }
> > > 762   
> > Yes well, all this remapping clearly has no chance to work
> > on systems without CONFIG_MMU.
> 
> 
> It looks to me mmap can work according to Documentation/nommu-mmap.txt. But
> I'm not sure it's worth to bother.
> 
> Thanks


Well

int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t prot)
{
if (addr != (pfn << PAGE_SHIFT))
return -EINVAL;

vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
return 0;
}
EXPORT_SYMBOL(remap_pfn_range);


So things aren't going to work if you have a fixed PFN
which is the case of the hardware device.


> 
> > 
> > 
> >

Re: [v4,1/1] hwmon: (nct7904) Add watchdog function

2020-06-02 Thread Guenter Roeck

On 6/2/20 1:01 AM, Geert Uytterhoeven wrote:
> Hi Yuechao,
> 
> On Tue, Mar 31, 2020 at 7:30 AM  wrote:
>> From: Yuechao Zhao 
>>
>> implement watchdong functionality into the "hwmon/nct7904.c"
>>
>> Signed-off-by: Yuechao Zhao 
> 
> Thanks for your patch, which is now commit 77849a552d142ef5 ("hwmon:
> (nct7904) Add watchdog function").
> 
>> --- a/drivers/hwmon/Kconfig
>> +++ b/drivers/hwmon/Kconfig
>> @@ -1340,10 +1340,12 @@ config SENSORS_NCT7802
>>
>>  config SENSORS_NCT7904
>> tristate "Nuvoton NCT7904"
>> -   depends on I2C
>> +   depends on I2C && WATCHDOG
>> +   select WATCHDOG_CORE
> 
> This makes the driver unselectable if WATCHDOG is not set.
> 
> Is there a use case for using this driver without watchdog functionality?
> If yes, it might make sense to make the watchdog support optional,
> protected by #ifdef CONFIG_WATCHDOG, and change the above to
> 

We use the same pattern in other hwmon drivers which also implement
watchdog functionality, so I am not particularly concerned about it.

Guenter

> depends on I2C
> select WATCHDOG_CORE if WATCHDOG
> 
> If no, please ignore my email.
> 
>> help
>>   If you say yes here you get support for the Nuvoton NCT7904
>> - hardware monitoring chip, including manual fan speed control.
>> + hardware monitoring chip, including manual fan speed control
>> + and support for the integrated watchdog.
>>
>>   This driver can also be built as a module. If so, the module
>>   will be called nct7904.
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
>

Re: [PATCH] x86_64: fix jiffies ODR violation

2020-06-02 Thread Josh Poimboeuf

On Fri, May 15, 2020 at 11:05:40AM -0700, Bob Haarman wrote:
> `jiffies`
[...]
> `jiffies_64`
[...]
> ```
> In LLD, symbol assignments in linker scripts override definitions in
> object files. GNU ld appears to have the same behavior. It would
> probably make sense for LLD to error "duplicate symbol" but GNU ld is
> unlikely to adopt for compatibility reasons.
> ```

Kernel commit logs shouldn't be in Markdown.

Symbol names can just be in single quotes (not back-quotes!) like
'jiffies'.

Quotes can be indented by a few spaces for visual separation, like

  In LLD, symbol assignments in linker scripts override definitions in
  object files. GNU ld appears to have the same behavior. It would
  probably make sense for LLD to error "duplicate symbol" but GNU ld is
  unlikely to adopt for compatibility reasons.

or can be formatting like an email quote:

> In LLD, symbol assignments in linker scripts override definitions in
> object files. GNU ld appears to have the same behavior. It would
> probably make sense for LLD to error "duplicate symbol" but GNU ld is
> unlikely to adopt for compatibility reasons.

With Markdown-isms removed from the patch description:

Reviewed-by: Josh Poimboeuf 

-- 
Josh

[PATCH] dt-bindings: clock: Convert imx7ulp clock to json-schema

2020-06-02 Thread Anson Huang

Convert the i.MX7ULP clock binding to DT schema format using json-schema,
the original binding doc is actually for two clock modules(SCG and PCC),
so split it to two binding docs, and the MPLL(mipi PLL) is NOT supposed
to be in clock module, so remove it from binding doc as well.

Signed-off-by: Anson Huang 
---
 .../devicetree/bindings/clock/imx7ulp-clock.txt| 103 --
 .../bindings/clock/imx7ulp-pcc-clock.yaml  | 119 +
 .../bindings/clock/imx7ulp-scg-clock.yaml  |  97 +
 3 files changed, 216 insertions(+), 103 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/clock/imx7ulp-clock.txt
 create mode 100644 
Documentation/devicetree/bindings/clock/imx7ulp-pcc-clock.yaml
 create mode 100644 
Documentation/devicetree/bindings/clock/imx7ulp-scg-clock.yaml

diff --git a/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt 
b/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt
deleted file mode 100644
index 93d89ad..000
--- a/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt
+++ /dev/null
@@ -1,103 +0,0 @@
-* Clock bindings for Freescale i.MX7ULP
-
-i.MX7ULP Clock functions are under joint control of the System
-Clock Generation (SCG) modules, Peripheral Clock Control (PCC)
-modules, and Core Mode Controller (CMC)1 blocks
-
-The clocking scheme provides clear separation between M4 domain
-and A7 domain. Except for a few clock sources shared between two
-domains, such as the System Oscillator clock, the Slow IRC (SIRC),
-and and the Fast IRC clock (FIRCLK), clock sources and clock
-management are separated and contained within each domain.
-
-M4 clock management consists of SCG0, PCC0, PCC1, and CMC0 modules.
-A7 clock management consists of SCG1, PCC2, PCC3, and CMC1 modules.
-
-Note: this binding doc is only for A7 clock domain.
-
-System Clock Generation (SCG) modules:
--
-The System Clock Generation (SCG) is responsible for clock generation
-and distribution across this device. Functions performed by the SCG
-include: clock reference selection, generation of clock used to derive
-processor, system, peripheral bus and external memory interface clocks,
-source selection for peripheral clocks and control of power saving
-clock gating mode.
-
-Required properties:
-
-- compatible:  Should be "fsl,imx7ulp-scg1".
-- reg :Should contain registers location and length.
-- #clock-cells:Should be <1>.
-- clocks:  Should contain the fixed input clocks.
-- clock-names:  Should contain the following clock names:
-   "rosc", "sosc", "sirc", "firc", "upll", "mpll".
-
-Peripheral Clock Control (PCC) modules:
--
-The Peripheral Clock Control (PCC) is responsible for clock selection,
-optional division and clock gating mode for peripherals in their
-respected power domain
-
-Required properties:
-- compatible:  Should be one of:
- "fsl,imx7ulp-pcc2",
- "fsl,imx7ulp-pcc3".
-- reg :Should contain registers location and length.
-- #clock-cells:Should be <1>.
-- clocks:  Should contain the fixed input clocks.
-- clock-names:  Should contain the following clock names:
-   "nic1_bus_clk", "nic1_clk", "ddr_clk", "apll_pfd2",
-   "apll_pfd1", "apll_pfd0", "upll", "sosc_bus_clk",
-   "mpll", "firc_bus_clk", "rosc", "spll_bus_clk";
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell.
-See include/dt-bindings/clock/imx7ulp-clock.h
-for the full list of i.MX7ULP clock IDs of each module.
-
-Examples:
-
-#include 
-
-scg1: scg1@403e {
-   compatible = "fsl,imx7ulp-scg1;
-   reg = <0x403e 0x1>;
-   clocks = <>, <>, <>,
-<>, <>, <>;
-   clock-names = "rosc", "sosc", "sirc",
- "firc", "upll", "mpll";
-   #clock-cells = <1>;
-};
-
-pcc2: pcc2@403f {
-   compatible = "fsl,imx7ulp-pcc2";
-   reg = <0x403f 0x1>;
-   #clock-cells = <1>;
-   clocks = < IMX7ULP_CLK_NIC1_BUS_DIV>,
-< IMX7ULP_CLK_NIC1_DIV>,
-< IMX7ULP_CLK_DDR_DIV>,
-< IMX7ULP_CLK_APLL_PFD2>,
-< IMX7ULP_CLK_APLL_PFD1>,
-< IMX7ULP_CLK_APLL_PFD0>,
-< IMX7ULP_CLK_UPLL>,
-< IMX7ULP_CLK_SOSC_BUS_CLK>,
-< IMX7ULP_CLK_FIRC_BUS_CLK>,
-< IMX7ULP_CLK_ROSC>,
-< IMX7ULP_CLK_SPLL_BUS_CLK>;
-   clock-names = "nic1_bus_clk", "nic1_clk", "ddr_clk",
- "apll_pfd2", "apll_pfd1", "apll_pfd0",
- "upll", "sosc_bus_clk", "mpll",
- "firc_bus_clk", "rosc", "spll_bus_clk";
-};
-
-usdhc1: usdhc@4038 {
-   compatible = "fsl,imx7ulp-usdhc";
-   reg = <0x4038 0x1>;
-

Re: [PATCHv2] perf stat: Ensure group is defined on top of the same cpu mask

2020-06-02 Thread Arnaldo Carvalho de Melo

Em Tue, Jun 02, 2020 at 02:10:17PM +0200, Jiri Olsa escreveu:
> On Tue, Jun 02, 2020 at 08:50:17PM +0900, Namhyung Kim wrote:
> > Hi Jiri,
> > 
> > On Tue, Jun 2, 2020 at 5:16 PM Jiri Olsa  wrote:
> > >
> > > On Tue, Jun 02, 2020 at 11:47:19AM +0900, Namhyung Kim wrote:
> > > > On Tue, Jun 2, 2020 at 1:21 AM Ian Rogers  wrote:
> > > > >
> > > > > On Mon, Jun 1, 2020 at 1:20 AM Jiri Olsa  wrote:
> > > > > >
> > > > > > Jin Yao reported the issue (and posted first versions of this 
> > > > > > change)
> > > > > > with groups being defined over events with different cpu mask.
> > > > > >
> > > > > > This causes assert aborts in get_group_fd, like:
> > > > > >
> > > > > >   # perf stat -M "C2_Pkg_Residency" -a -- sleep 1
> > > > > >   perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' 
> > > > > > failed.
> > > > > >   Aborted
> > > > > >
> > > > > > All the events in the group have to be defined over the same
> > > > > > cpus so the group_fd can be found for every leader/member pair.
> > > > > >
> > > > > > Adding check to ensure this condition is met and removing the
> > > > > > group (with warning) if we detect mixed cpus, like:
> > > > > >
> > > > > >   $ sudo perf stat -e 
> > > > > > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
> > > > > >   WARNING: event cpu maps do not match, disabling group:
> > > > > > anon group { power/energy-cores/, cycles }
> > > > > > anon group { instructions, power/energy-cores/ }
> > > > > >
> > > > > > Ian asked also for cpu maps details, it's displayed in verbose mode:
> > > > > >
> > > > > >   $ sudo perf stat -e '{cycles,power/energy-cores/}' -v
> > > > > >   WARNING: group events cpu maps do not match, disabling group:
> > > > > > anon group { power/energy-cores/, cycles }
> > > > > >power/energy-cores/: 0
> > > > > >cycles: 0-7
> > > > > > anon group { instructions, power/energy-cores/ }
> > > > > >instructions: 0-7
> > > > > >power/energy-cores/: 0
> > > > >
> > > > > This is great! A nit, would 'grouped events cpus do not match' read
> > > > > better? I think the cpu map is more of an internal naming convention.
> > > > Allowed cpus?
> > >
> > > hum, what you mean?
> > 
> > I mean that we can use 'allowed cpus' rather then 'cpu map' in the message.
> > Something like this?
> > 
> >   allowed cpus for events in a group do not match, disabling group:
> 
> hm, I like more the one Ian suggested.. anyway, leaving this to Arnaldo,
> he can change that before committing ;-)

I think its ok as-is, Ian, can I have your acked-by?

- Arnaldo

Re: [Cocci] [PATCH 1/2] Coccinelle: extend memdup_user transformation with GFP_USER

2020-06-02 Thread Julia Lawall




On Sat, 30 May 2020, Denis Efremov wrote:

> Match GFP_USER allocations with memdup_user.cocci rule.
> Commit 6c2c97a24f09 ("memdup_user(): switch to GFP_USER") switched
> memdup_user() from GFP_KERNEL to GFP_USER. In most cases it is still
> a good idea to use memdup_user() for GFP_KERNEL allocations. The
> motivation behind altering memdup_user() to GFP_USER is here:
> https://lkml.org/lkml/2018/1/6/333

Thanks for the patch series.  I will test them and try to push them to
Linus shortly.

julia


>
> Signed-off-by: Denis Efremov 
> ---
>  scripts/coccinelle/api/memdup_user.cocci | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/coccinelle/api/memdup_user.cocci 
> b/scripts/coccinelle/api/memdup_user.cocci
> index c809ab10bbce..49f487e6a5c8 100644
> --- a/scripts/coccinelle/api/memdup_user.cocci
> +++ b/scripts/coccinelle/api/memdup_user.cocci
> @@ -20,7 +20,7 @@ expression from,to,size;
>  identifier l1,l2;
>  @@
>
> --  to = \(kmalloc\|kzalloc\)(size,GFP_KERNEL);
> +-  to = \(kmalloc\|kzalloc\)(size,\(GFP_KERNEL\|GFP_USER\));
>  +  to = memdup_user(from,size);
> if (
>  -  to==NULL
> @@ -43,7 +43,7 @@ position p;
>  statement S1,S2;
>  @@
>
> -*  to = \(kmalloc@p\|kzalloc@p\)(size,GFP_KERNEL);
> +*  to = \(kmalloc@p\|kzalloc@p\)(size,\(GFP_KERNEL\|GFP_USER\));
> if (to==NULL || ...) S1
> if (copy_from_user(to, from, size) != 0)
> S2
> --
> 2.26.2
>
> ___
> Cocci mailing list
> co...@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>

Re: [PATCH v2 3/7] selftests/ftrace: Add "requires:" list support

2020-06-02 Thread Steven Rostedt

On Tue,  2 Jun 2020 18:08:31 +0900
Masami Hiramatsu  wrote:

> +++ b/tools/testing/selftests/ftrace/test.d/template
> @@ -1,6 +1,7 @@
>  #!/bin/sh
>  # SPDX-License-Identifier: GPL-2.0
>  # description: %HERE DESCRIBE WHAT THIS DOES%
> +# requires: %HERE LIST UP REQUIRED FILES%

Not sure what you mean by "LIST UP". Perhaps you mean "LIST OF"?

-- Steve


>  # you have to add ".tc" extention for your testcase file
>  # Note that all tests are run with "errexit" option.

KASAN: use-after-free Read in usb_udc_uevent

2020-06-02 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:ffeb595d Merge tag 'powerpc-5.7-6' of git://git.kernel.org..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17f7f2b110
kernel config:  https://syzkaller.appspot.com/x/.config?x=9ac87c7c2ba15abf
dashboard link: https://syzkaller.appspot.com/bug?extid=b0de012ceb1e2a97891b
compiler:   clang version 10.0.0 (https://github.com/llvm/llvm-project/ 
c2443155a0fb245c8f17f2c1c72b6ea391e86e81)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b0de012ceb1e2a978...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in usb_udc_uevent+0xac/0x120 
drivers/usb/gadget/udc/core.c:1593
Read of size 8 at addr 888091a7c050 by task systemd-udevd/14223

CPU: 1 PID: 14223 Comm: systemd-udevd Not tainted 5.7.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1e9/0x30e lib/dump_stack.c:118
 print_address_description+0x74/0x5c0 mm/kasan/report.c:382
 __kasan_report+0x103/0x1a0 mm/kasan/report.c:511
 kasan_report+0x4d/0x80 mm/kasan/common.c:625

Allocated by task 14178:
 save_stack mm/kasan/common.c:49 [inline]
 set_track mm/kasan/common.c:57 [inline]
 __kasan_kmalloc+0x114/0x160 mm/kasan/common.c:495
 kmem_cache_alloc_trace+0x234/0x300 mm/slab.c:3551
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 dev_new drivers/usb/gadget/legacy/raw_gadget.c:182 [inline]
 raw_open+0x87/0x500 drivers/usb/gadget/legacy/raw_gadget.c:372
 misc_open+0x346/0x3c0 drivers/char/misc.c:141
 chrdev_open+0x498/0x580 fs/char_dev.c:414
 do_dentry_open+0x82e/0x10b0 fs/open.c:797
 do_open fs/namei.c:3229 [inline]
 path_openat+0x2790/0x38b0 fs/namei.c:3346
 do_filp_open+0x191/0x3a0 fs/namei.c:3373
 do_sys_openat2+0x463/0x770 fs/open.c:1148
 do_sys_open fs/open.c:1164 [inline]
 ksys_open include/linux/syscalls.h:1386 [inline]
 __do_sys_open fs/open.c:1170 [inline]
 __se_sys_open fs/open.c:1168 [inline]
 __x64_sys_open+0x1af/0x1e0 fs/open.c:1168
 do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

Freed by task 14174:
 save_stack mm/kasan/common.c:49 [inline]
 set_track mm/kasan/common.c:57 [inline]
 kasan_set_free_info mm/kasan/common.c:317 [inline]
 __kasan_slab_free+0x125/0x190 mm/kasan/common.c:456
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 raw_release+0x130/0x1e0 drivers/usb/gadget/legacy/raw_gadget.c:411
 __fput+0x2ed/0x750 fs/file_table.c:280
 task_work_run+0x147/0x1d0 kernel/task_work.c:123
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x5ef/0x1f80 kernel/exit.c:796
 do_group_exit+0x15e/0x2c0 kernel/exit.c:894
 get_signal+0x13cf/0x1d60 kernel/signal.c:2739
 do_signal+0x33/0x610 arch/x86/kernel/signal.c:784
 exit_to_usermode_loop arch/x86/entry/common.c:161 [inline]
 prepare_exit_to_usermode+0x32a/0x600 arch/x86/entry/common.c:196
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

The buggy address belongs to the object at 888091a7c000
 which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 80 bytes inside of
 4096-byte region [888091a7c000, 888091a7d000)
The buggy address belongs to the page:
page:ea0002469f00 refcount:1 mapcount:0 mapping: index:0x0 
head:ea0002469f00 order:1 compound_mapcount:0
flags: 0xfffe010200(slab|head)
raw: 00fffe010200 ea0001043588 ea0001258b88 8880aa402000
raw:  888091a7c000 00010001 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 888091a7bf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 888091a7bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>888091a7c000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ^
 888091a7c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 888091a7c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Re: [GIT PULL][PATCH v5 0/8] Add support for ZSTD-compressed kernel and initramfs

2020-06-02 Thread Metztli Information Technology

On Mon, Jun 1, 2020 at 2:59 PM Norbert Lange  wrote:
>
> The series seems to be stuck in limbo, and I got the hint to bring
> this to Andrew's attention [1].
> Hope this will finally end in upstream, been using these patches for ~2 years.
>
> Regards, Norbert
>
> [1] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=955469

Began using your patch for Debian Buster backport of 0.136 initramfs-tools, and 
Nick Terrel's kernel patch for 5.6 -- but modified for 5.5.17-19 in my reiser4 
builds -- and set both defaults to zstd.

Thus far, as long as there exists 'a priori' the Zstd package, the combination 
works in both local and my custom Google cloud instances installations.

Best Professional Regards.

-- 
Jose R R
http://metztli.it
-
Download Metztli Reiser4: Debian Buster w/ Linux 5.5.19 AMD64
-
feats ZSTD compression https://sf.net/projects/metztli-reiser4/
---
Official current Reiser4 resources: https://reiser4.wiki.kernel.org/

[PATCH v14 15/15] MAINTAINERS: Update for DAMON

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit updates MAINTAINERS file for DAMON related files.

Signed-off-by: SeongJae Park 
---
 MAINTAINERS | 12 
 1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 50659d76976b..2396a9098715 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4686,6 +4686,18 @@ F:   net/ax25/ax25_out.c
 F: net/ax25/ax25_timer.c
 F: net/ax25/sysctl_net_ax25.c
 
+DATA ACCESS MONITOR
+M: SeongJae Park 
+L: linux...@kvack.org
+S: Maintained
+F: Documentation/admin-guide/mm/damon/*
+F: include/linux/damon.h
+F: include/trace/events/damon.h
+F: mm/damon-test.h
+F: mm/damon.c
+F: tools/damon/*
+F: tools/testing/selftests/damon/*
+
 DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER
 L: net...@vger.kernel.org
 S: Orphan
-- 
2.17.1

[PATCH v14 14/15] mm/damon: Add user space selftests

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds a simple user space tests for DAMON.  The tests are
using kselftest framework.

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/damon/Makefile|   7 +
 .../selftests/damon/_chk_dependency.sh|  28 
 tools/testing/selftests/damon/_chk_record.py  | 108 ++
 .../testing/selftests/damon/debugfs_attrs.sh  | 139 ++
 .../testing/selftests/damon/debugfs_record.sh |  50 +++
 5 files changed, 332 insertions(+)
 create mode 100644 tools/testing/selftests/damon/Makefile
 create mode 100644 tools/testing/selftests/damon/_chk_dependency.sh
 create mode 100644 tools/testing/selftests/damon/_chk_record.py
 create mode 100755 tools/testing/selftests/damon/debugfs_attrs.sh
 create mode 100755 tools/testing/selftests/damon/debugfs_record.sh

diff --git a/tools/testing/selftests/damon/Makefile 
b/tools/testing/selftests/damon/Makefile
new file mode 100644
index ..cfd5393a4639
--- /dev/null
+++ b/tools/testing/selftests/damon/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for damon selftests
+
+TEST_FILES = _chk_dependency.sh _chk_record_file.py
+TEST_PROGS = debugfs_attrs.sh debugfs_record.sh
+
+include ../lib.mk
diff --git a/tools/testing/selftests/damon/_chk_dependency.sh 
b/tools/testing/selftests/damon/_chk_dependency.sh
new file mode 100644
index ..814dcadd5e96
--- /dev/null
+++ b/tools/testing/selftests/damon/_chk_dependency.sh
@@ -0,0 +1,28 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+DBGFS=/sys/kernel/debug/damon
+
+if [ $EUID -ne 0 ];
+then
+   echo "Run as root"
+   exit $ksft_skip
+fi
+
+if [ ! -d $DBGFS ]
+then
+   echo "$DBGFS not found"
+   exit $ksft_skip
+fi
+
+for f in attrs record pids monitor_on
+do
+   if [ ! -f "$DBGFS/$f" ]
+   then
+   echo "$f not found"
+   exit 1
+   fi
+done
diff --git a/tools/testing/selftests/damon/_chk_record.py 
b/tools/testing/selftests/damon/_chk_record.py
new file mode 100644
index ..5cfcf4161404
--- /dev/null
+++ b/tools/testing/selftests/damon/_chk_record.py
@@ -0,0 +1,108 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"Check whether the DAMON record file is valid"
+
+import argparse
+import struct
+import sys
+
+fmt_version = 0
+
+def set_fmt_version(f):
+global fmt_version
+
+mark = f.read(16)
+if mark == b'damon_recfmt_ver':
+fmt_version = struct.unpack('i', f.read(4))[0]
+else:
+fmt_version = 0
+f.seek(0)
+return fmt_version
+
+def read_pid(f):
+if fmt_version == 0:
+pid = struct.unpack('L', f.read(8))[0]
+else:
+pid = struct.unpack('i', f.read(4))[0]
+def err_percent(val, expected):
+return abs(val - expected) / expected * 100
+
+def chk_task_info(f):
+pid = read_pid(f)
+nr_regions = struct.unpack('I', f.read(4))[0]
+
+if nr_regions > max_nr_regions:
+print('too many regions: %d > %d' % (nr_regions, max_nr_regions))
+exit(1)
+
+nr_gaps = 0
+eaddr = 0
+for r in range(nr_regions):
+saddr = struct.unpack('L', f.read(8))[0]
+if eaddr and saddr != eaddr:
+nr_gaps += 1
+eaddr = struct.unpack('L', f.read(8))[0]
+nr_accesses = struct.unpack('I', f.read(4))[0]
+
+if saddr >= eaddr:
+print('wrong region [%d,%d)' % (saddr, eaddr))
+exit(1)
+
+max_nr_accesses = aint / sint
+if nr_accesses > max_nr_accesses:
+if err_percent(nr_accesses, max_nr_accesses) > 15:
+print('too high nr_access: expected %d but %d' %
+(max_nr_accesses, nr_accesses))
+exit(1)
+if nr_gaps != 2:
+print('number of gaps are not two but %d' % nr_gaps)
+exit(1)
+
+def parse_time_us(bindat):
+sec = struct.unpack('l', bindat[0:8])[0]
+nsec = struct.unpack('l', bindat[8:16])[0]
+return (sec * 10 + nsec) / 1000
+
+def main():
+global sint
+global aint
+global min_nr
+global max_nr_regions
+
+parser = argparse.ArgumentParser()
+parser.add_argument('file', metavar='',
+help='path to the record file')
+parser.add_argument('--attrs', metavar='',
+default='5000 10 100 10 1000',
+help='content of debugfs attrs file')
+args = parser.parse_args()
+file_path = args.file
+attrs = [int(x) for x in args.attrs.split()]
+sint, aint, rint, min_nr, max_nr_regions = attrs
+
+with open(file_path, 'rb') as f:
+set_fmt_version(f)
+last_aggr_time = None
+while True:
+timebin = f.read(16)
+if len(timebin) != 16:
+break
+
+now = parse_time_us(timebin)
+if not last_aggr_time:
+last_aggr_time = now
+else:
+error

[PATCH v14 12/15] mm/damon: Add kunit tests

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds kunit based unit tests for DAMON.

Signed-off-by: SeongJae Park 
Reviewed-by: Brendan Higgins 
---
 mm/Kconfig  |  11 +
 mm/damon-test.h | 622 
 mm/damon.c  |   6 +
 3 files changed, 639 insertions(+)
 create mode 100644 mm/damon-test.h

diff --git a/mm/Kconfig b/mm/Kconfig
index ecea0889ea35..91473ed9e7c7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -879,4 +879,15 @@ config DAMON
  and 2) sufficiently light-weight so that it can be applied online.
  If unsure, say N.
 
+config DAMON_KUNIT_TEST
+   bool "Test for damon"
+   depends on DAMON=y && KUNIT
+   help
+ This builds the DAMON Kunit test suite.
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation.
+
+ If unsure, say N.
+
 endmenu
diff --git a/mm/damon-test.h b/mm/damon-test.h
new file mode 100644
index ..cf715529ff64
--- /dev/null
+++ b/mm/damon-test.h
@@ -0,0 +1,622 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Data Access Monitor Unit Tests
+ *
+ * Copyright 2019 Amazon.com, Inc. or its affiliates.  All rights reserved.
+ *
+ * Author: SeongJae Park 
+ */
+
+#ifdef CONFIG_DAMON_KUNIT_TEST
+
+#ifndef _DAMON_TEST_H
+#define _DAMON_TEST_H
+
+#include 
+
+static void damon_test_str_to_pids(struct kunit *test)
+{
+   char *question;
+   int *answers;
+   int expected[] = {12, 35, 46};
+   ssize_t nr_integers = 0, i;
+
+   question = "123";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
+   KUNIT_EXPECT_EQ(test, 123, answers[0]);
+   kfree(answers);
+
+   question = "123abc";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
+   KUNIT_EXPECT_EQ(test, 123, answers[0]);
+   kfree(answers);
+
+   question = "a123";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+   KUNIT_EXPECT_PTR_EQ(test, answers, (int *)NULL);
+
+   question = "12 35";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
+   for (i = 0; i < nr_integers; i++)
+   KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+   kfree(answers);
+
+   question = "12 35 46";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)3, nr_integers);
+   for (i = 0; i < nr_integers; i++)
+   KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+   kfree(answers);
+
+   question = "12 35 abc 46";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
+   for (i = 0; i < 2; i++)
+   KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+   kfree(answers);
+
+   question = "";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+   KUNIT_EXPECT_PTR_EQ(test, (int *)NULL, answers);
+   kfree(answers);
+
+   question = "\n";
+   answers = str_to_pids(question, strnlen(question, 128), _integers);
+   KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+   KUNIT_EXPECT_PTR_EQ(test, (int *)NULL, answers);
+   kfree(answers);
+}
+
+static void damon_test_regions(struct kunit *test)
+{
+   struct damon_region *r;
+   struct damon_task *t;
+
+   r = damon_new_region(_user_ctx, 1, 2);
+   KUNIT_EXPECT_EQ(test, 1ul, r->vm_start);
+   KUNIT_EXPECT_EQ(test, 2ul, r->vm_end);
+   KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);
+
+   t = damon_new_task(42);
+   KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t));
+
+   damon_add_region(r, t);
+   KUNIT_EXPECT_EQ(test, 1u, nr_damon_regions(t));
+
+   damon_del_region(r);
+   KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t));
+
+   damon_free_task(t);
+}
+
+static void damon_test_tasks(struct kunit *test)
+{
+   struct damon_ctx *c = _user_ctx;
+   struct damon_task *t;
+
+   t = damon_new_task(42);
+   KUNIT_EXPECT_EQ(test, 42, t->pid);
+   KUNIT_EXPECT_EQ(test, 0u, nr_damon_tasks(c));
+
+   damon_add_task(_user_ctx, t);
+   KUNIT_EXPECT_EQ(test, 1u, nr_damon_tasks(c));
+
+   damon_destroy_task(t);
+   KUNIT_EXPECT_EQ(test, 0u, nr_damon_tasks(c));
+}
+
+static void damon_test_set_pids(struct kunit *test)
+{
+   struct damon_ctx *ctx = _user_ctx;
+   int pids[] = {1, 2, 3};
+   char buf[64];
+
+   damon_set_pids(ctx, pids, 3);
+   damon_sprint_pids(ctx, buf, 64);
+   KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2 3\n");
+
+   damon_set_pids(ctx, NULL, 0);
+   damon_sprint_pids(ctx, buf,

Re: [PATCH] video: fbdev: pxafb: Use correct return value for pxafb_probe()

2020-06-02 Thread Daniel Vetter

Hi Bart,

On Mon, Jun 01, 2020 at 03:25:25PM +0200, Bartlomiej Zolnierkiewicz wrote:
> 
> On 5/25/20 9:11 AM, Tiezhu Yang wrote:
> > When call function devm_platform_ioremap_resource(), we should use IS_ERR()
> > to check the return value and return PTR_ERR() if failed.
> > 
> > Signed-off-by: Tiezhu Yang 
> 
> Applied to drm-misc-next tree (patch should show up in v5.9), thanks.

Thanks for going through all the backlog of patches in the fbdev area
every once in a while! That kind of housekeeping work is often
underappreciated, but rather important to keep the ship going.

Cheers, Daniel

PS: Of course also holds for everyone else doing this in other areas.
fbdev simply stuck out just now catching up on mails.


> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R Institute Poland
> Samsung Electronics
> 
> > ---
> >  drivers/video/fbdev/pxafb.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/video/fbdev/pxafb.c b/drivers/video/fbdev/pxafb.c
> > index 00b96a7..423331c 100644
> > --- a/drivers/video/fbdev/pxafb.c
> > +++ b/drivers/video/fbdev/pxafb.c
> > @@ -2305,7 +2305,7 @@ static int pxafb_probe(struct platform_device *dev)
> > fbi->mmio_base = devm_platform_ioremap_resource(dev, 0);
> > if (IS_ERR(fbi->mmio_base)) {
> > dev_err(>dev, "failed to get I/O memory\n");
> > -   ret = -EBUSY;
> > +   ret = PTR_ERR(fbi->mmio_base);
> > goto failed;
> > }
> >  
> > 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 11/15] mmc: sdhci: use PCI_IRQ_MSI_TYPES where appropriate

2020-06-02 Thread Adrian Hunter

On 2/06/20 12:20 pm, Piotr Stankiewicz wrote:
> Seeing as there is shorthand available to use when asking for any type
> of interrupt, or any type of message signalled interrupt, leverage it.
> 
> Signed-off-by: Piotr Stankiewicz 
> Reviewed-by: Andy Shevchenko 

Acked-by: Adrian Hunter 

> ---
>  drivers/mmc/host/sdhci-pci-gli.c | 3 +--
>  drivers/mmc/host/sdhci-pci-o2micro.c | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci-pci-gli.c 
> b/drivers/mmc/host/sdhci-pci-gli.c
> index fd76aa672e02..d14997f9cbf9 100644
> --- a/drivers/mmc/host/sdhci-pci-gli.c
> +++ b/drivers/mmc/host/sdhci-pci-gli.c
> @@ -271,8 +271,7 @@ static void gli_pcie_enable_msi(struct sdhci_pci_slot 
> *slot)
>  {
>   int ret;
>  
> - ret = pci_alloc_irq_vectors(slot->chip->pdev, 1, 1,
> - PCI_IRQ_MSI | PCI_IRQ_MSIX);
> + ret = pci_alloc_irq_vectors(slot->chip->pdev, 1, 1, PCI_IRQ_MSI_TYPES);
>   if (ret < 0) {
>   pr_warn("%s: enable PCI MSI failed, error=%d\n",
>  mmc_hostname(slot->host->mmc), ret);
> diff --git a/drivers/mmc/host/sdhci-pci-o2micro.c 
> b/drivers/mmc/host/sdhci-pci-o2micro.c
> index fa8105087d68..498c51207bfb 100644
> --- a/drivers/mmc/host/sdhci-pci-o2micro.c
> +++ b/drivers/mmc/host/sdhci-pci-o2micro.c
> @@ -470,8 +470,7 @@ static void sdhci_pci_o2_enable_msi(struct sdhci_pci_chip 
> *chip,
>   return;
>   }
>  
> - ret = pci_alloc_irq_vectors(chip->pdev, 1, 1,
> - PCI_IRQ_MSI | PCI_IRQ_MSIX);
> + ret = pci_alloc_irq_vectors(chip->pdev, 1, 1, PCI_IRQ_MSI_TYPES);
>   if (ret < 0) {
>   pr_err("%s: enable PCI MSI failed, err=%d\n",
>  mmc_hostname(host->mmc), ret);
>

[PATCH v14 13/15] mm/damon-test: Add a kunit test for recording setup

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds another unit test case for the recording setup.

Signed-off-by: SeongJae Park 
---
 mm/damon-test.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/mm/damon-test.h b/mm/damon-test.h
index cf715529ff64..5b18619efe72 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -137,6 +137,18 @@ static void damon_test_set_pids(struct kunit *test)
KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
 }
 
+static void damon_test_set_recording(struct kunit *test)
+{
+   struct damon_ctx *ctx = _user_ctx;
+
+   damon_set_recording(ctx, 4242, "foo.bar");
+   KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 4242u);
+   KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo.bar");
+   damon_set_recording(ctx, 42, "foo");
+   KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 42u);
+   KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo");
+}
+
 /*
  * Test damon_three_regions_in_vmas() function
  *
@@ -596,6 +608,7 @@ static struct kunit_case damon_test_cases[] = {
KUNIT_CASE(damon_test_tasks),
KUNIT_CASE(damon_test_regions),
KUNIT_CASE(damon_test_set_pids),
+   KUNIT_CASE(damon_test_set_recording),
KUNIT_CASE(damon_test_three_regions_in_vmas),
KUNIT_CASE(damon_test_aggregate),
KUNIT_CASE(damon_test_write_rbuf),
-- 
2.17.1

Re: [PATCH] vimc: debayer: Add support for ARGB format

2020-06-02 Thread Dafna Hirschfeld





On 02.06.20 14:45, Laurent Pinchart wrote:

Hello,

On Tue, Jun 02, 2020 at 08:31:26AM -0300, Helen Koike wrote:

On 6/2/20 8:24 AM, Kieran Bingham wrote:

On 02/06/2020 11:55, Helen Koike wrote:

On 6/2/20 7:52 AM, Dafna Hirschfeld wrote:

On 01.06.20 14:16, Kaaira Gupta wrote:

On Fri, May 29, 2020 at 05:43:57PM +0200, Dafna Hirschfeld wrote:

Hi,
Thanks for the patch

I don't know how real devices handle ARGB formats,
I wonder if it should be the part of the debayer.


Hi! qcam tries to support BA24 as it is one of the formats that vimc
lists as its supported formats wih --list-formats. Shouldn't BA24 be
possible to capture with vimc?


Hi,
Just to clarify, when listing the supported formats of a video node, the node 
lists
the formats that the video node as an independent media entity support.
It does not mean that the 'camera' as a whole (that is, the media topology 
graph) supports
all the formats that the video node lists. When interacting with a video node or
a subdevice node, one interacts only with that specific entity.
In the case of vimc, the RGB video node as an independent entity supports BA24 
so the format
appears in the list of the its supported formats. But since the Debayer does not
support it, the format can not be generated by the entire vimc topology.
This is not a bug.


Is here a valid configuration for the vimc pipeline that produces BA24 ?

I think there isn't


I agree that not all pipeline configurations need to support every
format, but we shouldn't report a format that can't be produced at all.

This being said, and as discussed before, the de-bayering subdev should
just produce MEDIA_BUS_FMT_RGB888_1X24, and the video node should then
implement the RGB pixel formats. BA24 should likely be one of the
supported formats (or maybe BX24 ?).

So you mean that the video node should support it so when it receive RGB
format in the source pad it converts it to BA24 or BX24 ?
It makes sense. I guess both BA24 and BX24 can be added, I see in the
pixfmt-rgb.html doc that probably the control  V4L2_CID_ALPHA_COMPONENT
should then be added.




This is also my understanding.

You should have an -EPIPE error when start streaming though, it
shouldn't fail silently.


Yes, we had -EPIPE, and that is what I think we were trying to resolve.

How would userspace be expected to detect what formats to use ? Should
the available formats on the capture node depend on the current linking
of the media graph?


This is a good question, I don't recall v4l2 API defining this.


A recent extension to VIDIOC_ENUMFMT allows enumerating pixel formats
for a given media bus code, I think that's the way forward.


It would be a bit hard to implement in Vimc, specially when we have configfs
for custom topology, since the capture would need to query all the pipeline.
But could be implemented.


Otherwise, to know what formats are supported - userspace must first
'get a list of formats' then try to 'set' the formats to know what is
possible?

Yes, there is a doc file that explains that it should be done in a "bottom-up" 
way
,that is,  starting with configuring the sensor, then adjusting the debayer
to the sensor output, then adjusting the scaler to the debayer outout and then
adjusting the video node output to the scaler output. One should also use the
'try' version of the setting at the stage of adjusting the final configuration.
The detailed explanation is in 
Documentation/output/userspace-api/media/v4l/dev-subdev.html

Thanks,
Dafna



At the moment yes.


Or should (given VIMC is quite specialist anyway) userspace 'just know'
what is capable all the same?

That's possibly fine, as we can simply remove support for the ARGB
formats from the libcamera pipeline handler if it is never expected to
be supported.


With the configfs feature, you could build a topology with sensor->capture,
and ARGB would be supported.


But then as a further question - what formats will we expect VIMC to
support? VIVID has a (very) wide range of formats.

Would we ever expect VIMC to be as configurable?
Or is the scope limited to what we have today?


I know it is very limited atm, but I would like to increase the range,
I'm just with a limited bandwitdh to work on it.



If yes, which entity should support it, if not debayer? Should there be
a separate conversion entity, or should we keep the support in debayer
itself for efficiency issues?


On 28.05.20 20:57, Kaaira Gupta wrote:

Running qcam for pixelformat 0x34324142 showed that vimc debayer does
not support it. Hence, add the support for Alpha (255).


I would change the commit log to:

Add support for V4L2_PIX_FMT_RGB24 format in the debayer
and set the alpha channel to constant 255.


Signed-off-by: Kaaira Gupta 
---
    .../media/test-drivers/vimc/vimc-debayer.c    | 27 ---
    1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/media/test-drivers/vimc/vimc-debayer.c 
b/drivers/media/test-drivers/vimc/vimc-debayer.c
index

[PATCH v14 10/15] tools: Add a minimal user-space tool for DAMON

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds a shallow wrapper python script, ``/tools/damon/damo``
that provides more convenient interface.  Note that it is only aimed to
be used for minimal reference of the DAMON's debugfs interfaces and for
debugging of the DAMON itself.

Signed-off-by: SeongJae Park 
---
 tools/damon/.gitignore|   1 +
 tools/damon/_dist.py  |  36 
 tools/damon/_recfile.py   |  23 +++
 tools/damon/bin2txt.py|  67 +++
 tools/damon/damo  |  37 
 tools/damon/heats.py  | 362 ++
 tools/damon/nr_regions.py |  91 ++
 tools/damon/record.py | 212 ++
 tools/damon/report.py |  45 +
 tools/damon/wss.py|  97 ++
 10 files changed, 971 insertions(+)
 create mode 100644 tools/damon/.gitignore
 create mode 100644 tools/damon/_dist.py
 create mode 100644 tools/damon/_recfile.py
 create mode 100644 tools/damon/bin2txt.py
 create mode 100755 tools/damon/damo
 create mode 100644 tools/damon/heats.py
 create mode 100644 tools/damon/nr_regions.py
 create mode 100644 tools/damon/record.py
 create mode 100644 tools/damon/report.py
 create mode 100644 tools/damon/wss.py

diff --git a/tools/damon/.gitignore b/tools/damon/.gitignore
new file mode 100644
index ..96403d36ff93
--- /dev/null
+++ b/tools/damon/.gitignore
@@ -0,0 +1 @@
+__pycache__/*
diff --git a/tools/damon/_dist.py b/tools/damon/_dist.py
new file mode 100644
index ..9851ec964e5c
--- /dev/null
+++ b/tools/damon/_dist.py
@@ -0,0 +1,36 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import os
+import struct
+import subprocess
+
+def access_patterns(f):
+nr_regions = struct.unpack('I', f.read(4))[0]
+
+patterns = []
+for r in range(nr_regions):
+saddr = struct.unpack('L', f.read(8))[0]
+eaddr = struct.unpack('L', f.read(8))[0]
+nr_accesses = struct.unpack('I', f.read(4))[0]
+patterns.append([eaddr - saddr, nr_accesses])
+return patterns
+
+def plot_dist(data_file, output_file, xlabel, ylabel):
+terminal = output_file.split('.')[-1]
+if not terminal in ['pdf', 'jpeg', 'png', 'svg']:
+os.remove(data_file)
+print("Unsupported plot output type.")
+exit(-1)
+
+gnuplot_cmd = """
+set term %s;
+set output '%s';
+set key off;
+set xlabel '%s';
+set ylabel '%s';
+plot '%s' with linespoints;""" % (terminal, output_file, xlabel, ylabel,
+data_file)
+subprocess.call(['gnuplot', '-e', gnuplot_cmd])
+os.remove(data_file)
+
diff --git a/tools/damon/_recfile.py b/tools/damon/_recfile.py
new file mode 100644
index ..331b4d8165d8
--- /dev/null
+++ b/tools/damon/_recfile.py
@@ -0,0 +1,23 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import struct
+
+fmt_version = 0
+
+def set_fmt_version(f):
+global fmt_version
+
+mark = f.read(16)
+if mark == b'damon_recfmt_ver':
+fmt_version = struct.unpack('i', f.read(4))[0]
+else:
+fmt_version = 0
+f.seek(0)
+return fmt_version
+
+def pid(f):
+if fmt_version == 0:
+return struct.unpack('L', f.read(8))[0]
+else:
+return struct.unpack('i', f.read(4))[0]
diff --git a/tools/damon/bin2txt.py b/tools/damon/bin2txt.py
new file mode 100644
index ..8b9b57a0d727
--- /dev/null
+++ b/tools/damon/bin2txt.py
@@ -0,0 +1,67 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import argparse
+import os
+import struct
+import sys
+
+import _recfile
+
+def parse_time(bindat):
+"bindat should be 16 bytes"
+sec = struct.unpack('l', bindat[0:8])[0]
+nsec = struct.unpack('l', bindat[8:16])[0]
+return sec * 10 + nsec;
+
+def pr_region(f):
+saddr = struct.unpack('L', f.read(8))[0]
+eaddr = struct.unpack('L', f.read(8))[0]
+nr_accesses = struct.unpack('I', f.read(4))[0]
+print("%012x-%012x(%10d):\t%d" %
+(saddr, eaddr, eaddr - saddr, nr_accesses))
+
+def pr_task_info(f):
+pid = _recfile.pid(f)
+print("pid: ", pid)
+nr_regions = struct.unpack('I', f.read(4))[0]
+print("nr_regions: ", nr_regions)
+for r in range(nr_regions):
+pr_region(f)
+
+def set_argparser(parser):
+parser.add_argument('--input', '-i', type=str, metavar='',
+default='damon.data', help='input file name')
+
+def main(args=None):
+if not args:
+parser = argparse.ArgumentParser()
+set_argparser(parser)
+args = parser.parse_args()
+
+file_path = args.input
+
+if not os.path.isfile(file_path):
+print('input file (%s) is not exist' % file_path)
+exit(1)
+
+with open(file_path, 'rb') as f:
+_recfile.set_fmt_version(f)
+start_time = None
+while True:
+timebin = f.read(16)
+if len(timebin) != 16:
+break
+time = parse_time(timebin)
+if not start_time:
+

Re: Question: livepatch failed for new fork() task stack unreliable

2020-06-02 Thread Josh Poimboeuf

On Tue, Jun 02, 2020 at 09:22:30AM +0800, Wangshaobo (bobo) wrote:
> so i think this question is related to ORC unwinder, could i ask if you have
> strategy or plan to avoid this problem ?

I suspect something like this would fix it (untested):

diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 6ad43fc44556..8cf95ded1410 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -50,7 +50,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn 
consume_entry,
if (regs) {
/* Success path for user tasks */
if (user_mode(regs))
-   return 0;
+   break;
 
/*
 * Kernel mode registers on the stack indicate an
@@ -81,10 +81,6 @@ int arch_stack_walk_reliable(stack_trace_consume_fn 
consume_entry,
if (unwind_error())
return -EINVAL;
 
-   /* Success path for non-user tasks, i.e. kthreads and idle tasks */
-   if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
-   return -EINVAL;
-
return 0;
 }
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 7f969b2d240f..d7396431261a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
state->sp = sp;
state->regs = NULL;
state->prev_regs = NULL;
-   state->signal = false;
+   state->signal = ((void *)state->ip == ret_from_fork);
break;
 
case ORC_TYPE_REGS:

[PATCH] ARM: dts: AM33xx-l4: add gpio-ranges

2020-06-02 Thread Drew Fustini

Add gpio-ranges properties to the gpio controller nodes.

These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE
REGISTERS" in the  "AM335x Technical Reference Manual" [0] and "Table
4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1].
A csv file with this data is available for reference [2].

[0] https://www.ti.com/lit/ug/spruh73q/spruh73q.pdf
[1] http://www.ti.com/lit/ds/symlink/am3358.pdf
[2] https://gist.github.com/pdp7/6ffaddc8867973c1c3e8612cfaf72020

Signed-off-by: Drew Fustini 
---
Notes:
There was previous dicussion relevant to this patch:
https://lore.kernel.org/linux-gpio/20200525131731.GA948395@x1/

Here is the list I compiled which shows the mapping between a gpioline
and the pin number including the memory address for the pin control
register

gpiochip,gpio-line,pinctrl-PIN,pinctrl-address
0,0,82,44e10948
0,1,83,44e1094c
0,2,84,44e10950
0,3,85,44e10954
0,4,86,44e10958
0,5,87,44e1095c
0,6,88,44e10960
0,7,89,44e10964
0,8,52,44e108d0
0,9,53,44e108d4
0,10,54,44e108d8
0,11,55,44e108dc
0,12,94,44e10978
0,13,95,44e1097c
0,14,96,44e10980
0,15,97,44e10984
0,16,71,44e1091c
0,17,72,44e10920
0,18,135,44e10a1c
0,19,108,44e109b0
0,20,109,44e109b4
0,21,73,44e10924
0,22,8,44e10820
0,23,9,44e10824
0,26,10,44e10828
0,27,11,44e1082c
0,28,74,44e10928
0,29,81,44e10944
0,30,28,44e10870
0,31,29,44e10874
1,0,0,44e10800
1,1,1,44e10804
1,2,2,44e10808
1,3,3,44e1080c
1,4,4,44e10810
1,5,5,44e10814
1,6,6,44e10818
1,7,7,44e1081c
1,8,90,44e10968
1,9,91,44e1096c
1,10,92,44e10970
1,11,93,44e10974
1,12,12,44e10830
1,13,13,44e10834
1,14,14,44e10838
1,15,15,44e1083c
1,16,16,44e10840
1,17,17,44e10844
1,18,18,44e10848
1,19,19,44e1084c
1,20,20,44e10850
1,21,21,44e10854
1,22,22,44e10858
1,23,23,44e1085c
1,24,24,44e10860
1,25,25,44e10864
1,26,26,44e10868
1,27,27,44e1086c
1,28,30,44e10878
1,29,31,44e1087c
1,30,32,44e10880
1,31,33,44e10884
2,0,34,44e10888
2,1,35,44e1088c
2,2,36,44e10890
2,3,37,44e10894
2,4,38,44e10898
2,5,39,44e1089c
2,6,40,44e108a0
2,7,41,44e108a4
2,8,42,44e108a8
2,9,43,44e108ac
2,10,44,44e108b0
2,11,45,44e108b4
2,12,46,44e108b8
2,13,47,44e108bc
2,14,48,44e108c0
2,15,49,44e108c4
2,16,50,44e108c8
2,17,51,44e108cc
2,18,77,44e10934
2,19,78,44e10938
2,20,79,44e1093c
2,21,80,44e10940
2,22,56,44e108e0
2,23,57,44e108e4
2,24,58,44e108e8
2,25,59,44e108ec
2,26,60,44e108f0
2,27,61,44e108f4
2,28,62,44e108f8
2,29,63,44e108fc
2,30,64,44e10900
2,31,65,44e10904
3,0,66,44e10908
3,1,67,44e1090c
3,2,68,44e10910
3,3,69,44e10914
3,4,70,44e10918
3,5,98,44e10988
3,6,99,44e1098c
3,9,75,44e1092c
3,10,76,44e10930
3,13,141,44e10a34
3,14,100,44e10990
3,15,101,44e10994
3,16,102,44e10998
3,17,103,44e1099c
3,18,104,44e109a0
3,19,105,44e109a4
3,20,106,44e109a8
3,21,107,44e109ac

On a BeagleBlack Black board (AM3358) with this patch:
cat /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single/gpio-ranges

GPIO ranges handled:
0: gpio-0-31 GPIOS [0 - 7] PINS [82 - 89]
8: gpio-0-31 GPIOS [8 - 11] PINS [52 - 55]
12: gpio-0-31 GPIOS [12 - 15] PINS [94 - 97]
16: gpio-0-31 GPIOS [16 - 17] PINS [71 - 72]
18: gpio-0-31 GPIOS [18 - 18] PINS [135 - 135]
19: gpio-0-31 GPIOS [19 - 20] PINS [108 - 109]
21: gpio-0-31 GPIOS [21 - 21] PINS [73 - 73]
22: gpio-0-31 GPIOS [22 - 23] PINS [8 - 9]
26: gpio-0-31 GPIOS [26 - 27] PINS [10 - 11]
28: gpio-0-31 GPIOS [28 - 28] PINS [74 - 74]
29: gpio-0-31 GPIOS [29 - 29] PINS [81 - 81]
30: gpio-0-31 GPIOS [30 - 31] PINS [28 - 29]
0: gpio-32-63 GPIOS [32 - 39] PINS [0 - 7]
8: gpio-32-63 GPIOS [40 - 43] PINS [90 - 93]
12: gpio-32-63 GPIOS [44 - 59] PINS [12 - 27]
28: gpio-32-63 GPIOS [60 - 63] PINS [30 - 33]
0: gpio-64-95 GPIOS [64 - 81] PINS [34 - 51]
18: gpio-64-95 GPIOS [82 - 85] PINS [77 - 80]
22: gpio-64-95 GPIOS [86 - 95] PINS [56 - 65]
0: gpio-96-127 GPIOS [96 - 100] PINS [66 - 70]
5: gpio-96-127 GPIOS [101 - 102] PINS [98 - 99]
7: gpio-96-127 GPIOS [103 - 104] PINS [75 - 76]
13: gpio-96-127 GPIOS [109 - 109] PINS [141 - 141]
14: gpio-96-127 GPIOS [110 - 117] PINS [100 - 107]

 arch/arm/boot/dts/am33xx-l4.dtsi | 24 
 1 file changed, 24 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi
index 5ed7f3c58c0f..340ea331e54d 100644
--- a/arch/arm/boot/dts/am33xx-l4.dtsi
+++ b/arch/arm/boot/dts/am33xx-l4.dtsi
@@ -151,6 +151,18 @@ SYSC_OMAP2_SOFTRESET |
 
gpio0: gpio@0 {
compatible = "ti,omap4-gpio";
+   gpio-ranges =   <_pinmux  0  82 8>,
+   <_pinmux  8  52 4>,
+   <_pinmux 12  94 4>,
+   <_pinmux 16  71 2>,
+   <_pinmux 18 135 1>,
+   <_pinmux 19 108 2>,
+   <_pinmux 21  73 1>,
+   <_pinmux 22   8 2>,
+

Re: [PATCH] vimc: debayer: Add support for ARGB format

2020-06-02 Thread Helen Koike




On 6/2/20 9:45 AM, Laurent Pinchart wrote:
> Hello,
> 
> On Tue, Jun 02, 2020 at 08:31:26AM -0300, Helen Koike wrote:
>> On 6/2/20 8:24 AM, Kieran Bingham wrote:
>>> On 02/06/2020 11:55, Helen Koike wrote:
 On 6/2/20 7:52 AM, Dafna Hirschfeld wrote:
> On 01.06.20 14:16, Kaaira Gupta wrote:
>> On Fri, May 29, 2020 at 05:43:57PM +0200, Dafna Hirschfeld wrote:
>>> Hi,
>>> Thanks for the patch
>>>
>>> I don't know how real devices handle ARGB formats,
>>> I wonder if it should be the part of the debayer.
>>
>> Hi! qcam tries to support BA24 as it is one of the formats that vimc
>> lists as its supported formats wih --list-formats. Shouldn't BA24 be
>> possible to capture with vimc?
>
> Hi,
> Just to clarify, when listing the supported formats of a video node, the 
> node lists
> the formats that the video node as an independent media entity support.
> It does not mean that the 'camera' as a whole (that is, the media 
> topology graph) supports
> all the formats that the video node lists. When interacting with a video 
> node or
> a subdevice node, one interacts only with that specific entity.
> In the case of vimc, the RGB video node as an independent entity supports 
> BA24 so the format
> appears in the list of the its supported formats. But since the Debayer 
> does not
> support it, the format can not be generated by the entire vimc topology.
> This is not a bug.
> 
> Is here a valid configuration for the vimc pipeline that produces BA24 ?

It should work when using the other capture nodes that doesn't contain the 
debayer
in the pipeline.

> I agree that not all pipeline configurations need to support every
> format, but we shouldn't report a format that can't be produced at all.

ok, this requires major changes in Vimc, unless we implement all the conversions
in the capture.

> 
> This being said, and as discussed before, the de-bayering subdev should
> just produce MEDIA_BUS_FMT_RGB888_1X24, and the video node should then
> implement the RGB pixel formats. BA24 should likely be one of the
> supported formats (or maybe BX24 ?).

We can implement the conversion in the capture node, we just need to
distinguish when the pipeline generates it and when it requires conversion,
but shouldn't be a problem.

> 
 This is also my understanding.

 You should have an -EPIPE error when start streaming though, it
 shouldn't fail silently.
>>>
>>> Yes, we had -EPIPE, and that is what I think we were trying to resolve.
>>>
>>> How would userspace be expected to detect what formats to use ? Should
>>> the available formats on the capture node depend on the current linking
>>> of the media graph?
>>
>> This is a good question, I don't recall v4l2 API defining this.
> 
> A recent extension to VIDIOC_ENUMFMT allows enumerating pixel formats
> for a given media bus code, I think that's the way forward.

Nice, I'm not familiar with this extension, I'll take a look, thanks for
the pointer.

Thanks
Helen

> 
>> It would be a bit hard to implement in Vimc, specially when we have configfs
>> for custom topology, since the capture would need to query all the pipeline.
>> But could be implemented.
>>
>>> Otherwise, to know what formats are supported - userspace must first
>>> 'get a list of formats' then try to 'set' the formats to know what is
>>> possible?
>>
>> At the moment yes.
>>
>>> Or should (given VIMC is quite specialist anyway) userspace 'just know'
>>> what is capable all the same?
>>>
>>> That's possibly fine, as we can simply remove support for the ARGB
>>> formats from the libcamera pipeline handler if it is never expected to
>>> be supported.
>>
>> With the configfs feature, you could build a topology with sensor->capture,
>> and ARGB would be supported.
>>
>>> But then as a further question - what formats will we expect VIMC to
>>> support? VIVID has a (very) wide range of formats.
>>>
>>> Would we ever expect VIMC to be as configurable?
>>> Or is the scope limited to what we have today?
>>
>> I know it is very limited atm, but I would like to increase the range,
>> I'm just with a limited bandwitdh to work on it.
>>
>>
>> If yes, which entity should support it, if not debayer? Should there be
>> a separate conversion entity, or should we keep the support in debayer
>> itself for efficiency issues?
>>
>>> On 28.05.20 20:57, Kaaira Gupta wrote:
 Running qcam for pixelformat 0x34324142 showed that vimc debayer does
 not support it. Hence, add the support for Alpha (255).
>>>
>>> I would change the commit log to:
>>>
>>> Add support for V4L2_PIX_FMT_RGB24 format in the debayer
>>> and set the alpha channel to constant 255.
>>>
 Signed-off-by: Kaaira Gupta 
 ---
    .../media/test-drivers/vimc/vimc-debayer.c    | 27 
 ---
    1 file changed, 18 insertions(+), 9

[PATCH v14 09/15] mm/damon: Add tracepoints

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds a tracepoint for DAMON.  It traces the monitoring
results of each region for each aggregation interval.  Using this, DAMON
will be easily integrated with any tracepoints supporting tools such as
perf.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/trace/events/damon.h | 43 
 mm/damon.c   |  5 +
 2 files changed, 48 insertions(+)
 create mode 100644 include/trace/events/damon.h

diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h
new file mode 100644
index ..22236642d366
--- /dev/null
+++ b/include/trace/events/damon.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM damon
+
+#if !defined(_TRACE_DAMON_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_DAMON_H
+
+#include 
+#include 
+
+TRACE_EVENT(damon_aggregated,
+
+   TP_PROTO(int pid, unsigned int nr_regions,
+   unsigned long vm_start, unsigned long vm_end,
+   unsigned int nr_accesses),
+
+   TP_ARGS(pid, nr_regions, vm_start, vm_end, nr_accesses),
+
+   TP_STRUCT__entry(
+   __field(int, pid)
+   __field(unsigned int, nr_regions)
+   __field(unsigned long, vm_start)
+   __field(unsigned long, vm_end)
+   __field(unsigned int, nr_accesses)
+   ),
+
+   TP_fast_assign(
+   __entry->pid = pid;
+   __entry->nr_regions = nr_regions;
+   __entry->vm_start = vm_start;
+   __entry->vm_end = vm_end;
+   __entry->nr_accesses = nr_accesses;
+   ),
+
+   TP_printk("pid=%d nr_regions=%u %lu-%lu: %u", __entry->pid,
+   __entry->nr_regions, __entry->vm_start,
+   __entry->vm_end, __entry->nr_accesses)
+);
+
+#endif /* _TRACE_DAMON_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/mm/damon.c b/mm/damon.c
index 6b0b8f21a6c6..af6f395fe06c 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -9,6 +9,8 @@
 
 #define pr_fmt(fmt) "damon: " fmt
 
+#define CREATE_TRACE_POINTS
+
 #include 
 #include 
 #include 
@@ -20,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Minimal region size.  Every damon_region is aligned by this. */
 #define MIN_REGION PAGE_SIZE
@@ -650,6 +653,8 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
damon_write_rbuf(c, >vm_end, sizeof(r->vm_end));
damon_write_rbuf(c, >nr_accesses,
sizeof(r->nr_accesses));
+   trace_damon_aggregated(t->pid, nr,
+   r->vm_start, r->vm_end, r->nr_accesses);
r->nr_accesses = 0;
}
}
-- 
2.17.1

From Honourable Barrister Aziz Dake.

2020-06-02 Thread Aziz Dake

Attn: Sir/Madam

I am Honourable Barrister Aziz the personal resident Attorney here in
Burkina Faso to Late Mr. Muammar Muhammad Abu Minyar al-Gaddafi of
Libya c. 1942 – 20 October 2011.

My client Late Mr. Muammar Muhammad Abu Minyar al-Gaddafi c. 1942 – 20
October 2011, was having a deposit sum of {thirty million four Hundred
thousand united state dollars} only ($30.4M USD) with a security
finance firm affiliated with African development bank here in Burkina
Faso.

With the above explanation’s I want to move this money from Burkina
Faso to your country, affidavit on your name, but note that this is a
deal between me and you and should not be related to anybody until the
deal is over for security reasons, please if interested reply as soon
as possible.

Thanks,
Honourable Barrister Aziz Dake.

[PATCH v14 08/15] mm/damon: Add debugfs interface

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit adds a debugfs interface for DAMON.

DAMON exports four files, ``attrs``, ``pids``, ``record``, and
``monitor_on`` under its debugfs directory, ``/damon/``.

Attributes
--

Users can read and write the ``sampling interval``, ``aggregation
interval``, ``regions update interval``, and min/max number of
monitoring target regions by reading from and writing to the ``attrs``
file.  For example, below commands set those values to 5 ms, 100 ms,
1,000 ms, 10, 1000 and check it again::

# cd /damon
# echo 5000 10 100 10 1000 > attrs
# cat attrs
5000 10 100 10 1000

Target PIDs
---

Users can read and write the pids of current monitoring target processes
by reading from and writing to the ``pids`` file.  For example, below
commands set processes having pids 42 and 4242 as the processes to be
monitored and check it again::

# cd /damon
# echo 42 4242 > pids
# cat pids
42 4242

Note that setting the pids doesn't start the monitoring.

Record
--

DAMON supports direct monitoring result record feature.  The recorded
results are first written to a buffer and flushed to a file in batch.
Users can set the size of the buffer and the path to the result file by
reading from and writing to the ``record`` file.  For example, below
commands set the buffer to be 4 KiB and the result to be saved in
'/damon.data'.

# cd /damon
# echo 4096 /damon.data > pids
# cat record
4096 /damon.data

Turning On/Off
--

You can check current status, start and stop the monitoring by reading
from and writing to the ``monitor_on`` file.  Writing ``on`` to the file
starts DAMON to monitor the target processes with the attributes.
Writing ``off`` to the file stops DAMON.  DAMON also stops if every
target processes is terminated.  Below example commands turn on, off,
and check status of DAMON::

# cd /damon
# echo on > monitor_on
# echo off > monitor_on
# cat monitor_on
off

Please note that you cannot write to the ``attrs`` and ``pids`` files
while the monitoring is turned on.  If you write to the files while
DAMON is running, ``-EINVAL`` will be returned.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 mm/damon.c | 357 -
 1 file changed, 356 insertions(+), 1 deletion(-)

diff --git a/mm/damon.c b/mm/damon.c
index 768ffd08f7ab..6b0b8f21a6c6 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -10,6 +10,7 @@
 #define pr_fmt(fmt) "damon: " fmt
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,6 +51,15 @@
 /* Get a random number in [l, r) */
 #define damon_rand(l, r) (l + prandom_u32() % (r - l))
 
+/* A monitoring context for debugfs interface users. */
+static struct damon_ctx damon_user_ctx = {
+   .sample_interval = 5 * 1000,
+   .aggr_interval = 100 * 1000,
+   .regions_update_interval = 1000 * 1000,
+   .min_nr_regions = 10,
+   .max_nr_regions = 1000,
+};
+
 /*
  * Construct a damon_region struct
  *
@@ -1133,13 +1143,358 @@ int damon_set_attrs(struct damon_ctx *ctx, unsigned 
long sample_int,
return 0;
 }
 
-static int __init damon_init(void)
+static ssize_t debugfs_monitor_on_read(struct file *file,
+   char __user *buf, size_t count, loff_t *ppos)
+{
+   struct damon_ctx *ctx = _user_ctx;
+   char monitor_on_buf[5];
+   bool monitor_on;
+   int len;
+
+   monitor_on = damon_kdamond_running(ctx);
+   len = snprintf(monitor_on_buf, 5, monitor_on ? "on\n" : "off\n");
+
+   return simple_read_from_buffer(buf, count, ppos, monitor_on_buf, len);
+}
+
+static ssize_t debugfs_monitor_on_write(struct file *file,
+   const char __user *buf, size_t count, loff_t *ppos)
+{
+   struct damon_ctx *ctx = _user_ctx;
+   ssize_t ret;
+   char cmdbuf[5];
+   int err;
+
+   ret = simple_write_to_buffer(cmdbuf, 5, ppos, buf, count);
+   if (ret < 0)
+   return ret;
+
+   if (sscanf(cmdbuf, "%s", cmdbuf) != 1)
+   return -EINVAL;
+   if (!strncmp(cmdbuf, "on", 5))
+   err = damon_start(ctx);
+   else if (!strncmp(cmdbuf, "off", 5))
+   err = damon_stop(ctx);
+   else
+   return -EINVAL;
+
+   if (err)
+   ret = err;
+   return ret;
+}
+
+static ssize_t damon_sprint_pids(struct damon_ctx *ctx, char *buf, ssize_t len)
+{
+   struct damon_task *t;
+   int written = 0;
+   int rc;
+
+   damon_for_each_task(t, ctx) {
+   rc = snprintf([written], len - written, "%d ", t->pid);
+   if (!rc)
+   return -ENOMEM;
+   written += rc;
+   }
+   if (written)
+   written -= 1;
+   written += snprintf([written], len - written, "\n");
+   return written;
+}
+
+static ssize_t debugfs_pids_read(struct file *file,
+   char __user *buf, size_t count, loff_t

[PATCH -next] PCI: uniphier: Fix Kconfig warning

2020-06-02 Thread YueHaibing

WARNING: unmet direct dependencies detected for PCIE_DW_EP
  Depends on [n]: PCI [=y] && PCI_ENDPOINT [=n]
  Selected by [y]:
  - PCIE_UNIPHIER_EP [=y] && PCI [=y] && (ARCH_UNIPHIER || COMPILE_TEST [=y]) 
&& OF [=y] && HAS_IOMEM [=y]

Add missing dependency to fix this.

Fixes: 006564dee825 ("PCI: uniphier: Add Socionext UniPhier Pro5 PCIe endpoint 
controller driver")
Signed-off-by: YueHaibing 
---
 drivers/pci/controller/dwc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/controller/dwc/Kconfig 
b/drivers/pci/controller/dwc/Kconfig
index 43a29f7a4501..044a3761c44f 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -293,6 +293,7 @@ config PCIE_UNIPHIER_EP
bool "Socionext UniPhier PCIe endpoint controllers"
depends on ARCH_UNIPHIER || COMPILE_TEST
depends on OF && HAS_IOMEM
+   depends on PCI_ENDPOINT
select PCIE_DW_EP
help
  Say Y here if you want PCIe endpoint controller support on
-- 
2.17.1

[PATCH 2/2] lib/Kconfig.debug: Fix typo in the help text of CONFIG_PANIC_TIMEOUT

2020-06-02 Thread Tiezhu Yang

There exists duplicated "the" in the help text of CONFIG_PANIC_TIMEOUT,
remove it.

Signed-off-by: Tiezhu Yang 
---
 lib/Kconfig.debug | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b3b05ad..d33627a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,7 +824,7 @@ config PANIC_TIMEOUT
int "panic timeout"
default 0
help
- Set the timeout value (in seconds) until a reboot occurs when the
+ Set the timeout value (in seconds) until a reboot occurs when
  the kernel panics. If n = 0, then we wait forever. A timeout
  value n > 0 will wait n seconds before rebooting, while a timeout
  value n < 0 will reboot immediately.
-- 
2.1.0

[PATCH 1/2] kernel/panic.c: Make oops_may_print() return bool

2020-06-02 Thread Tiezhu Yang

The return value of oops_may_print() is true or false, so change its type
to reflect that.

Signed-off-by: Tiezhu Yang 
---
 include/linux/kernel.h | 2 +-
 kernel/panic.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 9b7a8d7..69c7fa4 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -322,7 +322,7 @@ void nmi_panic(struct pt_regs *regs, const char *msg);
 extern void oops_enter(void);
 extern void oops_exit(void);
 void print_oops_end_marker(void);
-extern int oops_may_print(void);
+extern bool oops_may_print(void);
 void do_exit(long error_code) __noreturn;
 void complete_and_exit(struct completion *, long) __noreturn;
 
diff --git a/kernel/panic.c b/kernel/panic.c
index b69ee9e..064d80f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -490,7 +490,7 @@ static void do_oops_enter_exit(void)
  * Return true if the calling CPU is allowed to print oops-related info.
  * This is a bit racy..
  */
-int oops_may_print(void)
+bool oops_may_print(void)
 {
return pause_on_oops_flag == 0;
 }
-- 
2.1.0

[PATCH v2] soc: samsung: Add simple voltage coupler for Exynos5800

2020-06-02 Thread Marek Szyprowski

Add a simple custom voltage regulator coupler for Exynos5800 SoCs, which
require coupling between "vdd_arm" and "vdd_int" regulators. This coupler
ensures that the voltage balancing for the coupled regulators is done
only when clients for the each regulator apply their constraints, so the
voltage values don't go beyond the bootloader-selected operation point
during the boot process. This also ensures proper voltage balancing if
any of the client driver is missing.

Signed-off-by: Marek Szyprowski 
---
v2:
- removed dependency on the regulator names as pointed by krzk, now it
  works with all coupled regulators. So far the coupling between the
  regulators on Exynos5800 based boards is defined only between
  "vdd_arm" and "vdd_int" supplies.
---
 arch/arm/mach-exynos/Kconfig  |  1 +
 drivers/soc/samsung/Kconfig   |  3 +
 drivers/soc/samsung/Makefile  |  1 +
 .../soc/samsung/exynos-regulator-coupler.c| 56 +++
 4 files changed, 61 insertions(+)
 create mode 100644 drivers/soc/samsung/exynos-regulator-coupler.c

diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
index 76838255b5fa..f185cd3d4c62 100644
--- a/arch/arm/mach-exynos/Kconfig
+++ b/arch/arm/mach-exynos/Kconfig
@@ -118,6 +118,7 @@ config SOC_EXYNOS5800
bool "Samsung EXYNOS5800"
default y
depends on SOC_EXYNOS5420
+   select EXYNOS_REGULATOR_COUPLER
 
 config EXYNOS_MCPM
bool
diff --git a/drivers/soc/samsung/Kconfig b/drivers/soc/samsung/Kconfig
index c7a2003687c7..264185664594 100644
--- a/drivers/soc/samsung/Kconfig
+++ b/drivers/soc/samsung/Kconfig
@@ -37,4 +37,7 @@ config EXYNOS_PM_DOMAINS
bool "Exynos PM domains" if COMPILE_TEST
depends on PM_GENERIC_DOMAINS || COMPILE_TEST
 
+config EXYNOS_REGULATOR_COUPLER
+   bool "Exynos SoC Regulator Coupler" if COMPILE_TEST
+   depends on ARCH_EXYNOS || COMPILE_TEST
 endif
diff --git a/drivers/soc/samsung/Makefile b/drivers/soc/samsung/Makefile
index edd1d6ea064d..ecc3a32f6406 100644
--- a/drivers/soc/samsung/Makefile
+++ b/drivers/soc/samsung/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_EXYNOS_PMU)+= exynos-pmu.o
 obj-$(CONFIG_EXYNOS_PMU_ARM_DRIVERS)   += exynos3250-pmu.o exynos4-pmu.o \
exynos5250-pmu.o exynos5420-pmu.o
 obj-$(CONFIG_EXYNOS_PM_DOMAINS) += pm_domains.o
+obj-$(CONFIG_EXYNOS_REGULATOR_COUPLER) += exynos-regulator-coupler.o
diff --git a/drivers/soc/samsung/exynos-regulator-coupler.c 
b/drivers/soc/samsung/exynos-regulator-coupler.c
new file mode 100644
index ..370a0ce4de3a
--- /dev/null
+++ b/drivers/soc/samsung/exynos-regulator-coupler.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2020 Samsung Electronics Co., Ltd.
+ *   http://www.samsung.com/
+ * Author: Marek Szyprowski 
+ *
+ * Simple Samsung Exynos SoC voltage coupler. Ensures that all
+ * clients set their constraints before balancing the voltages.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int exynos_coupler_balance_voltage(struct regulator_coupler *coupler,
+ struct regulator_dev *rdev,
+ suspend_state_t state)
+{
+   struct coupling_desc *c_desc = >coupling_desc;
+   int ret, cons_uV = 0, cons_max_uV = INT_MAX;
+   bool skip_coupled = false;
+
+   /* get coupled regulator constraints */
+   ret = regulator_check_consumers(c_desc->coupled_rdevs[1], _uV,
+   _max_uV, state);
+   if (ret < 0)
+   return ret;
+
+   /* skip adjusting coupled regulator if it has no constraints set yet */
+   if (cons_uV == 0)
+   skip_coupled = true;
+
+   return regulator_do_balance_voltage(rdev, state, skip_coupled);
+}
+
+static int exynos_coupler_attach(struct regulator_coupler *coupler,
+struct regulator_dev *rdev)
+{
+   return 0;
+}
+
+static struct regulator_coupler exynos_coupler = {
+   .attach_regulator = exynos_coupler_attach,
+   .balance_voltage  = exynos_coupler_balance_voltage,
+};
+
+static int __init exynos_coupler_init(void)
+{
+   if (!of_machine_is_compatible("samsung,exynos5800"))
+   return 0;
+
+   return regulator_coupler_register(_coupler);
+}
+arch_initcall(exynos_coupler_init);
-- 
2.17.1

[PATCH v14 07/15] mm/damon: Implement access pattern recording

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit implements the recording feature of DAMON. If this feature
is enabled, DAMON writes the monitored access patterns in its binary
format into a file which specified by the user. This is already able to
be implemented by each user using the callbacks.  However, as the
recording is expected to be used widely, this commit implements the
feature in the DAMON, for more convenience and efficiency.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/linux/damon.h |  15 +
 mm/damon.c| 131 +-
 2 files changed, 143 insertions(+), 3 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 12536d9d2f74..c4796a10cb1a 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -66,6 +66,14 @@ struct damon_task {
  * in case of virtual memory monitoring) and applies the changes for each
  * @regions_update_interval.  All time intervals are in micro-seconds.
  *
+ * @rbuf: In-memory buffer for monitoring result recording.
+ * @rbuf_len: The length of @rbuf.
+ * @rbuf_offset: The offset for next write to @rbuf.
+ * @rfile_path: Record file path.
+ *
+ * If @rbuf, @rbuf_len, and @rfile_path are set, the monitored results are
+ * automatically stored in @rfile_path file.
+ *
  * @kdamond:   Kernel thread who does the monitoring.
  * @kdamond_stop:  Notifies whether kdamond should stop.
  * @kdamond_lock:  Mutex for the synchronizations with @kdamond.
@@ -100,6 +108,11 @@ struct damon_ctx {
struct timespec64 last_aggregation;
struct timespec64 last_regions_update;
 
+   unsigned char *rbuf;
+   unsigned int rbuf_len;
+   unsigned int rbuf_offset;
+   char *rfile_path;
+
struct task_struct *kdamond;
bool kdamond_stop;
struct mutex kdamond_lock;
@@ -115,6 +128,8 @@ int damon_set_pids(struct damon_ctx *ctx, int *pids, 
ssize_t nr_pids);
 int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
unsigned long aggr_int, unsigned long regions_update_int,
unsigned long min_nr_reg, unsigned long max_nr_reg);
+int damon_set_recording(struct damon_ctx *ctx,
+   unsigned int rbuf_len, char *rfile_path);
 int damon_start(struct damon_ctx *ctx);
 int damon_stop(struct damon_ctx *ctx);
 
diff --git a/mm/damon.c b/mm/damon.c
index 42ce9df47f8b..768ffd08f7ab 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -44,6 +44,9 @@
 #define damon_for_each_task_safe(t, next, ctx) \
list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list)
 
+#define MAX_RECORD_BUFFER_LEN  (4 * 1024 * 1024)
+#define MAX_RFILE_PATH_LEN 256
+
 /* Get a random number in [l, r) */
 #define damon_rand(l, r) (l + prandom_u32() % (r - l))
 
@@ -565,16 +568,80 @@ static bool kdamond_aggregate_interval_passed(struct 
damon_ctx *ctx)
 }
 
 /*
- * Reset the aggregated monitoring results
+ * Flush the content in the result buffer to the result file
+ */
+static void damon_flush_rbuffer(struct damon_ctx *ctx)
+{
+   ssize_t sz;
+   loff_t pos = 0;
+   struct file *rfile;
+
+   rfile = filp_open(ctx->rfile_path, O_CREAT | O_RDWR | O_APPEND, 0644);
+   if (IS_ERR(rfile)) {
+   pr_err("Cannot open the result file %s\n",
+   ctx->rfile_path);
+   return;
+   }
+
+   while (ctx->rbuf_offset) {
+   sz = kernel_write(rfile, ctx->rbuf, ctx->rbuf_offset, );
+   if (sz < 0)
+   break;
+   ctx->rbuf_offset -= sz;
+   }
+   filp_close(rfile, NULL);
+}
+
+/*
+ * Write a data into the result buffer
+ */
+static void damon_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size)
+{
+   if (!ctx->rbuf_len || !ctx->rbuf)
+   return;
+   if (ctx->rbuf_offset + size > ctx->rbuf_len)
+   damon_flush_rbuffer(ctx);
+
+   memcpy(>rbuf[ctx->rbuf_offset], data, size);
+   ctx->rbuf_offset += size;
+}
+
+/*
+ * Flush the aggregated monitoring results to the result buffer
+ *
+ * Stores current tracking results to the result buffer and reset 'nr_accesses'
+ * of each region.  The format for the result buffer is as below:
+ *
+ * 
+ *
+ *   task info:   
+ *   region info:   
  */
 static void kdamond_reset_aggregated(struct damon_ctx *c)
 {
struct damon_task *t;
-   struct damon_region *r;
+   struct timespec64 now;
+   unsigned int nr;
+
+   ktime_get_coarse_ts64();
+
+   damon_write_rbuf(c, , sizeof(struct timespec64));
+   nr = nr_damon_tasks(c);
+   damon_write_rbuf(c, , sizeof(nr));
 
damon_for_each_task(t, c) {
-   damon_for_each_region(r, t)
+   struct damon_region *r;
+
+   damon_write_rbuf(c, >pid, sizeof(t->pid));
+   nr = nr_damon_regions(t);
+   damon_write_rbuf(c, , sizeof(nr));
+   damon_for_each_region(r, t)

[PATCH] iommu/iova: Don't BUG on invalid PFNs

2020-06-02 Thread Robin Murphy

Unlike the other instances which represent a complete loss of
consistency within the rcache mechanism itself, or a fundamental
and obvious misconfiguration by an IOMMU driver, the BUG_ON() in
iova_magazine_free_pfns() can be provoked at more or less any time
in a "spooky action-at-a-distance" manner by any old device driver
passing nonsense to dma_unmap_*() which then propagates through to
queue_iova().

Not only is this well outside the IOVA layer's control, it's also
nowhere near fatal enough to justify panicking anyway - all that
really achieves is to make debugging the offending driver more
difficult. Let's simply WARN and otherwise ignore bogus PFNs.

Reported-by: Prakash Gupta 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/iova.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 0e6a9536eca6..612cbf668adf 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -811,7 +811,9 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct 
iova_domain *iovad)
for (i = 0 ; i < mag->size; ++i) {
struct iova *iova = private_find_iova(iovad, mag->pfns[i]);
 
-   BUG_ON(!iova);
+   if (WARN_ON(!iova))
+   continue;
+
private_free_iova(iovad, iova);
}
 
-- 
2.23.0.dirty

[PATCH v14 05/15] mm/damon: Apply dynamic memory mapping changes

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

Only a number of parts in the virtual address space of the processes is
mapped to physical memory and accessed.  Thus, tracking the unmapped
address regions is just wasteful.  However, tracking every memory
mapping change might incur an overhead.  For the reason, DAMON applies
the dynamic memory mapping changes to the tracking regions only for each
of a user-specified time interval (``regions update interval``).

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/linux/damon.h |  13 --
 mm/damon.c| 101 +-
 2 files changed, 108 insertions(+), 6 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index babdba6b5c47..5afcb2bb7f77 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -55,13 +55,16 @@ struct damon_task {
  *
  * @sample_interval:   The time between access samplings.
  * @aggr_interval: The time between monitor results aggregations.
+ * @regions_update_interval:   The time between monitor regions updates.
  * @min_nr_regions:The number of initial monitoring regions.
  * @max_nr_regions:The maximum number of monitoring regions.
  *
  * For each @sample_interval, DAMON checks whether each region is accessed or
  * not.  It aggregates and keeps the access information (number of accesses to
- * each region) for @aggr_interval time.  All time intervals are in
- * micro-seconds.
+ * each region) for @aggr_interval time.  DAMON also checks whether the target
+ * memory regions need update (e.g., by ``mmap()`` calls from the application,
+ * in case of virtual memory monitoring) and applies the changes for each
+ * @regions_update_interval.  All time intervals are in micro-seconds.
  *
  * @kdamond:   Kernel thread who does the monitoring.
  * @kdamond_stop:  Notifies whether kdamond should stop.
@@ -81,10 +84,12 @@ struct damon_task {
 struct damon_ctx {
unsigned long sample_interval;
unsigned long aggr_interval;
+   unsigned long regions_update_interval;
unsigned long min_nr_regions;
unsigned long max_nr_regions;
 
struct timespec64 last_aggregation;
+   struct timespec64 last_regions_update;
 
struct task_struct *kdamond;
bool kdamond_stop;
@@ -94,8 +99,8 @@ struct damon_ctx {
 };
 
 int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
-int damon_set_attrs(struct damon_ctx *ctx,
-   unsigned long sample_int, unsigned long aggr_int,
+int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
+   unsigned long aggr_int, unsigned long regions_update_int,
unsigned long min_nr_reg, unsigned long max_nr_reg);
 int damon_start(struct damon_ctx *ctx);
 int damon_stop(struct damon_ctx *ctx);
diff --git a/mm/damon.c b/mm/damon.c
index 6b8f038331f5..de5ed38ac05c 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -713,6 +713,98 @@ static void kdamond_split_regions(struct damon_ctx *ctx)
last_nr_regions = nr_regions;
 }
 
+/*
+ * Check whether it is time to check and apply the dynamic mmap changes
+ *
+ * Returns true if it is.
+ */
+static bool kdamond_need_update_regions(struct damon_ctx *ctx)
+{
+   return damon_check_reset_time_interval(>last_regions_update,
+   ctx->regions_update_interval);
+}
+
+/*
+ * Check whether regions are intersecting
+ *
+ * Note that this function checks 'struct damon_region' and 'struct region'.
+ *
+ * Returns true if it is.
+ */
+static bool damon_intersect(struct damon_region *r, struct region *re)
+{
+   return !(r->vm_end <= re->start || re->end <= r->vm_start);
+}
+
+/*
+ * Update damon regions for the three big regions of the given task
+ *
+ * t   the given task
+ * bregionsthe three big regions of the task
+ */
+static void damon_apply_three_regions(struct damon_ctx *ctx,
+   struct damon_task *t, struct region bregions[3])
+{
+   struct damon_region *r, *next;
+   unsigned int i = 0;
+
+   /* Remove regions which are not in the three big regions now */
+   damon_for_each_region_safe(r, next, t) {
+   for (i = 0; i < 3; i++) {
+   if (damon_intersect(r, [i]))
+   break;
+   }
+   if (i == 3)
+   damon_destroy_region(r);
+   }
+
+   /* Adjust intersecting regions to fit with the three big regions */
+   for (i = 0; i < 3; i++) {
+   struct damon_region *first = NULL, *last;
+   struct damon_region *newr;
+   struct region *br;
+
+   br = [i];
+   /* Get the first and last regions which intersects with br */
+   damon_for_each_region(r, t) {
+   if (damon_intersect(r, br)) {
+   if (!first)
+   first = r;
+

[PATCH v14 06/15] mm/damon: Implement callbacks

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit implements callbacks for DAMON.  Using this, DAMON users can
install their callbacks for each step of the access monitoring so that
they can do something interesting with the monitored access patterns
online.  For example, callbacks can report the monitored patterns to
users or do some access pattern based memory management such as
proactive reclamations or access pattern based THP promotions/demotions
decision makings.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/linux/damon.h | 13 +
 mm/damon.c|  4 
 2 files changed, 17 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 5afcb2bb7f77..12536d9d2f74 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -80,6 +80,15 @@ struct damon_task {
  * @kdamond_lock.  Accesses to other fields must be protected by themselves.
  *
  * @tasks_list:Head of monitoring target tasks (_task) 
list.
+ *
+ * @sample_cb: Called for each sampling interval.
+ * @aggregate_cb:  Called for each aggregation interval.
+ *
+ * @sample_cb and @aggregate_cb are called from @kdamond for each of the
+ * sampling intervals and aggregation intervals, respectively.  Therefore,
+ * users can safely access to the monitoring results via @tasks_list without
+ * additional protection of @kdamond_lock.  For the reason, users are
+ * recommended to use these callback for the accesses to the results.
  */
 struct damon_ctx {
unsigned long sample_interval;
@@ -96,6 +105,10 @@ struct damon_ctx {
struct mutex kdamond_lock;
 
struct list_head tasks_list;/* 'damon_task' objects */
+
+   /* callbacks */
+   void (*sample_cb)(struct damon_ctx *context);
+   void (*aggregate_cb)(struct damon_ctx *context);
 };
 
 int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
diff --git a/mm/damon.c b/mm/damon.c
index de5ed38ac05c..42ce9df47f8b 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -850,6 +850,8 @@ static int kdamond_fn(void *data)
kdamond_init_regions(ctx);
while (!kdamond_need_stop(ctx)) {
kdamond_prepare_access_checks(ctx);
+   if (ctx->sample_cb)
+   ctx->sample_cb(ctx);
 
usleep_range(ctx->sample_interval, ctx->sample_interval + 1);
 
@@ -857,6 +859,8 @@ static int kdamond_fn(void *data)
 
if (kdamond_aggregate_interval_passed(ctx)) {
kdamond_merge_regions(ctx, max_nr_accesses / 10);
+   if (ctx->aggregate_cb)
+   ctx->aggregate_cb(ctx);
kdamond_reset_aggregated(ctx);
kdamond_split_regions(ctx);
}
-- 
2.17.1

[PATCH RFC 05/13] vhost/net: pass net specific struct pointer

2020-06-02 Thread Michael S. Tsirkin

In preparation for further cleanup, pass net specific pointer
to ubuf callbacks so we can move net specific fields
out to net structures.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2927f02cc7e1..749a9cf51a59 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -94,7 +94,7 @@ struct vhost_net_ubuf_ref {
 */
atomic_t refcount;
wait_queue_head_t wait;
-   struct vhost_virtqueue *vq;
+   struct vhost_net_virtqueue *nvq;
 };
 
 #define VHOST_NET_BATCH 64
@@ -231,7 +231,7 @@ static void vhost_net_enable_zcopy(int vq)
 }
 
 static struct vhost_net_ubuf_ref *
-vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy)
+vhost_net_ubuf_alloc(struct vhost_net_virtqueue *nvq, bool zcopy)
 {
struct vhost_net_ubuf_ref *ubufs;
/* No zero copy backend? Nothing to count. */
@@ -242,7 +242,7 @@ vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy)
return ERR_PTR(-ENOMEM);
atomic_set(>refcount, 1);
init_waitqueue_head(>wait);
-   ubufs->vq = vq;
+   ubufs->nvq = nvq;
return ubufs;
 }
 
@@ -384,13 +384,13 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
 static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
 {
struct vhost_net_ubuf_ref *ubufs = ubuf->ctx;
-   struct vhost_virtqueue *vq = ubufs->vq;
+   struct vhost_net_virtqueue *nvq = ubufs->nvq;
int cnt;
 
rcu_read_lock_bh();
 
/* set len to mark this desc buffers done DMA */
-   vq->heads[ubuf->desc].len = success ?
+   nvq->vq.heads[ubuf->desc].in_len = success ?
VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
cnt = vhost_net_ubuf_put(ubufs);
 
@@ -402,7 +402,7 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, 
bool success)
 * less than 10% of times).
 */
if (cnt <= 1 || !(cnt % 16))
-   vhost_poll_queue(>poll);
+   vhost_poll_queue(>vq.poll);
 
rcu_read_unlock_bh();
 }
@@ -1525,7 +1525,7 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
/* start polling new socket */
oldsock = vhost_vq_get_backend(vq);
if (sock != oldsock) {
-   ubufs = vhost_net_ubuf_alloc(vq,
+   ubufs = vhost_net_ubuf_alloc(nvq,
 sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
-- 
MST

[PATCH RFC 09/13] vhost/net: avoid iov length math

2020-06-02 Thread Michael S. Tsirkin

Now that API exposes buffer length, we no longer need to
scan IOVs to figure it out.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 47af3d1ce3dd..36843058182b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -607,11 +607,9 @@ static bool vhost_exceeds_maxpend(struct vhost_net *net)
 }
 
 static size_t init_iov_iter(struct vhost_virtqueue *vq, struct iov_iter *iter,
-   size_t hdr_size, int out)
+   size_t len, size_t hdr_size, int out)
 {
/* Skip header. TODO: support TSO. */
-   size_t len = iov_length(vq->iov, out);
-
iov_iter_init(iter, WRITE, vq->iov, out, len);
iov_iter_advance(iter, hdr_size);
 
@@ -640,7 +638,7 @@ static int get_tx_bufs(struct vhost_net *net,
}
 
/* Sanity check */
-   *len = init_iov_iter(vq, >msg_iter, nvq->vhost_hlen, *out);
+   *len = init_iov_iter(vq, >msg_iter, buf->out_len, nvq->vhost_hlen, 
*out);
if (*len == 0) {
vq_err(vq, "Unexpected header len for TX: %zd expected %zd\n",
*len, nvq->vhost_hlen);
@@ -1080,7 +1078,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
nlogs += *log_num;
log += *log_num;
}
-   len = iov_length(vq->iov + seg, in);
+   len = bufs[bufcount].in_len;
datalen -= len;
++bufcount;
seg += in;
-- 
MST

[PATCH RFC 10/13] vhost/test: convert to the buf API

2020-06-02 Thread Michael S. Tsirkin

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/test.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 02806d6f84ef..251fd2bf74a3 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -44,9 +44,10 @@ static void handle_vq(struct vhost_test *n)
 {
struct vhost_virtqueue *vq = >vqs[VHOST_TEST_VQ];
unsigned out, in;
-   int head;
+   int ret;
size_t len, total_len = 0;
void *private;
+   struct vhost_buf buf;
 
mutex_lock(>mutex);
private = vhost_vq_get_backend(vq);
@@ -58,15 +59,15 @@ static void handle_vq(struct vhost_test *n)
vhost_disable_notify(>dev, vq);
 
for (;;) {
-   head = vhost_get_vq_desc(vq, vq->iov,
-ARRAY_SIZE(vq->iov),
-, ,
-NULL, NULL);
+   ret = vhost_get_avail_buf(vq, vq->iov, ,
+ ARRAY_SIZE(vq->iov),
+ , ,
+ NULL, NULL);
/* On error, stop handling until the next kick. */
-   if (unlikely(head < 0))
+   if (unlikely(ret < 0))
break;
/* Nothing new?  Wait for eventfd to tell us they refilled. */
-   if (head == vq->num) {
+   if (!ret) {
if (unlikely(vhost_enable_notify(>dev, vq))) {
vhost_disable_notify(>dev, vq);
continue;
@@ -78,13 +79,14 @@ static void handle_vq(struct vhost_test *n)
   "out %d, int %d\n", out, in);
break;
}
-   len = iov_length(vq->iov, out);
+   len = buf.out_len;
/* Sanity check */
if (!len) {
vq_err(vq, "Unexpected 0 len for TX\n");
break;
}
-   vhost_add_used_and_signal(>dev, vq, head, 0);
+   vhost_put_used_buf(vq, );
+   vhost_signal(>dev, vq);
total_len += len;
if (unlikely(vhost_exceeds_weight(vq, 0, total_len)))
break;
-- 
MST

[PATCH RFC 04/13] vhost: cleanup fetch_buf return code handling

2020-06-02 Thread Michael S. Tsirkin

Return code of fetch_buf is confusing, so callers resort to
tricks to get to sane values. Let's switch to something standard:
0 empty, >0 non-empty, <0 error.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index aca2a5b0d078..bd52b44b0d23 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2146,6 +2146,8 @@ static int fetch_indirect_descs(struct vhost_virtqueue 
*vq,
return 0;
 }
 
+/* This function returns a value > 0 if a descriptor was found, or 0 if none 
were found.
+ * A negative code is returned on error. */
 static int fetch_buf(struct vhost_virtqueue *vq)
 {
unsigned int i, head, found = 0;
@@ -2162,7 +2164,7 @@ static int fetch_buf(struct vhost_virtqueue *vq)
if (unlikely(vq->avail_idx == vq->last_avail_idx)) {
/* If we already have work to do, don't bother re-checking. */
if (likely(vq->ndescs))
-   return vq->num;
+   return 0;
 
if (unlikely(vhost_get_avail_idx(vq, _idx))) {
vq_err(vq, "Failed to access avail idx at %p\n",
@@ -2181,7 +2183,7 @@ static int fetch_buf(struct vhost_virtqueue *vq)
 * invalid.
 */
if (vq->avail_idx == last_avail_idx)
-   return vq->num;
+   return 0;
 
/* Only get avail ring entries after they have been
 * exposed by guest.
@@ -2251,12 +2253,14 @@ static int fetch_buf(struct vhost_virtqueue *vq)
/* On success, increment avail index. */
vq->last_avail_idx++;
 
-   return 0;
+   return 1;
 }
 
+/* This function returns a value > 0 if a descriptor was found, or 0 if none 
were found.
+ * A negative code is returned on error. */
 static int fetch_descs(struct vhost_virtqueue *vq)
 {
-   int ret = 0;
+   int ret;
 
if (unlikely(vq->first_desc >= vq->ndescs)) {
vq->first_desc = 0;
@@ -2266,10 +2270,14 @@ static int fetch_descs(struct vhost_virtqueue *vq)
if (vq->ndescs)
return 0;
 
-   while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq))
-   ret = fetch_buf(vq);
+   for (ret = 1;
+ret > 0 && vq->ndescs <= vhost_vq_num_batch_descs(vq);
+ret = fetch_buf(vq))
+   ;
 
-   return vq->ndescs ? 0 : ret;
+   /* On success we expect some descs */
+   BUG_ON(ret > 0 && !vq->ndescs);
+   return ret;
 }
 
 /* This looks in the virtqueue and for the first available buffer, and converts
@@ -2288,7 +2296,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
int ret = fetch_descs(vq);
int i;
 
-   if (ret)
+   if (ret <= 0)
return ret;
 
/* Now convert to IOV */
-- 
MST

Re: [PATCH] iommu/dma: limit iova free size to unmmaped iova

2020-06-02 Thread Robin Murphy


On 2020-05-26 08:19, gup...@codeaurora.org wrote:

On 2020-05-22 14:54, Robin Murphy wrote:

On 2020-05-22 07:25, gup...@codeaurora.org wrote:

On 2020-05-22 01:46, Robin Murphy wrote:

On 2020-05-21 12:30, Prakash Gupta wrote:

I agree, we shouldn't be freeing the partial iova. Instead just making
sure if unmap was successful should be sufficient before freeing 
iova. So change

can instead be something like this:

-    iommu_dma_free_iova(cookie, dma_addr, size);
+    if (unmapped)
+    iommu_dma_free_iova(cookie, dma_addr, size);


TBH my gut feeling here is that you're really just trying to treat a
symptom of another bug elsewhere, namely some driver calling
dma_unmap_* or dma_free_* with the wrong address or size in the first
place.

This condition would arise only if driver calling dma_unmap/free_* 
with 0
iova_pfn. This will be flagged with a warning during unmap but will 
trigger
panic later on while doing unrelated dma_map/unmap_*. If unmapped has 
already
failed for invalid iova, there is no reason we should consider this 
as valid

iova and free. This part should be fixed.


I disagree. In general, if drivers call the DMA API incorrectly it is
liable to lead to data loss, memory corruption, and various other
unpleasant misbehaviour - it is not the DMA layer's job to attempt to
paper over driver bugs.

There *is* an argument for downgrading the BUG_ON() in
iova_magazine_free_pfns() to a WARN_ON(), since frankly it isn't a
sufficiently serious condition to justify killing the whole machine
immediately, but NAK to bodging the iommu-dma mid-layer to "fix" that.
A serious bug already happened elsewhere, so trying to hide the
fallout really doesn't help anyone.


Sorry for delayed response, it was a long weekend.
I agree that invalid DMA API call can result in unexpected issues and 
client
should fix it, but then the present behavior makes it difficult to catch 
cases
when driver is making wrong DMA API calls. When invalid iova pfn is 
passed it
doesn't fail then and there, though DMA layer is aware of iova being 
invalid. It

fails much after that in the context of an valid map/unmap, with BUG_ON().

Downgrading BUG_ON() to WARN_ON() in iova_magazine_free_pfns() will not 
help

much as invalid iova will cause NULL pointer dereference.


Obviously I didn't mean a literal s/BUG/WARN/ substitution - some 
additional control flow to actually handle the error case was implied.


I'll write up the patch myself, since it's easier than further debating.

I see no reason why DMA layer wants to free an iova for which unmapped 
failed.

IMHO queuing an invalid iova (which already failed unmap) to rcache which
eventually going to crash the system looks like iommu-dma layer issue.


What if the unmap fails because the address range is already entirely 
unmapped? Freeing the IOVA (or at least attempting to) would be 
logically appropriate in that case. In fact some IOMMU drivers might not 
even consider that a failure, so the DMA layer may not even be aware 
that it's been handed a bogus unallocated address.


The point is that unmapping *doesn't* fail under normal and correct 
operation, so the DMA layer should not expect to have to handle it. Even 
if it does happen, that's a highly exceptional case that the DMA layer 
cannot recover from by itself; at best it can just push the problem 
elsewhere. It's pretty hard to justify doing extra work to simply move 
an exceptional problem around without really addressing it.


And in this particular case, personally I would *much* rather see 
warnings spewing from both the pagetable and IOVA code as early as 
possible to clearly indicate that the DMA layer itself has been thrown 
out of sync, than just have warnings that might represent some other 
source of pagetable corruption (or at worst, depending on the pagetable 
code, no warnings at all and only have dma_map_*() calls quietly start 
failing much, much later due to all the IOVA space having been leaked by 
bad unmaps).


Robin.

[PATCH RFC 06/13] vhost: reorder functions

2020-06-02 Thread Michael S. Tsirkin

Reorder functions in the file to not rely on forward
declarations, in preparation to making them static
down the road.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index bd52b44b0d23..b4a6e44d56a8 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2256,6 +2256,13 @@ static int fetch_buf(struct vhost_virtqueue *vq)
return 1;
 }
 
+/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
+void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
+{
+   vq->last_avail_idx -= n;
+}
+EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
+
 /* This function returns a value > 0 if a descriptor was found, or 0 if none 
were found.
  * A negative code is returned on error. */
 static int fetch_descs(struct vhost_virtqueue *vq)
@@ -2370,26 +2377,6 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 }
 EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
 
-/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
-void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
-{
-   vq->last_avail_idx -= n;
-}
-EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
-
-/* After we've used one of their buffers, we tell them about it.  We'll then
- * want to notify the guest, using eventfd. */
-int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
-{
-   struct vring_used_elem heads = {
-   cpu_to_vhost32(vq, head),
-   cpu_to_vhost32(vq, len)
-   };
-
-   return vhost_add_used_n(vq, , 1);
-}
-EXPORT_SYMBOL_GPL(vhost_add_used);
-
 static int __vhost_add_used_n(struct vhost_virtqueue *vq,
struct vring_used_elem *heads,
unsigned count)
@@ -2459,6 +2446,19 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct 
vring_used_elem *heads,
 }
 EXPORT_SYMBOL_GPL(vhost_add_used_n);
 
+/* After we've used one of their buffers, we tell them about it.  We'll then
+ * want to notify the guest, using eventfd. */
+int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
+{
+   struct vring_used_elem heads = {
+   cpu_to_vhost32(vq, head),
+   cpu_to_vhost32(vq, len)
+   };
+
+   return vhost_add_used_n(vq, , 1);
+}
+EXPORT_SYMBOL_GPL(vhost_add_used);
+
 static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
__u16 old, new;
-- 
MST

[PATCH RFC 02/13] vhost: use batched version by default

2020-06-02 Thread Michael S. Tsirkin

As testing shows no performance change, switch to that now.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Eugenio Pérez 
Link: https://lore.kernel.org/r/20200401183118.8334-3-epere...@redhat.com
Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 251 +-
 drivers/vhost/vhost.h |   4 -
 2 files changed, 2 insertions(+), 253 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 105fc97af2c8..8f9a07282625 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2038,253 +2038,6 @@ static unsigned next_desc(struct vhost_virtqueue *vq, 
struct vring_desc *desc)
return next;
 }
 
-static int get_indirect(struct vhost_virtqueue *vq,
-   struct iovec iov[], unsigned int iov_size,
-   unsigned int *out_num, unsigned int *in_num,
-   struct vhost_log *log, unsigned int *log_num,
-   struct vring_desc *indirect)
-{
-   struct vring_desc desc;
-   unsigned int i = 0, count, found = 0;
-   u32 len = vhost32_to_cpu(vq, indirect->len);
-   struct iov_iter from;
-   int ret, access;
-
-   /* Sanity check */
-   if (unlikely(len % sizeof desc)) {
-   vq_err(vq, "Invalid length in indirect descriptor: "
-  "len 0x%llx not multiple of 0x%zx\n",
-  (unsigned long long)len,
-  sizeof desc);
-   return -EINVAL;
-   }
-
-   ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, 
vq->indirect,
-UIO_MAXIOV, VHOST_ACCESS_RO);
-   if (unlikely(ret < 0)) {
-   if (ret != -EAGAIN)
-   vq_err(vq, "Translation failure %d in indirect.\n", 
ret);
-   return ret;
-   }
-   iov_iter_init(, READ, vq->indirect, ret, len);
-
-   /* We will use the result as an address to read from, so most
-* architectures only need a compiler barrier here. */
-   read_barrier_depends();
-
-   count = len / sizeof desc;
-   /* Buffers are chained via a 16 bit next field, so
-* we can have at most 2^16 of these. */
-   if (unlikely(count > USHRT_MAX + 1)) {
-   vq_err(vq, "Indirect buffer length too big: %d\n",
-  indirect->len);
-   return -E2BIG;
-   }
-
-   do {
-   unsigned iov_count = *in_num + *out_num;
-   if (unlikely(++found > count)) {
-   vq_err(vq, "Loop detected: last one at %u "
-  "indirect size %u\n",
-  i, count);
-   return -EINVAL;
-   }
-   if (unlikely(!copy_from_iter_full(, sizeof(desc), ))) 
{
-   vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n",
-  i, (size_t)vhost64_to_cpu(vq, indirect->addr) + 
i * sizeof desc);
-   return -EINVAL;
-   }
-   if (unlikely(desc.flags & cpu_to_vhost16(vq, 
VRING_DESC_F_INDIRECT))) {
-   vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n",
-  i, (size_t)vhost64_to_cpu(vq, indirect->addr) + 
i * sizeof desc);
-   return -EINVAL;
-   }
-
-   if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE))
-   access = VHOST_ACCESS_WO;
-   else
-   access = VHOST_ACCESS_RO;
-
-   ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr),
-vhost32_to_cpu(vq, desc.len), iov + 
iov_count,
-iov_size - iov_count, access);
-   if (unlikely(ret < 0)) {
-   if (ret != -EAGAIN)
-   vq_err(vq, "Translation failure %d indirect idx 
%d\n",
-   ret, i);
-   return ret;
-   }
-   /* If this is an input descriptor, increment that count. */
-   if (access == VHOST_ACCESS_WO) {
-   *in_num += ret;
-   if (unlikely(log && ret)) {
-   log[*log_num].addr = vhost64_to_cpu(vq, 
desc.addr);
-   log[*log_num].len = vhost32_to_cpu(vq, 
desc.len);
-   ++*log_num;
-   }
-   } else {
-   /* If it's an output descriptor, they're all supposed
-* to come before any input descriptors. */
-   if (unlikely(*in_num)) {
-   vq_err(vq, "Indirect descriptor "
-  "has out after in: idx %d\n", i);
-   return -EINVAL;
-   }
-

[PATCH RFC 13/13] vhost: drop head based APIs

2020-06-02 Thread Michael S. Tsirkin

Everyone's using buf APIs, no need for head based ones anymore.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 36 
 drivers/vhost/vhost.h | 12 
 2 files changed, 8 insertions(+), 40 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index be822f0c9428..412923cc96df 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2256,12 +2256,12 @@ static int fetch_buf(struct vhost_virtqueue *vq)
return 1;
 }
 
-/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
+/* Revert the effect of fetch_buf. Useful for error handling. */
+static
 void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
 {
vq->last_avail_idx -= n;
 }
-EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
 
 /* This function returns a value > 0 if a descriptor was found, or 0 if none 
were found.
  * A negative code is returned on error. */
@@ -2421,8 +2421,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
return 0;
 }
 
-/* After we've used one of their buffers, we tell them about it.  We'll then
- * want to notify the guest, using eventfd. */
+static
 int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 unsigned count)
 {
@@ -2456,10 +2455,8 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct 
vring_used_elem *heads,
}
return r;
 }
-EXPORT_SYMBOL_GPL(vhost_add_used_n);
 
-/* After we've used one of their buffers, we tell them about it.  We'll then
- * want to notify the guest, using eventfd. */
+static
 int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 {
struct vring_used_elem heads = {
@@ -2469,14 +2466,17 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned 
int head, int len)
 
return vhost_add_used_n(vq, , 1);
 }
-EXPORT_SYMBOL_GPL(vhost_add_used);
 
+/* After we've used one of their buffers, we tell them about it.  We'll then
+ * want to notify the guest, using vhost_signal. */
 int vhost_put_used_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf)
 {
return vhost_add_used(vq, buf->id, buf->in_len);
 }
 EXPORT_SYMBOL_GPL(vhost_put_used_buf);
 
+/* After we've used one of their buffers, we tell them about it.  We'll then
+ * want to notify the guest, using vhost_signal. */
 int vhost_put_used_n_bufs(struct vhost_virtqueue *vq,
  struct vhost_buf *bufs, unsigned count)
 {
@@ -2537,26 +2537,6 @@ void vhost_signal(struct vhost_dev *dev, struct 
vhost_virtqueue *vq)
 }
 EXPORT_SYMBOL_GPL(vhost_signal);
 
-/* And here's the combo meal deal.  Supersize me! */
-void vhost_add_used_and_signal(struct vhost_dev *dev,
-  struct vhost_virtqueue *vq,
-  unsigned int head, int len)
-{
-   vhost_add_used(vq, head, len);
-   vhost_signal(dev, vq);
-}
-EXPORT_SYMBOL_GPL(vhost_add_used_and_signal);
-
-/* multi-buffer version of vhost_add_used_and_signal */
-void vhost_add_used_and_signal_n(struct vhost_dev *dev,
-struct vhost_virtqueue *vq,
-struct vring_used_elem *heads, unsigned count)
-{
-   vhost_add_used_n(vq, heads, count);
-   vhost_signal(dev, vq);
-}
-EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n);
-
 /* return true if we're sure that avaiable ring is empty */
 bool vhost_vq_avail_empty(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 6c10e99ff334..4fcf59153fc7 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -195,11 +195,6 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *arg
 bool vhost_vq_access_ok(struct vhost_virtqueue *vq);
 bool vhost_log_access_ok(struct vhost_dev *);
 
-int vhost_get_vq_desc(struct vhost_virtqueue *,
- struct iovec iov[], unsigned int iov_count,
- unsigned int *out_num, unsigned int *in_num,
- struct vhost_log *log, unsigned int *log_num);
-void vhost_discard_vq_desc(struct vhost_virtqueue *, int n);
 int vhost_get_avail_buf(struct vhost_virtqueue *, struct vhost_buf *buf,
struct iovec iov[], unsigned int iov_count,
unsigned int *out_num, unsigned int *in_num,
@@ -207,13 +202,6 @@ int vhost_get_avail_buf(struct vhost_virtqueue *, struct 
vhost_buf *buf,
 void vhost_discard_avail_bufs(struct vhost_virtqueue *,
  struct vhost_buf *, unsigned count);
 int vhost_vq_init_access(struct vhost_virtqueue *);
-int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len);
-int vhost_add_used_n(struct vhost_virtqueue *, struct vring_used_elem *heads,
-unsigned count);
-void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *,
-  unsigned int id, int len);
-void

[PATCH RFC 08/13] vhost/net: convert to new API: heads->bufs

2020-06-02 Thread Michael S. Tsirkin

Convert vhost net to use the new format-agnostic API.
In particular, don't poke at vq internals such as the
heads array.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/net.c | 153 +++-
 1 file changed, 81 insertions(+), 72 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 749a9cf51a59..47af3d1ce3dd 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -59,13 +59,13 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy 
TX;"
  * status internally; used for zerocopy tx only.
  */
 /* Lower device DMA failed */
-#define VHOST_DMA_FAILED_LEN   ((__force __virtio32)3)
+#define VHOST_DMA_FAILED_LEN   (3)
 /* Lower device DMA done */
-#define VHOST_DMA_DONE_LEN ((__force __virtio32)2)
+#define VHOST_DMA_DONE_LEN (2)
 /* Lower device DMA in progress */
-#define VHOST_DMA_IN_PROGRESS  ((__force __virtio32)1)
+#define VHOST_DMA_IN_PROGRESS  (1)
 /* Buffer unused */
-#define VHOST_DMA_CLEAR_LEN((__force __virtio32)0)
+#define VHOST_DMA_CLEAR_LEN(0)
 
 #define VHOST_DMA_IS_DONE(len) ((__force u32)(len) >= (__force 
u32)VHOST_DMA_DONE_LEN)
 
@@ -112,9 +112,12 @@ struct vhost_net_virtqueue {
/* last used idx for outstanding DMA zerocopy buffers */
int upend_idx;
/* For TX, first used idx for DMA done zerocopy buffers
-* For RX, number of batched heads
+* For RX, number of batched bufs
 */
int done_idx;
+   /* Outstanding user bufs. UIO_MAXIOV in length. */
+   /* TODO: we can make this smaller for sure. */
+   struct vhost_buf *bufs;
/* Number of XDP frames batched */
int batched_xdp;
/* an array of userspace buffers info */
@@ -271,6 +274,8 @@ static void vhost_net_clear_ubuf_info(struct vhost_net *n)
int i;
 
for (i = 0; i < VHOST_NET_VQ_MAX; ++i) {
+   kfree(n->vqs[i].bufs);
+   n->vqs[i].bufs = NULL;
kfree(n->vqs[i].ubuf_info);
n->vqs[i].ubuf_info = NULL;
}
@@ -282,6 +287,12 @@ static int vhost_net_set_ubuf_info(struct vhost_net *n)
int i;
 
for (i = 0; i < VHOST_NET_VQ_MAX; ++i) {
+   n->vqs[i].bufs = kmalloc_array(UIO_MAXIOV,
+  sizeof(*n->vqs[i].bufs),
+  GFP_KERNEL);
+   if (!n->vqs[i].bufs)
+   goto err;
+
zcopy = vhost_net_zcopy_mask & (0x1 << i);
if (!zcopy)
continue;
@@ -364,18 +375,18 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
int j = 0;
 
for (i = nvq->done_idx; i != nvq->upend_idx; i = (i + 1) % UIO_MAXIOV) {
-   if (vq->heads[i].len == VHOST_DMA_FAILED_LEN)
+   if (nvq->bufs[i].in_len == VHOST_DMA_FAILED_LEN)
vhost_net_tx_err(net);
-   if (VHOST_DMA_IS_DONE(vq->heads[i].len)) {
-   vq->heads[i].len = VHOST_DMA_CLEAR_LEN;
+   if (VHOST_DMA_IS_DONE(nvq->bufs[i].in_len)) {
+   nvq->bufs[i].in_len = VHOST_DMA_CLEAR_LEN;
++j;
} else
break;
}
while (j) {
add = min(UIO_MAXIOV - nvq->done_idx, j);
-   vhost_add_used_and_signal_n(vq->dev, vq,
-   >heads[nvq->done_idx], add);
+   vhost_put_used_n_bufs(vq, >bufs[nvq->done_idx], add);
+   vhost_signal(vq->dev, vq);
nvq->done_idx = (nvq->done_idx + add) % UIO_MAXIOV;
j -= add;
}
@@ -390,7 +401,7 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, 
bool success)
rcu_read_lock_bh();
 
/* set len to mark this desc buffers done DMA */
-   nvq->vq.heads[ubuf->desc].in_len = success ?
+   nvq->bufs[ubuf->desc].in_len = success ?
VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
cnt = vhost_net_ubuf_put(ubufs);
 
@@ -452,7 +463,8 @@ static void vhost_net_signal_used(struct 
vhost_net_virtqueue *nvq)
if (!nvq->done_idx)
return;
 
-   vhost_add_used_and_signal_n(dev, vq, vq->heads, nvq->done_idx);
+   vhost_put_used_n_bufs(vq, nvq->bufs, nvq->done_idx);
+   vhost_signal(dev, vq);
nvq->done_idx = 0;
 }
 
@@ -558,6 +570,7 @@ static void vhost_net_busy_poll(struct vhost_net *net,
 
 static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
struct vhost_net_virtqueue *tnvq,
+   struct vhost_buf *buf,
unsigned int *out_num, unsigned int *in_num,
struct msghdr *msghdr, bool *busyloop_intr)
 {
@@ -565,10 +578,10 @@ static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
struct vhost_virtqueue *rvq = >vq;

[PATCH RFC 03/13] vhost: batching fetches

2020-06-02 Thread Michael S. Tsirkin

With this patch applied, new and old code perform identically.

Lots of extra optimizations are now possible, e.g.
we can fetch multiple heads with copy_from/to_user now.
We can get rid of maintaining the log array.  Etc etc.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Eugenio Pérez 
Link: https://lore.kernel.org/r/20200401183118.8334-4-epere...@redhat.com
Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/test.c  |  2 +-
 drivers/vhost/vhost.c | 47 ++-
 drivers/vhost/vhost.h |  5 -
 3 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 9a3a09005e03..02806d6f84ef 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -119,7 +119,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = >dev;
vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
   VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT, NULL);
 
f->private_data = n;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 8f9a07282625..aca2a5b0d078 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -299,6 +299,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 {
vq->num = 1;
vq->ndescs = 0;
+   vq->first_desc = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -367,6 +368,11 @@ static int vhost_worker(void *data)
return 0;
 }
 
+static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)
+{
+   return vq->max_descs - UIO_MAXIOV;
+}
+
 static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
 {
kfree(vq->descs);
@@ -389,6 +395,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
vq->max_descs = dev->iov_limit;
+   if (vhost_vq_num_batch_descs(vq) < 0) {
+   return -EINVAL;
+   }
vq->descs = kmalloc_array(vq->max_descs,
  sizeof(*vq->descs),
  GFP_KERNEL);
@@ -1570,6 +1579,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *arg
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+   vq->ndescs = vq->first_desc = 0;
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
@@ -2136,7 +2146,7 @@ static int fetch_indirect_descs(struct vhost_virtqueue 
*vq,
return 0;
 }
 
-static int fetch_descs(struct vhost_virtqueue *vq)
+static int fetch_buf(struct vhost_virtqueue *vq)
 {
unsigned int i, head, found = 0;
struct vhost_desc *last;
@@ -2149,7 +2159,11 @@ static int fetch_descs(struct vhost_virtqueue *vq)
/* Check it isn't doing very strange things with descriptor numbers. */
last_avail_idx = vq->last_avail_idx;
 
-   if (vq->avail_idx == vq->last_avail_idx) {
+   if (unlikely(vq->avail_idx == vq->last_avail_idx)) {
+   /* If we already have work to do, don't bother re-checking. */
+   if (likely(vq->ndescs))
+   return vq->num;
+
if (unlikely(vhost_get_avail_idx(vq, _idx))) {
vq_err(vq, "Failed to access avail idx at %p\n",
>avail->idx);
@@ -2240,6 +2254,24 @@ static int fetch_descs(struct vhost_virtqueue *vq)
return 0;
 }
 
+static int fetch_descs(struct vhost_virtqueue *vq)
+{
+   int ret = 0;
+
+   if (unlikely(vq->first_desc >= vq->ndescs)) {
+   vq->first_desc = 0;
+   vq->ndescs = 0;
+   }
+
+   if (vq->ndescs)
+   return 0;
+
+   while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq))
+   ret = fetch_buf(vq);
+
+   return vq->ndescs ? 0 : ret;
+}
+
 /* This looks in the virtqueue and for the first available buffer, and converts
  * it to an iovec for convenient access.  Since descriptors consist of some
  * number of output then some number of input descriptors, it's actually two
@@ -2265,7 +2297,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
if (unlikely(log))
*log_num = 0;
 
-   for (i = 0; i < vq->ndescs; ++i) {
+   for (i = vq->first_desc; i < vq->ndescs; ++i) {
unsigned iov_count = *in_num + *out_num;
struct vhost_desc *desc = >descs[i];
int access;
@@ -2311,14 +2343,19 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
}
 
ret = desc->id;
+
+   if (!(desc->flags & VRING_DESC_F_NEXT))
+

[PATCH RFC 07/13] vhost: format-independent API for used buffers

2020-06-02 Thread Michael S. Tsirkin

Add a new API that doesn't assume used ring, heads, etc.
For now, we keep the old APIs around to make it easier
to convert drivers.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 52 ++-
 drivers/vhost/vhost.h | 17 +-
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b4a6e44d56a8..be822f0c9428 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2292,13 +2292,12 @@ static int fetch_descs(struct vhost_virtqueue *vq)
  * number of output then some number of input descriptors, it's actually two
  * iovecs, but we pack them into one and note how many of each there were.
  *
- * This function returns the descriptor number found, or vq->num (which is
- * never a valid descriptor number) if none was found.  A negative code is
- * returned on error. */
-int vhost_get_vq_desc(struct vhost_virtqueue *vq,
- struct iovec iov[], unsigned int iov_size,
- unsigned int *out_num, unsigned int *in_num,
- struct vhost_log *log, unsigned int *log_num)
+ * This function returns a value > 0 if a descriptor was found, or 0 if none 
were found.
+ * A negative code is returned on error. */
+int vhost_get_avail_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf,
+   struct iovec iov[], unsigned int iov_size,
+   unsigned int *out_num, unsigned int *in_num,
+   struct vhost_log *log, unsigned int *log_num)
 {
int ret = fetch_descs(vq);
int i;
@@ -2311,6 +2310,8 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
*out_num = *in_num = 0;
if (unlikely(log))
*log_num = 0;
+   buf->in_len = buf->out_len = 0;
+   buf->descs = 0;
 
for (i = vq->first_desc; i < vq->ndescs; ++i) {
unsigned iov_count = *in_num + *out_num;
@@ -2340,6 +2341,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
/* If this is an input descriptor,
 * increment that count. */
*in_num += ret;
+   buf->in_len += desc->len;
if (unlikely(log && ret)) {
log[*log_num].addr = desc->addr;
log[*log_num].len = desc->len;
@@ -2355,9 +2357,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
goto err;
}
*out_num += ret;
+   buf->out_len += desc->len;
}
 
-   ret = desc->id;
+   buf->id = desc->id;
+   ++buf->descs;
 
if (!(desc->flags & VRING_DESC_F_NEXT))
break;
@@ -2365,7 +2369,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
vq->first_desc = i + 1;
 
-   return ret;
+   return 1;
 
 err:
for (i = vq->first_desc; i < vq->ndescs; ++i)
@@ -2375,7 +2379,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
return ret;
 }
-EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
+EXPORT_SYMBOL_GPL(vhost_get_avail_buf);
+
+/* Reverse the effect of vhost_get_avail_buf. Useful for error handling. */
+void vhost_discard_avail_bufs(struct vhost_virtqueue *vq,
+ struct vhost_buf *buf, unsigned count)
+{
+   vhost_discard_vq_desc(vq, count);
+}
+EXPORT_SYMBOL_GPL(vhost_discard_avail_bufs);
 
 static int __vhost_add_used_n(struct vhost_virtqueue *vq,
struct vring_used_elem *heads,
@@ -2459,6 +2471,26 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned 
int head, int len)
 }
 EXPORT_SYMBOL_GPL(vhost_add_used);
 
+int vhost_put_used_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf)
+{
+   return vhost_add_used(vq, buf->id, buf->in_len);
+}
+EXPORT_SYMBOL_GPL(vhost_put_used_buf);
+
+int vhost_put_used_n_bufs(struct vhost_virtqueue *vq,
+ struct vhost_buf *bufs, unsigned count)
+{
+   unsigned i;
+
+   for (i = 0; i < count; ++i) {
+   vq->heads[i].id = cpu_to_vhost32(vq, bufs[i].id);
+   vq->heads[i].len = cpu_to_vhost32(vq, bufs[i].in_len);
+   }
+
+   return vhost_add_used_n(vq, vq->heads, count);
+}
+EXPORT_SYMBOL_GPL(vhost_put_used_n_bufs);
+
 static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
__u16 old, new;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index a67bda9792ec..6c10e99ff334 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -67,6 +67,13 @@ struct vhost_desc {
u16 id;
 };
 
+struct vhost_buf {
+   u32 out_len;
+   u32 in_len;
+   u16 descs;
+   u16 id;
+};
+
 /* The virtqueue structure describes a queue attached to a device. */
 struct vhost_virtqueue {
struct vhost_dev *dev;
@@ -193,7 +200,12 @@ int

[PATCH RFC 11/13] vhost/scsi: switch to buf APIs

2020-06-02 Thread Michael S. Tsirkin

Switch to buf APIs. Doing this exposes a spec violation in vhost scsi:
all used bufs are marked with length 0.
Fix that is left for another day.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/scsi.c | 73 ++--
 1 file changed, 44 insertions(+), 29 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index c39952243fd3..c426c4e899c7 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -71,8 +71,8 @@ struct vhost_scsi_inflight {
 };
 
 struct vhost_scsi_cmd {
-   /* Descriptor from vhost_get_vq_desc() for virt_queue segment */
-   int tvc_vq_desc;
+   /* Descriptor from vhost_get_avail_buf() for virt_queue segment */
+   struct vhost_buf tvc_vq_desc;
/* virtio-scsi initiator task attribute */
int tvc_task_attr;
/* virtio-scsi response incoming iovecs */
@@ -213,7 +213,7 @@ struct vhost_scsi {
  * Context for processing request and control queue operations.
  */
 struct vhost_scsi_ctx {
-   int head;
+   struct vhost_buf buf;
unsigned int out, in;
size_t req_size, rsp_size;
size_t out_size, in_size;
@@ -443,6 +443,20 @@ static int vhost_scsi_check_stop_free(struct se_cmd 
*se_cmd)
return target_put_sess_cmd(se_cmd);
 }
 
+/* Signal to guest that request finished with no input buffer. */
+/* TODO calling this when writing into buffer and most likely a bug */
+static void vhost_scsi_signal_noinput(struct vhost_dev *vdev,
+ struct vhost_virtqueue *vq,
+ struct vhost_buf *bufp)
+{
+   struct vhost_buf buf = *bufp;
+
+   buf.in_len = 0;
+   vhost_put_used_buf(vq, );
+   vhost_signal(vdev, vq);
+}
+
+
 static void
 vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
 {
@@ -450,7 +464,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct 
vhost_scsi_evt *evt)
struct virtio_scsi_event *event = >event;
struct virtio_scsi_event __user *eventp;
unsigned out, in;
-   int head, ret;
+   struct vhost_buf buf;
+   int ret;
 
if (!vhost_vq_get_backend(vq)) {
vs->vs_events_missed = true;
@@ -459,14 +474,14 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct 
vhost_scsi_evt *evt)
 
 again:
vhost_disable_notify(>dev, vq);
-   head = vhost_get_vq_desc(vq, vq->iov,
-   ARRAY_SIZE(vq->iov), , ,
-   NULL, NULL);
-   if (head < 0) {
+   ret = vhost_get_avail_buf(vq, ,
+ vq->iov, ARRAY_SIZE(vq->iov), , ,
+ NULL, NULL);
+   if (ret < 0) {
vs->vs_events_missed = true;
return;
}
-   if (head == vq->num) {
+   if (!ret) {
if (vhost_enable_notify(>dev, vq))
goto again;
vs->vs_events_missed = true;
@@ -488,7 +503,7 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct 
vhost_scsi_evt *evt)
eventp = vq->iov[out].iov_base;
ret = __copy_to_user(eventp, event, sizeof(*event));
if (!ret)
-   vhost_add_used_and_signal(>dev, vq, head, 0);
+   vhost_scsi_signal_noinput(>dev, vq, );
else
vq_err(vq, "Faulted on vhost_scsi_send_event\n");
 }
@@ -549,7 +564,7 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work 
*work)
ret = copy_to_iter(_rsp, sizeof(v_rsp), _iter);
if (likely(ret == sizeof(v_rsp))) {
struct vhost_scsi_virtqueue *q;
-   vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0);
+   vhost_put_used_buf(cmd->tvc_vq, >tvc_vq_desc);
q = container_of(cmd->tvc_vq, struct 
vhost_scsi_virtqueue, vq);
vq = q - vs->vqs;
__set_bit(vq, signal);
@@ -793,7 +808,7 @@ static void vhost_scsi_submission_work(struct work_struct 
*work)
 static void
 vhost_scsi_send_bad_target(struct vhost_scsi *vs,
   struct vhost_virtqueue *vq,
-  int head, unsigned out)
+  struct vhost_buf *buf, unsigned out)
 {
struct virtio_scsi_cmd_resp __user *resp;
struct virtio_scsi_cmd_resp rsp;
@@ -804,7 +819,7 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs,
resp = vq->iov[out].iov_base;
ret = __copy_to_user(resp, , sizeof(rsp));
if (!ret)
-   vhost_add_used_and_signal(>dev, vq, head, 0);
+   vhost_scsi_signal_noinput(>dev, vq, buf);
else
pr_err("Faulted on virtio_scsi_cmd_resp\n");
 }
@@ -813,21 +828,21 @@ static int
 vhost_scsi_get_desc(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
struct vhost_scsi_ctx *vc)
 {
-   int ret = -ENXIO;
+   int r, ret = -ENXIO;
 
-   vc->head =

[PATCH RFC 12/13] vhost/vsock: switch to the buf API

2020-06-02 Thread Michael S. Tsirkin

A straight-forward conversion.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vsock.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index fb4e944c4d0d..07d1fb340fb4 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -103,7 +103,8 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
unsigned out, in;
size_t nbytes;
size_t iov_len, payload_len;
-   int head;
+   struct vhost_buf buf;
+   int ret;
 
spin_lock_bh(>send_pkt_list_lock);
if (list_empty(>send_pkt_list)) {
@@ -117,16 +118,17 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
list_del_init(>list);
spin_unlock_bh(>send_pkt_list_lock);
 
-   head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
-, , NULL, NULL);
-   if (head < 0) {
+   ret = vhost_get_avail_buf(vq, ,
+ vq->iov, ARRAY_SIZE(vq->iov),
+ , , NULL, NULL);
+   if (ret < 0) {
spin_lock_bh(>send_pkt_list_lock);
list_add(>list, >send_pkt_list);
spin_unlock_bh(>send_pkt_list_lock);
break;
}
 
-   if (head == vq->num) {
+   if (!ret) {
spin_lock_bh(>send_pkt_list_lock);
list_add(>list, >send_pkt_list);
spin_unlock_bh(>send_pkt_list_lock);
@@ -186,7 +188,8 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
 */
virtio_transport_deliver_tap_pkt(pkt);
 
-   vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
+   buf.in_len = sizeof(pkt->hdr) + payload_len;
+   vhost_put_used_buf(vq, );
added = true;
 
pkt->off += payload_len;
@@ -440,7 +443,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work 
*work)
struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
 dev);
struct virtio_vsock_pkt *pkt;
-   int head, pkts = 0, total_len = 0;
+   int ret, pkts = 0, total_len = 0;
+   struct vhost_buf buf;
unsigned int out, in;
bool added = false;
 
@@ -461,12 +465,13 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work 
*work)
goto no_more_replies;
}
 
-   head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
-, , NULL, NULL);
-   if (head < 0)
+   ret = vhost_get_avail_buf(vq, ,
+ vq->iov, ARRAY_SIZE(vq->iov),
+ , , NULL, NULL);
+   if (ret < 0)
break;
 
-   if (head == vq->num) {
+   if (!ret) {
if (unlikely(vhost_enable_notify(>dev, vq))) {
vhost_disable_notify(>dev, vq);
continue;
@@ -494,7 +499,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work 
*work)
virtio_transport_free_pkt(pkt);
 
len += sizeof(pkt->hdr);
-   vhost_add_used(vq, head, len);
+   buf.in_len = len;
+   vhost_put_used_buf(vq, );
total_len += len;
added = true;
} while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
-- 
MST

[PATCH RFC 01/13] vhost: option to fetch descriptors through an independent struct

2020-06-02 Thread Michael S. Tsirkin

The idea is to support multiple ring formats by converting
to a format-independent array of descriptors.

This costs extra cycles, but we gain in ability
to fetch a batch of descriptors in one go, which
is good for code cache locality.

When used, this causes a minor performance degradation,
it's been kept as simple as possible for ease of review.
A follow-up patch gets us back the performance by adding batching.

To simplify benchmarking, I kept the old code around so one can switch
back and forth between old and new code. This will go away in the final
submission.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Eugenio Pérez 
Link: https://lore.kernel.org/r/20200401183118.8334-2-epere...@redhat.com
Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/vhost.c | 297 +-
 drivers/vhost/vhost.h |  16 +++
 2 files changed, 312 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 96d9871fa0cb..105fc97af2c8 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -298,6 +298,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
   struct vhost_virtqueue *vq)
 {
vq->num = 1;
+   vq->ndescs = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -368,6 +369,9 @@ static int vhost_worker(void *data)
 
 static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
 {
+   kfree(vq->descs);
+   vq->descs = NULL;
+   vq->max_descs = 0;
kfree(vq->indirect);
vq->indirect = NULL;
kfree(vq->log);
@@ -384,6 +388,10 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
 
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
+   vq->max_descs = dev->iov_limit;
+   vq->descs = kmalloc_array(vq->max_descs,
+ sizeof(*vq->descs),
+ GFP_KERNEL);
vq->indirect = kmalloc_array(UIO_MAXIOV,
 sizeof(*vq->indirect),
 GFP_KERNEL);
@@ -391,7 +399,7 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
GFP_KERNEL);
vq->heads = kmalloc_array(dev->iov_limit, sizeof(*vq->heads),
  GFP_KERNEL);
-   if (!vq->indirect || !vq->log || !vq->heads)
+   if (!vq->indirect || !vq->log || !vq->heads || !vq->descs)
goto err_nomem;
}
return 0;
@@ -2277,6 +2285,293 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 }
 EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
 
+static struct vhost_desc *peek_split_desc(struct vhost_virtqueue *vq)
+{
+   BUG_ON(!vq->ndescs);
+   return >descs[vq->ndescs - 1];
+}
+
+static void pop_split_desc(struct vhost_virtqueue *vq)
+{
+   BUG_ON(!vq->ndescs);
+   --vq->ndescs;
+}
+
+#define VHOST_DESC_FLAGS (VRING_DESC_F_INDIRECT | VRING_DESC_F_WRITE | \
+ VRING_DESC_F_NEXT)
+static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc 
*desc, u16 id)
+{
+   struct vhost_desc *h;
+
+   if (unlikely(vq->ndescs >= vq->max_descs))
+   return -EINVAL;
+   h = >descs[vq->ndescs++];
+   h->addr = vhost64_to_cpu(vq, desc->addr);
+   h->len = vhost32_to_cpu(vq, desc->len);
+   h->flags = vhost16_to_cpu(vq, desc->flags) & VHOST_DESC_FLAGS;
+   h->id = id;
+
+   return 0;
+}
+
+static int fetch_indirect_descs(struct vhost_virtqueue *vq,
+   struct vhost_desc *indirect,
+   u16 head)
+{
+   struct vring_desc desc;
+   unsigned int i = 0, count, found = 0;
+   u32 len = indirect->len;
+   struct iov_iter from;
+   int ret;
+
+   /* Sanity check */
+   if (unlikely(len % sizeof desc)) {
+   vq_err(vq, "Invalid length in indirect descriptor: "
+  "len 0x%llx not multiple of 0x%zx\n",
+  (unsigned long long)len,
+  sizeof desc);
+   return -EINVAL;
+   }
+
+   ret = translate_desc(vq, indirect->addr, len, vq->indirect,
+UIO_MAXIOV, VHOST_ACCESS_RO);
+   if (unlikely(ret < 0)) {
+   if (ret != -EAGAIN)
+   vq_err(vq, "Translation failure %d in indirect.\n", 
ret);
+   return ret;
+   }
+   iov_iter_init(, READ, vq->indirect, ret, len);
+
+   /* We will use the result as an address to read from, so most
+* architectures only need a compiler barrier here. */
+   read_barrier_depends();
+
+   count = len / sizeof desc;
+   /* Buffers are chained via a 16 bit next field, so
+* we can have at most 2^16 of these. */
+   if (unlikely(count > USHRT_MAX + 1)) {
+   vq_err(vq,

Re: [PATCH] tcp: fix TCP socks unreleased in BBR mode

2020-06-02 Thread Eric Dumazet

On Tue, Jun 2, 2020 at 1:05 AM  wrote:
>
> From: Jason Xing 
>
> TCP socks cannot be released because of the sock_hold() increasing the
> sk_refcnt in the manner of tcp_internal_pacing() when RTO happens.
> Therefore, this situation could increase the slab memory and then trigger
> the OOM if the machine has beening running for a long time. This issue,
> however, can happen on some machine only running a few days.
>
> We add one exception case to avoid unneeded use of sock_hold if the
> pacing_timer is enqueued.
>
> Reproduce procedure:
> 0) cat /proc/slabinfo | grep TCP
> 1) switch net.ipv4.tcp_congestion_control to bbr
> 2) using wrk tool something like that to send packages
> 3) using tc to increase the delay in the dev to simulate the busy case.
> 4) cat /proc/slabinfo | grep TCP
> 5) kill the wrk command and observe the number of objects and slabs in TCP.
> 6) at last, you could notice that the number would not decrease.
>
> Signed-off-by: Jason Xing 
> Signed-off-by: liweishi 
> Signed-off-by: Shujin Li 
> ---
>  net/ipv4/tcp_output.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index cc4ba42..5cf63d9 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -969,7 +969,8 @@ static void tcp_internal_pacing(struct sock *sk, const 
> struct sk_buff *skb)
> u64 len_ns;
> u32 rate;
>
> -   if (!tcp_needs_internal_pacing(sk))
> +   if (!tcp_needs_internal_pacing(sk) ||
> +   hrtimer_is_queued(_sk(sk)->pacing_timer))
> return;
> rate = sk->sk_pacing_rate;
> if (!rate || rate == ~0U)
> --
> 1.8.3.1
>

Hi Jason.

Please do not send patches that do not apply to current upstream trees.

Instead, backport to your kernels the needed fixes.

I suspect that you are not using a pristine linux kernel, but some
heavily modified one and something went wrong in your backports.
Do not ask us to spend time finding what went wrong.

Thank you.

[PATCH RFC 00/13] vhost: format independence

2020-06-02 Thread Michael S. Tsirkin

We let the specifics of the ring format seep through to vhost API
callers - mostly because there was only one format so it was
hard to imagine what an independent API would look like.
Now that there's an alternative in form of the packed ring,
it's easier to see the issues, and fixing them is perhaps
the cleanest way to add support for more formats.

This patchset does this by indtroducing two new structures: vhost_buf to
represent a buffer and vhost_desc to represent a descriptor.
Descriptors aren't normally of interest to devices but do occationally
get exposed e.g. for logging.

Perhaps surprisingly, the higher level API actually makes things a bit
easier for callers, as well as allows more freedom for the vhost core.
The end result is basically unchanged performance (based on preliminary
testing) even though we are forced to go through a format conversion.

The conversion also exposed (more) bugs in vhost scsi - which isn't
really surprising, that driver needs a lot more love than it's getting.

Very lightly tested. Would appreciate feedback and testing.

Michael S. Tsirkin (13):
  vhost: option to fetch descriptors through an independent struct
  vhost: use batched version by default
  vhost: batching fetches
  vhost: cleanup fetch_buf return code handling
  vhost/net: pass net specific struct pointer
  vhost: reorder functions
  vhost: format-independent API for used buffers
  vhost/net: convert to new API: heads->bufs
  vhost/net: avoid iov length math
  vhost/test: convert to the buf API
  vhost/scsi: switch to buf APIs
  vhost/vsock: switch to the buf API
  vhost: drop head based APIs

 drivers/vhost/net.c   | 173 +--
 drivers/vhost/scsi.c  |  73 
 drivers/vhost/test.c  |  22 +--
 drivers/vhost/vhost.c | 375 +++---
 drivers/vhost/vhost.h |  46 --
 drivers/vhost/vsock.c |  30 ++--
 6 files changed, 436 insertions(+), 283 deletions(-)

-- 
MST

[PATCH v14 04/15] mm/damon: Adaptively adjust regions

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

At the beginning of the monitoring, DAMON constructs the initial regions
by evenly splitting the memory mapped address space of the process into
the user-specified minimal number of regions.  In this initial state,
the assumption of the regions (pages in same region have similar access
frequencies) is normally not kept and thus the monitoring quality could
be low.  To keep the assumption as much as possible, DAMON adaptively
merges and splits each region.

For each ``aggregation interval``, it compares the access frequencies of
adjacent regions and merges those if the frequency difference is small.
Then, after it reports and clears the aggregated access frequency of
each region, it splits each region into two regions if the total number
of regions is smaller than the half of the user-specified maximum number
of regions.

In this way, DAMON provides its best-effort quality and minimal overhead
while keeping the bounds users set for their trade-off.

Signed-off-by: SeongJae Park 
---
 include/linux/damon.h |   7 +-
 mm/damon.c| 174 +++---
 2 files changed, 169 insertions(+), 12 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index f0fe4520a4e9..babdba6b5c47 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -56,6 +56,7 @@ struct damon_task {
  * @sample_interval:   The time between access samplings.
  * @aggr_interval: The time between monitor results aggregations.
  * @min_nr_regions:The number of initial monitoring regions.
+ * @max_nr_regions:The maximum number of monitoring regions.
  *
  * For each @sample_interval, DAMON checks whether each region is accessed or
  * not.  It aggregates and keeps the access information (number of accesses to
@@ -81,6 +82,7 @@ struct damon_ctx {
unsigned long sample_interval;
unsigned long aggr_interval;
unsigned long min_nr_regions;
+   unsigned long max_nr_regions;
 
struct timespec64 last_aggregation;
 
@@ -92,8 +94,9 @@ struct damon_ctx {
 };
 
 int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
-int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
-   unsigned long aggr_int, unsigned long min_nr_reg);
+int damon_set_attrs(struct damon_ctx *ctx,
+   unsigned long sample_int, unsigned long aggr_int,
+   unsigned long min_nr_reg, unsigned long max_nr_reg);
 int damon_start(struct damon_ctx *ctx);
 int damon_stop(struct damon_ctx *ctx);
 
diff --git a/mm/damon.c b/mm/damon.c
index b374544cdd78..6b8f038331f5 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -332,9 +332,12 @@ static int damon_three_regions_of(struct damon_task *t,
  * regions is wasteful.  That said, because we can deal with small noises,
  * tracking every mapping is not strictly required but could even incur a high
  * overhead if the mapping frequently changes or the number of mappings is
- * high.  Nonetheless, this may seems very weird.  DAMON's dynamic regions
- * adjustment mechanism, which will be implemented with following commit will
- * make this more sense.
+ * high.  The adaptive regions adjustment mechanism will further help to deal
+ * with the noise by simply identifying the unmapped areas as a region that
+ * has no access.  Moreover, applying the real mappings that would have many
+ * unmapped areas inside will make the adaptive mechanism quite complex.  That
+ * said, too huge unmapped areas inside the monitoring target should be removed
+ * to not take the time for the adaptive mechanism.
  *
  * For the reason, we convert the complex mappings to three distinct regions
  * that cover every mapped area of the address space.  Also the two gaps
@@ -508,20 +511,25 @@ static void damon_check_access(struct damon_ctx *ctx,
last_addr = r->sampling_addr;
 }
 
-static void kdamond_check_accesses(struct damon_ctx *ctx)
+static unsigned int kdamond_check_accesses(struct damon_ctx *ctx)
 {
struct damon_task *t;
struct mm_struct *mm;
struct damon_region *r;
+   unsigned int max_nr_accesses = 0;
 
damon_for_each_task(t, ctx) {
mm = damon_get_mm(t);
if (!mm)
continue;
-   damon_for_each_region(r, t)
+   damon_for_each_region(r, t) {
damon_check_access(ctx, mm, r);
+   max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+   }
+
mmput(mm);
}
+   return max_nr_accesses;
 }
 
 /*
@@ -570,6 +578,141 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
}
 }
 
+#define sz_damon_region(r) (r->vm_end - r->vm_start)
+
+/*
+ * Merge two adjacent regions into one region
+ */
+static void damon_merge_two_regions(struct damon_region *l,
+   struct damon_region *r)
+{
+   l->nr_accesses = (l->nr_accesses *

[PATCH v14 03/15] mm/damon: Implement region based sampling

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit implements DAMON's basic access check and region based
sampling mechanisms.  This change would seems make no sense, mainly
because it is only a part of the DAMON's logics.  Following two commits
will make more sense.

Basic Access Check
--

DAMON basically reports what pages are how frequently accessed.  Note
that the frequency is not an absolute number of accesses, but a relative
frequency among the pages of the target workloads.

Users can control the resolution of the reports by setting two time
intervals, ``sampling interval`` and ``aggregation interval``.  In
detail, DAMON checks access to each page per ``sampling interval``,
aggregates the results (counts the number of the accesses to each page),
and reports the aggregated results per ``aggregation interval``.  For
the access check of each page, DAMON uses the Accessed bits of PTEs.

This is thus similar to common periodic access checks based access
tracking mechanisms, which overhead is increasing as the size of the
target process grows.

Region Based Sampling
-

To avoid the unbounded increase of the overhead, DAMON groups a number
of adjacent pages that assumed to have same access frequencies into a
region.  As long as the assumption (pages in a region have same access
frequencies) is kept, only one page in the region is required to be
checked.  Thus, for each ``sampling interval``, DAMON randomly picks one
page in each region and clears its Accessed bit.  After one more
``sampling interval``, DAMON reads the Accessed bit of the page and
increases the access frequency of the region if the bit has set
meanwhile.  Therefore, the monitoring overhead is controllable by
setting the number of regions.

Nonetheless, this scheme cannot preserve the quality of the output if
the assumption is not kept.  Following commit will introduce how we can
make the guarantee with best effort.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/linux/damon.h |  48 +++-
 mm/damon.c| 597 ++
 2 files changed, 644 insertions(+), 1 deletion(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 13564929..f0fe4520a4e9 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -11,6 +11,8 @@
 #define _DAMON_H_
 
 #include 
+#include 
+#include 
 #include 
 
 /**
@@ -44,11 +46,55 @@ struct damon_task {
 };
 
 /**
- * struct damon_ctx - Represents a context for each monitoring.
+ * struct damon_ctx - Represents a context for each monitoring.  This is the
+ * main interface that allows users to set the attributes and get the results
+ * of the monitoring.
+ *
+ * For each monitoring request (damon_start()), a kernel thread for the
+ * monitoring is created.  The pointer to the thread is stored in @kdamond.
+ *
+ * @sample_interval:   The time between access samplings.
+ * @aggr_interval: The time between monitor results aggregations.
+ * @min_nr_regions:The number of initial monitoring regions.
+ *
+ * For each @sample_interval, DAMON checks whether each region is accessed or
+ * not.  It aggregates and keeps the access information (number of accesses to
+ * each region) for @aggr_interval time.  All time intervals are in
+ * micro-seconds.
+ *
+ * @kdamond:   Kernel thread who does the monitoring.
+ * @kdamond_stop:  Notifies whether kdamond should stop.
+ * @kdamond_lock:  Mutex for the synchronizations with @kdamond.
+ *
+ * The monitoring thread sets @kdamond to NULL when it terminates.  Therefore,
+ * users can know whether the monitoring is ongoing or terminated by reading
+ * @kdamond.  Also, users can ask @kdamond to be terminated by writing non-zero
+ * to @kdamond_stop.  Reads and writes to @kdamond and @kdamond_stop from
+ * outside of the monitoring thread must be protected by @kdamond_lock.
+ *
+ * Note that the monitoring thread protects only @kdamond and @kdamond_stop via
+ * @kdamond_lock.  Accesses to other fields must be protected by themselves.
+ *
  * @tasks_list:Head of monitoring target tasks (_task) 
list.
  */
 struct damon_ctx {
+   unsigned long sample_interval;
+   unsigned long aggr_interval;
+   unsigned long min_nr_regions;
+
+   struct timespec64 last_aggregation;
+
+   struct task_struct *kdamond;
+   bool kdamond_stop;
+   struct mutex kdamond_lock;
+
struct list_head tasks_list;/* 'damon_task' objects */
 };
 
+int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
+int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
+   unsigned long aggr_int, unsigned long min_nr_reg);
+int damon_start(struct damon_ctx *ctx);
+int damon_stop(struct damon_ctx *ctx);
+
 #endif
diff --git a/mm/damon.c b/mm/damon.c
index a5a7820ef0ad..b374544cdd78 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -10,10 +10,19 @@
 #define pr_fmt(fmt) "damon: " fmt

[PATCH v14 02/15] mm: Introduce Data Access MONitor (DAMON)

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit introduces a kernel module named DAMON.  Note that this
commit is implementing only the stub for the module load/unload, basic
data structures, and simple manipulation functions of the structures to
keep the size of commit small.  The core mechanisms of DAMON will be
implemented one by one by following commits.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 include/linux/damon.h |  54 +
 mm/Kconfig|  12 +++
 mm/Makefile   |   1 +
 mm/damon.c| 173 ++
 4 files changed, 240 insertions(+)
 create mode 100644 include/linux/damon.h
 create mode 100644 mm/damon.c

diff --git a/include/linux/damon.h b/include/linux/damon.h
new file mode 100644
index ..13564929
--- /dev/null
+++ b/include/linux/damon.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DAMON api
+ *
+ * Copyright 2019-2020 Amazon.com, Inc. or its affiliates.
+ *
+ * Author: SeongJae Park 
+ */
+
+#ifndef _DAMON_H_
+#define _DAMON_H_
+
+#include 
+#include 
+
+/**
+ * struct damon_region - Represents a monitoring target region of
+ * [@vm_start, @vm_end).
+ *
+ * @vm_start:  Start address of the region (inclusive).
+ * @vm_end:End address of the region (exclusive).
+ * @sampling_addr: Address of the sample for the next access check.
+ * @nr_accesses:   Access frequency of this region.
+ * @list:  List head for siblings.
+ */
+struct damon_region {
+   unsigned long vm_start;
+   unsigned long vm_end;
+   unsigned long sampling_addr;
+   unsigned int nr_accesses;
+   struct list_head list;
+};
+
+/**
+ * struct damon_task - Represents a monitoring target task.
+ * @pid:   Process id of the task.
+ * @regions_list:  Head of the monitoring target regions of this task.
+ * @list:  List head for siblings.
+ */
+struct damon_task {
+   int pid;
+   struct list_head regions_list;
+   struct list_head list;
+};
+
+/**
+ * struct damon_ctx - Represents a context for each monitoring.
+ * @tasks_list:Head of monitoring target tasks (_task) 
list.
+ */
+struct damon_ctx {
+   struct list_head tasks_list;/* 'damon_task' objects */
+};
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..ecea0889ea35 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -867,4 +867,16 @@ config ARCH_HAS_HUGEPD
 config MAPPING_DIRTY_HELPERS
 bool
 
+config DAMON
+   tristate "Data Access Monitor"
+   depends on MMU
+   help
+ Provides data access monitoring.
+
+ DAMON is a kernel module that allows users to monitor the actual
+ memory access pattern of specific user-space processes.  It aims to
+ be 1) accurate enough to be useful for performance-centric domains,
+ and 2) sufficiently light-weight so that it can be applied online.
+ If unsure, say N.
+
 endmenu
diff --git a/mm/Makefile b/mm/Makefile
index fccd3756b25f..230e545b6e07 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -112,3 +112,4 @@ obj-$(CONFIG_MEMFD_CREATE) += memfd.o
 obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o
 obj-$(CONFIG_PTDUMP_CORE) += ptdump.o
 obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
+obj-$(CONFIG_DAMON) += damon.o
diff --git a/mm/damon.c b/mm/damon.c
new file mode 100644
index ..a5a7820ef0ad
--- /dev/null
+++ b/mm/damon.c
@@ -0,0 +1,173 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Data Access Monitor
+ *
+ * Copyright 2019-2020 Amazon.com, Inc. or its affiliates.
+ *
+ * Author: SeongJae Park 
+ */
+
+#define pr_fmt(fmt) "damon: " fmt
+
+#include 
+#include 
+#include 
+#include 
+
+#define damon_get_task_struct(t) \
+   (get_pid_task(find_vpid(t->pid), PIDTYPE_PID))
+
+#define damon_next_region(r) \
+   (container_of(r->list.next, struct damon_region, list))
+
+#define damon_prev_region(r) \
+   (container_of(r->list.prev, struct damon_region, list))
+
+#define damon_for_each_region(r, t) \
+   list_for_each_entry(r, >regions_list, list)
+
+#define damon_for_each_region_safe(r, next, t) \
+   list_for_each_entry_safe(r, next, >regions_list, list)
+
+#define damon_for_each_task(t, ctx) \
+   list_for_each_entry(t, &(ctx)->tasks_list, list)
+
+#define damon_for_each_task_safe(t, next, ctx) \
+   list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list)
+
+/* Get a random number in [l, r) */
+#define damon_rand(l, r) (l + prandom_u32() % (r - l))
+
+/*
+ * Construct a damon_region struct
+ *
+ * Returns the pointer to the new struct if success, or NULL otherwise
+ */
+static struct damon_region *damon_new_region(struct damon_ctx *ctx,
+   unsigned long vm_start, unsigned long vm_end)
+{
+   struct damon_region *region;
+
+   region = kmalloc(sizeof(*region), GFP_KERNEL);
+   if (!region)
+   return NULL;
+
+   region->vm_start

Re: Security Random Number Generator support

2020-06-02 Thread Marc Zyngier


On 2020-06-02 13:14, Ard Biesheuvel wrote:

On Tue, 2 Jun 2020 at 10:15, Neal Liu  wrote:


These patch series introduce a security random number generator
which provides a generic interface to get hardware rnd from Secure
state. The Secure state can be Arm Trusted Firmware(ATF), Trusted
Execution Environment(TEE), or even EL2 hypervisor.

Patch #1..2 adds sec-rng kernel driver for Trustzone based SoCs.
For security awareness SoCs on ARMv8 with TrustZone enabled,
peripherals like entropy sources is not accessible from normal world
(linux) and rather accessible from secure world (HYP/ATF/TEE) only.
This driver aims to provide a generic interface to Arm Trusted
Firmware or Hypervisor rng service.


changes since v1:
- rename mt67xx-rng to mtk-sec-rng since all MediaTek ARMv8 SoCs can 
reuse

  this driver.
  - refine coding style and unnecessary check.

  changes since v2:
  - remove unused comments.
  - remove redundant variable.

  changes since v3:
  - add dt-bindings for MediaTek rng with TrustZone enabled.
  - revise HWRNG SMC call fid.

  changes since v4:
  - move bindings to the arm/firmware directory.
  - revise driver init flow to check more property.

  changes since v5:
  - refactor to more generic security rng driver which
is not platform specific.

*** BLURB HERE ***

Neal Liu (2):
  dt-bindings: rng: add bindings for sec-rng
  hwrng: add sec-rng driver



There is no reason to model a SMC call as a driver, and represent it
via a DT node like this.


+1.


It would be much better if this SMC interface is made truly generic,
and wired into the arch_get_random() interface, which can be used much
earlier.


Wasn't there a plan to standardize a SMC call to rule them all?

M.
--
Who you jivin' with that Cosmik Debris?

[PATCH v14 01/15] mm/page_ext: Export lookup_page_ext() to GPL modules

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

This commit exports 'lookup_page_ext()' to GPL modules.  This will be
used by DAMON.

Signed-off-by: SeongJae Park 
Reviewed-by: Leonard Foerster 
---
 mm/page_ext.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/page_ext.c b/mm/page_ext.c
index a3616f7a0e9e..9d802d01fcb5 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -131,6 +131,7 @@ struct page_ext *lookup_page_ext(const struct page *page)
MAX_ORDER_NR_PAGES);
return get_entry(base, index);
 }
+EXPORT_SYMBOL_GPL(lookup_page_ext);
 
 static int __init alloc_node_page_ext(int nid)
 {
-- 
2.17.1

[PATCH v14 00/15] Introduce Data Access MONitor (DAMON)

2020-06-02 Thread SeongJae Park

From: SeongJae Park 

Introduction


DAMON is a data access monitoring framework subsystem for the Linux kernel.
The core mechanisms of DAMON called 'region based sampling' and adaptive
regions adjustment' (refer to :doc:`mechanisms` for the detail) make it

 - accurate (The monitored information is useful for DRAM level memory
   management. It might not appropriate for Cache-level accuracy, though.),
 - ligfht-weight (The monitoring overhead is low enough to be applied online
   while making no impact on the performance of the target workloads.), and
 - scalable (the upper-bound of the instrumentation overhead is controllable
   regardless of the size of target workloads.).

Using this framework, therefore, the kernel's core memory management mechanisms
including reclamation and THP can be optimized for better memory management.
The experimental memory management optimization works that incurring high
instrumentation overhead will be able to have another try.  In user space,
meanwhile, users who have some special workloads will be able to write
personalized tools or applications for deeper understanding and specialized
optimizations of their systems.

Evaluations
===

We evaluated DAMON's overhead, monitoring quality and usefulness using 25
realistic workloads on my QEMU/KVM based virtual machine running a kernel that
v13 DAMON patchset is applied.

DAMON is lightweight.  It increases system memory usage by only -0.39% and
consumes less than 1% CPU time in most case.  It slows target workloads down by
only 0.63%.

DAMON is accurate and useful for memory management optimizations.  An
experimental DAMON-based operation scheme for THP, 'ethp', removes 69.43% of
THP memory overheads while preserving 37.11% of THP speedup.  Another
experimental DAMON-based 'proactive reclamation' implementation, 'prcl',
reduces 89.30% of residential sets and 22.40% of system memory footprint while
incurring only 1.98% runtime overhead in the best case (parsec3/freqmine).

NOTE that the experimentail THP optimization and proactive reclamation are not
for production, just only for proof of concepts.

Please refer to the official document[1] or "Documentation/admin-guide/mm: Add
a document for DAMON" patch in this patchset for detailed evaluation setup and
results.

[1] https://damonitor.github.io/doc/html/latest-damon

More Information


We prepared a showcase web site[1] that you can get more information.  There
are

- the official documentations[2],
- the heatmap format dynamic access pattern of various realistic workloads for
  heap area[3], mmap()-ed area[4], and stack[5] area,
- the dynamic working set size distribution[6] and chronological working set
  size changes[7], and
- the latest performance test results[8].

[1] https://damonitor.github.io/_index
[2] https://damonitor.github.io/doc/html/latest-damon
[3] https://damonitor.github.io/test/result/visual/latest/heatmap.0.html
[4] https://damonitor.github.io/test/result/visual/latest/heatmap.1.html
[5] https://damonitor.github.io/test/result/visual/latest/heatmap.2.html
[6] https://damonitor.github.io/test/result/visual/latest/wss_sz.html
[7] https://damonitor.github.io/test/result/visual/latest/wss_time.html
[8] https://damonitor.github.io/test/result/perf/latest/html/index.html

Baseline and Complete Git Trees
===

The patches are based on the v5.7.  You can also clone the complete git
tree:

$ git clone git://github.com/sjp38/linux -b damon/patches/v14

The web is also available:
https://github.com/sjp38/linux/releases/tag/damon/patches/v14

There are a couple of trees for entire DAMON patchset series.  It includes
future features.  The first one[1] contains the changes for latest release,
while the other one[2] contains the changes for next release.

[1] https://github.com/sjp38/linux/tree/damon/master
[2] https://github.com/sjp38/linux/tree/damon/next

Sequence Of Patches
===

The 1st patch exports 'lookup_page_ext()' to GPL modules so that it can be used
by DAMON even though it is built as a loadable module.

Next four patches implement the core of DAMON and it's programming interface.
The 2nd patch introduces DAMON module, it's data structures, and data structure
related common functions.  Following three patches (3rd to 5th) implements the
core mechanisms of DAMON, namely regions based sampling (patch 3), adaptive
regions adjustment (patch 4), and dynamic memory mapping chage adoption
(patch 5).

Following four patches are for low level users of DAMON.  The 6th patch
implements callbacks for each of monitoring steps so that users can do whatever
they want with the access patterns.  The 7th one implements recording of access
patterns in DAMON for better convenience and efficiency.  Each of next two
patches (8th and 9th) respectively adds a debugfs interface for privileged
people and/or programs in user space, and a tracepoint for other tracepoints
supporting tracers such as perf.

Re: [RESEND PATCH v1 6/6] staging: greybus: audio: Enable GB codec, audio module compilation.

2020-06-02 Thread Dan Carpenter

On Tue, Jun 02, 2020 at 10:51:15AM +0530, Vaibhav Agarwal wrote:
> Currently, GB codec and audio module is conditionally compiled based on
> GREYBUS_AUDIO_MSM8994. However, audio module is not dependent on MSM8994
> platform and can be used generically with any platform that follows
> GB Audio class specification.
> 
> Also, GB codec driver corresponds to dummy codec represented by I2S port
> available on Toshiba AP Bridge. Added config option for the same in
> kconfig file and accordingly updated Makefile.
> 

This commit message was a bit confusing to me.  Just say:

"Currently you can't enable the Grey Bus Audio Codec because there is no
entry for it in the Kconfig file.  Originally the config name was going
to be AUDIO_MSM8994 but that's not correct because other types of
hardware are supported now.  I have chosen the name AUDIO_APB_CODEC
instead.  Also I had to update the dependencies for GREYBUS_AUDIO to
make the compile work."

Otherwise this looks fine.

regards,
dan carpenter

Re: [PATCH v2 09/12] iio: imu: inv_icm42600: add buffer support in iio devices

2020-06-02 Thread Jean-Baptiste Maneyrol

Hi Jonathan,

I agree that this multiplexed watermark computation value is anything except 
simple and clear to understand.
I will add more documentation about it. And it also triggered a kbuild robot 
issue, because it is using 64 bits modulo without using math64 macros.

For buffer preenable/postenable/..., the sequence I am using currently is:
- preenable -> turn chip on (pm_runtime_get)
- update_scan_mode -> set FIFO en bits configuration (which sensor data is 
going into the fifo)
- hwfifo_watermark -> compute and set watermark value
- postenable -> turn FIFO on (and multiplexed with a FIFO on counter since used 
by accel & gyro)
- predisable -> turn FIFO off (multiplexed with counter)
- postdisable -> turn chip off (pm_runtime_put)

This setting is working well. Good to note that if there is an error when 
enabling the buffer, postdisable will always get called after preenable. So it 
ensures pm_runtime reference counter to be always OK.

Another way would be to only store configuration in internal state with 
update_scan_mode and hwfifo_watermark, and do everything in 
postenable/predisable. This is a possibility, but makes things a little more 
complex.

For hwfifo flush, this is an interesting feature when there is a need to have 
data immediately. Or when there is a need to do a clean change of 
configuration. In Android systems, Android framework is mainly using FIFO flush 
to change the sensor configuration (ODR, watermark) in a clean way. For our 
case with the FIFO interleaved this is a not an issue. If there are samples 
from the 2 sensors, it means the 2 buffers are enabled. And if data is coming 
to the iio buffer sooner than expected, that should not be a problem. The 
limitation I see when the 2 sensors are runnings, is that we will return less 
data than should have been possible. I limit FIFO reading to the provided n 
bytes, so we could read less than n samples of 1 sensor.

Something I have in mind, that would be really interesting to be able to 
set/change watermark value when the buffer is enabled. Otherwise, we are always 
loosing events by turning sensor off when we want to change the value. Is there 
any limitation to work this way, or should it be possible to implement this 
feature in the future ?

Thanks,
JB

From: Jonathan Cameron 
Sent: Sunday, May 31, 2020 14:56
To: Jean-Baptiste Maneyrol 
Cc: robh...@kernel.org ; r...@kernel.org ; 
mchehab+hua...@kernel.org ; da...@davemloft.net 
; gre...@linuxfoundation.org ; 
linux-...@vger.kernel.org ; 
devicet...@vger.kernel.org ; 
linux-kernel@vger.kernel.org 
Subject: Re: [PATCH v2 09/12] iio: imu: inv_icm42600: add buffer support in iio 
devices 

 CAUTION: This email originated from outside of the organization. Please make 
sure the sender is who they say they are and do not click links or open 
attachments unless you recognize the sender and know the content is safe.

On Wed, 27 May 2020 20:57:08 +0200
Jean-Baptiste Maneyrol  wrote:

> Add all FIFO parsing and reading functions. Add accel and gyro
> kfifo buffer and FIFO data parsing. Use device interrupt for
> reading data FIFO and launching accel and gyro parsing.
> 
> Support hwfifo watermark by multiplexing gyro and accel settings.
> Support hwfifo flush.

Both of these are complex given the interactions of the two sensors
types and to be honest I couldn't figure out exactly what the intent was.
Needs more docs!

Thanks,

Jonathan

> 
> Signed-off-by: Jean-Baptiste Maneyrol 
> ---
>  drivers/iio/imu/inv_icm42600/Kconfig  |   1 +
>  drivers/iio/imu/inv_icm42600/Makefile |   1 +
>  drivers/iio/imu/inv_icm42600/inv_icm42600.h   |   8 +
>  .../iio/imu/inv_icm42600/inv_icm42600_accel.c | 160 -
>  .../imu/inv_icm42600/inv_icm42600_buffer.c    | 555 ++
>  .../imu/inv_icm42600/inv_icm42600_buffer.h    |  98 
>  .../iio/imu/inv_icm42600/inv_icm42600_core.c  |  30 +
>  .../iio/imu/inv_icm42600/inv_icm42600_gyro.c  | 160 -
>  8 files changed, 1011 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.c
>  create mode 100644 drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
> 
> diff --git a/drivers/iio/imu/inv_icm42600/Kconfig 
> b/drivers/iio/imu/inv_icm42600/Kconfig
> index 22390a72f0a3..50cbcfcb6cf1 100644
> --- a/drivers/iio/imu/inv_icm42600/Kconfig
> +++ b/drivers/iio/imu/inv_icm42600/Kconfig
> @@ -2,6 +2,7 @@
>  
>  config INV_ICM42600
>    tristate
> + select IIO_BUFFER
>  
>  config INV_ICM42600_I2C
>    tristate "InvenSense ICM-426xx I2C driver"
> diff --git a/drivers/iio/imu/inv_icm42600/Makefile 
> b/drivers/iio/imu/inv_icm42600/Makefile
> index 48965824f00c..0f49f6df3647 100644
> --- a/drivers/iio/imu/inv_icm42600/Makefile
> +++ b/drivers/iio/imu/inv_icm42600/Makefile
> @@ -5,6 +5,7 @@ inv-icm42600-y += inv_icm42600_core.o
>  inv-icm42600-y += inv_icm42600_gyro.o
>  inv-icm42600-y += inv_icm42600_accel.o
>  inv-icm42600-y += inv_icm42600_temp.o
> +inv-icm42600-y +=

Re: [Linux-stm32] [PATCH v8 08/10] drm: stm: dw-mipi-dsi: let the bridge handle the HW version check

2020-06-02 Thread Emil Velikov

Hi Adrian,

On Mon, 1 Jun 2020 at 10:14, Adrian Ratiu  wrote:
>
> On Fri, 29 May 2020, Philippe CORNU  wrote:
> > Hi Adrian, and thank you very much for the patchset.  Thank you
> > also for having tested it on STM32F769 and STM32MP1.  Sorry for
> > the late response, Yannick and I will review it as soon as
> > possible and we will keep you posted.  Note: Do not hesitate to
> > put us in copy for the next version  (philippe.co...@st.com,
> > yannick.fer...@st.com) Regards, Philippe :-)
>
> Hi Philippe,
>
> Thank you very much for your previous and future STM testing,
> really appreciate it! I've CC'd Yannick until now but I'll also CC
> you sure :)
>
> It's been over a month since I posted v8 and I was just gearing up
> to address all feedback, rebase & retest to prepare v9 but I'll
> wait a little longer, no problem, it's no rush.
>
Small idea, pardon for joining so late:

Might be a good idea to add inline comment, why the clocks are disabled so late.
Effectively a 2 line version of the commit summary.

Feel free to make that a separate/follow-up patch.

-Emil

[GIT PULL] MIPS changes for v5.8-rc1

2020-06-02 Thread Thomas Bogendoerfer

Hi Linus,

this is the pull-requests for MIPS. A test merged between your current
master (f359287765c0) and mips-next showed a conflict in

  arch/mips/include/asm/module.h

between commit:

  62d0fd591db1 ("arch: split MODULE_ARCH_VERMAGIC definitions out to
")

from Linus' tree and commits:

  ab7c01fdc3cf ("mips: Add MIPS Release 5 support")
  281e3aea35e5 ("mips: Add MIPS Warrior P5600 support")

from the mips tree.

Conflict resolution from Stephen Rottwell is:

From: Stephen Rothwell 
Date: Mon, 25 May 2020 10:03:07 +1000
Subject: [PATCH] mips: vermagic updates

Signed-off-by: Stephen Rothwell 
---
 arch/mips/include/asm/vermagic.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/vermagic.h b/arch/mips/include/asm/vermagic.h
index 24dc3d35161c..4d2dae0c7c57 100644
--- a/arch/mips/include/asm/vermagic.h
+++ b/arch/mips/include/asm/vermagic.h
@@ -8,12 +8,16 @@
 #define MODULE_PROC_FAMILY "MIPS32_R1 "
 #elif defined CONFIG_CPU_MIPS32_R2
 #define MODULE_PROC_FAMILY "MIPS32_R2 "
+#elif defined CONFIG_CPU_MIPS32_R5
+#define MODULE_PROC_FAMILY "MIPS32_R5 "
 #elif defined CONFIG_CPU_MIPS32_R6
 #define MODULE_PROC_FAMILY "MIPS32_R6 "
 #elif defined CONFIG_CPU_MIPS64_R1
 #define MODULE_PROC_FAMILY "MIPS64_R1 "
 #elif defined CONFIG_CPU_MIPS64_R2
 #define MODULE_PROC_FAMILY "MIPS64_R2 "
+#elif defined CONFIG_CPU_MIPS64_R5
+#define MODULE_PROC_FAMILY "MIPS64_R5 "
 #elif defined CONFIG_CPU_MIPS64_R6
 #define MODULE_PROC_FAMILY "MIPS64_R6 "
 #elif defined CONFIG_CPU_R3000
@@ -46,6 +50,8 @@
 #define MODULE_PROC_FAMILY "LOONGSON64 "
 #elif defined CONFIG_CPU_CAVIUM_OCTEON
 #define MODULE_PROC_FAMILY "OCTEON "
+#elif defined CONFIG_CPU_P5600
+#define MODULE_PROC_FAMILY "P5600 "
 #elif defined CONFIG_CPU_XLR
 #define MODULE_PROC_FAMILY "XLR "
 #elif defined CONFIG_CPU_XLP



There were two more conflicts in linux-next:

Conflict in

  arch/mips/lasat/sysctl.c

between commit:

  10760dde9be3 ("MIPS: Remove support for LASAT")

from the mips tree and commit:

  32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")

from the vfs tree.

Here the resolution is to keep arch/mips/lasat/sysctl.c deleted.


Conflict in 

  mm/memory.c

between commit:

  7df676974359 ("mm/memory.c: Update local TLB if PTE entry exists")

from the mips tree and commit:

  e325f89fdd69 ("mm: memcontrol: convert anon and file-thp to new 
mem_cgroup_charge() API")

from the akpm-current tree.

Agreed resolution is:

"We decided that the update_mmu_tlb() call is no longer needed here, so
there is no need to re-add it when resolving this."

Thomas.



The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136:

  Linux 5.7-rc1 (2020-04-12 12:35:55 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux tags/mips_5.8

for you to fetch changes up to 9bd0bd264578fe191bf5d2ff23f9887b91862536:

  MIPS: ralink: drop ralink_clk_init for mt7621 (2020-05-31 14:17:00 +0200)


MIPS updates for v5.8:

- added support for MIPSr5 and P5600 cores
- converted Loongson PCI driver into a PCI host driver using the generic
  PCI framework
- added emulation of CPUCFG command for Loogonson64 cpus
- removed of LASAT, PMC MSP71xx and NEC MARKEINS/EMMA
- ioremap cleanup
- fix for a race between two threads faulting the same page
- various cleanups and fixes


Arnd Bergmann (1):
  mips: loongsoon2ef: remove private clk api

Ben Hutchings (1):
  MIPS: Fix exception handler memcpy()

Bibo Mao (4):
  MIPS: Do not flush tlb page when updating PTE entry
  mm/memory.c: Update local TLB if PTE entry exists
  mm/memory.c: Add memory read privilege on page fault handling
  MIPS: mm: add page valid judgement in function pte_modify

Bin Meng (1):
  mips: Drop CONFIG_MTD_M25P80 in various defconfig files

Christoph Hellwig (9):
  ASoC: txx9: don't work around too small resource_size_t
  MIPS: remove cpu_has_64bit_addresses
  MIPS: cleanup fixup_bigphys_addr handling
  MIPS: merge __ioremap_mode into ioremap_prot
  MIPS: split out the 64-bit ioremap implementation
  MIPS: move ioremap_prot und iounmap out of line
  MIPS: use ioremap_page_range
  ASoC: txx9: add back the hack for a too small resource_size_t
  MIPS: unexport __flush_icache_user_range

Christophe JAILLET (1):
  MIPS: ralink: bootrom: mark a function as __init to save some memory

Chuanhong Guo (1):
  MIPS: ralink: drop ralink_clk_init for mt7621

Daniel González Cabanelas (1):
  MIPS: BCM63XX: fix BCM6358 GPIO count

Geert Uytterhoeven (1):
  MIPS: ingenic: Replace  by 

Guoyun Sun (1):
  mips/mm: Add page soft dirty tracking

Gustavo A. R. Silva (1):
  MIPS: Replace zero-length array with flexible-array

H. Nikolaus Schaller (1):
  MIPS: DTS: CI20: make DM9000 Ethernet controller use NVMEM to find

Re: [PATCH v3 015/105] drm/vc4: hvs: Boost the core clock during modeset

2020-06-02 Thread Maxime Ripard

Hi Eric,

On Wed, May 27, 2020 at 09:33:44AM -0700, Eric Anholt wrote:
> On Wed, May 27, 2020 at 8:49 AM Maxime Ripard  wrote:
> >
> > In order to prevent timeouts and stalls in the pipeline, the core clock
> > needs to be maxed at 500MHz during a modeset on the BCM2711.
> 
> Like, the whole system's core clock?

Yep, unfortunately...

> How is it reasonable for some device driver to crank the system's core
> clock up and back down to some fixed-in-the-driver frequency? Sounds
> like you need some sort of opp thing here.

That frequency is the minimum rate of that clock. However, since other
devices have similar requirements (unicam in particular) with different
minimum requirements, we will switch to setting a minimum rate instead
of enforcing a particular rate, so that patch would be essentially
s/clk_set_rate/clk_set_min_rate/.

Would that work for you?

> 
> Patch 13,14 r-b.

Thanks!
Maxime

signature.asc
Description: PGP signature

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 1198 matches

Mail list logo