Re: [PATCH 1/4] io_uring: fix open/close/statx with {SQ,IO}POLL
On 02/06/2020 15:34, Pavel Begunkov wrote: > Trying to use them with IORING_SETUP_IOPOLL: > > RIP: 0010:io_iopoll_getevents+0x111/0x5a0 > Call Trace: > ? _raw_spin_unlock_irqrestore+0x24/0x40 > ? do_send_sig_info+0x64/0x90 > io_iopoll_reap_events.part.0+0x5e/0xa0 > io_ring_ctx_wait_and_kill+0x132/0x1c0 > io_uring_release+0x20/0x30 > __fput+0xcd/0x230 > fput+0xe/0x10 > task_work_run+0x67/0xa0 > do_exit+0x353/0xb10 > ? handle_mm_fault+0xd4/0x200 > ? syscall_trace_enter+0x18c/0x2c0 > do_group_exit+0x43/0xa0 > __x64_sys_exit_group+0x18/0x20 > do_syscall_64+0x60/0x1e0 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 io_do_iopoll() { ... ret = kiocb->*ki_filp*->f_op->iopoll(kiocb, spin); } Hmm, I'll double check later that only read*/write* can be done with IOPOLL, and send a follow-up patch if necessary. > > Also SQPOLL thread can't know which file table to use with > open/close. Disallow all these cases. > > Signed-off-by: Pavel Begunkov > --- > fs/io_uring.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 732ec73ec3c0..7208f91e9e77 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -2990,6 +2990,8 @@ static int io_openat_prep(struct io_kiocb *req, const > struct io_uring_sqe *sqe) > const char __user *fname; > int ret; > > + if (unlikely(req->ctx->flags & > (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL))) > + return -EINVAL; > if (sqe->ioprio || sqe->buf_index) > return -EINVAL; > if (req->flags & REQ_F_FIXED_FILE) > @@ -3023,6 +3025,8 @@ static int io_openat2_prep(struct io_kiocb *req, const > struct io_uring_sqe *sqe) > size_t len; > int ret; > > + if (unlikely(req->ctx->flags & > (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL))) > + return -EINVAL; > if (sqe->ioprio || sqe->buf_index) > return -EINVAL; > if (req->flags & REQ_F_FIXED_FILE) > @@ -3373,6 +3377,8 @@ static int io_fadvise(struct io_kiocb *req, bool > force_nonblock) > > static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe > *sqe) > { > + if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL)) > + return -EINVAL; > if (sqe->ioprio || sqe->buf_index) > return -EINVAL; > if (req->flags & REQ_F_FIXED_FILE) > @@ -3417,6 +3423,8 @@ static int io_close_prep(struct io_kiocb *req, const > struct io_uring_sqe *sqe) >*/ > req->work.flags |= IO_WQ_WORK_NO_CANCEL; > > + if (unlikely(req->ctx->flags & > (IORING_SETUP_IOPOLL|IORING_SETUP_SQPOLL))) > + return -EINVAL; > if (sqe->ioprio || sqe->off || sqe->addr || sqe->len || > sqe->rw_flags || sqe->buf_index) > return -EINVAL; > -- Pavel Begunkov
Re: [PATCH v3 032/105] drm/vc4: crtc: Enable and disable the PV in atomic_enable / disable
Hi Eric On Wed, May 27, 2020 at 09:54:44AM -0700, Eric Anholt wrote: > On Wed, May 27, 2020 at 8:50 AM Maxime Ripard wrote: > > > > The VIDEN bit in the pixelvalve currently being used to enable or disable > > the pixelvalve seems to not be enough in some situations, which whill end > > up with the pixelvalve stalling. > > > > In such a case, even re-enabling VIDEN doesn't bring it back and we need to > > clear the FIFO. This can only be done if the pixelvalve is disabled though. > > > > In order to overcome this, we can configure the pixelvalve during > > mode_set_no_fb, but only enable it in atomic_enable and flush the FIFO > > there, and in atomic_disable disable the pixelvalve again. > > What displays has this been tested with? Getting this sequencing > right is so painful, and things like DSI are tricky to get to light > up. That FIFO is between the HVS and the HDMI PVs, so this was obviously tested against that. Dave also tested the DSI output IIRC, so we should be covered here. Maxime signature.asc Description: PGP signature
Re: [PATCH] spi: spi-ti-qspi: call pm_runtime_put on pm_runtime_get failure
On Tue, Jun 02, 2020 at 12:02:11PM +0200, Markus Elfring wrote: > > The original changelog is perfectly fine, please stop sending these. > I find this commit message improvable also according to Linux software > development documentation. Causing people to send out new versions of things for tweaks to the commit log consumes time for them and everyone they're sending changes to. Pushing people to make trivial rewordings of their commit logs to match your particular taste is not a good use of people's time. signature.asc Description: PGP signature
[PATCH v2] gpiolib: split character device into gpiolib-cdev
Split the cdev specific functionality out of gpiolib.c and into gpiolib-cdev.c. This improves the readability and maintainability of both the cdev and core gpiolib code. Suggested-by: Bartosz Golaszewski Signed-off-by: Kent Gibson --- While this patch is complete and ready for review, I don't expect it to be applied as is. There are a few cdev patches pending merge into gpio/devel that are sure to conflict, and it makes more sense to rebase this on top of them than vice versa. But I thought it would be worthwhile to get it out for review so it can be ready to be rebased when the time is right. Also, this is a naive split. That is, if applied as is, it will lose the line history of the cdev code. This is not what I intend, and I understand can be avoided by encouraging git to remember the history with a few moves, but I'm unsure how the maintainers would prefer that to be done. Bart, As this was your idea, I've taken the liberty of adding the Suggested-by. I hope that is ok. Changes in v2: - rebased to latest gpio/devel and added base-commit to placate the build bot. The comments above still apply, as there are still a couple of commits in gpio/fixes that will conflict. Kent. drivers/gpio/Makefile |1 + drivers/gpio/gpiolib-cdev.c | 1148 +++ drivers/gpio/gpiolib-cdev.h | 11 + drivers/gpio/gpiolib.c | 1112 + 4 files changed, 1164 insertions(+), 1108 deletions(-) create mode 100644 drivers/gpio/gpiolib-cdev.c create mode 100644 drivers/gpio/gpiolib-cdev.h diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 65bf3940e33c..b5b58b624f37 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_GPIOLIB) += gpiolib.o obj-$(CONFIG_GPIOLIB) += gpiolib-devres.o obj-$(CONFIG_GPIOLIB) += gpiolib-legacy.o obj-$(CONFIG_GPIOLIB) += gpiolib-devprop.o +obj-$(CONFIG_GPIOLIB) += gpiolib-cdev.o obj-$(CONFIG_OF_GPIO) += gpiolib-of.o obj-$(CONFIG_GPIO_SYSFS) += gpiolib-sysfs.o obj-$(CONFIG_GPIO_ACPI)+= gpiolib-acpi.o diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c new file mode 100644 index ..971470bdc9c9 --- /dev/null +++ b/drivers/gpio/gpiolib-cdev.c @@ -0,0 +1,1148 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#include "gpiolib.h" + +/* Implementation infrastructure for GPIO interfaces. + * + * The GPIO programming interface allows for inlining speed-critical + * get/set operations for common cases, so that access to SOC-integrated + * GPIOs can sometimes cost only an instruction or two per bit. + */ + +/* + * GPIO line handle management + */ + +/** + * struct linehandle_state - contains the state of a userspace handle + * @gdev: the GPIO device the handle pertains to + * @label: consumer label used to tag descriptors + * @descs: the GPIO descriptors held by this handle + * @numdescs: the number of descriptors held in the descs array + */ +struct linehandle_state { + struct gpio_device *gdev; + const char *label; + struct gpio_desc *descs[GPIOHANDLES_MAX]; + u32 numdescs; +}; + +#define GPIOHANDLE_REQUEST_VALID_FLAGS \ + (GPIOHANDLE_REQUEST_INPUT | \ + GPIOHANDLE_REQUEST_OUTPUT | \ + GPIOHANDLE_REQUEST_ACTIVE_LOW | \ + GPIOHANDLE_REQUEST_BIAS_PULL_UP | \ + GPIOHANDLE_REQUEST_BIAS_PULL_DOWN | \ + GPIOHANDLE_REQUEST_BIAS_DISABLE | \ + GPIOHANDLE_REQUEST_OPEN_DRAIN | \ + GPIOHANDLE_REQUEST_OPEN_SOURCE) + +static int linehandle_validate_flags(u32 flags) +{ + /* Return an error if an unknown flag is set */ + if (flags & ~GPIOHANDLE_REQUEST_VALID_FLAGS) + return -EINVAL; + + /* +* Do not allow both INPUT & OUTPUT flags to be set as they are +* contradictory. +*/ + if ((flags & GPIOHANDLE_REQUEST_INPUT) && + (flags & GPIOHANDLE_REQUEST_OUTPUT)) + return -EINVAL; + + /* +* Do not allow OPEN_SOURCE & OPEN_DRAIN flags in a single request. If +* the hardware actually supports enabling both at the same time the +* electrical result would be disastrous. +*/ + if ((flags & GPIOHANDLE_REQUEST_OPEN_DRAIN) && + (flags & GPIOHANDLE_REQUEST_OPEN_SOURCE)) + return -EINVAL; + + /* OPEN_DRAIN and OPEN_SOURCE flags only make sense for output mode. */ + if (!(flags & GPIOHANDLE_REQUEST_OUTPUT) && + ((flags & GPIOHANDLE_REQUEST_OPEN_DRAIN) || +(flags & GPIOHANDLE_REQUEST_OPEN_SOURCE))) + return -EINVAL; + + /* Bias flags only allowed for input or output mode. */ + if (!((flags &
Re: [PATCH 1/2] perf tools: check libasan and libubsan in Makefile.config
Em Tue, Jun 02, 2020 at 12:15:03PM +0800, Tiezhu Yang escreveu: > When build perf with ASan or UBSan, if libasan or libubsan can not find, > the feature-glibc is 0 and there exists the following error log which is > wrong, because we can find gnu/libc-version.h in /usr/include, glibc-devel > is also installed. I'll check this later, - Arnaldo > [yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer > -fsanitize=address' > BUILD: Doing 'make -j4' parallel build > HOSTCC fixdep.o > HOSTLD fixdep-in.o > LINK fixdep > :1:0: warning: -fsanitize=address and -fsanitize=kernel-address are > not supported for this target > :1:0: warning: -fsanitize=address not supported for this target > > Auto-detecting system features: > ... dwarf: [ OFF ] > ...dwarf_getlocations: [ OFF ] > ... glibc: [ OFF ] > ... gtk2: [ OFF ] > ... libaudit: [ OFF ] > ...libbfd: [ OFF ] > ...libcap: [ OFF ] > ...libelf: [ OFF ] > ... libnuma: [ OFF ] > ...numa_num_possible_cpus: [ OFF ] > ... libperl: [ OFF ] > ... libpython: [ OFF ] > ... libcrypto: [ OFF ] > ... libunwind: [ OFF ] > ...libdw-dwarf-unwind: [ OFF ] > ... zlib: [ OFF ] > ... lzma: [ OFF ] > ... get_cpuid: [ OFF ] > ... bpf: [ OFF ] > ...libaio: [ OFF ] > ... libzstd: [ OFF ] > ...disassembler-four-args: [ OFF ] > > Makefile.config:393: *** No gnu/libc-version.h found, please install > glibc-dev[el]. Stop. > Makefile.perf:224: recipe for target 'sub-make' failed > make[1]: *** [sub-make] Error 2 > Makefile:69: recipe for target 'all' failed > make: *** [all] Error 2 > [yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h > /usr/include/gnu/libc-version.h > > After install libasan and libubsan, the feature-glibc is 1 and the build > process is success, so the cause is related with libasan or libubsan, we > should check them and print an error log to reflect the reality. > > Signed-off-by: Tiezhu Yang > --- > tools/perf/Makefile.config | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config > index 12a8204..b699d21 100644 > --- a/tools/perf/Makefile.config > +++ b/tools/perf/Makefile.config > @@ -387,6 +387,12 @@ else >NO_LIBBPF := 1 >NO_JVMTI := 1 > else > + ifneq ($(shell ldconfig -p | grep libasan >/dev/null 2>&1; echo $$?), > 0) > +msg := $(error No libasan found, please install libasan); > + endif > + ifneq ($(shell ldconfig -p | grep libubsan >/dev/null 2>&1; echo $$?), > 0) > +msg := $(error No libubsan found, please install libubsan); > + endif >ifneq ($(filter s% -static%,$(LDFLAGS),),) > msg := $(error No static glibc found, please install glibc-static); >else > -- > 2.1.0 > -- - Arnaldo
Re: [PATCH 2/2] perf tools: Remove some duplicated includes
Em Tue, Jun 02, 2020 at 12:15:04PM +0800, Tiezhu Yang escreveu: > There exists some duplicated includes in tools/perf, remove them. Applied, thanks. - Arnaldo > Signed-off-by: Tiezhu Yang > --- > tools/perf/builtin-report.c | 1 - > tools/perf/util/annotate.c | 1 - > tools/perf/util/auxtrace.c | 1 - > tools/perf/util/config.c| 1 - > tools/perf/util/session.c | 1 - > 5 files changed, 5 deletions(-) > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > index ba63390..5425a2c 100644 > --- a/tools/perf/builtin-report.c > +++ b/tools/perf/builtin-report.c > @@ -47,7 +47,6 @@ > #include "util/time-utils.h" > #include "util/auxtrace.h" > #include "util/units.h" > -#include "util/branch.h" > #include "util/util.h" // perf_tip() > #include "ui/ui.h" > #include "ui/progress.h" > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index d828c2d..76bfb4a 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -41,7 +41,6 @@ > #include > #include > #include > -#include > #include > #include > > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c > index 749487a..94a8f4f 100644 > --- a/tools/perf/util/auxtrace.c > +++ b/tools/perf/util/auxtrace.c > @@ -55,7 +55,6 @@ > #include "util/mmap.h" > > #include > -#include > #include "symbol/kallsyms.h" > #include > > diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c > index ef38eba..64f14a5 100644 > --- a/tools/perf/util/config.c > +++ b/tools/perf/util/config.c > @@ -20,7 +20,6 @@ > #include "build-id.h" > #include "debug.h" > #include "config.h" > -#include "debug.h" > #include > #include > #include > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c > index c11d89e..5550e26e 100644 > --- a/tools/perf/util/session.c > +++ b/tools/perf/util/session.c > @@ -33,7 +33,6 @@ > #include "../perf.h" > #include "arch/common.h" > #include > -#include > > #ifdef HAVE_ZSTD_SUPPORT > static int perf_session__process_compressed_event(struct perf_session > *session, > -- > 2.1.0 > -- - Arnaldo
Re: [PATCH] hwmon: bt1-pvt: Declare Temp- and Volt-to-N poly when alarms are enabled
On Tue, Jun 02, 2020 at 12:12:19PM +0300, Serge Semin wrote: > Clang-based kernel building with W=1 warns that some static const > variables are unused: > > drivers/hwmon/bt1-pvt.c:67:30: warning: unused variable 'poly_temp_to_N' > [-Wunused-const-variable] > static const struct pvt_poly poly_temp_to_N = { > ^ > drivers/hwmon/bt1-pvt.c:99:30: warning: unused variable 'poly_volt_to_N' > [-Wunused-const-variable] > static const struct pvt_poly poly_volt_to_N = { > ^ > > Indeed these polynomials are utilized only when the PVT sensor alarms are > enabled. In that case they are used to convert the temperature and > voltage alarm limits from normal quantities (Volts and degree Celsius) to > the sensor data representation N = [0, 1023]. Otherwise when alarms are > disabled the driver only does the detected data conversion to the human > readable form and doesn't need that polynomials defined. So let's declare > the Temp-to-N and Volt-to-N polynomials only if the PVT alarms are > switched on at compile-time. > > Note gcc with W=1 doesn't notice the problem. > > Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver") > Reported-by: kbuild test robot > Signed-off-by: Serge Semin > Cc: Maxim Kaurkin > Cc: Alexey Malahov I don't really like the added #if. Can you use __maybe_unused instead ? Thanks, Guenter > --- > drivers/hwmon/bt1-pvt.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c > index 1a9772fb1f73..1a5212c04549 100644 > --- a/drivers/hwmon/bt1-pvt.c > +++ b/drivers/hwmon/bt1-pvt.c > @@ -64,6 +64,7 @@ static const struct pvt_sensor_info pvt_info[] = { > * 48380, > * where T = [-48380, 147438] mC and N = [0, 1023]. > */ > +#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS) > static const struct pvt_poly poly_temp_to_N = { > .total_divider = 1, > .terms = { > @@ -74,6 +75,7 @@ static const struct pvt_poly poly_temp_to_N = { > {0, 1720400, 1, 1} > } > }; > +#endif /* CONFIG_SENSORS_BT1_PVT_ALARMS */ > > static const struct pvt_poly poly_N_to_temp = { > .total_divider = 1, > @@ -96,6 +98,7 @@ static const struct pvt_poly poly_N_to_temp = { > * N = (18658e-3*V - 11572) / 10, > * V = N * 10^5 / 18658 + 11572 * 10^4 / 18658. > */ > +#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS) > static const struct pvt_poly poly_volt_to_N = { > .total_divider = 10, > .terms = { > @@ -103,6 +106,7 @@ static const struct pvt_poly poly_volt_to_N = { > {0, -11572, 1, 1} > } > }; > +#endif /* CONFIG_SENSORS_BT1_PVT_ALARMS */ > > static const struct pvt_poly poly_N_to_volt = { > .total_divider = 10,
Re: [PATCH] perf tools: Fix kernel maps for kcore and eBPF
Em Tue, Jun 02, 2020 at 02:25:05PM +0300, Adrian Hunter escreveu: > Adjust pgoff also when moving a map's start address. > > Example with v5.4.34 based kernel: > > Before: > > $ sudo tools/perf/perf record -a --kcore -e intel_pt//k sleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 1.958 MB perf.data ] > $ sudo tools/perf/perf script --itrace=e >/dev/null > Warning: > 961 instruction trace errors > > After: > > $ sudo tools/perf/perf script --itrace=e >/dev/null > $ Thanks, tested on a kaby lake system (i5-7500) and applied, - Arnaldo > Signed-off-by: Adrian Hunter > Fixes: fb5a88d4131a ("perf tools: Preserve eBPF maps when loading kcore") > Cc: sta...@vger.kernel.org > --- > tools/perf/util/symbol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c > index 57cbe7a29868..5ddf84dcbae7 100644 > --- a/tools/perf/util/symbol.c > +++ b/tools/perf/util/symbol.c > @@ -1224,6 +1224,7 @@ int maps__merge_in(struct maps *kmaps, struct map > *new_map) > > m->end = old_map->start; > list_add_tail(>node, ); > + new_map->pgoff += old_map->end - new_map->start; > new_map->start = old_map->end; > } > } else { > @@ -1244,6 +1245,7 @@ int maps__merge_in(struct maps *kmaps, struct map > *new_map) >* |new..| -> |new...| >* |old|-> |old| >*/ > + new_map->pgoff += old_map->end - new_map->start; > new_map->start = old_map->end; > } > } > -- > 2.17.1 > -- - Arnaldo
Re: [PATCHv3] perf stat: Ensure group is defined on top of the same cpu mask
On Tue, Jun 02, 2020 at 10:42:56AM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Jun 02, 2020 at 12:17:36PM +0200, Jiri Olsa escreveu: > > Jin Yao reported the issue (and posted first versions of this change) > > with groups being defined over events with different cpu mask. > > > This causes assert aborts in get_group_fd, like: > > > # perf stat -M "C2_Pkg_Residency" -a -- sleep 1 > > perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed. > > Aborted > > > All the events in the group have to be defined over the same > > cpus so the group_fd can be found for every leader/member pair. > > > Adding check to ensure this condition is met and removing the > > group (with warning) if we detect mixed cpus, like: > > > $ sudo perf stat -e > > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}' > > WARNING: event cpu maps do not match, disabling group: > > anon group { power/energy-cores/, cycles } > > anon group { instructions, power/energy-cores/ } > > So it doesn't disable the 'group', it disables the 'grouping' of those > events, right? I.e. reading the WARNING, I thought that it would count > nothing, since it lists both groups as being disabled, but when I tested > I noticed that: > > [root@seventh ~]# perf stat -e > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}' > WARNING: grouped events cpus do not match, disabling group: > anon group { power/energy-cores/, cycles } > anon group { instructions, power/energy-cores/ } > ^C >Performance counter stats for 'system wide': > >12.62 Joules power/energy-cores/ > 106,920,637cycles > 80,228,899instructions #0.75 insn per > cycle >12.62 Joules power/energy-cores/ > > 14.514476987 seconds time elapsed > > > [root@seventh ~]# > > I.e. it counted the events, ungrouped, or am I missing something? right, it disables 'grouping', events are scheduled/counted individualy this way we will not hit the issue when looking for group_fd FD and there's not any, because of different cpu maps > > If I do: > > [root@seventh ~]# perf stat -e > '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2 > >Performance counter stats for 'system wide': > > 1.73 Joules power/energy-cores/ > > 0.92 Joules power/energy-ram/ > > 12,191,658instructions #0.67 insn per > cycle > 18,275,233cycles > > > 2.001272492 seconds time elapsed > > [root@seventh ~]# > > It works, grouped. One observation, shouldn't we somehow show in the > output that the first two were indeed grouped, ditto for the second two? yea, we don't display groups in output.. also there's no number for the group, it's still separate events numbers in output grouping is only used when creating events > > Also, this needs improvement: > > [root@seventh ~]# perf stat -e > '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2 > Error: > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for > event (power/energy-cores/). > /bin/dmesg | grep -i perf may provide additional information. yes, power events don't work with events without cpu being defined, which is what we do for 'workload' session.. we should either check for that and display some sensible error for power events or perhaps check if we could monitor like perf record does with creating events for task and every cpu in the system thanks, jirka
Re: kobject_init_and_add is easy to misuse
On Tue, Jun 02, 2020 at 05:10:35AM -0700, Matthew Wilcox wrote: > On Tue, Jun 02, 2020 at 07:50:33PM +0800, Wang Hai wrote: > > syzkaller reports for memory leak when kobject_init_and_add() > > returns an error in the function sysfs_slab_add() [1] > > > > When this happened, the function kobject_put() is not called for the > > corresponding kobject, which potentially leads to memory leak. > > > > This patch fixes the issue by calling kobject_put() even if > > kobject_init_and_add() fails. > > I think this speaks to a deeper problem with kobject_init_and_add() > -- the need to call kobject_put() if it fails is not readily apparent > to most users. This same bug appears in the first three users of > kobject_init_and_add() that I checked -- > arch/ia64/kernel/topology.c > drivers/firmware/dmi-sysfs.c > drivers/firmware/efi/esrt.c > drivers/scsi/iscsi_boot_sysfs.c > > Some do get it right -- > arch/powerpc/kernel/cacheinfo.c > drivers/gpu/drm/ttm/ttm_bo.c > drivers/gpu/drm/ttm/ttm_memory.c > drivers/infiniband/hw/mlx4/sysfs.c Why are random individual drivers calling kobject* functions? That speaks to a larger problem here... Anyway, yes, it's a tricky function, but the issue usually is that the kobject is embedded in something else and if you call init_and_add() you want to tear things down _before_ the final put happens. The good thing is, that function is really hard to get to fail except if you abuse it with syzkaller :) > I'd argue that the current behaviour is wrong, that kobject_init_and_add() > should call kobject_put() if the add fails. This would need a tree-wide > audit. But somebody needs to do that anyway because based on my random > sampling, half of the users currently get it wrong. As said above, this is "tricky", and might break things. thanks, greg k-h
Re: [PATCH v4 4/4] kdb: Switch to use safer dbg_io_ops over console APIs
On Tue, 2 Jun 2020 at 19:16, Daniel Thompson wrote: > > On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote: > > In kgdb context, calling console handlers aren't safe due to locks used > > in those handlers which could in turn lead to a deadlock. Although, using > > oops_in_progress increases the chance to bypass locks in most console > > handlers but it might not be sufficient enough in case a console uses > > more locks (VT/TTY is good example). > > > > Currently when a driver provides both polling I/O and a console then kdb > > will output using the console. We can increase robustness by using the > > currently active polling I/O driver (which should be lockless) instead > > of the corresponding console. For several common cases (e.g. an > > embedded system with a single serial port that is used both for console > > output and debugger I/O) this will result in no console handler being > > used. > > > > In order to achieve this we need to reverse the order of preference to > > use dbg_io_ops (uses polling I/O mode) over console APIs. So we just > > store "struct console" that represents debugger I/O in dbg_io_ops and > > while emitting kdb messages, skip console that matches dbg_io_ops > > console in order to avoid duplicate messages. After this change, > > "is_console" param becomes redundant and hence removed. > > > > Suggested-by: Daniel Thompson > > Signed-off-by: Sumit Garg > > Looking good, only one minor comment left on my side (including the > three patches prior). > > > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c > > index 9e5a40d..5e00bc8 100644 > > --- a/kernel/debug/kdb/kdb_io.c > > +++ b/kernel/debug/kdb/kdb_io.c > > @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len) > > if (msg_len == 0) > > return; > > > > - if (dbg_io_ops && !dbg_io_ops->is_console) > > + if (dbg_io_ops) > > kdb_io_write(msg, msg_len); > > Since this now slots on so cleanly and there are not multiple calls > to kdb_io_write() then I think perhaps factoring this out into its > own function (in patch 1) is no long necessary. The character write > loop can go directly into this function. > Okay, will update it in the next version. -Sumit > > Daniel.
[PATCH 4/4] serial: core: drop redundant sysrq checks
The sysrq timestamp will never be set unless port->has_sysrq so drop the redundant checks that where added by commit 1997e9dfdc84 ("serial_core: Un-ifdef sysrq SUPPORT_SYSRQ"). Signed-off-by: Johan Hovold --- include/linux/serial_core.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index a8a213b71553..2f6c3cfe2ae7 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -468,7 +468,7 @@ bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch); static inline int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch) { - if (!port->has_sysrq || !port->sysrq) + if (!port->sysrq) return 0; if (ch && time_before(jiffies, port->sysrq)) { @@ -487,7 +487,7 @@ static inline int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch static inline int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch) { - if (!port->has_sysrq || !port->sysrq) + if (!port->sysrq) return 0; if (ch && time_before(jiffies, port->sysrq)) { -- 2.26.2
[PATCH 2/4] serial: core: fix broken sysrq port unlock
Commit d6e1935819db ("serial: core: Allow processing sysrq at port unlock time") worked around a circular locking dependency by adding helpers used to defer sysrq processing to when the port lock was released. A later commit unfortunately converted these inline helpers to exported functions despite the fact that the unlock helper was restoring irq flags, something which needs to be done in the same function that saved them (e.g. on SPARC). Fixes: 8e20fc391711 ("serial_core: Move sysrq functions from header file") Cc: stable # 5.6 Cc: Dmitry Safonov <0x7f454...@gmail.com> Signed-off-by: Johan Hovold --- drivers/tty/serial/serial_core.c | 19 --- include/linux/serial_core.h | 21 +++-- 2 files changed, 19 insertions(+), 21 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index edfb7bc14bbf..f6cf9cc4ce69 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -3239,25 +3239,6 @@ int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch) } EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char); -void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long irqflags) -{ - int sysrq_ch; - - if (!port->has_sysrq) { - spin_unlock_irqrestore(>lock, irqflags); - return; - } - - sysrq_ch = port->sysrq_ch; - port->sysrq_ch = 0; - - spin_unlock_irqrestore(>lock, irqflags); - - if (sysrq_ch) - handle_sysrq(sysrq_ch); -} -EXPORT_SYMBOL_GPL(uart_unlock_and_check_sysrq); - /* * We do the SysRQ and SAK checking like this... */ diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 1f4443db5474..858c5dd926ad 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -462,8 +462,25 @@ extern void uart_insert_char(struct uart_port *port, unsigned int status, extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch); extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch); -extern void uart_unlock_and_check_sysrq(struct uart_port *port, - unsigned long irqflags); + +static inline void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long irqflags) +{ + int sysrq_ch; + + if (!port->has_sysrq) { + spin_unlock_irqrestore(>lock, irqflags); + return; + } + + sysrq_ch = port->sysrq_ch; + port->sysrq_ch = 0; + + spin_unlock_irqrestore(>lock, irqflags); + + if (sysrq_ch) + handle_sysrq(sysrq_ch); +} + extern int uart_handle_break(struct uart_port *port); /* -- 2.26.2
[PATCH 0/4] serial: core: fix up sysrq regressions
This series fixes a few regressions introduced by the recent sysrq rework that went into 5.6. The port unlock fix is tagged for stable, and the fix for the unnecessary per-character overhead probably could be as well although it is a bit more intrusive. Johan Johan Hovold (4): Revert "serial: core: Refactor uart_unlock_and_check_sysrq()" serial: core: fix broken sysrq port unlock serial: core: fix sysrq overhead regression serial: core: drop redundant sysrq checks drivers/tty/serial/serial_core.c | 96 +--- include/linux/serial_core.h | 105 +-- 2 files changed, 103 insertions(+), 98 deletions(-) -- 2.26.2
[PATCH 1/4] Revert "serial: core: Refactor uart_unlock_and_check_sysrq()"
This reverts commit da9a5aa3402db0ff3b57216d8dbf2478e1046cae. In order to ease backporting a fix for broken port unlock, revert this rewrite which was since added on top. The other sysrq helpers bail out early when sysrq is not enabled; it's better to keep that pattern here as well. Note that the __releases() attribute won't be needed after the follow-on fix either. Signed-off-by: Johan Hovold --- drivers/tty/serial/serial_core.c | 23 +-- include/linux/serial_core.h | 3 ++- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 66a5e2faf57e..edfb7bc14bbf 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -3239,19 +3239,22 @@ int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch) } EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char); -void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long flags) -__releases(>lock) +void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long irqflags) { - if (port->has_sysrq) { - int sysrq_ch = port->sysrq_ch; + int sysrq_ch; - port->sysrq_ch = 0; - spin_unlock_irqrestore(>lock, flags); - if (sysrq_ch) - handle_sysrq(sysrq_ch); - } else { - spin_unlock_irqrestore(>lock, flags); + if (!port->has_sysrq) { + spin_unlock_irqrestore(>lock, irqflags); + return; } + + sysrq_ch = port->sysrq_ch; + port->sysrq_ch = 0; + + spin_unlock_irqrestore(>lock, irqflags); + + if (sysrq_ch) + handle_sysrq(sysrq_ch); } EXPORT_SYMBOL_GPL(uart_unlock_and_check_sysrq); diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 92f5eba86052..1f4443db5474 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -462,7 +462,8 @@ extern void uart_insert_char(struct uart_port *port, unsigned int status, extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch); extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch); -extern void uart_unlock_and_check_sysrq(struct uart_port *port, unsigned long flags); +extern void uart_unlock_and_check_sysrq(struct uart_port *port, + unsigned long irqflags); extern int uart_handle_break(struct uart_port *port); /* -- 2.26.2
[PATCH 3/4] serial: core: fix sysrq overhead regression
Commit 8e20fc391711 ("serial_core: Move sysrq functions from header file") converted the inline sysrq helpers to exported functions which are now called for every received character and break signal also on systems without CONFIG_MAGIC_SYSRQ_SERIAL instead of being optimised away by the compiler. Inlining these helpers again also avoids the function call overhead when CONFIG_MAGIC_SYSRQ_SERIAL is enabled. Fixes: 8e20fc391711 ("serial_core: Move sysrq functions from header file") Cc: Dmitry Safonov <0x7f454...@gmail.com> Signed-off-by: Johan Hovold --- drivers/tty/serial/serial_core.c | 80 +- include/linux/serial_core.h | 85 ++-- 2 files changed, 84 insertions(+), 81 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index f6cf9cc4ce69..a714402dcf4e 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -41,8 +41,6 @@ static struct lock_class_key port_lock_key; #define HIGH_BITS_OFFSET ((sizeof(long)-sizeof(int))*8) -#define SYSRQ_TIMEOUT (HZ * 5) - static void uart_change_speed(struct tty_struct *tty, struct uart_state *state, struct ktermios *old_termios); static void uart_wait_until_sent(struct tty_struct *tty, int timeout); @@ -3163,7 +3161,7 @@ static DECLARE_WORK(sysrq_enable_work, uart_sysrq_on); * Returns false if @ch is out of enabling sequence and should be * handled some other way, true if @ch was consumed. */ -static bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch) +bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch) { int sysrq_toggle_seq_len = strlen(sysrq_toggle_seq); @@ -3186,83 +3184,9 @@ static bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch) port->sysrq = 0; return true; } -#else -static inline bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch) -{ - return false; -} +EXPORT_SYMBOL_GPL(uart_try_toggle_sysrq); #endif -int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch) -{ - if (!IS_ENABLED(CONFIG_MAGIC_SYSRQ_SERIAL)) - return 0; - - if (!port->has_sysrq || !port->sysrq) - return 0; - - if (ch && time_before(jiffies, port->sysrq)) { - if (sysrq_mask()) { - handle_sysrq(ch); - port->sysrq = 0; - return 1; - } - if (uart_try_toggle_sysrq(port, ch)) - return 1; - } - port->sysrq = 0; - - return 0; -} -EXPORT_SYMBOL_GPL(uart_handle_sysrq_char); - -int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch) -{ - if (!IS_ENABLED(CONFIG_MAGIC_SYSRQ_SERIAL)) - return 0; - - if (!port->has_sysrq || !port->sysrq) - return 0; - - if (ch && time_before(jiffies, port->sysrq)) { - if (sysrq_mask()) { - port->sysrq_ch = ch; - port->sysrq = 0; - return 1; - } - if (uart_try_toggle_sysrq(port, ch)) - return 1; - } - port->sysrq = 0; - - return 0; -} -EXPORT_SYMBOL_GPL(uart_prepare_sysrq_char); - -/* - * We do the SysRQ and SAK checking like this... - */ -int uart_handle_break(struct uart_port *port) -{ - struct uart_state *state = port->state; - - if (port->handle_break) - port->handle_break(port); - - if (port->has_sysrq && uart_console(port)) { - if (!port->sysrq) { - port->sysrq = jiffies + SYSRQ_TIMEOUT; - return 1; - } - port->sysrq = 0; - } - - if (port->flags & UPF_SAK) - do_SAK(state->port.tty); - return 0; -} -EXPORT_SYMBOL_GPL(uart_handle_break); - EXPORT_SYMBOL(uart_write_wakeup); EXPORT_SYMBOL(uart_register_driver); EXPORT_SYMBOL(uart_unregister_driver); diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 858c5dd926ad..a8a213b71553 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -460,8 +460,49 @@ extern void uart_handle_cts_change(struct uart_port *uport, extern void uart_insert_char(struct uart_port *port, unsigned int status, unsigned int overrun, unsigned int ch, unsigned int flag); -extern int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch); -extern int uart_prepare_sysrq_char(struct uart_port *port, unsigned int ch); +#ifdef CONFIG_MAGIC_SYSRQ_SERIAL + +#define SYSRQ_TIMEOUT (HZ * 5) + +bool uart_try_toggle_sysrq(struct uart_port *port, unsigned int ch); + +static inline int uart_handle_sysrq_char(struct uart_port *port, unsigned int ch) +{ + if (!port->has_sysrq || !port->sysrq) + return 0; + + if (ch
Re: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where appropriate
On Tue, Jun 2, 2020 at 4:38 PM Ruhl, Michael J wrote: > >-Original Message- > >From: dri-devel On Behalf Of > >Piotr Stankiewicz > >Sent: Tuesday, June 2, 2020 5:21 AM > >To: Alex Deucher ; Christian König > >; David Zhou ; David > >Airlie ; Daniel Vetter > >Cc: Stankiewicz, Piotr ; dri- > >de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; linux- > >ker...@vger.kernel.org > >Subject: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where > >appropriate ... > > int nvec = pci_msix_vec_count(adev->pdev); > > unsigned int flags; > > > >- if (nvec <= 0) { > >+ if (nvec > 0) > >+ flags = PCI_IRQ_MSI_TYPES; > >+ else > > flags = PCI_IRQ_MSI; > >- } else { > >- flags = PCI_IRQ_MSI | PCI_IRQ_MSIX; > >- } > > Minor nit: > > Is it really necessary to set do this check? Can flags just > be set? > > I.e.: > flags = PCI_IRQ_MSI_TYPES; > > pci_alloc_irq_vector() tries stuff in order. If MSIX is not available, > it will try MSI. That's also what I proposed earlier. But I suggested as well to wait for AMD people to confirm that neither pci_msix_vec_count() nor flags is needed and we can directly supply MSI_TYPES to the below call. > > /* we only need one vector */ > > nvec = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags); -- With Best Regards, Andy Shevchenko
Re: [PATCH] spi: sprd: call pm_runtime_put if pm_runtime_get_sync fails
Hi, On Tue, Jun 2, 2020 at 1:20 PM Navid Emamdoost wrote: > > Call to pm_runtime_get_sync increments counter even in case of > failure leading to incorrect ref count. > Call pm_runtime_put_noidle if pm_runtime_get_sync fails. > > Signed-off-by: Navid Emamdoost Looks good to me. Thanks Reviewed-by: Baolin Wang > --- > drivers/spi/spi-sprd.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/spi/spi-sprd.c b/drivers/spi/spi-sprd.c > index 6678f1cbc566..860032af4b98 100644 > --- a/drivers/spi/spi-sprd.c > +++ b/drivers/spi/spi-sprd.c > @@ -1018,6 +1018,7 @@ static int sprd_spi_remove(struct platform_device *pdev) > ret = pm_runtime_get_sync(ss->dev); > if (ret < 0) { > dev_err(ss->dev, "failed to resume SPI controller\n"); > + pm_runtime_put_noidle(>dev); > return ret; > } > > -- > 2.17.1 > -- Baolin Wang
[PATCH v4 1/6] arm64: Detect the ARMv8.4 TTL feature
From: Marc Zyngier In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL feature allows TLBs to be issued with a level allowing for quicker invalidation. Let's detect the feature for now. Further patches will implement its actual usage. Signed-off-by: Marc Zyngier Signed-off-by: Zhenyu Ye Reviewed-by: Catalin Marinas --- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/kernel/cpufeature.c | 11 +++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 8eb5a088ae65..cabb0c49a1d1 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -61,7 +61,8 @@ #define ARM64_HAS_AMU_EXTN 51 #define ARM64_HAS_ADDRESS_AUTH 52 #define ARM64_HAS_GENERIC_AUTH 53 +#define ARM64_HAS_ARMv8_4_TTL 54 -#define ARM64_NCAPS54 +#define ARM64_NCAPS55 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index c4ac0ac25a00..477d84ba1056 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -725,6 +725,7 @@ /* id_aa64mmfr2 */ #define ID_AA64MMFR2_E0PD_SHIFT60 +#define ID_AA64MMFR2_TTL_SHIFT 48 #define ID_AA64MMFR2_FWB_SHIFT 40 #define ID_AA64MMFR2_AT_SHIFT 32 #define ID_AA64MMFR2_LVA_SHIFT 16 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9fac745aa7bb..d993dc6dc7d5 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -244,6 +244,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = { ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_E0PD_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_TTL_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_FWB_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_AT_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LVA_SHIFT, 4, 0), @@ -1622,6 +1623,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .cpu_enable = cpu_has_fwb, }, + { + .desc = "ARMv8.4 Translation Table Level", + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .capability = ARM64_HAS_ARMv8_4_TTL, + .sys_reg = SYS_ID_AA64MMFR2_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR2_TTL_SHIFT, + .min_field_value = 1, + .matches = has_cpuid_feature, + }, #ifdef CONFIG_ARM64_HW_AFDBM { /* -- 2.19.1
[PATCH v4 3/6] arm64: Add tlbi_user_level TLB invalidation helper
Add a level-hinted parameter to __tlbi_user, which only gets used if ARMv8.4-TTL gets detected. ARMv8.4-TTL provides the TTL field in tlbi instruction to indicate the level of translation table walk holding the leaf entry for the address that is being invalidated. This patch set the default level value of flush_tlb_range() to 0, which will be updated in future patches. And set the ttl value of flush_tlb_page_nosync() to 3 because it is only called to flush a single pte page. Signed-off-by: Zhenyu Ye --- arch/arm64/include/asm/tlbflush.h | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 8adbd6fd8489..bfb58e62c127 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -88,6 +88,12 @@ __tlbi(op, arg); \ } while (0) +#define __tlbi_user_level(op, arg, level) do { \ + if (arm64_kernel_unmapped_at_el0()) \ + __tlbi_level(op, (arg | USER_ASID_FLAG), level);\ +} while (0) + + /* * TLB Invalidation * @@ -189,8 +195,9 @@ static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm)); dsb(ishst); - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); + /* This function is only called on a small page */ + __tlbi_level(vale1is, addr, 3); + __tlbi_user_level(vale1is, addr, 3); } static inline void flush_tlb_page(struct vm_area_struct *vma, @@ -230,11 +237,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, dsb(ishst); for (addr = start; addr < end; addr += stride) { if (last_level) { - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); + __tlbi_level(vale1is, addr, 0); + __tlbi_user_level(vale1is, addr, 0); } else { - __tlbi(vae1is, addr); - __tlbi_user(vae1is, addr); + __tlbi_level(vae1is, addr, 0); + __tlbi_user_level(vae1is, addr, 0); } } dsb(ish); -- 2.19.1
[PATCH v4 5/6] arm64: tlb: Set the TTL field in flush_tlb_range
This patch uses the cleared_* in struct mmu_gather to set the TTL field in flush_tlb_range(). Signed-off-by: Zhenyu Ye Reviewed-by: Catalin Marinas --- arch/arm64/include/asm/tlb.h | 29 - arch/arm64/include/asm/tlbflush.h | 14 -- 2 files changed, 36 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h index b76df828e6b7..61c97d3b58c7 100644 --- a/arch/arm64/include/asm/tlb.h +++ b/arch/arm64/include/asm/tlb.h @@ -21,11 +21,37 @@ static void tlb_flush(struct mmu_gather *tlb); #include +/* + * get the tlbi levels in arm64. Default value is 0 if more than one + * of cleared_* is set or neither is set. + * Arm64 doesn't support p4ds now. + */ +static inline int tlb_get_level(struct mmu_gather *tlb) +{ + if (tlb->cleared_ptes && !(tlb->cleared_pmds || + tlb->cleared_puds || + tlb->cleared_p4ds)) + return 3; + + if (tlb->cleared_pmds && !(tlb->cleared_ptes || + tlb->cleared_puds || + tlb->cleared_p4ds)) + return 2; + + if (tlb->cleared_puds && !(tlb->cleared_ptes || + tlb->cleared_pmds || + tlb->cleared_p4ds)) + return 1; + + return 0; +} + static inline void tlb_flush(struct mmu_gather *tlb) { struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0); bool last_level = !tlb->freed_tables; unsigned long stride = tlb_get_unmap_size(tlb); + int tlb_level = tlb_get_level(tlb); /* * If we're tearing down the address space then we only care about @@ -38,7 +64,8 @@ static inline void tlb_flush(struct mmu_gather *tlb) return; } - __flush_tlb_range(, tlb->start, tlb->end, stride, last_level); + __flush_tlb_range(, tlb->start, tlb->end, stride, + last_level, tlb_level); } static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bfb58e62c127..84cb98b60b7b 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -215,7 +215,8 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, -unsigned long stride, bool last_level) +unsigned long stride, bool last_level, +int tlb_level) { unsigned long asid = ASID(vma->vm_mm); unsigned long addr; @@ -237,11 +238,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, dsb(ishst); for (addr = start; addr < end; addr += stride) { if (last_level) { - __tlbi_level(vale1is, addr, 0); - __tlbi_user_level(vale1is, addr, 0); + __tlbi_level(vale1is, addr, tlb_level); + __tlbi_user_level(vale1is, addr, tlb_level); } else { - __tlbi_level(vae1is, addr, 0); - __tlbi_user_level(vae1is, addr, 0); + __tlbi_level(vae1is, addr, tlb_level); + __tlbi_user_level(vae1is, addr, tlb_level); } } dsb(ish); @@ -253,8 +254,9 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, /* * We cannot use leaf-only invalidation here, since we may be invalidating * table entries as part of collapsing hugepages or moving page tables. +* Set the tlb_level to 0 because we can not get enough information here. */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false); + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); } static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) -- 2.19.1
[PATCH v4 4/6] tlb: mmu_gather: add tlb_flush_*_range APIs
From: "Peter Zijlstra (Intel)" tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, then set corresponding cleared_*. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Zhenyu Ye Acked-by: Catalin Marinas --- include/asm-generic/tlb.h | 55 --- 1 file changed, 40 insertions(+), 15 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 3f1649a8cf55..ef75ec86f865 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -512,6 +512,38 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm } #endif +/* + * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, + * and set corresponding cleared_*. + */ +static inline void tlb_flush_pte_range(struct mmu_gather *tlb, +unsigned long address, unsigned long size) +{ + __tlb_adjust_range(tlb, address, size); + tlb->cleared_ptes = 1; +} + +static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, +unsigned long address, unsigned long size) +{ + __tlb_adjust_range(tlb, address, size); + tlb->cleared_pmds = 1; +} + +static inline void tlb_flush_pud_range(struct mmu_gather *tlb, +unsigned long address, unsigned long size) +{ + __tlb_adjust_range(tlb, address, size); + tlb->cleared_puds = 1; +} + +static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, +unsigned long address, unsigned long size) +{ + __tlb_adjust_range(tlb, address, size); + tlb->cleared_p4ds = 1; +} + #ifndef __tlb_remove_tlb_entry #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) #endif @@ -525,19 +557,17 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm */ #define tlb_remove_tlb_entry(tlb, ptep, address) \ do {\ - __tlb_adjust_range(tlb, address, PAGE_SIZE);\ - tlb->cleared_ptes = 1; \ + tlb_flush_pte_range(tlb, address, PAGE_SIZE); \ __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ do {\ unsigned long _sz = huge_page_size(h); \ - __tlb_adjust_range(tlb, address, _sz); \ if (_sz == PMD_SIZE)\ - tlb->cleared_pmds = 1; \ + tlb_flush_pmd_range(tlb, address, _sz); \ else if (_sz == PUD_SIZE) \ - tlb->cleared_puds = 1; \ + tlb_flush_pud_range(tlb, address, _sz); \ __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) @@ -551,8 +581,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address) \ do {\ - __tlb_adjust_range(tlb, address, HPAGE_PMD_SIZE); \ - tlb->cleared_pmds = 1; \ + tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE); \ __tlb_remove_pmd_tlb_entry(tlb, pmdp, address); \ } while (0) @@ -566,8 +595,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm #define tlb_remove_pud_tlb_entry(tlb, pudp, address) \ do {\ - __tlb_adjust_range(tlb, address, HPAGE_PUD_SIZE); \ - tlb->cleared_puds = 1; \ + tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE); \ __tlb_remove_pud_tlb_entry(tlb, pudp, address); \ } while (0) @@ -592,9 +620,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm #ifndef pte_free_tlb #define pte_free_tlb(tlb, ptep, address) \ do {\ - __tlb_adjust_range(tlb, address, PAGE_SIZE);\ + tlb_flush_pmd_range(tlb, address, PAGE_SIZE); \ tlb->freed_tables = 1; \ - tlb->cleared_pmds = 1; \ __pte_free_tlb(tlb, ptep, address); \ } while (0) #endif @@ -602,9 +629,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm #ifndef pmd_free_tlb #define pmd_free_tlb(tlb, pmdp, address)
[PATCH v4 6/6] arm64: tlb: Set the TTL field in flush_*_tlb_range
This patch implement flush_{pmd|pud}_tlb_range() in arm64 by calling __flush_tlb_range() with the corresponding stride and tlb_level values. Signed-off-by: Zhenyu Ye --- arch/arm64/include/asm/pgtable.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 538c85e62f86..bc59814eda64 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -40,6 +40,16 @@ extern void __pmd_error(const char *file, int line, unsigned long val); extern void __pud_error(const char *file, int line, unsigned long val); extern void __pgd_error(const char *file, int line, unsigned long val); +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE + +/* Set stride and tlb_level in flush_*_tlb_range */ +#define flush_pmd_tlb_range(vma, addr, end)\ + __flush_tlb_range(vma, addr, end, PMD_SIZE, false, 2) +#define flush_pud_tlb_range(vma, addr, end)\ + __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + /* * ZERO_PAGE is a global shared page that is always zero: used * for zero-mapped memory areas etc.. -- 2.19.1
[PATCH v4 2/6] arm64: Add level-hinted TLB invalidation helper
From: Marc Zyngier Add a level-hinted TLB invalidation helper that only gets used if ARMv8.4-TTL gets detected. Signed-off-by: Marc Zyngier Signed-off-by: Zhenyu Ye Reviewed-by: Catalin Marinas --- arch/arm64/include/asm/tlbflush.h | 29 + 1 file changed, 29 insertions(+) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bc3949064725..8adbd6fd8489 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -10,6 +10,7 @@ #ifndef __ASSEMBLY__ +#include #include #include #include @@ -59,6 +60,34 @@ __ta; \ }) +#define TLBI_TTL_MASK GENMASK_ULL(47, 44) + +#define __tlbi_level(op, addr, level) do { \ + u64 arg = addr; \ + \ + if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) && \ + level) {\ + u64 ttl = level;\ + \ + switch (PAGE_SIZE) {\ + case SZ_4K: \ + ttl |= 1 << 2; \ + break; \ + case SZ_16K:\ + ttl |= 2 << 2; \ + break; \ + case SZ_64K:\ + ttl |= 3 << 2; \ + break; \ + } \ + \ + arg &= ~TLBI_TTL_MASK; \ + arg |= FIELD_PREP(TLBI_TTL_MASK, ttl); \ + } \ + \ + __tlbi(op, arg); \ +} while (0) + /* * TLB Invalidation * -- 2.19.1
Re: [PATCH 1/7] drm/rockchip: prepare common code for cdns and rk dpi/dp driver
HI Sandor Yu On Mon, 1 Jun 2020 at 07:29, wrote: > > From: Sandor Yu > > - Extracted common fields from cdn_dp_device to a new cdns_mhdp_device > structure which will be used by two separate drivers later on. > - Moved some datatypes (audio_format, audio_info, vic_pxl_encoding_format, > video_info) from cdn-dp-core.c to cdn-dp-reg.h. > - Changed prefixes from cdn_dp to cdns_mhdp > cdn -> cdns to match the other Cadence's drivers > dp -> mhdp to distinguish it from a "just a DP" as the IP underneath > this registers map can be a HDMI (which is internally different, > but the interface for commands, events is pretty much the same). > - Modified cdn-dp-core.c to use the new driver structure and new function > names. > - writel and readl are replaced by cdns_mhdp_bus_write and > cdns_mhdp_bus_read. > The high-level idea is great - split, refactor and reuse the existing drivers. Although looking at the patches themselves - they seems to be doing multiple things at once. As indicated by the extensive list in the commit log. I would suggest splitting those up a bit, roughly in line of the itemisation as per the commit message. Here is one hand wavy way to chunk this patch: 1) use put_unalligned* 2) 'use local variable dev' style of changes (as seem in cdn_dp_clk_enable) 3) add writel/readl wrappers 4) hookup struct cdns_mhdp_device, keep dp->mhdp detail internal. The cdn-dp-reg.h function names/signatures will stay the same. 5) finalize the helpers - use mhdp directly, rename HTH Emil Examples: 4) static int cdn_dp_mailbox_read(struct cdn_dp_device *dp) { +" struct cdns_mhdp_device *mhdp = dp->mhdp; int val, ret; - ret = readx_poll_timeout(readl, dp->regs + MAILBOX_EMPTY_ADDR, + ret = readx_poll_timeout(readl, mhdp->regs_base + MAILBOX_EMPTY_ADDR, ... return fancy_readl(dp, MAILBOX0_RD_DATA) & 0xff; } 5) -static int cdn_dp_mailbox_read(struct cdn_dp_device *dp) +static int mhdp_mailbox_read(struct cdns_mhdp_device *mhdp) { - struct cdns_mhdp_device *mhdp = dp->mhdp; int val, ret; ... - return fancy_readl(dp, MAILBOX0_RD_DATA) & 0xff; + return cdns_mhdp_bus_read(mhdp, MAILBOX0_RD_DATA) & 0xff; }
[PATCH v4 0/6] arm64: tlb: add support for TTL feature
In order to reduce the cost of TLB invalidation, ARMv8.4 provides the TTL field in TLBI instruction. The TTL field indicates the level of page table walk holding the leaf entry for the address being invalidated. This series provide support for this feature. When ARMv8.4-TTL is implemented, the operand for TLBIs looks like below: * +--+---+--+ * | ASID | TTL |BADDR | * +--+---+--+ * |63 48|47 44|43 0| See patches for details, Thanks. -- ChangeList: v4: implement flush_*_tlb_range only on arm64. v3: minor changes: reduce the indentation levels of __tlbi_level(). v2: rebase series on Linux 5.7-rc1 and simplify the code implementation. v1: add support for TTL feature in arm64. Marc Zyngier (2): arm64: Detect the ARMv8.4 TTL feature arm64: Add level-hinted TLB invalidation helper Peter Zijlstra (Intel) (1): tlb: mmu_gather: add tlb_flush_*_range APIs Zhenyu Ye (3): arm64: Add tlbi_user_level TLB invalidation helper arm64: tlb: Set the TTL field in flush_tlb_range arm64: tlb: Set the TTL field in flush_*_tlb_range arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/pgtable.h | 10 ++ arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/include/asm/tlb.h | 29 +++- arch/arm64/include/asm/tlbflush.h | 54 +- arch/arm64/kernel/cpufeature.c| 11 +++ include/asm-generic/tlb.h | 55 ++- 7 files changed, 138 insertions(+), 25 deletions(-) -- 2.19.1
Re: linux-next: manual merge of the hyperv tree with the kvm tree
On Tue, Jun 02, 2020 at 05:18:02PM +1000, Stephen Rothwell wrote: > Hi all, > > Today's linux-next merge of the hyperv tree got a conflict in: > > arch/x86/include/asm/hyperv-tlfs.h > > between commit: > > 22ad0026d097 ("x86/hyper-v: Add synthetic debugger definitions") > Paolo As far as I can tell you merged that series a few days ago. Do you plan to submit it to Linus in this merge window? How do you want to proceed to fix the conflict? Wei. > from the kvm tree and commit: > > c55a844f46f9 ("x86/hyperv: Split hyperv-tlfs.h into arch dependent and > independent files") > > from the hyperv tree. > > I fixed it up (I removed the conficting bits from that file and added > the following patch) and can carry the fix as necessary. This is now fixed > as far as linux-next is concerned, but any non trivial conflicts should > be mentioned to your upstream maintainer when your tree is submitted for > merging. You may also want to consider cooperating with the maintainer > of the conflicting tree to minimise any particularly complex conflicts. > > From: Stephen Rothwell > Date: Tue, 2 Jun 2020 17:15:49 +1000 > Subject: [PATCH] x86/hyperv: merge fix for hyperv-tlfs.h split > > Signed-off-by: Stephen Rothwell > --- > include/asm-generic/hyperv-tlfs.h | 4 > 1 file changed, 4 insertions(+) > > diff --git a/include/asm-generic/hyperv-tlfs.h > b/include/asm-generic/hyperv-tlfs.h > index 262fae9526b1..e73a11850055 100644 > --- a/include/asm-generic/hyperv-tlfs.h > +++ b/include/asm-generic/hyperv-tlfs.h > @@ -145,6 +145,9 @@ struct ms_hyperv_tsc_page { > #define HVCALL_SET_VP_REGISTERS 0x0051 > #define HVCALL_POST_MESSAGE 0x005c > #define HVCALL_SIGNAL_EVENT 0x005d > +#define HVCALL_POST_DEBUG_DATA 0x0069 > +#define HVCALL_RETRIEVE_DEBUG_DATA 0x006a > +#define HVCALL_RESET_DEBUG_SESSION 0x006b > #define HVCALL_RETARGET_INTERRUPT0x007e > #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af > #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0 > @@ -177,6 +180,7 @@ enum HV_GENERIC_SET_FORMAT { > #define HV_STATUS_INVALID_HYPERCALL_INPUT3 > #define HV_STATUS_INVALID_ALIGNMENT 4 > #define HV_STATUS_INVALID_PARAMETER 5 > +#define HV_STATUS_OPERATION_DENIED 8 > #define HV_STATUS_INSUFFICIENT_MEMORY11 > #define HV_STATUS_INVALID_PORT_ID17 > #define HV_STATUS_INVALID_CONNECTION_ID 18 > -- > 2.26.2 > > -- > Cheers, > Stephen Rothwell
[PATCH v2] crypto: hisilicon - allow smaller reads in debugfs
Originally this code rejected any read less than 256 bytes. There is no need for this artificial limit. We should just use the normal helper functions to read a string from the kernel. Signed-off-by: Dan Carpenter --- v2: Use simple_read_from_buffer(). The v1 was slightly half arsed because I left the original check for: if (*pos) return 0; So it could result in partial reads. The new code means that if you want to read the buffer one byte at a time, that's fine or if you want to read it in one 256 byte chunk that's also fine. Plus it deletes 21 lines of code and is a lot cleaner. drivers/crypto/hisilicon/qm.c | 33 ++--- 1 file changed, 6 insertions(+), 27 deletions(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 9bb263cec6c30..13ccb9e29a2e1 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -1064,19 +1064,10 @@ static ssize_t qm_cmd_read(struct file *filp, char __user *buffer, char buf[QM_DBG_READ_LEN]; int len; - if (*pos) - return 0; - - if (count < QM_DBG_READ_LEN) - return -ENOSPC; + len = scnprintf(buf, QM_DBG_READ_LEN, "%s\n", + "Please echo help to cmd to get help information"); - len = snprintf(buf, QM_DBG_READ_LEN, "%s\n", - "Please echo help to cmd to get help information"); - - if (copy_to_user(buffer, buf, len)) - return -EFAULT; - - return (*pos = len); + return simple_read_from_buffer(buffer, count, pos, buf, len); } static void *qm_ctx_alloc(struct hisi_qm *qm, size_t ctx_size, @@ -2691,24 +2682,12 @@ static ssize_t qm_status_read(struct file *filp, char __user *buffer, { struct hisi_qm *qm = filp->private_data; char buf[QM_DBG_READ_LEN]; - int val, cp_len, len; - - if (*pos) - return 0; - - if (count < QM_DBG_READ_LEN) - return -ENOSPC; + int val, len; val = atomic_read(>status.flags); - len = snprintf(buf, QM_DBG_READ_LEN, "%s\n", qm_s[val]); - if (!len) - return -EFAULT; - - cp_len = copy_to_user(buffer, buf, len); - if (cp_len) - return -EFAULT; + len = scnprintf(buf, QM_DBG_READ_LEN, "%s\n", qm_s[val]); - return (*pos = len); + return simple_read_from_buffer(buffer, count, pos, buf, len); } static const struct file_operations qm_status_fops = { -- 2.26.2
Re: [PATCH] ARM: dts: AM33xx-l4: add gpio-ranges
* Grygorii Strashko [200602 13:44]: > > > On 02/06/2020 16:14, Drew Fustini wrote: > > Add gpio-ranges properties to the gpio controller nodes. > > > > These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE > > REGISTERS" in the "AM335x Technical Reference Manual" [0] and "Table > > 4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1]. > > A csv file with this data is available for reference [2]. > > It will be good if you can explain not only "what" is changed, but > also "why" it's needed in commit message. Also, please check (again) that this is the same for all the am3 variants. For omap3, we had different pad assignments even between SoC revisions. Different pad routings should be easy to deal with in the dts if needed though. Regards, Tony
[PATCH] workqueue: ensure all flush_work() completed when being destoryed
In old days, worker threads are not shared among different workqueues and destroy_workqueue() used kthread_stop() to destroy all workers before going to destroy workqueue structures. And kthread_stop() can ensure the scheduled (worker->scheduled) work items and the linked work items queued by flush_work() to be completed. For a workqueue to be completed/unused for wq users means that all queued works have completed and all flush_work() have return, and the workqueue is legitimate to passed to destroy_workqueue(). But e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool") made worker pools and workers shared among different workqueues and kthread_stop() is not used to sync the completion of the work items. destroy_workqueue() uses drain_workqueue() to drain user work items, but internal work items queued by flush_work() is not drained due to they don't have colors. So problems may occur when wq_barrier_func() does complete(>done) and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue() can be scheduled eariler than the proccess_one_work() to do the put_pwq(), so that the sanity check in destroy_workqueue() can see the no yet put pwq->refcnt and blame false positively. The problem can be easily fixed by removing the WORK_NO_COLOR and making the internal work item queued by flush_work() inherit the color of the work items to be flushed. It would definitely revert the design and the benefits of the WORK_NO_COLOR. The patch simply adds an atomic counter for in-flight flush_work() and a completion for destroy_workqueue() waiting for them. Signed-off-by: Lai Jiangshan --- Changed from V1: Change from flush_no_color based mechanism to atomic+completion based as TJ suggested. kernel/workqueue.c | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 1921c982f920..71272beb8e01 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -253,6 +253,9 @@ struct workqueue_struct { int nr_drainers;/* WQ: drain in progress */ int saved_max_active; /* WQ: saved pwq max_active */ + atomic_tnr_flush_work; /* flush work in progress */ + struct completion flush_work_done; /* sync flush_work() */ + struct workqueue_attrs *unbound_attrs; /* PW: only for unbound wqs */ struct pool_workqueue *dfl_pwq; /* PW: only for unbound wqs */ @@ -1154,6 +1157,12 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq) pwq_activate_delayed_work(work); } +static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq) +{ + if (atomic_dec_and_test(>nr_flush_work)) + complete(>flush_work_done); +} + /** * pwq_dec_nr_in_flight - decrement pwq's nr_in_flight * @pwq: pwq of interest @@ -1168,8 +1177,10 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq) static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color) { /* uncolored work items don't participate in flushing or nr_active */ - if (color == WORK_NO_COLOR) + if (color == WORK_NO_COLOR) { + dec_nr_in_flight_flush_work(pwq->wq); goto out_put; + } pwq->nr_in_flight[color]--; @@ -2682,6 +2693,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq, } debug_work_activate(>work); + atomic_inc(>wq->nr_flush_work); insert_work(pwq, >work, head, work_color_to_flags(WORK_NO_COLOR) | linked); } @@ -4278,6 +4290,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt, wq_init_lockdep(wq); INIT_LIST_HEAD(>list); + atomic_set(>nr_flush_work, 1); + init_completion(>flush_work_done); + if (alloc_and_link_pwqs(wq) < 0) goto err_unreg_lockdep; @@ -4354,6 +4369,10 @@ void destroy_workqueue(struct workqueue_struct *wq) /* drain it before proceeding with destruction */ drain_workqueue(wq); + /* flush all uncompleted internal work items queued by flush_work() */ + dec_nr_in_flight_flush_work(wq); + wait_for_completion(>flush_work_done); + /* kill rescuer, if sanity checks fail, leave it w/o rescuer */ if (wq->rescuer) { struct worker *rescuer = wq->rescuer; -- 2.20.1
Re: kobject_init_and_add is easy to misuse
On 02/06/2020 15.10, Matthew Wilcox wrote: On Tue, Jun 02, 2020 at 07:50:33PM +0800, Wang Hai wrote: syzkaller reports for memory leak when kobject_init_and_add() returns an error in the function sysfs_slab_add() [1] When this happened, the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. I think this speaks to a deeper problem with kobject_init_and_add() -- the need to call kobject_put() if it fails is not readily apparent to most users. This same bug appears in the first three users of kobject_init_and_add() that I checked -- arch/ia64/kernel/topology.c drivers/firmware/dmi-sysfs.c drivers/firmware/efi/esrt.c drivers/scsi/iscsi_boot_sysfs.c Some do get it right -- arch/powerpc/kernel/cacheinfo.c drivers/gpu/drm/ttm/ttm_bo.c drivers/gpu/drm/ttm/ttm_memory.c drivers/infiniband/hw/mlx4/sysfs.c I'd argue that the current behaviour is wrong, that kobject_init_and_add() should call kobject_put() if the add fails. This would need a tree-wide audit. But somebody needs to do that anyway because based on my random sampling, half of the users currently get it wrong. At his point kobject doesn't own kmem-cache structure itself yet. So calling kobject_put() will free kmem-cache and then it will be freed second time on error path in create_cache(). I suppose freeing in case of error should be pushed from common create_cache() into slab-specific __kmem_cache_create().
[PATCHv5 1/1] ext4: mballoc: Use raw_cpu_ptr instead of this_cpu_ptr
It doesn't really matter in ext4_mb_new_blocks() about whether the code is rescheduled on any other cpu due to preemption. Because we care about discard_pa_seq only when the block allocation fails and then too we add the seq counter of all the cpus against the initial sampled one to check if anyone has freed any blocks while we were doing allocation. So just use raw_cpu_ptr instead of this_cpu_ptr to avoid this BUG. BUG: using smp_processor_id() in preemptible [] code: syz-fuzzer/6927 caller is ext4_mb_new_blocks+0xa4d/0x3b70 fs/ext4/mballoc.c:4711 CPU: 1 PID: 6927 Comm: syz-fuzzer Not tainted 5.7.0-next-20200602-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 check_preemption_disabled+0x20d/0x220 lib/smp_processor_id.c:48 ext4_mb_new_blocks+0xa4d/0x3b70 fs/ext4/mballoc.c:4711 ext4_ext_map_blocks+0x201b/0x33e0 fs/ext4/extents.c:4244 ext4_map_blocks+0x4cb/0x1640 fs/ext4/inode.c:626 ext4_getblk+0xad/0x520 fs/ext4/inode.c:833 ext4_bread+0x7c/0x380 fs/ext4/inode.c:883 ext4_append+0x153/0x360 fs/ext4/namei.c:67 ext4_init_new_dir fs/ext4/namei.c:2757 [inline] ext4_mkdir+0x5e0/0xdf0 fs/ext4/namei.c:2802 vfs_mkdir+0x419/0x690 fs/namei.c:3632 do_mkdirat+0x21e/0x280 fs/namei.c:3655 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Ritesh Harjani Reported-by: syzbot+82f324bb69744c5f6...@syzkaller.appspotmail.com --- fs/ext4/mballoc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index a9083113a8c0..b79b32dbe3ea 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4708,7 +4708,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, } ac->ac_op = EXT4_MB_HISTORY_PREALLOC; - seq = *this_cpu_ptr(_pa_seq); + seq = *raw_cpu_ptr(_pa_seq); if (!ext4_mb_use_preallocated(ac)) { ac->ac_op = EXT4_MB_HISTORY_ALLOC; ext4_mb_normalize_request(ac, ar); -- 2.21.3
Re: [PATCH v4 4/4] kdb: Switch to use safer dbg_io_ops over console APIs
On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote: > In kgdb context, calling console handlers aren't safe due to locks used > in those handlers which could in turn lead to a deadlock. Although, using > oops_in_progress increases the chance to bypass locks in most console > handlers but it might not be sufficient enough in case a console uses > more locks (VT/TTY is good example). > > Currently when a driver provides both polling I/O and a console then kdb > will output using the console. We can increase robustness by using the > currently active polling I/O driver (which should be lockless) instead > of the corresponding console. For several common cases (e.g. an > embedded system with a single serial port that is used both for console > output and debugger I/O) this will result in no console handler being > used. > > In order to achieve this we need to reverse the order of preference to > use dbg_io_ops (uses polling I/O mode) over console APIs. So we just > store "struct console" that represents debugger I/O in dbg_io_ops and > while emitting kdb messages, skip console that matches dbg_io_ops > console in order to avoid duplicate messages. After this change, > "is_console" param becomes redundant and hence removed. > > Suggested-by: Daniel Thompson > Signed-off-by: Sumit Garg Looking good, only one minor comment left on my side (including the three patches prior). > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c > index 9e5a40d..5e00bc8 100644 > --- a/kernel/debug/kdb/kdb_io.c > +++ b/kernel/debug/kdb/kdb_io.c > @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len) > if (msg_len == 0) > return; > > - if (dbg_io_ops && !dbg_io_ops->is_console) > + if (dbg_io_ops) > kdb_io_write(msg, msg_len); Since this now slots on so cleanly and there are not multiple calls to kdb_io_write() then I think perhaps factoring this out into its own function (in patch 1) is no long necessary. The character write loop can go directly into this function. Daniel.
Re: [PATCH v5 13/13] perf record: introduce --ctl-fd[-ack] options
On 2/06/20 12:12 pm, Alexey Budankov wrote: > > On 02.06.2020 11:32, Alexey Budankov wrote: >> >> On 02.06.2020 2:37, Andi Kleen wrote: > or a pathname, or including also the event default of "disabled". For my cases conversion of pathnames into open fds belongs to external controlling process e.g. like in the examples provided in the patch set. Not sure about "event default of 'disabled'" >>> >>> It would be nicer for manual use cases if perf supported the path names >>> directly like in Adrian's example, not needing a complex wrapper script. >> >> fds interface is required for VTune integration since VTune wants control >> over files creation aside of Perf tool process. The script demonstrates >> just one possible use case. >> >> Control files could easily be implemented on top of fds making open >> operations >> for paths and then initializing fds. Interface below is vague and with >> explicit >> options like below it could be more explicit: >> --ctl-file /tmp/my-perf.fifo --ctl-file-ack /tmp/my-perf-ack.fifo > > Or even clearer: > > --ctl-fifo /tmp/my-perf --ctl-fifo-ack /tmp/my-perf-ack If people are OK with having so many options, then that is fine by me. > >> >> Make either fds and or files provided on the command line. Implement file >> options handling callbacks that would open paths and setting fds. Close fds >> if they were opened by Perf tool process. >> >> Adrian, please share your mind and use case. >> >> ~Alexey >> >>> >>> -Andi > > e.g. add "--control" and support all of: > > --control > --control 11 > --control 11,15 > --control 11,15,disabled > --control 11,,disabled > --control /tmp/my-perf.fifo > --control /tmp/my-perf.fifo,/tmp/my-perf-ack.fifo > --control /tmp/my-perf.fifo,/tmp/my-perf-ack.fifo,disabled > --control /tmp/my-perf.fifo,,disabled > > Regards > Adrian > Regards, Alexey
Re: [PATCH] ARM: dts: AM33xx-l4: add gpio-ranges
On 02/06/2020 16:14, Drew Fustini wrote: Add gpio-ranges properties to the gpio controller nodes. These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE REGISTERS" in the "AM335x Technical Reference Manual" [0] and "Table 4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1]. A csv file with this data is available for reference [2]. It will be good if you can explain not only "what" is changed, but also "why" it's needed in commit message. [0] https://www.ti.com/lit/ug/spruh73q/spruh73q.pdf [1] http://www.ti.com/lit/ds/symlink/am3358.pdf [2] https://gist.github.com/pdp7/6ffaddc8867973c1c3e8612cfaf72020 Signed-off-by: Drew Fustini --- Notes: There was previous dicussion relevant to this patch: https://lore.kernel.org/linux-gpio/20200525131731.GA948395@x1/ Here is the list I compiled which shows the mapping between a gpioline and the pin number including the memory address for the pin control register gpiochip,gpio-line,pinctrl-PIN,pinctrl-address 0,0,82,44e10948 0,1,83,44e1094c 0,2,84,44e10950 0,3,85,44e10954 0,4,86,44e10958 0,5,87,44e1095c ... On a BeagleBlack Black board (AM3358) with this patch: cat /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single/gpio-ranges GPIO ranges handled: 0: gpio-0-31 GPIOS [0 - 7] PINS [82 - 89] 8: gpio-0-31 GPIOS [8 - 11] PINS [52 - 55] 12: gpio-0-31 GPIOS [12 - 15] PINS [94 - 97] 16: gpio-0-31 GPIOS [16 - 17] PINS [71 - 72] 18: gpio-0-31 GPIOS [18 - 18] PINS [135 - 135] 19: gpio-0-31 GPIOS [19 - 20] PINS [108 - 109] 21: gpio-0-31 GPIOS [21 - 21] PINS [73 - 73] 22: gpio-0-31 GPIOS [22 - 23] PINS [8 - 9] 26: gpio-0-31 GPIOS [26 - 27] PINS [10 - 11] 28: gpio-0-31 GPIOS [28 - 28] PINS [74 - 74] 29: gpio-0-31 GPIOS [29 - 29] PINS [81 - 81] 30: gpio-0-31 GPIOS [30 - 31] PINS [28 - 29] 0: gpio-32-63 GPIOS [32 - 39] PINS [0 - 7] 8: gpio-32-63 GPIOS [40 - 43] PINS [90 - 93] 12: gpio-32-63 GPIOS [44 - 59] PINS [12 - 27] 28: gpio-32-63 GPIOS [60 - 63] PINS [30 - 33] 0: gpio-64-95 GPIOS [64 - 81] PINS [34 - 51] 18: gpio-64-95 GPIOS [82 - 85] PINS [77 - 80] 22: gpio-64-95 GPIOS [86 - 95] PINS [56 - 65] 0: gpio-96-127 GPIOS [96 - 100] PINS [66 - 70] 5: gpio-96-127 GPIOS [101 - 102] PINS [98 - 99] 7: gpio-96-127 GPIOS [103 - 104] PINS [75 - 76] 13: gpio-96-127 GPIOS [109 - 109] PINS [141 - 141] 14: gpio-96-127 GPIOS [110 - 117] PINS [100 - 107] arch/arm/boot/dts/am33xx-l4.dtsi | 24 1 file changed, 24 insertions(+) diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi index 5ed7f3c58c0f..340ea331e54d 100644 --- a/arch/arm/boot/dts/am33xx-l4.dtsi +++ b/arch/arm/boot/dts/am33xx-l4.dtsi @@ -151,6 +151,18 @@ SYSC_OMAP2_SOFTRESET | gpio0: gpio@0 { compatible = "ti,omap4-gpio"; + gpio-ranges = <_pinmux 0 82 8>, + <_pinmux 8 52 4>, + <_pinmux 12 94 4>, + <_pinmux 16 71 2>, + <_pinmux 18 135 1>, + <_pinmux 19 108 2>, + <_pinmux 21 73 1>, + <_pinmux 22 8 2>, + <_pinmux 26 10 2>, + <_pinmux 28 74 1>, + <_pinmux 29 81 1>, + <_pinmux 30 28 2>; gpio-controller; #gpio-cells = <2>; interrupt-controller; @@ -1298,6 +1310,10 @@ SYSC_OMAP2_SOFTRESET | gpio1: gpio@0 { compatible = "ti,omap4-gpio"; + gpio-ranges = <_pinmux 0 0 8>, + <_pinmux 8 90 4>, + <_pinmux 12 12 16>, + <_pinmux 28 30 4>; gpio-controller; #gpio-cells = <2>; interrupt-controller; @@ -1700,6 +1716,9 @@ SYSC_OMAP2_SOFTRESET | gpio2: gpio@0 { compatible = "ti,omap4-gpio"; +gpio-ranges = <_pinmux 0 34 18>, + <_pinmux 18 77 4>, + <_pinmux 22 56 10>; gpio-controller; #gpio-cells = <2>; interrupt-controller; @@ -1733,6 +1752,11 @@ SYSC_OMAP2_SOFTRESET | gpio3: gpio@0 { compatible = "ti,omap4-gpio"; + gpio-ranges = <_pinmux
Re: [PATCHv3] perf stat: Ensure group is defined on top of the same cpu mask
Em Tue, Jun 02, 2020 at 12:17:36PM +0200, Jiri Olsa escreveu: > Jin Yao reported the issue (and posted first versions of this change) > with groups being defined over events with different cpu mask. > This causes assert aborts in get_group_fd, like: > # perf stat -M "C2_Pkg_Residency" -a -- sleep 1 > perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed. > Aborted > All the events in the group have to be defined over the same > cpus so the group_fd can be found for every leader/member pair. > Adding check to ensure this condition is met and removing the > group (with warning) if we detect mixed cpus, like: > $ sudo perf stat -e > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}' > WARNING: event cpu maps do not match, disabling group: > anon group { power/energy-cores/, cycles } > anon group { instructions, power/energy-cores/ } So it doesn't disable the 'group', it disables the 'grouping' of those events, right? I.e. reading the WARNING, I thought that it would count nothing, since it lists both groups as being disabled, but when I tested I noticed that: [root@seventh ~]# perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}' WARNING: grouped events cpus do not match, disabling group: anon group { power/energy-cores/, cycles } anon group { instructions, power/energy-cores/ } ^C Performance counter stats for 'system wide': 12.62 Joules power/energy-cores/ 106,920,637cycles 80,228,899instructions #0.75 insn per cycle 12.62 Joules power/energy-cores/ 14.514476987 seconds time elapsed [root@seventh ~]# I.e. it counted the events, ungrouped, or am I missing something? If I do: [root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2 Performance counter stats for 'system wide': 1.73 Joules power/energy-cores/ 0.92 Joules power/energy-ram/ 12,191,658instructions #0.67 insn per cycle 18,275,233cycles 2.001272492 seconds time elapsed [root@seventh ~]# It works, grouped. One observation, shouldn't we somehow show in the output that the first two were indeed grouped, ditto for the second two? Also, this needs improvement: [root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (power/energy-cores/). /bin/dmesg | grep -i perf may provide additional information. [root@seventh ~]# Probably stating that the power/ events can only be done on a system wide mode or per-cpu? I'm applying the patch now, with the above observations as committer notes, we can improve this in follow on patch, - Arnaldo > Ian asked also for cpu maps details, it's displayed in verbose mode: > > $ sudo perf stat -e '{cycles,power/energy-cores/}' -v > WARNING: group events cpu maps do not match, disabling group: > anon group { power/energy-cores/, cycles } >power/energy-cores/: 0 >cycles: 0-7 > anon group { instructions, power/energy-cores/ } >instructions: 0-7 >power/energy-cores/: 0 > > Fixes: 6a4bb04caacc8 ("perf tools: Enable grouping logic for parsed events") > Co-developed-by: Jin Yao > Signed-off-by: Jin Yao > Signed-off-by: Jiri Olsa > --- > tools/perf/builtin-stat.c | 55 +++ > 1 file changed, 55 insertions(+) > > v3 changes: >- reword the warning with Ian's suggestion > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index b2b79aa161dd..9be020e0098a 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -190,6 +190,59 @@ static struct perf_stat_config stat_config = { > .big_num= true, > }; > > +static bool cpus_map_matched(struct evsel *a, struct evsel *b) > +{ > + if (!a->core.cpus && !b->core.cpus) > + return true; > + > + if (!a->core.cpus || !b->core.cpus) > + return false; > + > + if (a->core.cpus->nr != b->core.cpus->nr) > + return false; > + > + for (int i = 0; i < a->core.cpus->nr; i++) { > + if (a->core.cpus->map[i] != b->core.cpus->map[i]) > + return false; > + } > + > + return true; > +} > + > +static void evlist__check_cpu_maps(struct evlist *evlist) > +{ > + struct evsel *evsel, *pos, *leader; > + char buf[1024]; > + > + evlist__for_each_entry(evlist, evsel) { > + leader = evsel->leader; > + > + /* Check that leader matches cpus with each member. */ > +
Re: [PATCH 1/2] i2c: mux: pca9541: Change to correct bus control commands
On 6/2/20 5:12 AM, Quentin Strydom wrote: > Change current bus commands to match the pca9541a datasheet > (see table 12 on page 14 of > https://www.nxp.com/docs/en/data-sheet/PCA9541A.pdf). Also > where entries are marked as no change the current control > command is repeated as the current master reading the > control register has control of the bus and bus is on. > > Signed-off-by: Quentin Strydom I am not going to reply to your other e-mail. After all, it is marked confidential with a lot of legalese around it. - Split long lines in your e-mails. - Do not just send the patch again, reply to the original patch with a comment such as "ping" - If you resend (for example because still no one replied after more time, or you missed some mailing lists/reviewers), mark the patch subject with "PATCH RESEND" - Never under any circumstances send an e-mail to a public list that states that "This email and any files transmitted with it are confidential ...". For my part I did not reply because I have no access to the hardware anymore, and thus can not validate if the code still works after this change (or why I didn't notice the problem when I wrote the code initially). Guenter > --- > drivers/i2c/muxes/i2c-mux-pca9541.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/i2c/muxes/i2c-mux-pca9541.c > b/drivers/i2c/muxes/i2c-mux-pca9541.c > index 6a39ada..50808fa 100644 > --- a/drivers/i2c/muxes/i2c-mux-pca9541.c > +++ b/drivers/i2c/muxes/i2c-mux-pca9541.c > @@ -211,7 +211,7 @@ static void pca9541_release_bus(struct i2c_client *client) > > /* Control commands per PCA9541 datasheet */ > static const u8 pca9541_control[16] = { > - 4, 0, 1, 5, 4, 4, 5, 5, 0, 0, 1, 1, 0, 4, 5, 1 > + 4, 4, 5, 5, 4, 4, 5, 7, 8, 0, 1, 11, 0, 0, 1, 1 > }; > > /* >
RE: [PATCH v4 3/5] scsi: ufs: fix potential access NULL pointer while memcpy
But this is just a suggestion. Your way is fine too. Thanks, Avri > > How about something like the untested attached? > > Thanks, > Avri > > > -Original Message- > > From: Bean Huo > > Sent: Tuesday, June 2, 2020 2:36 PM > > To: Avri Altman ; alim.akh...@samsung.com; > > asuto...@codeaurora.org; j...@linux.ibm.com; > > martin.peter...@oracle.com; stanley@mediatek.com; > > bean...@micron.com; bvanass...@acm.org; tomas.wink...@intel.com; > > c...@codeaurora.org > > Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org > > Subject: Re: [PATCH v4 3/5] scsi: ufs: fix potential access NULL pointer > > while > > memcpy > > > > CAUTION: This email originated from outside of Western Digital. Do not click > > on links or open attachments unless you recognize the sender and know > that > > the content is safe. > > > > > > hi Avri > > thanks review. > > > > > > On Mon, 2020-06-01 at 06:25 +, Avri Altman wrote: > > > Hi, > > > > > > > If param_offset is not 0, the memcpy length shouldn't be the > > > > true descriptor length. > > > > > > > > Fixes: a4b0e8a4e92b ("scsi: ufs: Factor out > > > > ufshcd_read_desc_param") > > > > Signed-off-by: Bean Huo > > > > --- > > > > drivers/scsi/ufs/ufshcd.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > > > index f7e8bfefe3d4..bc52a0e89cd3 100644 > > > > --- a/drivers/scsi/ufs/ufshcd.c > > > > +++ b/drivers/scsi/ufs/ufshcd.c > > > > @@ -3211,7 +3211,7 @@ int ufshcd_read_desc_param(struct ufs_hba > > > > *hba, > > > > > > > > /* Check wherher we will not copy more data, than available > > > > */ > > > > if (is_kmalloc && param_size > buff_len) > > > > - param_size = buff_len; > > > > + param_size = buff_len - param_offset; > > > > > > But Is_kmalloc is true if (param_offset != 0 || param_size < > > > buff_len) > > > So if (is_kmalloc && param_size > buff_len) implies that > > > param_offset is 0, > > > Or did I get it wrong? > > > > If param_offset is 0, This willn't get any wrong, after this patch, it > > is the same since offset is 0. As mentioned in the commit message, this > > patch is only for the case of param_offset is not 0. > > > > > > > > Still, I think that there is a problem here because nowhere we are > > > checking that > > > param_offset + param_size < buff_len, which now can happen because of > > > ufs-bsg. > > > Maybe you can add it and get rid of that is_kmalloc which is an > > > awkward way to test for valid values? > > > > let me explain further: > > we have these conditinos: > > > > 1) param_offset == 0, param_size >= buff_len;//no problem, > > ufshcd_query_descriptor_retry() will read descripor with true > > descriptor length, and no memcpy() called. > > > > > > 2) param_offset == 0, param_size < buff_len;// no problem, > > ufshcd_query_descriptor_retry() will read descripor with true > > descriptor length buff_len, and memcpy() with param_size length. > > > > > > 3) param_offset != 0, param_offset + param_size <= buff_len;// no > > problem, ufshcd_query_descriptor_retry() will read descripor with true > > descriptor length, and memcpy() with param_size length. > > > > > > 4) param_offset != 0, param_offset + param_size > buff_len;// NULL > > pointer reference problem, since ufshcd_query_descriptor_retry() will > > read descripor with true descriptor length, and memcpy() with buff_len > > length. correct memcpy length should be (buff_len - param_offset) > > > > param_offset + param_size < buff_len doesn't need to add, and > > is_kmalloc is very hard to be removed based on current flow. > > > > so, the correct fixup patch shoulbe be like this: > > > > > > -if (is_kmalloc && param_size > buff_len) > > - param_size = buff_len > > +if (is_kmalloc && (param_size + param_offset) > buff_len) > > + param_size = buff_len - param_offset; > > > > > > how do you think about it? if no problem, I will update it in next > > version patch. > > > > thanks, > > > > Bean
Re: [PATCH] serial: 8250_port: Fix imprecise external abort for mctrl if inactive
* Andy Shevchenko [200602 08:33]: > On Tue, Jun 2, 2020 at 11:09 AM Johan Hovold wrote: > > On Mon, Jun 01, 2020 at 05:18:13PM -0700, Tony Lindgren wrote: > > ... > > > There's shouldn't be anything fundamental preventing you from adding the > > missing resume calls to the mctrl paths even if it may require reworking > > (and fixing) the whole RPM implementation (which would be a good thing > > of course). > > Yes, for serial core I have long standing patch series to implement > RPM (more or less?) properly. Yeah let's try after the merge window. Not sure what else to do with the fix though. We currently have 8250_port.c not really aware of the hardare state for PM runtime at least for the hang-up path. > However, OMAP is a beast which prevents us to go due to a big hack > called pm_runtime_irq_safe(). > Tony is aware of this and I think the above is somehow related to removal of > it. Now that we can detach and reattach the kernel serial console, there should not be any need for pm_runtime_irq_safe() anymore :) And the UART wake-up from deeper idle states can only happen with help of external hardware like GPIO controller or pinctrl controller. And for the always-on wake-up interrupt controllers we have the Linux generic wakeirqs to wake-up serial device on events. So I think the way to procedd with pm_runtime_irq_safe() removal for serial drivers is to block serial PM runtime unless we have a wakeirq configured for omaps in devicetree. In the worst case the regression is that PM runtime for serial won't work unless properly configured. And the UART wakeup latency will be a bit longer compared to pm_runtime_irq_safe() naturally. > But I completely agree that the goal is to get better runtime PM > implementation over all. Yes agreed. Regards, Tony
RE: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where appropriate
>-Original Message- >From: dri-devel On Behalf Of >Piotr Stankiewicz >Sent: Tuesday, June 2, 2020 5:21 AM >To: Alex Deucher ; Christian König >; David Zhou ; David >Airlie ; Daniel Vetter >Cc: Stankiewicz, Piotr ; dri- >de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; linux- >ker...@vger.kernel.org >Subject: [PATCH 07/15] drm/amdgpu: use PCI_IRQ_MSI_TYPES where >appropriate > >Seeing as there is shorthand available to use when asking for any type >of interrupt, or any type of message signalled interrupt, leverage it. > >Signed-off-by: Piotr Stankiewicz >Reviewed-by: Andy Shevchenko >--- > drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >index 5ed4227f304b..6dbe173a9fd4 100644 >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >@@ -251,11 +251,11 @@ int amdgpu_irq_init(struct amdgpu_device *adev) > int nvec = pci_msix_vec_count(adev->pdev); > unsigned int flags; > >- if (nvec <= 0) { >+ if (nvec > 0) >+ flags = PCI_IRQ_MSI_TYPES; >+ else > flags = PCI_IRQ_MSI; >- } else { >- flags = PCI_IRQ_MSI | PCI_IRQ_MSIX; >- } Minor nit: Is it really necessary to set do this check? Can flags just be set? I.e.: flags = PCI_IRQ_MSI_TYPES; pci_alloc_irq_vector() tries stuff in order. If MSIX is not available, it will try MSI. M >+ > /* we only need one vector */ > nvec = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags); > if (nvec > 0) { >-- >2.17.2 > >___ >dri-devel mailing list >dri-de...@lists.freedesktop.org >https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] media: stm32-dcmi: Set minimum cpufreq requirement
On 02/06/20 12:37, Benjamin GAIGNARD wrote: > On 6/2/20 11:31 AM, Valentin Schneider wrote: >>> @@ -99,6 +100,8 @@ enum state { >>> >>> #define OVERRUN_ERROR_THRESHOLD 3 >>> >>> +#define DCMI_MIN_FREQ 65 /* in KHz */ >>> + >> This assumes the handling part is guaranteed to always run on the same CPU >> with the same performance profile (regardless of the platform). If that's >> not guaranteed, it feels like you'd want this to be configurable in some >> way. > Yes I could add a st,stm32-dcmi-min-frequency (in KHz) parameter the > device tree node. > Something like that - I'm not sure how well this fits with the DT landscape, as you could argue it isn't really a description of the hardware, more of a description of the performance expectations of the software. I won't really argue here. >> >>> struct dcmi_graph_entity { >>>struct v4l2_async_subdev asd; >>> >> [...] >>> @@ -2020,6 +2042,8 @@ static int dcmi_probe(struct platform_device *pdev) >>>goto err_cleanup; >>>} >>> >>> + dcmi->policy = cpufreq_cpu_get(0); >>> + >> Ideally you'd want to fetch the policy of the CPU your IRQ (and handling >> thread) is affined to; The only compatible DTS I found describes a single >> A7, which is somewhat limited in the affinity area... > If I move this code just before start streaming and use get_cpu(), would > it works ? > AFAIA streaming_start() is not necessarily executing on the same CPU as the one that will handle the interrupt. I was thinking you could use the IRQ's effective affinity as a hint of which CPU(s) to boost, i.e. something like: --- struct cpumask_var_t visited; struct irq_data *d = irq_get_irq_data(irq); err = alloc_cpumask_var(visited, GFP_KERNEL); /* ... */ for_each_cpu(cpu, irq_data_get_effective_affinity_mask(d)) { /* check if not already spanned */ if (cpumask_test_cpu(cpu, visited)) continue; policy = cpufreq_cpu_get(cpu); cpumask_or(visited, visited, policy->cpus); /* do the boost for that policy here */ /* ... */ cpufreq_cpu_put(policy); } --- That of course falls apart when hotplug gets involved, and the effective affinity changes... There's irq_set_affinity_notifier() out there, but it seems it's only about the affinity, not the effective_affinity, I'm not sure how valid it would be to query the effective_affinity in that notifier. > Benjamin >> >>>dev_info(>dev, "Probe done\n"); >>> >>>platform_set_drvdata(pdev, dcmi); >>> @@ -2049,6 +2073,9 @@ static int dcmi_remove(struct platform_device *pdev) >>> >>>pm_runtime_disable(>dev); >>> >>> + if (dcmi->policy) >>> + cpufreq_cpu_put(dcmi->policy); >>> + >>>v4l2_async_notifier_unregister(>notifier); >>>v4l2_async_notifier_cleanup(>notifier); >>>media_entity_cleanup(>vdev->entity);
Re: [RFC 06/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory
On Mon, Jun 01, 2020 at 06:35:22PM +0200, Paolo Bonzini wrote: > On 25/05/20 17:17, Kirill A. Shutemov wrote: > >> Personally, I would've just added 'struct kvm' pointer to 'struct > >> kvm_memory_slot' to be able to extract 'mem_protected' info when > >> needed. This will make the patch much smaller. > > Okay, can do. > > > > Other thing I tried is to have per-slot flag to indicate that it's > > protected. But Sean pointed that it's all-or-nothing feature and having > > the flag in the slot would be misleading. > > > > Perhaps it would be misleading, but it's an optimization. Saving a > pointer dereference can be worth it, also because there are some places > where we just pass around a memslot and we don't have the struct kvm*. Vitaly proposed to add struct kvm pointer into memslot. Do you object against it? -- Kirill A. Shutemov
Re: [PATCH 4/6] vhost_vdpa: support doorbell mapping via mmap
On Tue, Jun 02, 2020 at 02:49:38PM +0800, Jason Wang wrote: > > On 2020/6/2 下午12:56, Michael S. Tsirkin wrote: > > On Tue, Jun 02, 2020 at 03:22:49AM +0800, kbuild test robot wrote: > > > Hi Jason, > > > > > > I love your patch! Yet something to improve: > > > > > > [auto build test ERROR on vhost/linux-next] > > > [also build test ERROR on linus/master v5.7 next-20200529] > > > [if your patch is applied to the wrong git tree, please drop us a note to > > > help > > > improve the system. BTW, we also suggest to use '--base' option to > > > specify the > > > base tree in git format-patch, please > > > seehttps://stackoverflow.com/a/37406982] > > > > > > url:https://github.com/0day-ci/linux/commits/Jason-Wang/vDPA-doorbell-mapping/20200531-070834 > > > base:https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git > > > linux-next > > > config: m68k-randconfig-r011-20200601 (attached as .config) > > > compiler: m68k-linux-gcc (GCC) 9.3.0 > > > reproduce (this is a W=1 build): > > > > > > wgethttps://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross > > > -O ~/bin/make.cross > > > chmod +x ~/bin/make.cross > > > # save the attached .config to linux build tree > > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross > > > ARCH=m68k > > > > > > If you fix the issue, kindly add following tag as appropriate > > > Reported-by: kbuild test robot > > > > > > All errors (new ones prefixed by >>, old ones prefixed by <<): > > > > > > drivers/vhost/vdpa.c: In function 'vhost_vdpa_fault': > > > > > drivers/vhost/vdpa.c:754:22: error: implicit declaration of function > > > > > 'pgprot_noncached' [-Werror=implicit-function-declaration] > > > 754 | vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > > | ^~~~ > > > > > drivers/vhost/vdpa.c:754:22: error: incompatible types when assigning > > > > > to type 'pgprot_t' {aka 'struct '} from type 'int' > > > cc1: some warnings being treated as errors > > > > > > vim +/pgprot_noncached +754 drivers/vhost/vdpa.c > > > > > > 742 > > > 743 static vm_fault_t vhost_vdpa_fault(struct vm_fault *vmf) > > > 744 { > > > 745 struct vhost_vdpa *v = vmf->vma->vm_file->private_data; > > > 746 struct vdpa_device *vdpa = v->vdpa; > > > 747 const struct vdpa_config_ops *ops = vdpa->config; > > > 748 struct vdpa_notification_area notify; > > > 749 struct vm_area_struct *vma = vmf->vma; > > > 750 u16 index = vma->vm_pgoff; > > > 751 > > > 752 notify = ops->get_vq_notification(vdpa, index); > > > 753 > > > > 754 vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > > 755 if (remap_pfn_range(vma, vmf->address & PAGE_MASK, > > > 756 notify.addr >> PAGE_SHIFT, > > > PAGE_SIZE, > > > 757 vma->vm_page_prot)) > > > 758 return VM_FAULT_SIGBUS; > > > 759 > > > 760 return VM_FAULT_NOPAGE; > > > 761 } > > > 762 > > Yes well, all this remapping clearly has no chance to work > > on systems without CONFIG_MMU. > > > It looks to me mmap can work according to Documentation/nommu-mmap.txt. But > I'm not sure it's worth to bother. > > Thanks Well int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t prot) { if (addr != (pfn << PAGE_SHIFT)) return -EINVAL; vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; return 0; } EXPORT_SYMBOL(remap_pfn_range); So things aren't going to work if you have a fixed PFN which is the case of the hardware device. > > > > > > >
Re: [v4,1/1] hwmon: (nct7904) Add watchdog function
On 6/2/20 1:01 AM, Geert Uytterhoeven wrote: > Hi Yuechao, > > On Tue, Mar 31, 2020 at 7:30 AM wrote: >> From: Yuechao Zhao >> >> implement watchdong functionality into the "hwmon/nct7904.c" >> >> Signed-off-by: Yuechao Zhao > > Thanks for your patch, which is now commit 77849a552d142ef5 ("hwmon: > (nct7904) Add watchdog function"). > >> --- a/drivers/hwmon/Kconfig >> +++ b/drivers/hwmon/Kconfig >> @@ -1340,10 +1340,12 @@ config SENSORS_NCT7802 >> >> config SENSORS_NCT7904 >> tristate "Nuvoton NCT7904" >> - depends on I2C >> + depends on I2C && WATCHDOG >> + select WATCHDOG_CORE > > This makes the driver unselectable if WATCHDOG is not set. > > Is there a use case for using this driver without watchdog functionality? > If yes, it might make sense to make the watchdog support optional, > protected by #ifdef CONFIG_WATCHDOG, and change the above to > We use the same pattern in other hwmon drivers which also implement watchdog functionality, so I am not particularly concerned about it. Guenter > depends on I2C > select WATCHDOG_CORE if WATCHDOG > > If no, please ignore my email. > >> help >> If you say yes here you get support for the Nuvoton NCT7904 >> - hardware monitoring chip, including manual fan speed control. >> + hardware monitoring chip, including manual fan speed control >> + and support for the integrated watchdog. >> >> This driver can also be built as a module. If so, the module >> will be called nct7904. > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds >
Re: [PATCH] x86_64: fix jiffies ODR violation
On Fri, May 15, 2020 at 11:05:40AM -0700, Bob Haarman wrote: > `jiffies` [...] > `jiffies_64` [...] > ``` > In LLD, symbol assignments in linker scripts override definitions in > object files. GNU ld appears to have the same behavior. It would > probably make sense for LLD to error "duplicate symbol" but GNU ld is > unlikely to adopt for compatibility reasons. > ``` Kernel commit logs shouldn't be in Markdown. Symbol names can just be in single quotes (not back-quotes!) like 'jiffies'. Quotes can be indented by a few spaces for visual separation, like In LLD, symbol assignments in linker scripts override definitions in object files. GNU ld appears to have the same behavior. It would probably make sense for LLD to error "duplicate symbol" but GNU ld is unlikely to adopt for compatibility reasons. or can be formatting like an email quote: > In LLD, symbol assignments in linker scripts override definitions in > object files. GNU ld appears to have the same behavior. It would > probably make sense for LLD to error "duplicate symbol" but GNU ld is > unlikely to adopt for compatibility reasons. With Markdown-isms removed from the patch description: Reviewed-by: Josh Poimboeuf -- Josh
[PATCH] dt-bindings: clock: Convert imx7ulp clock to json-schema
Convert the i.MX7ULP clock binding to DT schema format using json-schema, the original binding doc is actually for two clock modules(SCG and PCC), so split it to two binding docs, and the MPLL(mipi PLL) is NOT supposed to be in clock module, so remove it from binding doc as well. Signed-off-by: Anson Huang --- .../devicetree/bindings/clock/imx7ulp-clock.txt| 103 -- .../bindings/clock/imx7ulp-pcc-clock.yaml | 119 + .../bindings/clock/imx7ulp-scg-clock.yaml | 97 + 3 files changed, 216 insertions(+), 103 deletions(-) delete mode 100644 Documentation/devicetree/bindings/clock/imx7ulp-clock.txt create mode 100644 Documentation/devicetree/bindings/clock/imx7ulp-pcc-clock.yaml create mode 100644 Documentation/devicetree/bindings/clock/imx7ulp-scg-clock.yaml diff --git a/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt b/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt deleted file mode 100644 index 93d89ad..000 --- a/Documentation/devicetree/bindings/clock/imx7ulp-clock.txt +++ /dev/null @@ -1,103 +0,0 @@ -* Clock bindings for Freescale i.MX7ULP - -i.MX7ULP Clock functions are under joint control of the System -Clock Generation (SCG) modules, Peripheral Clock Control (PCC) -modules, and Core Mode Controller (CMC)1 blocks - -The clocking scheme provides clear separation between M4 domain -and A7 domain. Except for a few clock sources shared between two -domains, such as the System Oscillator clock, the Slow IRC (SIRC), -and and the Fast IRC clock (FIRCLK), clock sources and clock -management are separated and contained within each domain. - -M4 clock management consists of SCG0, PCC0, PCC1, and CMC0 modules. -A7 clock management consists of SCG1, PCC2, PCC3, and CMC1 modules. - -Note: this binding doc is only for A7 clock domain. - -System Clock Generation (SCG) modules: -- -The System Clock Generation (SCG) is responsible for clock generation -and distribution across this device. Functions performed by the SCG -include: clock reference selection, generation of clock used to derive -processor, system, peripheral bus and external memory interface clocks, -source selection for peripheral clocks and control of power saving -clock gating mode. - -Required properties: - -- compatible: Should be "fsl,imx7ulp-scg1". -- reg :Should contain registers location and length. -- #clock-cells:Should be <1>. -- clocks: Should contain the fixed input clocks. -- clock-names: Should contain the following clock names: - "rosc", "sosc", "sirc", "firc", "upll", "mpll". - -Peripheral Clock Control (PCC) modules: -- -The Peripheral Clock Control (PCC) is responsible for clock selection, -optional division and clock gating mode for peripherals in their -respected power domain - -Required properties: -- compatible: Should be one of: - "fsl,imx7ulp-pcc2", - "fsl,imx7ulp-pcc3". -- reg :Should contain registers location and length. -- #clock-cells:Should be <1>. -- clocks: Should contain the fixed input clocks. -- clock-names: Should contain the following clock names: - "nic1_bus_clk", "nic1_clk", "ddr_clk", "apll_pfd2", - "apll_pfd1", "apll_pfd0", "upll", "sosc_bus_clk", - "mpll", "firc_bus_clk", "rosc", "spll_bus_clk"; - -The clock consumer should specify the desired clock by having the clock -ID in its "clocks" phandle cell. -See include/dt-bindings/clock/imx7ulp-clock.h -for the full list of i.MX7ULP clock IDs of each module. - -Examples: - -#include - -scg1: scg1@403e { - compatible = "fsl,imx7ulp-scg1; - reg = <0x403e 0x1>; - clocks = <>, <>, <>, -<>, <>, <>; - clock-names = "rosc", "sosc", "sirc", - "firc", "upll", "mpll"; - #clock-cells = <1>; -}; - -pcc2: pcc2@403f { - compatible = "fsl,imx7ulp-pcc2"; - reg = <0x403f 0x1>; - #clock-cells = <1>; - clocks = < IMX7ULP_CLK_NIC1_BUS_DIV>, -< IMX7ULP_CLK_NIC1_DIV>, -< IMX7ULP_CLK_DDR_DIV>, -< IMX7ULP_CLK_APLL_PFD2>, -< IMX7ULP_CLK_APLL_PFD1>, -< IMX7ULP_CLK_APLL_PFD0>, -< IMX7ULP_CLK_UPLL>, -< IMX7ULP_CLK_SOSC_BUS_CLK>, -< IMX7ULP_CLK_FIRC_BUS_CLK>, -< IMX7ULP_CLK_ROSC>, -< IMX7ULP_CLK_SPLL_BUS_CLK>; - clock-names = "nic1_bus_clk", "nic1_clk", "ddr_clk", - "apll_pfd2", "apll_pfd1", "apll_pfd0", - "upll", "sosc_bus_clk", "mpll", - "firc_bus_clk", "rosc", "spll_bus_clk"; -}; - -usdhc1: usdhc@4038 { - compatible = "fsl,imx7ulp-usdhc"; - reg = <0x4038 0x1>; -
Re: [PATCHv2] perf stat: Ensure group is defined on top of the same cpu mask
Em Tue, Jun 02, 2020 at 02:10:17PM +0200, Jiri Olsa escreveu: > On Tue, Jun 02, 2020 at 08:50:17PM +0900, Namhyung Kim wrote: > > Hi Jiri, > > > > On Tue, Jun 2, 2020 at 5:16 PM Jiri Olsa wrote: > > > > > > On Tue, Jun 02, 2020 at 11:47:19AM +0900, Namhyung Kim wrote: > > > > On Tue, Jun 2, 2020 at 1:21 AM Ian Rogers wrote: > > > > > > > > > > On Mon, Jun 1, 2020 at 1:20 AM Jiri Olsa wrote: > > > > > > > > > > > > Jin Yao reported the issue (and posted first versions of this > > > > > > change) > > > > > > with groups being defined over events with different cpu mask. > > > > > > > > > > > > This causes assert aborts in get_group_fd, like: > > > > > > > > > > > > # perf stat -M "C2_Pkg_Residency" -a -- sleep 1 > > > > > > perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' > > > > > > failed. > > > > > > Aborted > > > > > > > > > > > > All the events in the group have to be defined over the same > > > > > > cpus so the group_fd can be found for every leader/member pair. > > > > > > > > > > > > Adding check to ensure this condition is met and removing the > > > > > > group (with warning) if we detect mixed cpus, like: > > > > > > > > > > > > $ sudo perf stat -e > > > > > > '{power/energy-cores/,cycles},{instructions,power/energy-cores/}' > > > > > > WARNING: event cpu maps do not match, disabling group: > > > > > > anon group { power/energy-cores/, cycles } > > > > > > anon group { instructions, power/energy-cores/ } > > > > > > > > > > > > Ian asked also for cpu maps details, it's displayed in verbose mode: > > > > > > > > > > > > $ sudo perf stat -e '{cycles,power/energy-cores/}' -v > > > > > > WARNING: group events cpu maps do not match, disabling group: > > > > > > anon group { power/energy-cores/, cycles } > > > > > >power/energy-cores/: 0 > > > > > >cycles: 0-7 > > > > > > anon group { instructions, power/energy-cores/ } > > > > > >instructions: 0-7 > > > > > >power/energy-cores/: 0 > > > > > > > > > > This is great! A nit, would 'grouped events cpus do not match' read > > > > > better? I think the cpu map is more of an internal naming convention. > > > > Allowed cpus? > > > > > > hum, what you mean? > > > > I mean that we can use 'allowed cpus' rather then 'cpu map' in the message. > > Something like this? > > > > allowed cpus for events in a group do not match, disabling group: > > hm, I like more the one Ian suggested.. anyway, leaving this to Arnaldo, > he can change that before committing ;-) I think its ok as-is, Ian, can I have your acked-by? - Arnaldo
Re: [Cocci] [PATCH 1/2] Coccinelle: extend memdup_user transformation with GFP_USER
On Sat, 30 May 2020, Denis Efremov wrote: > Match GFP_USER allocations with memdup_user.cocci rule. > Commit 6c2c97a24f09 ("memdup_user(): switch to GFP_USER") switched > memdup_user() from GFP_KERNEL to GFP_USER. In most cases it is still > a good idea to use memdup_user() for GFP_KERNEL allocations. The > motivation behind altering memdup_user() to GFP_USER is here: > https://lkml.org/lkml/2018/1/6/333 Thanks for the patch series. I will test them and try to push them to Linus shortly. julia > > Signed-off-by: Denis Efremov > --- > scripts/coccinelle/api/memdup_user.cocci | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/scripts/coccinelle/api/memdup_user.cocci > b/scripts/coccinelle/api/memdup_user.cocci > index c809ab10bbce..49f487e6a5c8 100644 > --- a/scripts/coccinelle/api/memdup_user.cocci > +++ b/scripts/coccinelle/api/memdup_user.cocci > @@ -20,7 +20,7 @@ expression from,to,size; > identifier l1,l2; > @@ > > -- to = \(kmalloc\|kzalloc\)(size,GFP_KERNEL); > +- to = \(kmalloc\|kzalloc\)(size,\(GFP_KERNEL\|GFP_USER\)); > + to = memdup_user(from,size); > if ( > - to==NULL > @@ -43,7 +43,7 @@ position p; > statement S1,S2; > @@ > > -* to = \(kmalloc@p\|kzalloc@p\)(size,GFP_KERNEL); > +* to = \(kmalloc@p\|kzalloc@p\)(size,\(GFP_KERNEL\|GFP_USER\)); > if (to==NULL || ...) S1 > if (copy_from_user(to, from, size) != 0) > S2 > -- > 2.26.2 > > ___ > Cocci mailing list > co...@systeme.lip6.fr > https://systeme.lip6.fr/mailman/listinfo/cocci >
Re: [PATCH v2 3/7] selftests/ftrace: Add "requires:" list support
On Tue, 2 Jun 2020 18:08:31 +0900 Masami Hiramatsu wrote: > +++ b/tools/testing/selftests/ftrace/test.d/template > @@ -1,6 +1,7 @@ > #!/bin/sh > # SPDX-License-Identifier: GPL-2.0 > # description: %HERE DESCRIBE WHAT THIS DOES% > +# requires: %HERE LIST UP REQUIRED FILES% Not sure what you mean by "LIST UP". Perhaps you mean "LIST OF"? -- Steve > # you have to add ".tc" extention for your testcase file > # Note that all tests are run with "errexit" option.
KASAN: use-after-free Read in usb_udc_uevent
Hello, syzbot found the following crash on: HEAD commit:ffeb595d Merge tag 'powerpc-5.7-6' of git://git.kernel.org.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=17f7f2b110 kernel config: https://syzkaller.appspot.com/x/.config?x=9ac87c7c2ba15abf dashboard link: https://syzkaller.appspot.com/bug?extid=b0de012ceb1e2a97891b compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+b0de012ceb1e2a978...@syzkaller.appspotmail.com == BUG: KASAN: use-after-free in usb_udc_uevent+0xac/0x120 drivers/usb/gadget/udc/core.c:1593 Read of size 8 at addr 888091a7c050 by task systemd-udevd/14223 CPU: 1 PID: 14223 Comm: systemd-udevd Not tainted 5.7.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1e9/0x30e lib/dump_stack.c:118 print_address_description+0x74/0x5c0 mm/kasan/report.c:382 __kasan_report+0x103/0x1a0 mm/kasan/report.c:511 kasan_report+0x4d/0x80 mm/kasan/common.c:625 Allocated by task 14178: save_stack mm/kasan/common.c:49 [inline] set_track mm/kasan/common.c:57 [inline] __kasan_kmalloc+0x114/0x160 mm/kasan/common.c:495 kmem_cache_alloc_trace+0x234/0x300 mm/slab.c:3551 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] dev_new drivers/usb/gadget/legacy/raw_gadget.c:182 [inline] raw_open+0x87/0x500 drivers/usb/gadget/legacy/raw_gadget.c:372 misc_open+0x346/0x3c0 drivers/char/misc.c:141 chrdev_open+0x498/0x580 fs/char_dev.c:414 do_dentry_open+0x82e/0x10b0 fs/open.c:797 do_open fs/namei.c:3229 [inline] path_openat+0x2790/0x38b0 fs/namei.c:3346 do_filp_open+0x191/0x3a0 fs/namei.c:3373 do_sys_openat2+0x463/0x770 fs/open.c:1148 do_sys_open fs/open.c:1164 [inline] ksys_open include/linux/syscalls.h:1386 [inline] __do_sys_open fs/open.c:1170 [inline] __se_sys_open fs/open.c:1168 [inline] __x64_sys_open+0x1af/0x1e0 fs/open.c:1168 do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Freed by task 14174: save_stack mm/kasan/common.c:49 [inline] set_track mm/kasan/common.c:57 [inline] kasan_set_free_info mm/kasan/common.c:317 [inline] __kasan_slab_free+0x125/0x190 mm/kasan/common.c:456 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x220 mm/slab.c:3757 raw_release+0x130/0x1e0 drivers/usb/gadget/legacy/raw_gadget.c:411 __fput+0x2ed/0x750 fs/file_table.c:280 task_work_run+0x147/0x1d0 kernel/task_work.c:123 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x5ef/0x1f80 kernel/exit.c:796 do_group_exit+0x15e/0x2c0 kernel/exit.c:894 get_signal+0x13cf/0x1d60 kernel/signal.c:2739 do_signal+0x33/0x610 arch/x86/kernel/signal.c:784 exit_to_usermode_loop arch/x86/entry/common.c:161 [inline] prepare_exit_to_usermode+0x32a/0x600 arch/x86/entry/common.c:196 entry_SYSCALL_64_after_hwframe+0x49/0xb3 The buggy address belongs to the object at 888091a7c000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 80 bytes inside of 4096-byte region [888091a7c000, 888091a7d000) The buggy address belongs to the page: page:ea0002469f00 refcount:1 mapcount:0 mapping: index:0x0 head:ea0002469f00 order:1 compound_mapcount:0 flags: 0xfffe010200(slab|head) raw: 00fffe010200 ea0001043588 ea0001258b88 8880aa402000 raw: 888091a7c000 00010001 page dumped because: kasan: bad access detected Memory state around the buggy address: 888091a7bf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 888091a7bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >888091a7c000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ 888091a7c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 888091a7c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb == --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
Re: [GIT PULL][PATCH v5 0/8] Add support for ZSTD-compressed kernel and initramfs
On Mon, Jun 1, 2020 at 2:59 PM Norbert Lange wrote: > > The series seems to be stuck in limbo, and I got the hint to bring > this to Andrew's attention [1]. > Hope this will finally end in upstream, been using these patches for ~2 years. > > Regards, Norbert > > [1] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=955469 Began using your patch for Debian Buster backport of 0.136 initramfs-tools, and Nick Terrel's kernel patch for 5.6 -- but modified for 5.5.17-19 in my reiser4 builds -- and set both defaults to zstd. Thus far, as long as there exists 'a priori' the Zstd package, the combination works in both local and my custom Google cloud instances installations. Best Professional Regards. -- Jose R R http://metztli.it - Download Metztli Reiser4: Debian Buster w/ Linux 5.5.19 AMD64 - feats ZSTD compression https://sf.net/projects/metztli-reiser4/ --- Official current Reiser4 resources: https://reiser4.wiki.kernel.org/
[PATCH v14 15/15] MAINTAINERS: Update for DAMON
From: SeongJae Park This commit updates MAINTAINERS file for DAMON related files. Signed-off-by: SeongJae Park --- MAINTAINERS | 12 1 file changed, 12 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 50659d76976b..2396a9098715 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4686,6 +4686,18 @@ F: net/ax25/ax25_out.c F: net/ax25/ax25_timer.c F: net/ax25/sysctl_net_ax25.c +DATA ACCESS MONITOR +M: SeongJae Park +L: linux...@kvack.org +S: Maintained +F: Documentation/admin-guide/mm/damon/* +F: include/linux/damon.h +F: include/trace/events/damon.h +F: mm/damon-test.h +F: mm/damon.c +F: tools/damon/* +F: tools/testing/selftests/damon/* + DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER L: net...@vger.kernel.org S: Orphan -- 2.17.1
[PATCH v14 14/15] mm/damon: Add user space selftests
From: SeongJae Park This commit adds a simple user space tests for DAMON. The tests are using kselftest framework. Signed-off-by: SeongJae Park --- tools/testing/selftests/damon/Makefile| 7 + .../selftests/damon/_chk_dependency.sh| 28 tools/testing/selftests/damon/_chk_record.py | 108 ++ .../testing/selftests/damon/debugfs_attrs.sh | 139 ++ .../testing/selftests/damon/debugfs_record.sh | 50 +++ 5 files changed, 332 insertions(+) create mode 100644 tools/testing/selftests/damon/Makefile create mode 100644 tools/testing/selftests/damon/_chk_dependency.sh create mode 100644 tools/testing/selftests/damon/_chk_record.py create mode 100755 tools/testing/selftests/damon/debugfs_attrs.sh create mode 100755 tools/testing/selftests/damon/debugfs_record.sh diff --git a/tools/testing/selftests/damon/Makefile b/tools/testing/selftests/damon/Makefile new file mode 100644 index ..cfd5393a4639 --- /dev/null +++ b/tools/testing/selftests/damon/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 +# Makefile for damon selftests + +TEST_FILES = _chk_dependency.sh _chk_record_file.py +TEST_PROGS = debugfs_attrs.sh debugfs_record.sh + +include ../lib.mk diff --git a/tools/testing/selftests/damon/_chk_dependency.sh b/tools/testing/selftests/damon/_chk_dependency.sh new file mode 100644 index ..814dcadd5e96 --- /dev/null +++ b/tools/testing/selftests/damon/_chk_dependency.sh @@ -0,0 +1,28 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +DBGFS=/sys/kernel/debug/damon + +if [ $EUID -ne 0 ]; +then + echo "Run as root" + exit $ksft_skip +fi + +if [ ! -d $DBGFS ] +then + echo "$DBGFS not found" + exit $ksft_skip +fi + +for f in attrs record pids monitor_on +do + if [ ! -f "$DBGFS/$f" ] + then + echo "$f not found" + exit 1 + fi +done diff --git a/tools/testing/selftests/damon/_chk_record.py b/tools/testing/selftests/damon/_chk_record.py new file mode 100644 index ..5cfcf4161404 --- /dev/null +++ b/tools/testing/selftests/damon/_chk_record.py @@ -0,0 +1,108 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +"Check whether the DAMON record file is valid" + +import argparse +import struct +import sys + +fmt_version = 0 + +def set_fmt_version(f): +global fmt_version + +mark = f.read(16) +if mark == b'damon_recfmt_ver': +fmt_version = struct.unpack('i', f.read(4))[0] +else: +fmt_version = 0 +f.seek(0) +return fmt_version + +def read_pid(f): +if fmt_version == 0: +pid = struct.unpack('L', f.read(8))[0] +else: +pid = struct.unpack('i', f.read(4))[0] +def err_percent(val, expected): +return abs(val - expected) / expected * 100 + +def chk_task_info(f): +pid = read_pid(f) +nr_regions = struct.unpack('I', f.read(4))[0] + +if nr_regions > max_nr_regions: +print('too many regions: %d > %d' % (nr_regions, max_nr_regions)) +exit(1) + +nr_gaps = 0 +eaddr = 0 +for r in range(nr_regions): +saddr = struct.unpack('L', f.read(8))[0] +if eaddr and saddr != eaddr: +nr_gaps += 1 +eaddr = struct.unpack('L', f.read(8))[0] +nr_accesses = struct.unpack('I', f.read(4))[0] + +if saddr >= eaddr: +print('wrong region [%d,%d)' % (saddr, eaddr)) +exit(1) + +max_nr_accesses = aint / sint +if nr_accesses > max_nr_accesses: +if err_percent(nr_accesses, max_nr_accesses) > 15: +print('too high nr_access: expected %d but %d' % +(max_nr_accesses, nr_accesses)) +exit(1) +if nr_gaps != 2: +print('number of gaps are not two but %d' % nr_gaps) +exit(1) + +def parse_time_us(bindat): +sec = struct.unpack('l', bindat[0:8])[0] +nsec = struct.unpack('l', bindat[8:16])[0] +return (sec * 10 + nsec) / 1000 + +def main(): +global sint +global aint +global min_nr +global max_nr_regions + +parser = argparse.ArgumentParser() +parser.add_argument('file', metavar='', +help='path to the record file') +parser.add_argument('--attrs', metavar='', +default='5000 10 100 10 1000', +help='content of debugfs attrs file') +args = parser.parse_args() +file_path = args.file +attrs = [int(x) for x in args.attrs.split()] +sint, aint, rint, min_nr, max_nr_regions = attrs + +with open(file_path, 'rb') as f: +set_fmt_version(f) +last_aggr_time = None +while True: +timebin = f.read(16) +if len(timebin) != 16: +break + +now = parse_time_us(timebin) +if not last_aggr_time: +last_aggr_time = now +else: +error
[PATCH v14 12/15] mm/damon: Add kunit tests
From: SeongJae Park This commit adds kunit based unit tests for DAMON. Signed-off-by: SeongJae Park Reviewed-by: Brendan Higgins --- mm/Kconfig | 11 + mm/damon-test.h | 622 mm/damon.c | 6 + 3 files changed, 639 insertions(+) create mode 100644 mm/damon-test.h diff --git a/mm/Kconfig b/mm/Kconfig index ecea0889ea35..91473ed9e7c7 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -879,4 +879,15 @@ config DAMON and 2) sufficiently light-weight so that it can be applied online. If unsure, say N. +config DAMON_KUNIT_TEST + bool "Test for damon" + depends on DAMON=y && KUNIT + help + This builds the DAMON Kunit test suite. + + For more information on KUnit and unit tests in general, please refer + to the KUnit documentation. + + If unsure, say N. + endmenu diff --git a/mm/damon-test.h b/mm/damon-test.h new file mode 100644 index ..cf715529ff64 --- /dev/null +++ b/mm/damon-test.h @@ -0,0 +1,622 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Data Access Monitor Unit Tests + * + * Copyright 2019 Amazon.com, Inc. or its affiliates. All rights reserved. + * + * Author: SeongJae Park + */ + +#ifdef CONFIG_DAMON_KUNIT_TEST + +#ifndef _DAMON_TEST_H +#define _DAMON_TEST_H + +#include + +static void damon_test_str_to_pids(struct kunit *test) +{ + char *question; + int *answers; + int expected[] = {12, 35, 46}; + ssize_t nr_integers = 0, i; + + question = "123"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers); + KUNIT_EXPECT_EQ(test, 123, answers[0]); + kfree(answers); + + question = "123abc"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers); + KUNIT_EXPECT_EQ(test, 123, answers[0]); + kfree(answers); + + question = "a123"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers); + KUNIT_EXPECT_PTR_EQ(test, answers, (int *)NULL); + + question = "12 35"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers); + for (i = 0; i < nr_integers; i++) + KUNIT_EXPECT_EQ(test, expected[i], answers[i]); + kfree(answers); + + question = "12 35 46"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)3, nr_integers); + for (i = 0; i < nr_integers; i++) + KUNIT_EXPECT_EQ(test, expected[i], answers[i]); + kfree(answers); + + question = "12 35 abc 46"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers); + for (i = 0; i < 2; i++) + KUNIT_EXPECT_EQ(test, expected[i], answers[i]); + kfree(answers); + + question = ""; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers); + KUNIT_EXPECT_PTR_EQ(test, (int *)NULL, answers); + kfree(answers); + + question = "\n"; + answers = str_to_pids(question, strnlen(question, 128), _integers); + KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers); + KUNIT_EXPECT_PTR_EQ(test, (int *)NULL, answers); + kfree(answers); +} + +static void damon_test_regions(struct kunit *test) +{ + struct damon_region *r; + struct damon_task *t; + + r = damon_new_region(_user_ctx, 1, 2); + KUNIT_EXPECT_EQ(test, 1ul, r->vm_start); + KUNIT_EXPECT_EQ(test, 2ul, r->vm_end); + KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses); + + t = damon_new_task(42); + KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t)); + + damon_add_region(r, t); + KUNIT_EXPECT_EQ(test, 1u, nr_damon_regions(t)); + + damon_del_region(r); + KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t)); + + damon_free_task(t); +} + +static void damon_test_tasks(struct kunit *test) +{ + struct damon_ctx *c = _user_ctx; + struct damon_task *t; + + t = damon_new_task(42); + KUNIT_EXPECT_EQ(test, 42, t->pid); + KUNIT_EXPECT_EQ(test, 0u, nr_damon_tasks(c)); + + damon_add_task(_user_ctx, t); + KUNIT_EXPECT_EQ(test, 1u, nr_damon_tasks(c)); + + damon_destroy_task(t); + KUNIT_EXPECT_EQ(test, 0u, nr_damon_tasks(c)); +} + +static void damon_test_set_pids(struct kunit *test) +{ + struct damon_ctx *ctx = _user_ctx; + int pids[] = {1, 2, 3}; + char buf[64]; + + damon_set_pids(ctx, pids, 3); + damon_sprint_pids(ctx, buf, 64); + KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2 3\n"); + + damon_set_pids(ctx, NULL, 0); + damon_sprint_pids(ctx, buf,
Re: [PATCH] video: fbdev: pxafb: Use correct return value for pxafb_probe()
Hi Bart, On Mon, Jun 01, 2020 at 03:25:25PM +0200, Bartlomiej Zolnierkiewicz wrote: > > On 5/25/20 9:11 AM, Tiezhu Yang wrote: > > When call function devm_platform_ioremap_resource(), we should use IS_ERR() > > to check the return value and return PTR_ERR() if failed. > > > > Signed-off-by: Tiezhu Yang > > Applied to drm-misc-next tree (patch should show up in v5.9), thanks. Thanks for going through all the backlog of patches in the fbdev area every once in a while! That kind of housekeeping work is often underappreciated, but rather important to keep the ship going. Cheers, Daniel PS: Of course also holds for everyone else doing this in other areas. fbdev simply stuck out just now catching up on mails. > > Best regards, > -- > Bartlomiej Zolnierkiewicz > Samsung R Institute Poland > Samsung Electronics > > > --- > > drivers/video/fbdev/pxafb.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/video/fbdev/pxafb.c b/drivers/video/fbdev/pxafb.c > > index 00b96a7..423331c 100644 > > --- a/drivers/video/fbdev/pxafb.c > > +++ b/drivers/video/fbdev/pxafb.c > > @@ -2305,7 +2305,7 @@ static int pxafb_probe(struct platform_device *dev) > > fbi->mmio_base = devm_platform_ioremap_resource(dev, 0); > > if (IS_ERR(fbi->mmio_base)) { > > dev_err(>dev, "failed to get I/O memory\n"); > > - ret = -EBUSY; > > + ret = PTR_ERR(fbi->mmio_base); > > goto failed; > > } > > > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 11/15] mmc: sdhci: use PCI_IRQ_MSI_TYPES where appropriate
On 2/06/20 12:20 pm, Piotr Stankiewicz wrote: > Seeing as there is shorthand available to use when asking for any type > of interrupt, or any type of message signalled interrupt, leverage it. > > Signed-off-by: Piotr Stankiewicz > Reviewed-by: Andy Shevchenko Acked-by: Adrian Hunter > --- > drivers/mmc/host/sdhci-pci-gli.c | 3 +-- > drivers/mmc/host/sdhci-pci-o2micro.c | 3 +-- > 2 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/mmc/host/sdhci-pci-gli.c > b/drivers/mmc/host/sdhci-pci-gli.c > index fd76aa672e02..d14997f9cbf9 100644 > --- a/drivers/mmc/host/sdhci-pci-gli.c > +++ b/drivers/mmc/host/sdhci-pci-gli.c > @@ -271,8 +271,7 @@ static void gli_pcie_enable_msi(struct sdhci_pci_slot > *slot) > { > int ret; > > - ret = pci_alloc_irq_vectors(slot->chip->pdev, 1, 1, > - PCI_IRQ_MSI | PCI_IRQ_MSIX); > + ret = pci_alloc_irq_vectors(slot->chip->pdev, 1, 1, PCI_IRQ_MSI_TYPES); > if (ret < 0) { > pr_warn("%s: enable PCI MSI failed, error=%d\n", > mmc_hostname(slot->host->mmc), ret); > diff --git a/drivers/mmc/host/sdhci-pci-o2micro.c > b/drivers/mmc/host/sdhci-pci-o2micro.c > index fa8105087d68..498c51207bfb 100644 > --- a/drivers/mmc/host/sdhci-pci-o2micro.c > +++ b/drivers/mmc/host/sdhci-pci-o2micro.c > @@ -470,8 +470,7 @@ static void sdhci_pci_o2_enable_msi(struct sdhci_pci_chip > *chip, > return; > } > > - ret = pci_alloc_irq_vectors(chip->pdev, 1, 1, > - PCI_IRQ_MSI | PCI_IRQ_MSIX); > + ret = pci_alloc_irq_vectors(chip->pdev, 1, 1, PCI_IRQ_MSI_TYPES); > if (ret < 0) { > pr_err("%s: enable PCI MSI failed, err=%d\n", > mmc_hostname(host->mmc), ret); >
[PATCH v14 13/15] mm/damon-test: Add a kunit test for recording setup
From: SeongJae Park This commit adds another unit test case for the recording setup. Signed-off-by: SeongJae Park --- mm/damon-test.h | 13 + 1 file changed, 13 insertions(+) diff --git a/mm/damon-test.h b/mm/damon-test.h index cf715529ff64..5b18619efe72 100644 --- a/mm/damon-test.h +++ b/mm/damon-test.h @@ -137,6 +137,18 @@ static void damon_test_set_pids(struct kunit *test) KUNIT_EXPECT_STREQ(test, (char *)buf, "\n"); } +static void damon_test_set_recording(struct kunit *test) +{ + struct damon_ctx *ctx = _user_ctx; + + damon_set_recording(ctx, 4242, "foo.bar"); + KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 4242u); + KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo.bar"); + damon_set_recording(ctx, 42, "foo"); + KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 42u); + KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo"); +} + /* * Test damon_three_regions_in_vmas() function * @@ -596,6 +608,7 @@ static struct kunit_case damon_test_cases[] = { KUNIT_CASE(damon_test_tasks), KUNIT_CASE(damon_test_regions), KUNIT_CASE(damon_test_set_pids), + KUNIT_CASE(damon_test_set_recording), KUNIT_CASE(damon_test_three_regions_in_vmas), KUNIT_CASE(damon_test_aggregate), KUNIT_CASE(damon_test_write_rbuf), -- 2.17.1
Re: [PATCH] vimc: debayer: Add support for ARGB format
On 02.06.20 14:45, Laurent Pinchart wrote: Hello, On Tue, Jun 02, 2020 at 08:31:26AM -0300, Helen Koike wrote: On 6/2/20 8:24 AM, Kieran Bingham wrote: On 02/06/2020 11:55, Helen Koike wrote: On 6/2/20 7:52 AM, Dafna Hirschfeld wrote: On 01.06.20 14:16, Kaaira Gupta wrote: On Fri, May 29, 2020 at 05:43:57PM +0200, Dafna Hirschfeld wrote: Hi, Thanks for the patch I don't know how real devices handle ARGB formats, I wonder if it should be the part of the debayer. Hi! qcam tries to support BA24 as it is one of the formats that vimc lists as its supported formats wih --list-formats. Shouldn't BA24 be possible to capture with vimc? Hi, Just to clarify, when listing the supported formats of a video node, the node lists the formats that the video node as an independent media entity support. It does not mean that the 'camera' as a whole (that is, the media topology graph) supports all the formats that the video node lists. When interacting with a video node or a subdevice node, one interacts only with that specific entity. In the case of vimc, the RGB video node as an independent entity supports BA24 so the format appears in the list of the its supported formats. But since the Debayer does not support it, the format can not be generated by the entire vimc topology. This is not a bug. Is here a valid configuration for the vimc pipeline that produces BA24 ? I think there isn't I agree that not all pipeline configurations need to support every format, but we shouldn't report a format that can't be produced at all. This being said, and as discussed before, the de-bayering subdev should just produce MEDIA_BUS_FMT_RGB888_1X24, and the video node should then implement the RGB pixel formats. BA24 should likely be one of the supported formats (or maybe BX24 ?). So you mean that the video node should support it so when it receive RGB format in the source pad it converts it to BA24 or BX24 ? It makes sense. I guess both BA24 and BX24 can be added, I see in the pixfmt-rgb.html doc that probably the control V4L2_CID_ALPHA_COMPONENT should then be added. This is also my understanding. You should have an -EPIPE error when start streaming though, it shouldn't fail silently. Yes, we had -EPIPE, and that is what I think we were trying to resolve. How would userspace be expected to detect what formats to use ? Should the available formats on the capture node depend on the current linking of the media graph? This is a good question, I don't recall v4l2 API defining this. A recent extension to VIDIOC_ENUMFMT allows enumerating pixel formats for a given media bus code, I think that's the way forward. It would be a bit hard to implement in Vimc, specially when we have configfs for custom topology, since the capture would need to query all the pipeline. But could be implemented. Otherwise, to know what formats are supported - userspace must first 'get a list of formats' then try to 'set' the formats to know what is possible? Yes, there is a doc file that explains that it should be done in a "bottom-up" way ,that is, starting with configuring the sensor, then adjusting the debayer to the sensor output, then adjusting the scaler to the debayer outout and then adjusting the video node output to the scaler output. One should also use the 'try' version of the setting at the stage of adjusting the final configuration. The detailed explanation is in Documentation/output/userspace-api/media/v4l/dev-subdev.html Thanks, Dafna At the moment yes. Or should (given VIMC is quite specialist anyway) userspace 'just know' what is capable all the same? That's possibly fine, as we can simply remove support for the ARGB formats from the libcamera pipeline handler if it is never expected to be supported. With the configfs feature, you could build a topology with sensor->capture, and ARGB would be supported. But then as a further question - what formats will we expect VIMC to support? VIVID has a (very) wide range of formats. Would we ever expect VIMC to be as configurable? Or is the scope limited to what we have today? I know it is very limited atm, but I would like to increase the range, I'm just with a limited bandwitdh to work on it. If yes, which entity should support it, if not debayer? Should there be a separate conversion entity, or should we keep the support in debayer itself for efficiency issues? On 28.05.20 20:57, Kaaira Gupta wrote: Running qcam for pixelformat 0x34324142 showed that vimc debayer does not support it. Hence, add the support for Alpha (255). I would change the commit log to: Add support for V4L2_PIX_FMT_RGB24 format in the debayer and set the alpha channel to constant 255. Signed-off-by: Kaaira Gupta --- .../media/test-drivers/vimc/vimc-debayer.c | 27 --- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/media/test-drivers/vimc/vimc-debayer.c b/drivers/media/test-drivers/vimc/vimc-debayer.c index
[PATCH v14 10/15] tools: Add a minimal user-space tool for DAMON
From: SeongJae Park This commit adds a shallow wrapper python script, ``/tools/damon/damo`` that provides more convenient interface. Note that it is only aimed to be used for minimal reference of the DAMON's debugfs interfaces and for debugging of the DAMON itself. Signed-off-by: SeongJae Park --- tools/damon/.gitignore| 1 + tools/damon/_dist.py | 36 tools/damon/_recfile.py | 23 +++ tools/damon/bin2txt.py| 67 +++ tools/damon/damo | 37 tools/damon/heats.py | 362 ++ tools/damon/nr_regions.py | 91 ++ tools/damon/record.py | 212 ++ tools/damon/report.py | 45 + tools/damon/wss.py| 97 ++ 10 files changed, 971 insertions(+) create mode 100644 tools/damon/.gitignore create mode 100644 tools/damon/_dist.py create mode 100644 tools/damon/_recfile.py create mode 100644 tools/damon/bin2txt.py create mode 100755 tools/damon/damo create mode 100644 tools/damon/heats.py create mode 100644 tools/damon/nr_regions.py create mode 100644 tools/damon/record.py create mode 100644 tools/damon/report.py create mode 100644 tools/damon/wss.py diff --git a/tools/damon/.gitignore b/tools/damon/.gitignore new file mode 100644 index ..96403d36ff93 --- /dev/null +++ b/tools/damon/.gitignore @@ -0,0 +1 @@ +__pycache__/* diff --git a/tools/damon/_dist.py b/tools/damon/_dist.py new file mode 100644 index ..9851ec964e5c --- /dev/null +++ b/tools/damon/_dist.py @@ -0,0 +1,36 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import os +import struct +import subprocess + +def access_patterns(f): +nr_regions = struct.unpack('I', f.read(4))[0] + +patterns = [] +for r in range(nr_regions): +saddr = struct.unpack('L', f.read(8))[0] +eaddr = struct.unpack('L', f.read(8))[0] +nr_accesses = struct.unpack('I', f.read(4))[0] +patterns.append([eaddr - saddr, nr_accesses]) +return patterns + +def plot_dist(data_file, output_file, xlabel, ylabel): +terminal = output_file.split('.')[-1] +if not terminal in ['pdf', 'jpeg', 'png', 'svg']: +os.remove(data_file) +print("Unsupported plot output type.") +exit(-1) + +gnuplot_cmd = """ +set term %s; +set output '%s'; +set key off; +set xlabel '%s'; +set ylabel '%s'; +plot '%s' with linespoints;""" % (terminal, output_file, xlabel, ylabel, +data_file) +subprocess.call(['gnuplot', '-e', gnuplot_cmd]) +os.remove(data_file) + diff --git a/tools/damon/_recfile.py b/tools/damon/_recfile.py new file mode 100644 index ..331b4d8165d8 --- /dev/null +++ b/tools/damon/_recfile.py @@ -0,0 +1,23 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import struct + +fmt_version = 0 + +def set_fmt_version(f): +global fmt_version + +mark = f.read(16) +if mark == b'damon_recfmt_ver': +fmt_version = struct.unpack('i', f.read(4))[0] +else: +fmt_version = 0 +f.seek(0) +return fmt_version + +def pid(f): +if fmt_version == 0: +return struct.unpack('L', f.read(8))[0] +else: +return struct.unpack('i', f.read(4))[0] diff --git a/tools/damon/bin2txt.py b/tools/damon/bin2txt.py new file mode 100644 index ..8b9b57a0d727 --- /dev/null +++ b/tools/damon/bin2txt.py @@ -0,0 +1,67 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import argparse +import os +import struct +import sys + +import _recfile + +def parse_time(bindat): +"bindat should be 16 bytes" +sec = struct.unpack('l', bindat[0:8])[0] +nsec = struct.unpack('l', bindat[8:16])[0] +return sec * 10 + nsec; + +def pr_region(f): +saddr = struct.unpack('L', f.read(8))[0] +eaddr = struct.unpack('L', f.read(8))[0] +nr_accesses = struct.unpack('I', f.read(4))[0] +print("%012x-%012x(%10d):\t%d" % +(saddr, eaddr, eaddr - saddr, nr_accesses)) + +def pr_task_info(f): +pid = _recfile.pid(f) +print("pid: ", pid) +nr_regions = struct.unpack('I', f.read(4))[0] +print("nr_regions: ", nr_regions) +for r in range(nr_regions): +pr_region(f) + +def set_argparser(parser): +parser.add_argument('--input', '-i', type=str, metavar='', +default='damon.data', help='input file name') + +def main(args=None): +if not args: +parser = argparse.ArgumentParser() +set_argparser(parser) +args = parser.parse_args() + +file_path = args.input + +if not os.path.isfile(file_path): +print('input file (%s) is not exist' % file_path) +exit(1) + +with open(file_path, 'rb') as f: +_recfile.set_fmt_version(f) +start_time = None +while True: +timebin = f.read(16) +if len(timebin) != 16: +break +time = parse_time(timebin) +if not start_time: +
Re: Question: livepatch failed for new fork() task stack unreliable
On Tue, Jun 02, 2020 at 09:22:30AM +0800, Wangshaobo (bobo) wrote: > so i think this question is related to ORC unwinder, could i ask if you have > strategy or plan to avoid this problem ? I suspect something like this would fix it (untested): diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 6ad43fc44556..8cf95ded1410 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -50,7 +50,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, if (regs) { /* Success path for user tasks */ if (user_mode(regs)) - return 0; + break; /* * Kernel mode registers on the stack indicate an @@ -81,10 +81,6 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, if (unwind_error()) return -EINVAL; - /* Success path for non-user tasks, i.e. kthreads and idle tasks */ - if (!(task->flags & (PF_KTHREAD | PF_IDLE))) - return -EINVAL; - return 0; } diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 7f969b2d240f..d7396431261a 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state) state->sp = sp; state->regs = NULL; state->prev_regs = NULL; - state->signal = false; + state->signal = ((void *)state->ip == ret_from_fork); break; case ORC_TYPE_REGS:
[PATCH] ARM: dts: AM33xx-l4: add gpio-ranges
Add gpio-ranges properties to the gpio controller nodes. These gpio-ranges were created based on "Table 9-10. CONTROL_MODULE REGISTERS" in the "AM335x Technical Reference Manual" [0] and "Table 4-2. Pin Attributes" in the "AM335x Sitara Processor datasheet" [1]. A csv file with this data is available for reference [2]. [0] https://www.ti.com/lit/ug/spruh73q/spruh73q.pdf [1] http://www.ti.com/lit/ds/symlink/am3358.pdf [2] https://gist.github.com/pdp7/6ffaddc8867973c1c3e8612cfaf72020 Signed-off-by: Drew Fustini --- Notes: There was previous dicussion relevant to this patch: https://lore.kernel.org/linux-gpio/20200525131731.GA948395@x1/ Here is the list I compiled which shows the mapping between a gpioline and the pin number including the memory address for the pin control register gpiochip,gpio-line,pinctrl-PIN,pinctrl-address 0,0,82,44e10948 0,1,83,44e1094c 0,2,84,44e10950 0,3,85,44e10954 0,4,86,44e10958 0,5,87,44e1095c 0,6,88,44e10960 0,7,89,44e10964 0,8,52,44e108d0 0,9,53,44e108d4 0,10,54,44e108d8 0,11,55,44e108dc 0,12,94,44e10978 0,13,95,44e1097c 0,14,96,44e10980 0,15,97,44e10984 0,16,71,44e1091c 0,17,72,44e10920 0,18,135,44e10a1c 0,19,108,44e109b0 0,20,109,44e109b4 0,21,73,44e10924 0,22,8,44e10820 0,23,9,44e10824 0,26,10,44e10828 0,27,11,44e1082c 0,28,74,44e10928 0,29,81,44e10944 0,30,28,44e10870 0,31,29,44e10874 1,0,0,44e10800 1,1,1,44e10804 1,2,2,44e10808 1,3,3,44e1080c 1,4,4,44e10810 1,5,5,44e10814 1,6,6,44e10818 1,7,7,44e1081c 1,8,90,44e10968 1,9,91,44e1096c 1,10,92,44e10970 1,11,93,44e10974 1,12,12,44e10830 1,13,13,44e10834 1,14,14,44e10838 1,15,15,44e1083c 1,16,16,44e10840 1,17,17,44e10844 1,18,18,44e10848 1,19,19,44e1084c 1,20,20,44e10850 1,21,21,44e10854 1,22,22,44e10858 1,23,23,44e1085c 1,24,24,44e10860 1,25,25,44e10864 1,26,26,44e10868 1,27,27,44e1086c 1,28,30,44e10878 1,29,31,44e1087c 1,30,32,44e10880 1,31,33,44e10884 2,0,34,44e10888 2,1,35,44e1088c 2,2,36,44e10890 2,3,37,44e10894 2,4,38,44e10898 2,5,39,44e1089c 2,6,40,44e108a0 2,7,41,44e108a4 2,8,42,44e108a8 2,9,43,44e108ac 2,10,44,44e108b0 2,11,45,44e108b4 2,12,46,44e108b8 2,13,47,44e108bc 2,14,48,44e108c0 2,15,49,44e108c4 2,16,50,44e108c8 2,17,51,44e108cc 2,18,77,44e10934 2,19,78,44e10938 2,20,79,44e1093c 2,21,80,44e10940 2,22,56,44e108e0 2,23,57,44e108e4 2,24,58,44e108e8 2,25,59,44e108ec 2,26,60,44e108f0 2,27,61,44e108f4 2,28,62,44e108f8 2,29,63,44e108fc 2,30,64,44e10900 2,31,65,44e10904 3,0,66,44e10908 3,1,67,44e1090c 3,2,68,44e10910 3,3,69,44e10914 3,4,70,44e10918 3,5,98,44e10988 3,6,99,44e1098c 3,9,75,44e1092c 3,10,76,44e10930 3,13,141,44e10a34 3,14,100,44e10990 3,15,101,44e10994 3,16,102,44e10998 3,17,103,44e1099c 3,18,104,44e109a0 3,19,105,44e109a4 3,20,106,44e109a8 3,21,107,44e109ac On a BeagleBlack Black board (AM3358) with this patch: cat /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single/gpio-ranges GPIO ranges handled: 0: gpio-0-31 GPIOS [0 - 7] PINS [82 - 89] 8: gpio-0-31 GPIOS [8 - 11] PINS [52 - 55] 12: gpio-0-31 GPIOS [12 - 15] PINS [94 - 97] 16: gpio-0-31 GPIOS [16 - 17] PINS [71 - 72] 18: gpio-0-31 GPIOS [18 - 18] PINS [135 - 135] 19: gpio-0-31 GPIOS [19 - 20] PINS [108 - 109] 21: gpio-0-31 GPIOS [21 - 21] PINS [73 - 73] 22: gpio-0-31 GPIOS [22 - 23] PINS [8 - 9] 26: gpio-0-31 GPIOS [26 - 27] PINS [10 - 11] 28: gpio-0-31 GPIOS [28 - 28] PINS [74 - 74] 29: gpio-0-31 GPIOS [29 - 29] PINS [81 - 81] 30: gpio-0-31 GPIOS [30 - 31] PINS [28 - 29] 0: gpio-32-63 GPIOS [32 - 39] PINS [0 - 7] 8: gpio-32-63 GPIOS [40 - 43] PINS [90 - 93] 12: gpio-32-63 GPIOS [44 - 59] PINS [12 - 27] 28: gpio-32-63 GPIOS [60 - 63] PINS [30 - 33] 0: gpio-64-95 GPIOS [64 - 81] PINS [34 - 51] 18: gpio-64-95 GPIOS [82 - 85] PINS [77 - 80] 22: gpio-64-95 GPIOS [86 - 95] PINS [56 - 65] 0: gpio-96-127 GPIOS [96 - 100] PINS [66 - 70] 5: gpio-96-127 GPIOS [101 - 102] PINS [98 - 99] 7: gpio-96-127 GPIOS [103 - 104] PINS [75 - 76] 13: gpio-96-127 GPIOS [109 - 109] PINS [141 - 141] 14: gpio-96-127 GPIOS [110 - 117] PINS [100 - 107] arch/arm/boot/dts/am33xx-l4.dtsi | 24 1 file changed, 24 insertions(+) diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi index 5ed7f3c58c0f..340ea331e54d 100644 --- a/arch/arm/boot/dts/am33xx-l4.dtsi +++ b/arch/arm/boot/dts/am33xx-l4.dtsi @@ -151,6 +151,18 @@ SYSC_OMAP2_SOFTRESET | gpio0: gpio@0 { compatible = "ti,omap4-gpio"; + gpio-ranges = <_pinmux 0 82 8>, + <_pinmux 8 52 4>, + <_pinmux 12 94 4>, + <_pinmux 16 71 2>, + <_pinmux 18 135 1>, + <_pinmux 19 108 2>, + <_pinmux 21 73 1>, + <_pinmux 22 8 2>, +
Re: [PATCH] vimc: debayer: Add support for ARGB format
On 6/2/20 9:45 AM, Laurent Pinchart wrote: > Hello, > > On Tue, Jun 02, 2020 at 08:31:26AM -0300, Helen Koike wrote: >> On 6/2/20 8:24 AM, Kieran Bingham wrote: >>> On 02/06/2020 11:55, Helen Koike wrote: On 6/2/20 7:52 AM, Dafna Hirschfeld wrote: > On 01.06.20 14:16, Kaaira Gupta wrote: >> On Fri, May 29, 2020 at 05:43:57PM +0200, Dafna Hirschfeld wrote: >>> Hi, >>> Thanks for the patch >>> >>> I don't know how real devices handle ARGB formats, >>> I wonder if it should be the part of the debayer. >> >> Hi! qcam tries to support BA24 as it is one of the formats that vimc >> lists as its supported formats wih --list-formats. Shouldn't BA24 be >> possible to capture with vimc? > > Hi, > Just to clarify, when listing the supported formats of a video node, the > node lists > the formats that the video node as an independent media entity support. > It does not mean that the 'camera' as a whole (that is, the media > topology graph) supports > all the formats that the video node lists. When interacting with a video > node or > a subdevice node, one interacts only with that specific entity. > In the case of vimc, the RGB video node as an independent entity supports > BA24 so the format > appears in the list of the its supported formats. But since the Debayer > does not > support it, the format can not be generated by the entire vimc topology. > This is not a bug. > > Is here a valid configuration for the vimc pipeline that produces BA24 ? It should work when using the other capture nodes that doesn't contain the debayer in the pipeline. > I agree that not all pipeline configurations need to support every > format, but we shouldn't report a format that can't be produced at all. ok, this requires major changes in Vimc, unless we implement all the conversions in the capture. > > This being said, and as discussed before, the de-bayering subdev should > just produce MEDIA_BUS_FMT_RGB888_1X24, and the video node should then > implement the RGB pixel formats. BA24 should likely be one of the > supported formats (or maybe BX24 ?). We can implement the conversion in the capture node, we just need to distinguish when the pipeline generates it and when it requires conversion, but shouldn't be a problem. > This is also my understanding. You should have an -EPIPE error when start streaming though, it shouldn't fail silently. >>> >>> Yes, we had -EPIPE, and that is what I think we were trying to resolve. >>> >>> How would userspace be expected to detect what formats to use ? Should >>> the available formats on the capture node depend on the current linking >>> of the media graph? >> >> This is a good question, I don't recall v4l2 API defining this. > > A recent extension to VIDIOC_ENUMFMT allows enumerating pixel formats > for a given media bus code, I think that's the way forward. Nice, I'm not familiar with this extension, I'll take a look, thanks for the pointer. Thanks Helen > >> It would be a bit hard to implement in Vimc, specially when we have configfs >> for custom topology, since the capture would need to query all the pipeline. >> But could be implemented. >> >>> Otherwise, to know what formats are supported - userspace must first >>> 'get a list of formats' then try to 'set' the formats to know what is >>> possible? >> >> At the moment yes. >> >>> Or should (given VIMC is quite specialist anyway) userspace 'just know' >>> what is capable all the same? >>> >>> That's possibly fine, as we can simply remove support for the ARGB >>> formats from the libcamera pipeline handler if it is never expected to >>> be supported. >> >> With the configfs feature, you could build a topology with sensor->capture, >> and ARGB would be supported. >> >>> But then as a further question - what formats will we expect VIMC to >>> support? VIVID has a (very) wide range of formats. >>> >>> Would we ever expect VIMC to be as configurable? >>> Or is the scope limited to what we have today? >> >> I know it is very limited atm, but I would like to increase the range, >> I'm just with a limited bandwitdh to work on it. >> >> >> If yes, which entity should support it, if not debayer? Should there be >> a separate conversion entity, or should we keep the support in debayer >> itself for efficiency issues? >> >>> On 28.05.20 20:57, Kaaira Gupta wrote: Running qcam for pixelformat 0x34324142 showed that vimc debayer does not support it. Hence, add the support for Alpha (255). >>> >>> I would change the commit log to: >>> >>> Add support for V4L2_PIX_FMT_RGB24 format in the debayer >>> and set the alpha channel to constant 255. >>> Signed-off-by: Kaaira Gupta --- .../media/test-drivers/vimc/vimc-debayer.c | 27 --- 1 file changed, 18 insertions(+), 9
[PATCH v14 09/15] mm/damon: Add tracepoints
From: SeongJae Park This commit adds a tracepoint for DAMON. It traces the monitoring results of each region for each aggregation interval. Using this, DAMON will be easily integrated with any tracepoints supporting tools such as perf. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/trace/events/damon.h | 43 mm/damon.c | 5 + 2 files changed, 48 insertions(+) create mode 100644 include/trace/events/damon.h diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h new file mode 100644 index ..22236642d366 --- /dev/null +++ b/include/trace/events/damon.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM damon + +#if !defined(_TRACE_DAMON_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_DAMON_H + +#include +#include + +TRACE_EVENT(damon_aggregated, + + TP_PROTO(int pid, unsigned int nr_regions, + unsigned long vm_start, unsigned long vm_end, + unsigned int nr_accesses), + + TP_ARGS(pid, nr_regions, vm_start, vm_end, nr_accesses), + + TP_STRUCT__entry( + __field(int, pid) + __field(unsigned int, nr_regions) + __field(unsigned long, vm_start) + __field(unsigned long, vm_end) + __field(unsigned int, nr_accesses) + ), + + TP_fast_assign( + __entry->pid = pid; + __entry->nr_regions = nr_regions; + __entry->vm_start = vm_start; + __entry->vm_end = vm_end; + __entry->nr_accesses = nr_accesses; + ), + + TP_printk("pid=%d nr_regions=%u %lu-%lu: %u", __entry->pid, + __entry->nr_regions, __entry->vm_start, + __entry->vm_end, __entry->nr_accesses) +); + +#endif /* _TRACE_DAMON_H */ + +/* This part must be outside protection */ +#include diff --git a/mm/damon.c b/mm/damon.c index 6b0b8f21a6c6..af6f395fe06c 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -9,6 +9,8 @@ #define pr_fmt(fmt) "damon: " fmt +#define CREATE_TRACE_POINTS + #include #include #include @@ -20,6 +22,7 @@ #include #include #include +#include /* Minimal region size. Every damon_region is aligned by this. */ #define MIN_REGION PAGE_SIZE @@ -650,6 +653,8 @@ static void kdamond_reset_aggregated(struct damon_ctx *c) damon_write_rbuf(c, >vm_end, sizeof(r->vm_end)); damon_write_rbuf(c, >nr_accesses, sizeof(r->nr_accesses)); + trace_damon_aggregated(t->pid, nr, + r->vm_start, r->vm_end, r->nr_accesses); r->nr_accesses = 0; } } -- 2.17.1
From Honourable Barrister Aziz Dake.
Attn: Sir/Madam I am Honourable Barrister Aziz the personal resident Attorney here in Burkina Faso to Late Mr. Muammar Muhammad Abu Minyar al-Gaddafi of Libya c. 1942 – 20 October 2011. My client Late Mr. Muammar Muhammad Abu Minyar al-Gaddafi c. 1942 – 20 October 2011, was having a deposit sum of {thirty million four Hundred thousand united state dollars} only ($30.4M USD) with a security finance firm affiliated with African development bank here in Burkina Faso. With the above explanation’s I want to move this money from Burkina Faso to your country, affidavit on your name, but note that this is a deal between me and you and should not be related to anybody until the deal is over for security reasons, please if interested reply as soon as possible. Thanks, Honourable Barrister Aziz Dake.
[PATCH v14 08/15] mm/damon: Add debugfs interface
From: SeongJae Park This commit adds a debugfs interface for DAMON. DAMON exports four files, ``attrs``, ``pids``, ``record``, and ``monitor_on`` under its debugfs directory, ``/damon/``. Attributes -- Users can read and write the ``sampling interval``, ``aggregation interval``, ``regions update interval``, and min/max number of monitoring target regions by reading from and writing to the ``attrs`` file. For example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10, 1000 and check it again:: # cd /damon # echo 5000 10 100 10 1000 > attrs # cat attrs 5000 10 100 10 1000 Target PIDs --- Users can read and write the pids of current monitoring target processes by reading from and writing to the ``pids`` file. For example, below commands set processes having pids 42 and 4242 as the processes to be monitored and check it again:: # cd /damon # echo 42 4242 > pids # cat pids 42 4242 Note that setting the pids doesn't start the monitoring. Record -- DAMON supports direct monitoring result record feature. The recorded results are first written to a buffer and flushed to a file in batch. Users can set the size of the buffer and the path to the result file by reading from and writing to the ``record`` file. For example, below commands set the buffer to be 4 KiB and the result to be saved in '/damon.data'. # cd /damon # echo 4096 /damon.data > pids # cat record 4096 /damon.data Turning On/Off -- You can check current status, start and stop the monitoring by reading from and writing to the ``monitor_on`` file. Writing ``on`` to the file starts DAMON to monitor the target processes with the attributes. Writing ``off`` to the file stops DAMON. DAMON also stops if every target processes is terminated. Below example commands turn on, off, and check status of DAMON:: # cd /damon # echo on > monitor_on # echo off > monitor_on # cat monitor_on off Please note that you cannot write to the ``attrs`` and ``pids`` files while the monitoring is turned on. If you write to the files while DAMON is running, ``-EINVAL`` will be returned. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- mm/damon.c | 357 - 1 file changed, 356 insertions(+), 1 deletion(-) diff --git a/mm/damon.c b/mm/damon.c index 768ffd08f7ab..6b0b8f21a6c6 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -10,6 +10,7 @@ #define pr_fmt(fmt) "damon: " fmt #include +#include #include #include #include @@ -50,6 +51,15 @@ /* Get a random number in [l, r) */ #define damon_rand(l, r) (l + prandom_u32() % (r - l)) +/* A monitoring context for debugfs interface users. */ +static struct damon_ctx damon_user_ctx = { + .sample_interval = 5 * 1000, + .aggr_interval = 100 * 1000, + .regions_update_interval = 1000 * 1000, + .min_nr_regions = 10, + .max_nr_regions = 1000, +}; + /* * Construct a damon_region struct * @@ -1133,13 +1143,358 @@ int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, return 0; } -static int __init damon_init(void) +static ssize_t debugfs_monitor_on_read(struct file *file, + char __user *buf, size_t count, loff_t *ppos) +{ + struct damon_ctx *ctx = _user_ctx; + char monitor_on_buf[5]; + bool monitor_on; + int len; + + monitor_on = damon_kdamond_running(ctx); + len = snprintf(monitor_on_buf, 5, monitor_on ? "on\n" : "off\n"); + + return simple_read_from_buffer(buf, count, ppos, monitor_on_buf, len); +} + +static ssize_t debugfs_monitor_on_write(struct file *file, + const char __user *buf, size_t count, loff_t *ppos) +{ + struct damon_ctx *ctx = _user_ctx; + ssize_t ret; + char cmdbuf[5]; + int err; + + ret = simple_write_to_buffer(cmdbuf, 5, ppos, buf, count); + if (ret < 0) + return ret; + + if (sscanf(cmdbuf, "%s", cmdbuf) != 1) + return -EINVAL; + if (!strncmp(cmdbuf, "on", 5)) + err = damon_start(ctx); + else if (!strncmp(cmdbuf, "off", 5)) + err = damon_stop(ctx); + else + return -EINVAL; + + if (err) + ret = err; + return ret; +} + +static ssize_t damon_sprint_pids(struct damon_ctx *ctx, char *buf, ssize_t len) +{ + struct damon_task *t; + int written = 0; + int rc; + + damon_for_each_task(t, ctx) { + rc = snprintf([written], len - written, "%d ", t->pid); + if (!rc) + return -ENOMEM; + written += rc; + } + if (written) + written -= 1; + written += snprintf([written], len - written, "\n"); + return written; +} + +static ssize_t debugfs_pids_read(struct file *file, + char __user *buf, size_t count, loff_t
[PATCH -next] PCI: uniphier: Fix Kconfig warning
WARNING: unmet direct dependencies detected for PCIE_DW_EP Depends on [n]: PCI [=y] && PCI_ENDPOINT [=n] Selected by [y]: - PCIE_UNIPHIER_EP [=y] && PCI [=y] && (ARCH_UNIPHIER || COMPILE_TEST [=y]) && OF [=y] && HAS_IOMEM [=y] Add missing dependency to fix this. Fixes: 006564dee825 ("PCI: uniphier: Add Socionext UniPhier Pro5 PCIe endpoint controller driver") Signed-off-by: YueHaibing --- drivers/pci/controller/dwc/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig index 43a29f7a4501..044a3761c44f 100644 --- a/drivers/pci/controller/dwc/Kconfig +++ b/drivers/pci/controller/dwc/Kconfig @@ -293,6 +293,7 @@ config PCIE_UNIPHIER_EP bool "Socionext UniPhier PCIe endpoint controllers" depends on ARCH_UNIPHIER || COMPILE_TEST depends on OF && HAS_IOMEM + depends on PCI_ENDPOINT select PCIE_DW_EP help Say Y here if you want PCIe endpoint controller support on -- 2.17.1
[PATCH 2/2] lib/Kconfig.debug: Fix typo in the help text of CONFIG_PANIC_TIMEOUT
There exists duplicated "the" in the help text of CONFIG_PANIC_TIMEOUT, remove it. Signed-off-by: Tiezhu Yang --- lib/Kconfig.debug | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index b3b05ad..d33627a 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -824,7 +824,7 @@ config PANIC_TIMEOUT int "panic timeout" default 0 help - Set the timeout value (in seconds) until a reboot occurs when the + Set the timeout value (in seconds) until a reboot occurs when the kernel panics. If n = 0, then we wait forever. A timeout value n > 0 will wait n seconds before rebooting, while a timeout value n < 0 will reboot immediately. -- 2.1.0
[PATCH 1/2] kernel/panic.c: Make oops_may_print() return bool
The return value of oops_may_print() is true or false, so change its type to reflect that. Signed-off-by: Tiezhu Yang --- include/linux/kernel.h | 2 +- kernel/panic.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 9b7a8d7..69c7fa4 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -322,7 +322,7 @@ void nmi_panic(struct pt_regs *regs, const char *msg); extern void oops_enter(void); extern void oops_exit(void); void print_oops_end_marker(void); -extern int oops_may_print(void); +extern bool oops_may_print(void); void do_exit(long error_code) __noreturn; void complete_and_exit(struct completion *, long) __noreturn; diff --git a/kernel/panic.c b/kernel/panic.c index b69ee9e..064d80f 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -490,7 +490,7 @@ static void do_oops_enter_exit(void) * Return true if the calling CPU is allowed to print oops-related info. * This is a bit racy.. */ -int oops_may_print(void) +bool oops_may_print(void) { return pause_on_oops_flag == 0; } -- 2.1.0
[PATCH v2] soc: samsung: Add simple voltage coupler for Exynos5800
Add a simple custom voltage regulator coupler for Exynos5800 SoCs, which require coupling between "vdd_arm" and "vdd_int" regulators. This coupler ensures that the voltage balancing for the coupled regulators is done only when clients for the each regulator apply their constraints, so the voltage values don't go beyond the bootloader-selected operation point during the boot process. This also ensures proper voltage balancing if any of the client driver is missing. Signed-off-by: Marek Szyprowski --- v2: - removed dependency on the regulator names as pointed by krzk, now it works with all coupled regulators. So far the coupling between the regulators on Exynos5800 based boards is defined only between "vdd_arm" and "vdd_int" supplies. --- arch/arm/mach-exynos/Kconfig | 1 + drivers/soc/samsung/Kconfig | 3 + drivers/soc/samsung/Makefile | 1 + .../soc/samsung/exynos-regulator-coupler.c| 56 +++ 4 files changed, 61 insertions(+) create mode 100644 drivers/soc/samsung/exynos-regulator-coupler.c diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig index 76838255b5fa..f185cd3d4c62 100644 --- a/arch/arm/mach-exynos/Kconfig +++ b/arch/arm/mach-exynos/Kconfig @@ -118,6 +118,7 @@ config SOC_EXYNOS5800 bool "Samsung EXYNOS5800" default y depends on SOC_EXYNOS5420 + select EXYNOS_REGULATOR_COUPLER config EXYNOS_MCPM bool diff --git a/drivers/soc/samsung/Kconfig b/drivers/soc/samsung/Kconfig index c7a2003687c7..264185664594 100644 --- a/drivers/soc/samsung/Kconfig +++ b/drivers/soc/samsung/Kconfig @@ -37,4 +37,7 @@ config EXYNOS_PM_DOMAINS bool "Exynos PM domains" if COMPILE_TEST depends on PM_GENERIC_DOMAINS || COMPILE_TEST +config EXYNOS_REGULATOR_COUPLER + bool "Exynos SoC Regulator Coupler" if COMPILE_TEST + depends on ARCH_EXYNOS || COMPILE_TEST endif diff --git a/drivers/soc/samsung/Makefile b/drivers/soc/samsung/Makefile index edd1d6ea064d..ecc3a32f6406 100644 --- a/drivers/soc/samsung/Makefile +++ b/drivers/soc/samsung/Makefile @@ -9,3 +9,4 @@ obj-$(CONFIG_EXYNOS_PMU)+= exynos-pmu.o obj-$(CONFIG_EXYNOS_PMU_ARM_DRIVERS) += exynos3250-pmu.o exynos4-pmu.o \ exynos5250-pmu.o exynos5420-pmu.o obj-$(CONFIG_EXYNOS_PM_DOMAINS) += pm_domains.o +obj-$(CONFIG_EXYNOS_REGULATOR_COUPLER) += exynos-regulator-coupler.o diff --git a/drivers/soc/samsung/exynos-regulator-coupler.c b/drivers/soc/samsung/exynos-regulator-coupler.c new file mode 100644 index ..370a0ce4de3a --- /dev/null +++ b/drivers/soc/samsung/exynos-regulator-coupler.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2020 Samsung Electronics Co., Ltd. + * http://www.samsung.com/ + * Author: Marek Szyprowski + * + * Simple Samsung Exynos SoC voltage coupler. Ensures that all + * clients set their constraints before balancing the voltages. + */ + +#include +#include +#include +#include +#include + +static int exynos_coupler_balance_voltage(struct regulator_coupler *coupler, + struct regulator_dev *rdev, + suspend_state_t state) +{ + struct coupling_desc *c_desc = >coupling_desc; + int ret, cons_uV = 0, cons_max_uV = INT_MAX; + bool skip_coupled = false; + + /* get coupled regulator constraints */ + ret = regulator_check_consumers(c_desc->coupled_rdevs[1], _uV, + _max_uV, state); + if (ret < 0) + return ret; + + /* skip adjusting coupled regulator if it has no constraints set yet */ + if (cons_uV == 0) + skip_coupled = true; + + return regulator_do_balance_voltage(rdev, state, skip_coupled); +} + +static int exynos_coupler_attach(struct regulator_coupler *coupler, +struct regulator_dev *rdev) +{ + return 0; +} + +static struct regulator_coupler exynos_coupler = { + .attach_regulator = exynos_coupler_attach, + .balance_voltage = exynos_coupler_balance_voltage, +}; + +static int __init exynos_coupler_init(void) +{ + if (!of_machine_is_compatible("samsung,exynos5800")) + return 0; + + return regulator_coupler_register(_coupler); +} +arch_initcall(exynos_coupler_init); -- 2.17.1
[PATCH v14 07/15] mm/damon: Implement access pattern recording
From: SeongJae Park This commit implements the recording feature of DAMON. If this feature is enabled, DAMON writes the monitored access patterns in its binary format into a file which specified by the user. This is already able to be implemented by each user using the callbacks. However, as the recording is expected to be used widely, this commit implements the feature in the DAMON, for more convenience and efficiency. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/linux/damon.h | 15 + mm/damon.c| 131 +- 2 files changed, 143 insertions(+), 3 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index 12536d9d2f74..c4796a10cb1a 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -66,6 +66,14 @@ struct damon_task { * in case of virtual memory monitoring) and applies the changes for each * @regions_update_interval. All time intervals are in micro-seconds. * + * @rbuf: In-memory buffer for monitoring result recording. + * @rbuf_len: The length of @rbuf. + * @rbuf_offset: The offset for next write to @rbuf. + * @rfile_path: Record file path. + * + * If @rbuf, @rbuf_len, and @rfile_path are set, the monitored results are + * automatically stored in @rfile_path file. + * * @kdamond: Kernel thread who does the monitoring. * @kdamond_stop: Notifies whether kdamond should stop. * @kdamond_lock: Mutex for the synchronizations with @kdamond. @@ -100,6 +108,11 @@ struct damon_ctx { struct timespec64 last_aggregation; struct timespec64 last_regions_update; + unsigned char *rbuf; + unsigned int rbuf_len; + unsigned int rbuf_offset; + char *rfile_path; + struct task_struct *kdamond; bool kdamond_stop; struct mutex kdamond_lock; @@ -115,6 +128,8 @@ int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, unsigned long aggr_int, unsigned long regions_update_int, unsigned long min_nr_reg, unsigned long max_nr_reg); +int damon_set_recording(struct damon_ctx *ctx, + unsigned int rbuf_len, char *rfile_path); int damon_start(struct damon_ctx *ctx); int damon_stop(struct damon_ctx *ctx); diff --git a/mm/damon.c b/mm/damon.c index 42ce9df47f8b..768ffd08f7ab 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -44,6 +44,9 @@ #define damon_for_each_task_safe(t, next, ctx) \ list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list) +#define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024) +#define MAX_RFILE_PATH_LEN 256 + /* Get a random number in [l, r) */ #define damon_rand(l, r) (l + prandom_u32() % (r - l)) @@ -565,16 +568,80 @@ static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx) } /* - * Reset the aggregated monitoring results + * Flush the content in the result buffer to the result file + */ +static void damon_flush_rbuffer(struct damon_ctx *ctx) +{ + ssize_t sz; + loff_t pos = 0; + struct file *rfile; + + rfile = filp_open(ctx->rfile_path, O_CREAT | O_RDWR | O_APPEND, 0644); + if (IS_ERR(rfile)) { + pr_err("Cannot open the result file %s\n", + ctx->rfile_path); + return; + } + + while (ctx->rbuf_offset) { + sz = kernel_write(rfile, ctx->rbuf, ctx->rbuf_offset, ); + if (sz < 0) + break; + ctx->rbuf_offset -= sz; + } + filp_close(rfile, NULL); +} + +/* + * Write a data into the result buffer + */ +static void damon_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size) +{ + if (!ctx->rbuf_len || !ctx->rbuf) + return; + if (ctx->rbuf_offset + size > ctx->rbuf_len) + damon_flush_rbuffer(ctx); + + memcpy(>rbuf[ctx->rbuf_offset], data, size); + ctx->rbuf_offset += size; +} + +/* + * Flush the aggregated monitoring results to the result buffer + * + * Stores current tracking results to the result buffer and reset 'nr_accesses' + * of each region. The format for the result buffer is as below: + * + * + * + * task info: + * region info: */ static void kdamond_reset_aggregated(struct damon_ctx *c) { struct damon_task *t; - struct damon_region *r; + struct timespec64 now; + unsigned int nr; + + ktime_get_coarse_ts64(); + + damon_write_rbuf(c, , sizeof(struct timespec64)); + nr = nr_damon_tasks(c); + damon_write_rbuf(c, , sizeof(nr)); damon_for_each_task(t, c) { - damon_for_each_region(r, t) + struct damon_region *r; + + damon_write_rbuf(c, >pid, sizeof(t->pid)); + nr = nr_damon_regions(t); + damon_write_rbuf(c, , sizeof(nr)); + damon_for_each_region(r, t)
[PATCH] iommu/iova: Don't BUG on invalid PFNs
Unlike the other instances which represent a complete loss of consistency within the rcache mechanism itself, or a fundamental and obvious misconfiguration by an IOMMU driver, the BUG_ON() in iova_magazine_free_pfns() can be provoked at more or less any time in a "spooky action-at-a-distance" manner by any old device driver passing nonsense to dma_unmap_*() which then propagates through to queue_iova(). Not only is this well outside the IOVA layer's control, it's also nowhere near fatal enough to justify panicking anyway - all that really achieves is to make debugging the offending driver more difficult. Let's simply WARN and otherwise ignore bogus PFNs. Reported-by: Prakash Gupta Signed-off-by: Robin Murphy --- drivers/iommu/iova.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 0e6a9536eca6..612cbf668adf 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -811,7 +811,9 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad) for (i = 0 ; i < mag->size; ++i) { struct iova *iova = private_find_iova(iovad, mag->pfns[i]); - BUG_ON(!iova); + if (WARN_ON(!iova)) + continue; + private_free_iova(iovad, iova); } -- 2.23.0.dirty
[PATCH v14 05/15] mm/damon: Apply dynamic memory mapping changes
From: SeongJae Park Only a number of parts in the virtual address space of the processes is mapped to physical memory and accessed. Thus, tracking the unmapped address regions is just wasteful. However, tracking every memory mapping change might incur an overhead. For the reason, DAMON applies the dynamic memory mapping changes to the tracking regions only for each of a user-specified time interval (``regions update interval``). Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/linux/damon.h | 13 -- mm/damon.c| 101 +- 2 files changed, 108 insertions(+), 6 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index babdba6b5c47..5afcb2bb7f77 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -55,13 +55,16 @@ struct damon_task { * * @sample_interval: The time between access samplings. * @aggr_interval: The time between monitor results aggregations. + * @regions_update_interval: The time between monitor regions updates. * @min_nr_regions:The number of initial monitoring regions. * @max_nr_regions:The maximum number of monitoring regions. * * For each @sample_interval, DAMON checks whether each region is accessed or * not. It aggregates and keeps the access information (number of accesses to - * each region) for @aggr_interval time. All time intervals are in - * micro-seconds. + * each region) for @aggr_interval time. DAMON also checks whether the target + * memory regions need update (e.g., by ``mmap()`` calls from the application, + * in case of virtual memory monitoring) and applies the changes for each + * @regions_update_interval. All time intervals are in micro-seconds. * * @kdamond: Kernel thread who does the monitoring. * @kdamond_stop: Notifies whether kdamond should stop. @@ -81,10 +84,12 @@ struct damon_task { struct damon_ctx { unsigned long sample_interval; unsigned long aggr_interval; + unsigned long regions_update_interval; unsigned long min_nr_regions; unsigned long max_nr_regions; struct timespec64 last_aggregation; + struct timespec64 last_regions_update; struct task_struct *kdamond; bool kdamond_stop; @@ -94,8 +99,8 @@ struct damon_ctx { }; int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); -int damon_set_attrs(struct damon_ctx *ctx, - unsigned long sample_int, unsigned long aggr_int, +int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, + unsigned long aggr_int, unsigned long regions_update_int, unsigned long min_nr_reg, unsigned long max_nr_reg); int damon_start(struct damon_ctx *ctx); int damon_stop(struct damon_ctx *ctx); diff --git a/mm/damon.c b/mm/damon.c index 6b8f038331f5..de5ed38ac05c 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -713,6 +713,98 @@ static void kdamond_split_regions(struct damon_ctx *ctx) last_nr_regions = nr_regions; } +/* + * Check whether it is time to check and apply the dynamic mmap changes + * + * Returns true if it is. + */ +static bool kdamond_need_update_regions(struct damon_ctx *ctx) +{ + return damon_check_reset_time_interval(>last_regions_update, + ctx->regions_update_interval); +} + +/* + * Check whether regions are intersecting + * + * Note that this function checks 'struct damon_region' and 'struct region'. + * + * Returns true if it is. + */ +static bool damon_intersect(struct damon_region *r, struct region *re) +{ + return !(r->vm_end <= re->start || re->end <= r->vm_start); +} + +/* + * Update damon regions for the three big regions of the given task + * + * t the given task + * bregionsthe three big regions of the task + */ +static void damon_apply_three_regions(struct damon_ctx *ctx, + struct damon_task *t, struct region bregions[3]) +{ + struct damon_region *r, *next; + unsigned int i = 0; + + /* Remove regions which are not in the three big regions now */ + damon_for_each_region_safe(r, next, t) { + for (i = 0; i < 3; i++) { + if (damon_intersect(r, [i])) + break; + } + if (i == 3) + damon_destroy_region(r); + } + + /* Adjust intersecting regions to fit with the three big regions */ + for (i = 0; i < 3; i++) { + struct damon_region *first = NULL, *last; + struct damon_region *newr; + struct region *br; + + br = [i]; + /* Get the first and last regions which intersects with br */ + damon_for_each_region(r, t) { + if (damon_intersect(r, br)) { + if (!first) + first = r; +
[PATCH v14 06/15] mm/damon: Implement callbacks
From: SeongJae Park This commit implements callbacks for DAMON. Using this, DAMON users can install their callbacks for each step of the access monitoring so that they can do something interesting with the monitored access patterns online. For example, callbacks can report the monitored patterns to users or do some access pattern based memory management such as proactive reclamations or access pattern based THP promotions/demotions decision makings. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/linux/damon.h | 13 + mm/damon.c| 4 2 files changed, 17 insertions(+) diff --git a/include/linux/damon.h b/include/linux/damon.h index 5afcb2bb7f77..12536d9d2f74 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -80,6 +80,15 @@ struct damon_task { * @kdamond_lock. Accesses to other fields must be protected by themselves. * * @tasks_list:Head of monitoring target tasks (_task) list. + * + * @sample_cb: Called for each sampling interval. + * @aggregate_cb: Called for each aggregation interval. + * + * @sample_cb and @aggregate_cb are called from @kdamond for each of the + * sampling intervals and aggregation intervals, respectively. Therefore, + * users can safely access to the monitoring results via @tasks_list without + * additional protection of @kdamond_lock. For the reason, users are + * recommended to use these callback for the accesses to the results. */ struct damon_ctx { unsigned long sample_interval; @@ -96,6 +105,10 @@ struct damon_ctx { struct mutex kdamond_lock; struct list_head tasks_list;/* 'damon_task' objects */ + + /* callbacks */ + void (*sample_cb)(struct damon_ctx *context); + void (*aggregate_cb)(struct damon_ctx *context); }; int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); diff --git a/mm/damon.c b/mm/damon.c index de5ed38ac05c..42ce9df47f8b 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -850,6 +850,8 @@ static int kdamond_fn(void *data) kdamond_init_regions(ctx); while (!kdamond_need_stop(ctx)) { kdamond_prepare_access_checks(ctx); + if (ctx->sample_cb) + ctx->sample_cb(ctx); usleep_range(ctx->sample_interval, ctx->sample_interval + 1); @@ -857,6 +859,8 @@ static int kdamond_fn(void *data) if (kdamond_aggregate_interval_passed(ctx)) { kdamond_merge_regions(ctx, max_nr_accesses / 10); + if (ctx->aggregate_cb) + ctx->aggregate_cb(ctx); kdamond_reset_aggregated(ctx); kdamond_split_regions(ctx); } -- 2.17.1
[PATCH RFC 05/13] vhost/net: pass net specific struct pointer
In preparation for further cleanup, pass net specific pointer to ubuf callbacks so we can move net specific fields out to net structures. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/net.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 2927f02cc7e1..749a9cf51a59 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -94,7 +94,7 @@ struct vhost_net_ubuf_ref { */ atomic_t refcount; wait_queue_head_t wait; - struct vhost_virtqueue *vq; + struct vhost_net_virtqueue *nvq; }; #define VHOST_NET_BATCH 64 @@ -231,7 +231,7 @@ static void vhost_net_enable_zcopy(int vq) } static struct vhost_net_ubuf_ref * -vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy) +vhost_net_ubuf_alloc(struct vhost_net_virtqueue *nvq, bool zcopy) { struct vhost_net_ubuf_ref *ubufs; /* No zero copy backend? Nothing to count. */ @@ -242,7 +242,7 @@ vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy) return ERR_PTR(-ENOMEM); atomic_set(>refcount, 1); init_waitqueue_head(>wait); - ubufs->vq = vq; + ubufs->nvq = nvq; return ubufs; } @@ -384,13 +384,13 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success) { struct vhost_net_ubuf_ref *ubufs = ubuf->ctx; - struct vhost_virtqueue *vq = ubufs->vq; + struct vhost_net_virtqueue *nvq = ubufs->nvq; int cnt; rcu_read_lock_bh(); /* set len to mark this desc buffers done DMA */ - vq->heads[ubuf->desc].len = success ? + nvq->vq.heads[ubuf->desc].in_len = success ? VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN; cnt = vhost_net_ubuf_put(ubufs); @@ -402,7 +402,7 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success) * less than 10% of times). */ if (cnt <= 1 || !(cnt % 16)) - vhost_poll_queue(>poll); + vhost_poll_queue(>vq.poll); rcu_read_unlock_bh(); } @@ -1525,7 +1525,7 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd) /* start polling new socket */ oldsock = vhost_vq_get_backend(vq); if (sock != oldsock) { - ubufs = vhost_net_ubuf_alloc(vq, + ubufs = vhost_net_ubuf_alloc(nvq, sock && vhost_sock_zcopy(sock)); if (IS_ERR(ubufs)) { r = PTR_ERR(ubufs); -- MST
[PATCH RFC 09/13] vhost/net: avoid iov length math
Now that API exposes buffer length, we no longer need to scan IOVs to figure it out. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/net.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 47af3d1ce3dd..36843058182b 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -607,11 +607,9 @@ static bool vhost_exceeds_maxpend(struct vhost_net *net) } static size_t init_iov_iter(struct vhost_virtqueue *vq, struct iov_iter *iter, - size_t hdr_size, int out) + size_t len, size_t hdr_size, int out) { /* Skip header. TODO: support TSO. */ - size_t len = iov_length(vq->iov, out); - iov_iter_init(iter, WRITE, vq->iov, out, len); iov_iter_advance(iter, hdr_size); @@ -640,7 +638,7 @@ static int get_tx_bufs(struct vhost_net *net, } /* Sanity check */ - *len = init_iov_iter(vq, >msg_iter, nvq->vhost_hlen, *out); + *len = init_iov_iter(vq, >msg_iter, buf->out_len, nvq->vhost_hlen, *out); if (*len == 0) { vq_err(vq, "Unexpected header len for TX: %zd expected %zd\n", *len, nvq->vhost_hlen); @@ -1080,7 +1078,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq, nlogs += *log_num; log += *log_num; } - len = iov_length(vq->iov + seg, in); + len = bufs[bufcount].in_len; datalen -= len; ++bufcount; seg += in; -- MST
[PATCH RFC 10/13] vhost/test: convert to the buf API
Signed-off-by: Michael S. Tsirkin --- drivers/vhost/test.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c index 02806d6f84ef..251fd2bf74a3 100644 --- a/drivers/vhost/test.c +++ b/drivers/vhost/test.c @@ -44,9 +44,10 @@ static void handle_vq(struct vhost_test *n) { struct vhost_virtqueue *vq = >vqs[VHOST_TEST_VQ]; unsigned out, in; - int head; + int ret; size_t len, total_len = 0; void *private; + struct vhost_buf buf; mutex_lock(>mutex); private = vhost_vq_get_backend(vq); @@ -58,15 +59,15 @@ static void handle_vq(struct vhost_test *n) vhost_disable_notify(>dev, vq); for (;;) { - head = vhost_get_vq_desc(vq, vq->iov, -ARRAY_SIZE(vq->iov), -, , -NULL, NULL); + ret = vhost_get_avail_buf(vq, vq->iov, , + ARRAY_SIZE(vq->iov), + , , + NULL, NULL); /* On error, stop handling until the next kick. */ - if (unlikely(head < 0)) + if (unlikely(ret < 0)) break; /* Nothing new? Wait for eventfd to tell us they refilled. */ - if (head == vq->num) { + if (!ret) { if (unlikely(vhost_enable_notify(>dev, vq))) { vhost_disable_notify(>dev, vq); continue; @@ -78,13 +79,14 @@ static void handle_vq(struct vhost_test *n) "out %d, int %d\n", out, in); break; } - len = iov_length(vq->iov, out); + len = buf.out_len; /* Sanity check */ if (!len) { vq_err(vq, "Unexpected 0 len for TX\n"); break; } - vhost_add_used_and_signal(>dev, vq, head, 0); + vhost_put_used_buf(vq, ); + vhost_signal(>dev, vq); total_len += len; if (unlikely(vhost_exceeds_weight(vq, 0, total_len))) break; -- MST
[PATCH RFC 04/13] vhost: cleanup fetch_buf return code handling
Return code of fetch_buf is confusing, so callers resort to tricks to get to sane values. Let's switch to something standard: 0 empty, >0 non-empty, <0 error. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index aca2a5b0d078..bd52b44b0d23 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2146,6 +2146,8 @@ static int fetch_indirect_descs(struct vhost_virtqueue *vq, return 0; } +/* This function returns a value > 0 if a descriptor was found, or 0 if none were found. + * A negative code is returned on error. */ static int fetch_buf(struct vhost_virtqueue *vq) { unsigned int i, head, found = 0; @@ -2162,7 +2164,7 @@ static int fetch_buf(struct vhost_virtqueue *vq) if (unlikely(vq->avail_idx == vq->last_avail_idx)) { /* If we already have work to do, don't bother re-checking. */ if (likely(vq->ndescs)) - return vq->num; + return 0; if (unlikely(vhost_get_avail_idx(vq, _idx))) { vq_err(vq, "Failed to access avail idx at %p\n", @@ -2181,7 +2183,7 @@ static int fetch_buf(struct vhost_virtqueue *vq) * invalid. */ if (vq->avail_idx == last_avail_idx) - return vq->num; + return 0; /* Only get avail ring entries after they have been * exposed by guest. @@ -2251,12 +2253,14 @@ static int fetch_buf(struct vhost_virtqueue *vq) /* On success, increment avail index. */ vq->last_avail_idx++; - return 0; + return 1; } +/* This function returns a value > 0 if a descriptor was found, or 0 if none were found. + * A negative code is returned on error. */ static int fetch_descs(struct vhost_virtqueue *vq) { - int ret = 0; + int ret; if (unlikely(vq->first_desc >= vq->ndescs)) { vq->first_desc = 0; @@ -2266,10 +2270,14 @@ static int fetch_descs(struct vhost_virtqueue *vq) if (vq->ndescs) return 0; - while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq)) - ret = fetch_buf(vq); + for (ret = 1; +ret > 0 && vq->ndescs <= vhost_vq_num_batch_descs(vq); +ret = fetch_buf(vq)) + ; - return vq->ndescs ? 0 : ret; + /* On success we expect some descs */ + BUG_ON(ret > 0 && !vq->ndescs); + return ret; } /* This looks in the virtqueue and for the first available buffer, and converts @@ -2288,7 +2296,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, int ret = fetch_descs(vq); int i; - if (ret) + if (ret <= 0) return ret; /* Now convert to IOV */ -- MST
Re: [PATCH] iommu/dma: limit iova free size to unmmaped iova
On 2020-05-26 08:19, gup...@codeaurora.org wrote: On 2020-05-22 14:54, Robin Murphy wrote: On 2020-05-22 07:25, gup...@codeaurora.org wrote: On 2020-05-22 01:46, Robin Murphy wrote: On 2020-05-21 12:30, Prakash Gupta wrote: I agree, we shouldn't be freeing the partial iova. Instead just making sure if unmap was successful should be sufficient before freeing iova. So change can instead be something like this: - iommu_dma_free_iova(cookie, dma_addr, size); + if (unmapped) + iommu_dma_free_iova(cookie, dma_addr, size); TBH my gut feeling here is that you're really just trying to treat a symptom of another bug elsewhere, namely some driver calling dma_unmap_* or dma_free_* with the wrong address or size in the first place. This condition would arise only if driver calling dma_unmap/free_* with 0 iova_pfn. This will be flagged with a warning during unmap but will trigger panic later on while doing unrelated dma_map/unmap_*. If unmapped has already failed for invalid iova, there is no reason we should consider this as valid iova and free. This part should be fixed. I disagree. In general, if drivers call the DMA API incorrectly it is liable to lead to data loss, memory corruption, and various other unpleasant misbehaviour - it is not the DMA layer's job to attempt to paper over driver bugs. There *is* an argument for downgrading the BUG_ON() in iova_magazine_free_pfns() to a WARN_ON(), since frankly it isn't a sufficiently serious condition to justify killing the whole machine immediately, but NAK to bodging the iommu-dma mid-layer to "fix" that. A serious bug already happened elsewhere, so trying to hide the fallout really doesn't help anyone. Sorry for delayed response, it was a long weekend. I agree that invalid DMA API call can result in unexpected issues and client should fix it, but then the present behavior makes it difficult to catch cases when driver is making wrong DMA API calls. When invalid iova pfn is passed it doesn't fail then and there, though DMA layer is aware of iova being invalid. It fails much after that in the context of an valid map/unmap, with BUG_ON(). Downgrading BUG_ON() to WARN_ON() in iova_magazine_free_pfns() will not help much as invalid iova will cause NULL pointer dereference. Obviously I didn't mean a literal s/BUG/WARN/ substitution - some additional control flow to actually handle the error case was implied. I'll write up the patch myself, since it's easier than further debating. I see no reason why DMA layer wants to free an iova for which unmapped failed. IMHO queuing an invalid iova (which already failed unmap) to rcache which eventually going to crash the system looks like iommu-dma layer issue. What if the unmap fails because the address range is already entirely unmapped? Freeing the IOVA (or at least attempting to) would be logically appropriate in that case. In fact some IOMMU drivers might not even consider that a failure, so the DMA layer may not even be aware that it's been handed a bogus unallocated address. The point is that unmapping *doesn't* fail under normal and correct operation, so the DMA layer should not expect to have to handle it. Even if it does happen, that's a highly exceptional case that the DMA layer cannot recover from by itself; at best it can just push the problem elsewhere. It's pretty hard to justify doing extra work to simply move an exceptional problem around without really addressing it. And in this particular case, personally I would *much* rather see warnings spewing from both the pagetable and IOVA code as early as possible to clearly indicate that the DMA layer itself has been thrown out of sync, than just have warnings that might represent some other source of pagetable corruption (or at worst, depending on the pagetable code, no warnings at all and only have dma_map_*() calls quietly start failing much, much later due to all the IOVA space having been leaked by bad unmaps). Robin.
[PATCH RFC 06/13] vhost: reorder functions
Reorder functions in the file to not rely on forward declarations, in preparation to making them static down the road. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index bd52b44b0d23..b4a6e44d56a8 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2256,6 +2256,13 @@ static int fetch_buf(struct vhost_virtqueue *vq) return 1; } +/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */ +void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n) +{ + vq->last_avail_idx -= n; +} +EXPORT_SYMBOL_GPL(vhost_discard_vq_desc); + /* This function returns a value > 0 if a descriptor was found, or 0 if none were found. * A negative code is returned on error. */ static int fetch_descs(struct vhost_virtqueue *vq) @@ -2370,26 +2377,6 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, } EXPORT_SYMBOL_GPL(vhost_get_vq_desc); -/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */ -void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n) -{ - vq->last_avail_idx -= n; -} -EXPORT_SYMBOL_GPL(vhost_discard_vq_desc); - -/* After we've used one of their buffers, we tell them about it. We'll then - * want to notify the guest, using eventfd. */ -int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) -{ - struct vring_used_elem heads = { - cpu_to_vhost32(vq, head), - cpu_to_vhost32(vq, len) - }; - - return vhost_add_used_n(vq, , 1); -} -EXPORT_SYMBOL_GPL(vhost_add_used); - static int __vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, unsigned count) @@ -2459,6 +2446,19 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, } EXPORT_SYMBOL_GPL(vhost_add_used_n); +/* After we've used one of their buffers, we tell them about it. We'll then + * want to notify the guest, using eventfd. */ +int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) +{ + struct vring_used_elem heads = { + cpu_to_vhost32(vq, head), + cpu_to_vhost32(vq, len) + }; + + return vhost_add_used_n(vq, , 1); +} +EXPORT_SYMBOL_GPL(vhost_add_used); + static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) { __u16 old, new; -- MST
[PATCH RFC 02/13] vhost: use batched version by default
As testing shows no performance change, switch to that now. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eugenio Pérez Link: https://lore.kernel.org/r/20200401183118.8334-3-epere...@redhat.com Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 251 +- drivers/vhost/vhost.h | 4 - 2 files changed, 2 insertions(+), 253 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 105fc97af2c8..8f9a07282625 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2038,253 +2038,6 @@ static unsigned next_desc(struct vhost_virtqueue *vq, struct vring_desc *desc) return next; } -static int get_indirect(struct vhost_virtqueue *vq, - struct iovec iov[], unsigned int iov_size, - unsigned int *out_num, unsigned int *in_num, - struct vhost_log *log, unsigned int *log_num, - struct vring_desc *indirect) -{ - struct vring_desc desc; - unsigned int i = 0, count, found = 0; - u32 len = vhost32_to_cpu(vq, indirect->len); - struct iov_iter from; - int ret, access; - - /* Sanity check */ - if (unlikely(len % sizeof desc)) { - vq_err(vq, "Invalid length in indirect descriptor: " - "len 0x%llx not multiple of 0x%zx\n", - (unsigned long long)len, - sizeof desc); - return -EINVAL; - } - - ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, vq->indirect, -UIO_MAXIOV, VHOST_ACCESS_RO); - if (unlikely(ret < 0)) { - if (ret != -EAGAIN) - vq_err(vq, "Translation failure %d in indirect.\n", ret); - return ret; - } - iov_iter_init(, READ, vq->indirect, ret, len); - - /* We will use the result as an address to read from, so most -* architectures only need a compiler barrier here. */ - read_barrier_depends(); - - count = len / sizeof desc; - /* Buffers are chained via a 16 bit next field, so -* we can have at most 2^16 of these. */ - if (unlikely(count > USHRT_MAX + 1)) { - vq_err(vq, "Indirect buffer length too big: %d\n", - indirect->len); - return -E2BIG; - } - - do { - unsigned iov_count = *in_num + *out_num; - if (unlikely(++found > count)) { - vq_err(vq, "Loop detected: last one at %u " - "indirect size %u\n", - i, count); - return -EINVAL; - } - if (unlikely(!copy_from_iter_full(, sizeof(desc), ))) { - vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n", - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); - return -EINVAL; - } - if (unlikely(desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT))) { - vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n", - i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc); - return -EINVAL; - } - - if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE)) - access = VHOST_ACCESS_WO; - else - access = VHOST_ACCESS_RO; - - ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr), -vhost32_to_cpu(vq, desc.len), iov + iov_count, -iov_size - iov_count, access); - if (unlikely(ret < 0)) { - if (ret != -EAGAIN) - vq_err(vq, "Translation failure %d indirect idx %d\n", - ret, i); - return ret; - } - /* If this is an input descriptor, increment that count. */ - if (access == VHOST_ACCESS_WO) { - *in_num += ret; - if (unlikely(log && ret)) { - log[*log_num].addr = vhost64_to_cpu(vq, desc.addr); - log[*log_num].len = vhost32_to_cpu(vq, desc.len); - ++*log_num; - } - } else { - /* If it's an output descriptor, they're all supposed -* to come before any input descriptors. */ - if (unlikely(*in_num)) { - vq_err(vq, "Indirect descriptor " - "has out after in: idx %d\n", i); - return -EINVAL; - } -
[PATCH RFC 13/13] vhost: drop head based APIs
Everyone's using buf APIs, no need for head based ones anymore. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 36 drivers/vhost/vhost.h | 12 2 files changed, 8 insertions(+), 40 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index be822f0c9428..412923cc96df 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2256,12 +2256,12 @@ static int fetch_buf(struct vhost_virtqueue *vq) return 1; } -/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */ +/* Revert the effect of fetch_buf. Useful for error handling. */ +static void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n) { vq->last_avail_idx -= n; } -EXPORT_SYMBOL_GPL(vhost_discard_vq_desc); /* This function returns a value > 0 if a descriptor was found, or 0 if none were found. * A negative code is returned on error. */ @@ -2421,8 +2421,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq, return 0; } -/* After we've used one of their buffers, we tell them about it. We'll then - * want to notify the guest, using eventfd. */ +static int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, unsigned count) { @@ -2456,10 +2455,8 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, } return r; } -EXPORT_SYMBOL_GPL(vhost_add_used_n); -/* After we've used one of their buffers, we tell them about it. We'll then - * want to notify the guest, using eventfd. */ +static int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) { struct vring_used_elem heads = { @@ -2469,14 +2466,17 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) return vhost_add_used_n(vq, , 1); } -EXPORT_SYMBOL_GPL(vhost_add_used); +/* After we've used one of their buffers, we tell them about it. We'll then + * want to notify the guest, using vhost_signal. */ int vhost_put_used_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf) { return vhost_add_used(vq, buf->id, buf->in_len); } EXPORT_SYMBOL_GPL(vhost_put_used_buf); +/* After we've used one of their buffers, we tell them about it. We'll then + * want to notify the guest, using vhost_signal. */ int vhost_put_used_n_bufs(struct vhost_virtqueue *vq, struct vhost_buf *bufs, unsigned count) { @@ -2537,26 +2537,6 @@ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) } EXPORT_SYMBOL_GPL(vhost_signal); -/* And here's the combo meal deal. Supersize me! */ -void vhost_add_used_and_signal(struct vhost_dev *dev, - struct vhost_virtqueue *vq, - unsigned int head, int len) -{ - vhost_add_used(vq, head, len); - vhost_signal(dev, vq); -} -EXPORT_SYMBOL_GPL(vhost_add_used_and_signal); - -/* multi-buffer version of vhost_add_used_and_signal */ -void vhost_add_used_and_signal_n(struct vhost_dev *dev, -struct vhost_virtqueue *vq, -struct vring_used_elem *heads, unsigned count) -{ - vhost_add_used_n(vq, heads, count); - vhost_signal(dev, vq); -} -EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n); - /* return true if we're sure that avaiable ring is empty */ bool vhost_vq_avail_empty(struct vhost_dev *dev, struct vhost_virtqueue *vq) { diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 6c10e99ff334..4fcf59153fc7 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -195,11 +195,6 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg bool vhost_vq_access_ok(struct vhost_virtqueue *vq); bool vhost_log_access_ok(struct vhost_dev *); -int vhost_get_vq_desc(struct vhost_virtqueue *, - struct iovec iov[], unsigned int iov_count, - unsigned int *out_num, unsigned int *in_num, - struct vhost_log *log, unsigned int *log_num); -void vhost_discard_vq_desc(struct vhost_virtqueue *, int n); int vhost_get_avail_buf(struct vhost_virtqueue *, struct vhost_buf *buf, struct iovec iov[], unsigned int iov_count, unsigned int *out_num, unsigned int *in_num, @@ -207,13 +202,6 @@ int vhost_get_avail_buf(struct vhost_virtqueue *, struct vhost_buf *buf, void vhost_discard_avail_bufs(struct vhost_virtqueue *, struct vhost_buf *, unsigned count); int vhost_vq_init_access(struct vhost_virtqueue *); -int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len); -int vhost_add_used_n(struct vhost_virtqueue *, struct vring_used_elem *heads, -unsigned count); -void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *, - unsigned int id, int len); -void
[PATCH RFC 08/13] vhost/net: convert to new API: heads->bufs
Convert vhost net to use the new format-agnostic API. In particular, don't poke at vq internals such as the heads array. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/net.c | 153 +++- 1 file changed, 81 insertions(+), 72 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 749a9cf51a59..47af3d1ce3dd 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -59,13 +59,13 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;" * status internally; used for zerocopy tx only. */ /* Lower device DMA failed */ -#define VHOST_DMA_FAILED_LEN ((__force __virtio32)3) +#define VHOST_DMA_FAILED_LEN (3) /* Lower device DMA done */ -#define VHOST_DMA_DONE_LEN ((__force __virtio32)2) +#define VHOST_DMA_DONE_LEN (2) /* Lower device DMA in progress */ -#define VHOST_DMA_IN_PROGRESS ((__force __virtio32)1) +#define VHOST_DMA_IN_PROGRESS (1) /* Buffer unused */ -#define VHOST_DMA_CLEAR_LEN((__force __virtio32)0) +#define VHOST_DMA_CLEAR_LEN(0) #define VHOST_DMA_IS_DONE(len) ((__force u32)(len) >= (__force u32)VHOST_DMA_DONE_LEN) @@ -112,9 +112,12 @@ struct vhost_net_virtqueue { /* last used idx for outstanding DMA zerocopy buffers */ int upend_idx; /* For TX, first used idx for DMA done zerocopy buffers -* For RX, number of batched heads +* For RX, number of batched bufs */ int done_idx; + /* Outstanding user bufs. UIO_MAXIOV in length. */ + /* TODO: we can make this smaller for sure. */ + struct vhost_buf *bufs; /* Number of XDP frames batched */ int batched_xdp; /* an array of userspace buffers info */ @@ -271,6 +274,8 @@ static void vhost_net_clear_ubuf_info(struct vhost_net *n) int i; for (i = 0; i < VHOST_NET_VQ_MAX; ++i) { + kfree(n->vqs[i].bufs); + n->vqs[i].bufs = NULL; kfree(n->vqs[i].ubuf_info); n->vqs[i].ubuf_info = NULL; } @@ -282,6 +287,12 @@ static int vhost_net_set_ubuf_info(struct vhost_net *n) int i; for (i = 0; i < VHOST_NET_VQ_MAX; ++i) { + n->vqs[i].bufs = kmalloc_array(UIO_MAXIOV, + sizeof(*n->vqs[i].bufs), + GFP_KERNEL); + if (!n->vqs[i].bufs) + goto err; + zcopy = vhost_net_zcopy_mask & (0x1 << i); if (!zcopy) continue; @@ -364,18 +375,18 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, int j = 0; for (i = nvq->done_idx; i != nvq->upend_idx; i = (i + 1) % UIO_MAXIOV) { - if (vq->heads[i].len == VHOST_DMA_FAILED_LEN) + if (nvq->bufs[i].in_len == VHOST_DMA_FAILED_LEN) vhost_net_tx_err(net); - if (VHOST_DMA_IS_DONE(vq->heads[i].len)) { - vq->heads[i].len = VHOST_DMA_CLEAR_LEN; + if (VHOST_DMA_IS_DONE(nvq->bufs[i].in_len)) { + nvq->bufs[i].in_len = VHOST_DMA_CLEAR_LEN; ++j; } else break; } while (j) { add = min(UIO_MAXIOV - nvq->done_idx, j); - vhost_add_used_and_signal_n(vq->dev, vq, - >heads[nvq->done_idx], add); + vhost_put_used_n_bufs(vq, >bufs[nvq->done_idx], add); + vhost_signal(vq->dev, vq); nvq->done_idx = (nvq->done_idx + add) % UIO_MAXIOV; j -= add; } @@ -390,7 +401,7 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success) rcu_read_lock_bh(); /* set len to mark this desc buffers done DMA */ - nvq->vq.heads[ubuf->desc].in_len = success ? + nvq->bufs[ubuf->desc].in_len = success ? VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN; cnt = vhost_net_ubuf_put(ubufs); @@ -452,7 +463,8 @@ static void vhost_net_signal_used(struct vhost_net_virtqueue *nvq) if (!nvq->done_idx) return; - vhost_add_used_and_signal_n(dev, vq, vq->heads, nvq->done_idx); + vhost_put_used_n_bufs(vq, nvq->bufs, nvq->done_idx); + vhost_signal(dev, vq); nvq->done_idx = 0; } @@ -558,6 +570,7 @@ static void vhost_net_busy_poll(struct vhost_net *net, static int vhost_net_tx_get_vq_desc(struct vhost_net *net, struct vhost_net_virtqueue *tnvq, + struct vhost_buf *buf, unsigned int *out_num, unsigned int *in_num, struct msghdr *msghdr, bool *busyloop_intr) { @@ -565,10 +578,10 @@ static int vhost_net_tx_get_vq_desc(struct vhost_net *net, struct vhost_virtqueue *rvq = >vq;
[PATCH RFC 03/13] vhost: batching fetches
With this patch applied, new and old code perform identically. Lots of extra optimizations are now possible, e.g. we can fetch multiple heads with copy_from/to_user now. We can get rid of maintaining the log array. Etc etc. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eugenio Pérez Link: https://lore.kernel.org/r/20200401183118.8334-4-epere...@redhat.com Signed-off-by: Michael S. Tsirkin --- drivers/vhost/test.c | 2 +- drivers/vhost/vhost.c | 47 ++- drivers/vhost/vhost.h | 5 - 3 files changed, 47 insertions(+), 7 deletions(-) diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c index 9a3a09005e03..02806d6f84ef 100644 --- a/drivers/vhost/test.c +++ b/drivers/vhost/test.c @@ -119,7 +119,7 @@ static int vhost_test_open(struct inode *inode, struct file *f) dev = >dev; vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ]; n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick; - vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV, + vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64, VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT, NULL); f->private_data = n; diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 8f9a07282625..aca2a5b0d078 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -299,6 +299,7 @@ static void vhost_vq_reset(struct vhost_dev *dev, { vq->num = 1; vq->ndescs = 0; + vq->first_desc = 0; vq->desc = NULL; vq->avail = NULL; vq->used = NULL; @@ -367,6 +368,11 @@ static int vhost_worker(void *data) return 0; } +static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq) +{ + return vq->max_descs - UIO_MAXIOV; +} + static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq) { kfree(vq->descs); @@ -389,6 +395,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev) for (i = 0; i < dev->nvqs; ++i) { vq = dev->vqs[i]; vq->max_descs = dev->iov_limit; + if (vhost_vq_num_batch_descs(vq) < 0) { + return -EINVAL; + } vq->descs = kmalloc_array(vq->max_descs, sizeof(*vq->descs), GFP_KERNEL); @@ -1570,6 +1579,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg vq->last_avail_idx = s.num; /* Forget the cached index value. */ vq->avail_idx = vq->last_avail_idx; + vq->ndescs = vq->first_desc = 0; break; case VHOST_GET_VRING_BASE: s.index = idx; @@ -2136,7 +2146,7 @@ static int fetch_indirect_descs(struct vhost_virtqueue *vq, return 0; } -static int fetch_descs(struct vhost_virtqueue *vq) +static int fetch_buf(struct vhost_virtqueue *vq) { unsigned int i, head, found = 0; struct vhost_desc *last; @@ -2149,7 +2159,11 @@ static int fetch_descs(struct vhost_virtqueue *vq) /* Check it isn't doing very strange things with descriptor numbers. */ last_avail_idx = vq->last_avail_idx; - if (vq->avail_idx == vq->last_avail_idx) { + if (unlikely(vq->avail_idx == vq->last_avail_idx)) { + /* If we already have work to do, don't bother re-checking. */ + if (likely(vq->ndescs)) + return vq->num; + if (unlikely(vhost_get_avail_idx(vq, _idx))) { vq_err(vq, "Failed to access avail idx at %p\n", >avail->idx); @@ -2240,6 +2254,24 @@ static int fetch_descs(struct vhost_virtqueue *vq) return 0; } +static int fetch_descs(struct vhost_virtqueue *vq) +{ + int ret = 0; + + if (unlikely(vq->first_desc >= vq->ndescs)) { + vq->first_desc = 0; + vq->ndescs = 0; + } + + if (vq->ndescs) + return 0; + + while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq)) + ret = fetch_buf(vq); + + return vq->ndescs ? 0 : ret; +} + /* This looks in the virtqueue and for the first available buffer, and converts * it to an iovec for convenient access. Since descriptors consist of some * number of output then some number of input descriptors, it's actually two @@ -2265,7 +2297,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, if (unlikely(log)) *log_num = 0; - for (i = 0; i < vq->ndescs; ++i) { + for (i = vq->first_desc; i < vq->ndescs; ++i) { unsigned iov_count = *in_num + *out_num; struct vhost_desc *desc = >descs[i]; int access; @@ -2311,14 +2343,19 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, } ret = desc->id; + + if (!(desc->flags & VRING_DESC_F_NEXT)) +
[PATCH RFC 07/13] vhost: format-independent API for used buffers
Add a new API that doesn't assume used ring, heads, etc. For now, we keep the old APIs around to make it easier to convert drivers. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 52 ++- drivers/vhost/vhost.h | 17 +- 2 files changed, 58 insertions(+), 11 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index b4a6e44d56a8..be822f0c9428 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -2292,13 +2292,12 @@ static int fetch_descs(struct vhost_virtqueue *vq) * number of output then some number of input descriptors, it's actually two * iovecs, but we pack them into one and note how many of each there were. * - * This function returns the descriptor number found, or vq->num (which is - * never a valid descriptor number) if none was found. A negative code is - * returned on error. */ -int vhost_get_vq_desc(struct vhost_virtqueue *vq, - struct iovec iov[], unsigned int iov_size, - unsigned int *out_num, unsigned int *in_num, - struct vhost_log *log, unsigned int *log_num) + * This function returns a value > 0 if a descriptor was found, or 0 if none were found. + * A negative code is returned on error. */ +int vhost_get_avail_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf, + struct iovec iov[], unsigned int iov_size, + unsigned int *out_num, unsigned int *in_num, + struct vhost_log *log, unsigned int *log_num) { int ret = fetch_descs(vq); int i; @@ -2311,6 +2310,8 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, *out_num = *in_num = 0; if (unlikely(log)) *log_num = 0; + buf->in_len = buf->out_len = 0; + buf->descs = 0; for (i = vq->first_desc; i < vq->ndescs; ++i) { unsigned iov_count = *in_num + *out_num; @@ -2340,6 +2341,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, /* If this is an input descriptor, * increment that count. */ *in_num += ret; + buf->in_len += desc->len; if (unlikely(log && ret)) { log[*log_num].addr = desc->addr; log[*log_num].len = desc->len; @@ -2355,9 +2357,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, goto err; } *out_num += ret; + buf->out_len += desc->len; } - ret = desc->id; + buf->id = desc->id; + ++buf->descs; if (!(desc->flags & VRING_DESC_F_NEXT)) break; @@ -2365,7 +2369,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, vq->first_desc = i + 1; - return ret; + return 1; err: for (i = vq->first_desc; i < vq->ndescs; ++i) @@ -2375,7 +2379,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, return ret; } -EXPORT_SYMBOL_GPL(vhost_get_vq_desc); +EXPORT_SYMBOL_GPL(vhost_get_avail_buf); + +/* Reverse the effect of vhost_get_avail_buf. Useful for error handling. */ +void vhost_discard_avail_bufs(struct vhost_virtqueue *vq, + struct vhost_buf *buf, unsigned count) +{ + vhost_discard_vq_desc(vq, count); +} +EXPORT_SYMBOL_GPL(vhost_discard_avail_bufs); static int __vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads, @@ -2459,6 +2471,26 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) } EXPORT_SYMBOL_GPL(vhost_add_used); +int vhost_put_used_buf(struct vhost_virtqueue *vq, struct vhost_buf *buf) +{ + return vhost_add_used(vq, buf->id, buf->in_len); +} +EXPORT_SYMBOL_GPL(vhost_put_used_buf); + +int vhost_put_used_n_bufs(struct vhost_virtqueue *vq, + struct vhost_buf *bufs, unsigned count) +{ + unsigned i; + + for (i = 0; i < count; ++i) { + vq->heads[i].id = cpu_to_vhost32(vq, bufs[i].id); + vq->heads[i].len = cpu_to_vhost32(vq, bufs[i].in_len); + } + + return vhost_add_used_n(vq, vq->heads, count); +} +EXPORT_SYMBOL_GPL(vhost_put_used_n_bufs); + static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) { __u16 old, new; diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index a67bda9792ec..6c10e99ff334 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -67,6 +67,13 @@ struct vhost_desc { u16 id; }; +struct vhost_buf { + u32 out_len; + u32 in_len; + u16 descs; + u16 id; +}; + /* The virtqueue structure describes a queue attached to a device. */ struct vhost_virtqueue { struct vhost_dev *dev; @@ -193,7 +200,12 @@ int
[PATCH RFC 11/13] vhost/scsi: switch to buf APIs
Switch to buf APIs. Doing this exposes a spec violation in vhost scsi: all used bufs are marked with length 0. Fix that is left for another day. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/scsi.c | 73 ++-- 1 file changed, 44 insertions(+), 29 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index c39952243fd3..c426c4e899c7 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -71,8 +71,8 @@ struct vhost_scsi_inflight { }; struct vhost_scsi_cmd { - /* Descriptor from vhost_get_vq_desc() for virt_queue segment */ - int tvc_vq_desc; + /* Descriptor from vhost_get_avail_buf() for virt_queue segment */ + struct vhost_buf tvc_vq_desc; /* virtio-scsi initiator task attribute */ int tvc_task_attr; /* virtio-scsi response incoming iovecs */ @@ -213,7 +213,7 @@ struct vhost_scsi { * Context for processing request and control queue operations. */ struct vhost_scsi_ctx { - int head; + struct vhost_buf buf; unsigned int out, in; size_t req_size, rsp_size; size_t out_size, in_size; @@ -443,6 +443,20 @@ static int vhost_scsi_check_stop_free(struct se_cmd *se_cmd) return target_put_sess_cmd(se_cmd); } +/* Signal to guest that request finished with no input buffer. */ +/* TODO calling this when writing into buffer and most likely a bug */ +static void vhost_scsi_signal_noinput(struct vhost_dev *vdev, + struct vhost_virtqueue *vq, + struct vhost_buf *bufp) +{ + struct vhost_buf buf = *bufp; + + buf.in_len = 0; + vhost_put_used_buf(vq, ); + vhost_signal(vdev, vq); +} + + static void vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt) { @@ -450,7 +464,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt) struct virtio_scsi_event *event = >event; struct virtio_scsi_event __user *eventp; unsigned out, in; - int head, ret; + struct vhost_buf buf; + int ret; if (!vhost_vq_get_backend(vq)) { vs->vs_events_missed = true; @@ -459,14 +474,14 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt) again: vhost_disable_notify(>dev, vq); - head = vhost_get_vq_desc(vq, vq->iov, - ARRAY_SIZE(vq->iov), , , - NULL, NULL); - if (head < 0) { + ret = vhost_get_avail_buf(vq, , + vq->iov, ARRAY_SIZE(vq->iov), , , + NULL, NULL); + if (ret < 0) { vs->vs_events_missed = true; return; } - if (head == vq->num) { + if (!ret) { if (vhost_enable_notify(>dev, vq)) goto again; vs->vs_events_missed = true; @@ -488,7 +503,7 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt) eventp = vq->iov[out].iov_base; ret = __copy_to_user(eventp, event, sizeof(*event)); if (!ret) - vhost_add_used_and_signal(>dev, vq, head, 0); + vhost_scsi_signal_noinput(>dev, vq, ); else vq_err(vq, "Faulted on vhost_scsi_send_event\n"); } @@ -549,7 +564,7 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) ret = copy_to_iter(_rsp, sizeof(v_rsp), _iter); if (likely(ret == sizeof(v_rsp))) { struct vhost_scsi_virtqueue *q; - vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); + vhost_put_used_buf(cmd->tvc_vq, >tvc_vq_desc); q = container_of(cmd->tvc_vq, struct vhost_scsi_virtqueue, vq); vq = q - vs->vqs; __set_bit(vq, signal); @@ -793,7 +808,7 @@ static void vhost_scsi_submission_work(struct work_struct *work) static void vhost_scsi_send_bad_target(struct vhost_scsi *vs, struct vhost_virtqueue *vq, - int head, unsigned out) + struct vhost_buf *buf, unsigned out) { struct virtio_scsi_cmd_resp __user *resp; struct virtio_scsi_cmd_resp rsp; @@ -804,7 +819,7 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs, resp = vq->iov[out].iov_base; ret = __copy_to_user(resp, , sizeof(rsp)); if (!ret) - vhost_add_used_and_signal(>dev, vq, head, 0); + vhost_scsi_signal_noinput(>dev, vq, buf); else pr_err("Faulted on virtio_scsi_cmd_resp\n"); } @@ -813,21 +828,21 @@ static int vhost_scsi_get_desc(struct vhost_scsi *vs, struct vhost_virtqueue *vq, struct vhost_scsi_ctx *vc) { - int ret = -ENXIO; + int r, ret = -ENXIO; - vc->head =
[PATCH RFC 12/13] vhost/vsock: switch to the buf API
A straight-forward conversion. Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vsock.c | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index fb4e944c4d0d..07d1fb340fb4 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -103,7 +103,8 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, unsigned out, in; size_t nbytes; size_t iov_len, payload_len; - int head; + struct vhost_buf buf; + int ret; spin_lock_bh(>send_pkt_list_lock); if (list_empty(>send_pkt_list)) { @@ -117,16 +118,17 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, list_del_init(>list); spin_unlock_bh(>send_pkt_list_lock); - head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), -, , NULL, NULL); - if (head < 0) { + ret = vhost_get_avail_buf(vq, , + vq->iov, ARRAY_SIZE(vq->iov), + , , NULL, NULL); + if (ret < 0) { spin_lock_bh(>send_pkt_list_lock); list_add(>list, >send_pkt_list); spin_unlock_bh(>send_pkt_list_lock); break; } - if (head == vq->num) { + if (!ret) { spin_lock_bh(>send_pkt_list_lock); list_add(>list, >send_pkt_list); spin_unlock_bh(>send_pkt_list_lock); @@ -186,7 +188,8 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, */ virtio_transport_deliver_tap_pkt(pkt); - vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len); + buf.in_len = sizeof(pkt->hdr) + payload_len; + vhost_put_used_buf(vq, ); added = true; pkt->off += payload_len; @@ -440,7 +443,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock, dev); struct virtio_vsock_pkt *pkt; - int head, pkts = 0, total_len = 0; + int ret, pkts = 0, total_len = 0; + struct vhost_buf buf; unsigned int out, in; bool added = false; @@ -461,12 +465,13 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) goto no_more_replies; } - head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), -, , NULL, NULL); - if (head < 0) + ret = vhost_get_avail_buf(vq, , + vq->iov, ARRAY_SIZE(vq->iov), + , , NULL, NULL); + if (ret < 0) break; - if (head == vq->num) { + if (!ret) { if (unlikely(vhost_enable_notify(>dev, vq))) { vhost_disable_notify(>dev, vq); continue; @@ -494,7 +499,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) virtio_transport_free_pkt(pkt); len += sizeof(pkt->hdr); - vhost_add_used(vq, head, len); + buf.in_len = len; + vhost_put_used_buf(vq, ); total_len += len; added = true; } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); -- MST
[PATCH RFC 01/13] vhost: option to fetch descriptors through an independent struct
The idea is to support multiple ring formats by converting to a format-independent array of descriptors. This costs extra cycles, but we gain in ability to fetch a batch of descriptors in one go, which is good for code cache locality. When used, this causes a minor performance degradation, it's been kept as simple as possible for ease of review. A follow-up patch gets us back the performance by adding batching. To simplify benchmarking, I kept the old code around so one can switch back and forth between old and new code. This will go away in the final submission. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eugenio Pérez Link: https://lore.kernel.org/r/20200401183118.8334-2-epere...@redhat.com Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vhost.c | 297 +- drivers/vhost/vhost.h | 16 +++ 2 files changed, 312 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 96d9871fa0cb..105fc97af2c8 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -298,6 +298,7 @@ static void vhost_vq_reset(struct vhost_dev *dev, struct vhost_virtqueue *vq) { vq->num = 1; + vq->ndescs = 0; vq->desc = NULL; vq->avail = NULL; vq->used = NULL; @@ -368,6 +369,9 @@ static int vhost_worker(void *data) static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq) { + kfree(vq->descs); + vq->descs = NULL; + vq->max_descs = 0; kfree(vq->indirect); vq->indirect = NULL; kfree(vq->log); @@ -384,6 +388,10 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev) for (i = 0; i < dev->nvqs; ++i) { vq = dev->vqs[i]; + vq->max_descs = dev->iov_limit; + vq->descs = kmalloc_array(vq->max_descs, + sizeof(*vq->descs), + GFP_KERNEL); vq->indirect = kmalloc_array(UIO_MAXIOV, sizeof(*vq->indirect), GFP_KERNEL); @@ -391,7 +399,7 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev) GFP_KERNEL); vq->heads = kmalloc_array(dev->iov_limit, sizeof(*vq->heads), GFP_KERNEL); - if (!vq->indirect || !vq->log || !vq->heads) + if (!vq->indirect || !vq->log || !vq->heads || !vq->descs) goto err_nomem; } return 0; @@ -2277,6 +2285,293 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, } EXPORT_SYMBOL_GPL(vhost_get_vq_desc); +static struct vhost_desc *peek_split_desc(struct vhost_virtqueue *vq) +{ + BUG_ON(!vq->ndescs); + return >descs[vq->ndescs - 1]; +} + +static void pop_split_desc(struct vhost_virtqueue *vq) +{ + BUG_ON(!vq->ndescs); + --vq->ndescs; +} + +#define VHOST_DESC_FLAGS (VRING_DESC_F_INDIRECT | VRING_DESC_F_WRITE | \ + VRING_DESC_F_NEXT) +static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc *desc, u16 id) +{ + struct vhost_desc *h; + + if (unlikely(vq->ndescs >= vq->max_descs)) + return -EINVAL; + h = >descs[vq->ndescs++]; + h->addr = vhost64_to_cpu(vq, desc->addr); + h->len = vhost32_to_cpu(vq, desc->len); + h->flags = vhost16_to_cpu(vq, desc->flags) & VHOST_DESC_FLAGS; + h->id = id; + + return 0; +} + +static int fetch_indirect_descs(struct vhost_virtqueue *vq, + struct vhost_desc *indirect, + u16 head) +{ + struct vring_desc desc; + unsigned int i = 0, count, found = 0; + u32 len = indirect->len; + struct iov_iter from; + int ret; + + /* Sanity check */ + if (unlikely(len % sizeof desc)) { + vq_err(vq, "Invalid length in indirect descriptor: " + "len 0x%llx not multiple of 0x%zx\n", + (unsigned long long)len, + sizeof desc); + return -EINVAL; + } + + ret = translate_desc(vq, indirect->addr, len, vq->indirect, +UIO_MAXIOV, VHOST_ACCESS_RO); + if (unlikely(ret < 0)) { + if (ret != -EAGAIN) + vq_err(vq, "Translation failure %d in indirect.\n", ret); + return ret; + } + iov_iter_init(, READ, vq->indirect, ret, len); + + /* We will use the result as an address to read from, so most +* architectures only need a compiler barrier here. */ + read_barrier_depends(); + + count = len / sizeof desc; + /* Buffers are chained via a 16 bit next field, so +* we can have at most 2^16 of these. */ + if (unlikely(count > USHRT_MAX + 1)) { + vq_err(vq,
Re: [PATCH] tcp: fix TCP socks unreleased in BBR mode
On Tue, Jun 2, 2020 at 1:05 AM wrote: > > From: Jason Xing > > TCP socks cannot be released because of the sock_hold() increasing the > sk_refcnt in the manner of tcp_internal_pacing() when RTO happens. > Therefore, this situation could increase the slab memory and then trigger > the OOM if the machine has beening running for a long time. This issue, > however, can happen on some machine only running a few days. > > We add one exception case to avoid unneeded use of sock_hold if the > pacing_timer is enqueued. > > Reproduce procedure: > 0) cat /proc/slabinfo | grep TCP > 1) switch net.ipv4.tcp_congestion_control to bbr > 2) using wrk tool something like that to send packages > 3) using tc to increase the delay in the dev to simulate the busy case. > 4) cat /proc/slabinfo | grep TCP > 5) kill the wrk command and observe the number of objects and slabs in TCP. > 6) at last, you could notice that the number would not decrease. > > Signed-off-by: Jason Xing > Signed-off-by: liweishi > Signed-off-by: Shujin Li > --- > net/ipv4/tcp_output.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index cc4ba42..5cf63d9 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -969,7 +969,8 @@ static void tcp_internal_pacing(struct sock *sk, const > struct sk_buff *skb) > u64 len_ns; > u32 rate; > > - if (!tcp_needs_internal_pacing(sk)) > + if (!tcp_needs_internal_pacing(sk) || > + hrtimer_is_queued(_sk(sk)->pacing_timer)) > return; > rate = sk->sk_pacing_rate; > if (!rate || rate == ~0U) > -- > 1.8.3.1 > Hi Jason. Please do not send patches that do not apply to current upstream trees. Instead, backport to your kernels the needed fixes. I suspect that you are not using a pristine linux kernel, but some heavily modified one and something went wrong in your backports. Do not ask us to spend time finding what went wrong. Thank you.
[PATCH RFC 00/13] vhost: format independence
We let the specifics of the ring format seep through to vhost API callers - mostly because there was only one format so it was hard to imagine what an independent API would look like. Now that there's an alternative in form of the packed ring, it's easier to see the issues, and fixing them is perhaps the cleanest way to add support for more formats. This patchset does this by indtroducing two new structures: vhost_buf to represent a buffer and vhost_desc to represent a descriptor. Descriptors aren't normally of interest to devices but do occationally get exposed e.g. for logging. Perhaps surprisingly, the higher level API actually makes things a bit easier for callers, as well as allows more freedom for the vhost core. The end result is basically unchanged performance (based on preliminary testing) even though we are forced to go through a format conversion. The conversion also exposed (more) bugs in vhost scsi - which isn't really surprising, that driver needs a lot more love than it's getting. Very lightly tested. Would appreciate feedback and testing. Michael S. Tsirkin (13): vhost: option to fetch descriptors through an independent struct vhost: use batched version by default vhost: batching fetches vhost: cleanup fetch_buf return code handling vhost/net: pass net specific struct pointer vhost: reorder functions vhost: format-independent API for used buffers vhost/net: convert to new API: heads->bufs vhost/net: avoid iov length math vhost/test: convert to the buf API vhost/scsi: switch to buf APIs vhost/vsock: switch to the buf API vhost: drop head based APIs drivers/vhost/net.c | 173 +-- drivers/vhost/scsi.c | 73 drivers/vhost/test.c | 22 +-- drivers/vhost/vhost.c | 375 +++--- drivers/vhost/vhost.h | 46 -- drivers/vhost/vsock.c | 30 ++-- 6 files changed, 436 insertions(+), 283 deletions(-) -- MST
[PATCH v14 04/15] mm/damon: Adaptively adjust regions
From: SeongJae Park At the beginning of the monitoring, DAMON constructs the initial regions by evenly splitting the memory mapped address space of the process into the user-specified minimal number of regions. In this initial state, the assumption of the regions (pages in same region have similar access frequencies) is normally not kept and thus the monitoring quality could be low. To keep the assumption as much as possible, DAMON adaptively merges and splits each region. For each ``aggregation interval``, it compares the access frequencies of adjacent regions and merges those if the frequency difference is small. Then, after it reports and clears the aggregated access frequency of each region, it splits each region into two regions if the total number of regions is smaller than the half of the user-specified maximum number of regions. In this way, DAMON provides its best-effort quality and minimal overhead while keeping the bounds users set for their trade-off. Signed-off-by: SeongJae Park --- include/linux/damon.h | 7 +- mm/damon.c| 174 +++--- 2 files changed, 169 insertions(+), 12 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index f0fe4520a4e9..babdba6b5c47 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -56,6 +56,7 @@ struct damon_task { * @sample_interval: The time between access samplings. * @aggr_interval: The time between monitor results aggregations. * @min_nr_regions:The number of initial monitoring regions. + * @max_nr_regions:The maximum number of monitoring regions. * * For each @sample_interval, DAMON checks whether each region is accessed or * not. It aggregates and keeps the access information (number of accesses to @@ -81,6 +82,7 @@ struct damon_ctx { unsigned long sample_interval; unsigned long aggr_interval; unsigned long min_nr_regions; + unsigned long max_nr_regions; struct timespec64 last_aggregation; @@ -92,8 +94,9 @@ struct damon_ctx { }; int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); -int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, - unsigned long aggr_int, unsigned long min_nr_reg); +int damon_set_attrs(struct damon_ctx *ctx, + unsigned long sample_int, unsigned long aggr_int, + unsigned long min_nr_reg, unsigned long max_nr_reg); int damon_start(struct damon_ctx *ctx); int damon_stop(struct damon_ctx *ctx); diff --git a/mm/damon.c b/mm/damon.c index b374544cdd78..6b8f038331f5 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -332,9 +332,12 @@ static int damon_three_regions_of(struct damon_task *t, * regions is wasteful. That said, because we can deal with small noises, * tracking every mapping is not strictly required but could even incur a high * overhead if the mapping frequently changes or the number of mappings is - * high. Nonetheless, this may seems very weird. DAMON's dynamic regions - * adjustment mechanism, which will be implemented with following commit will - * make this more sense. + * high. The adaptive regions adjustment mechanism will further help to deal + * with the noise by simply identifying the unmapped areas as a region that + * has no access. Moreover, applying the real mappings that would have many + * unmapped areas inside will make the adaptive mechanism quite complex. That + * said, too huge unmapped areas inside the monitoring target should be removed + * to not take the time for the adaptive mechanism. * * For the reason, we convert the complex mappings to three distinct regions * that cover every mapped area of the address space. Also the two gaps @@ -508,20 +511,25 @@ static void damon_check_access(struct damon_ctx *ctx, last_addr = r->sampling_addr; } -static void kdamond_check_accesses(struct damon_ctx *ctx) +static unsigned int kdamond_check_accesses(struct damon_ctx *ctx) { struct damon_task *t; struct mm_struct *mm; struct damon_region *r; + unsigned int max_nr_accesses = 0; damon_for_each_task(t, ctx) { mm = damon_get_mm(t); if (!mm) continue; - damon_for_each_region(r, t) + damon_for_each_region(r, t) { damon_check_access(ctx, mm, r); + max_nr_accesses = max(r->nr_accesses, max_nr_accesses); + } + mmput(mm); } + return max_nr_accesses; } /* @@ -570,6 +578,141 @@ static void kdamond_reset_aggregated(struct damon_ctx *c) } } +#define sz_damon_region(r) (r->vm_end - r->vm_start) + +/* + * Merge two adjacent regions into one region + */ +static void damon_merge_two_regions(struct damon_region *l, + struct damon_region *r) +{ + l->nr_accesses = (l->nr_accesses *
[PATCH v14 03/15] mm/damon: Implement region based sampling
From: SeongJae Park This commit implements DAMON's basic access check and region based sampling mechanisms. This change would seems make no sense, mainly because it is only a part of the DAMON's logics. Following two commits will make more sense. Basic Access Check -- DAMON basically reports what pages are how frequently accessed. Note that the frequency is not an absolute number of accesses, but a relative frequency among the pages of the target workloads. Users can control the resolution of the reports by setting two time intervals, ``sampling interval`` and ``aggregation interval``. In detail, DAMON checks access to each page per ``sampling interval``, aggregates the results (counts the number of the accesses to each page), and reports the aggregated results per ``aggregation interval``. For the access check of each page, DAMON uses the Accessed bits of PTEs. This is thus similar to common periodic access checks based access tracking mechanisms, which overhead is increasing as the size of the target process grows. Region Based Sampling - To avoid the unbounded increase of the overhead, DAMON groups a number of adjacent pages that assumed to have same access frequencies into a region. As long as the assumption (pages in a region have same access frequencies) is kept, only one page in the region is required to be checked. Thus, for each ``sampling interval``, DAMON randomly picks one page in each region and clears its Accessed bit. After one more ``sampling interval``, DAMON reads the Accessed bit of the page and increases the access frequency of the region if the bit has set meanwhile. Therefore, the monitoring overhead is controllable by setting the number of regions. Nonetheless, this scheme cannot preserve the quality of the output if the assumption is not kept. Following commit will introduce how we can make the guarantee with best effort. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/linux/damon.h | 48 +++- mm/damon.c| 597 ++ 2 files changed, 644 insertions(+), 1 deletion(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index 13564929..f0fe4520a4e9 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -11,6 +11,8 @@ #define _DAMON_H_ #include +#include +#include #include /** @@ -44,11 +46,55 @@ struct damon_task { }; /** - * struct damon_ctx - Represents a context for each monitoring. + * struct damon_ctx - Represents a context for each monitoring. This is the + * main interface that allows users to set the attributes and get the results + * of the monitoring. + * + * For each monitoring request (damon_start()), a kernel thread for the + * monitoring is created. The pointer to the thread is stored in @kdamond. + * + * @sample_interval: The time between access samplings. + * @aggr_interval: The time between monitor results aggregations. + * @min_nr_regions:The number of initial monitoring regions. + * + * For each @sample_interval, DAMON checks whether each region is accessed or + * not. It aggregates and keeps the access information (number of accesses to + * each region) for @aggr_interval time. All time intervals are in + * micro-seconds. + * + * @kdamond: Kernel thread who does the monitoring. + * @kdamond_stop: Notifies whether kdamond should stop. + * @kdamond_lock: Mutex for the synchronizations with @kdamond. + * + * The monitoring thread sets @kdamond to NULL when it terminates. Therefore, + * users can know whether the monitoring is ongoing or terminated by reading + * @kdamond. Also, users can ask @kdamond to be terminated by writing non-zero + * to @kdamond_stop. Reads and writes to @kdamond and @kdamond_stop from + * outside of the monitoring thread must be protected by @kdamond_lock. + * + * Note that the monitoring thread protects only @kdamond and @kdamond_stop via + * @kdamond_lock. Accesses to other fields must be protected by themselves. + * * @tasks_list:Head of monitoring target tasks (_task) list. */ struct damon_ctx { + unsigned long sample_interval; + unsigned long aggr_interval; + unsigned long min_nr_regions; + + struct timespec64 last_aggregation; + + struct task_struct *kdamond; + bool kdamond_stop; + struct mutex kdamond_lock; + struct list_head tasks_list;/* 'damon_task' objects */ }; +int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); +int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, + unsigned long aggr_int, unsigned long min_nr_reg); +int damon_start(struct damon_ctx *ctx); +int damon_stop(struct damon_ctx *ctx); + #endif diff --git a/mm/damon.c b/mm/damon.c index a5a7820ef0ad..b374544cdd78 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -10,10 +10,19 @@ #define pr_fmt(fmt) "damon: " fmt
[PATCH v14 02/15] mm: Introduce Data Access MONitor (DAMON)
From: SeongJae Park This commit introduces a kernel module named DAMON. Note that this commit is implementing only the stub for the module load/unload, basic data structures, and simple manipulation functions of the structures to keep the size of commit small. The core mechanisms of DAMON will be implemented one by one by following commits. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- include/linux/damon.h | 54 + mm/Kconfig| 12 +++ mm/Makefile | 1 + mm/damon.c| 173 ++ 4 files changed, 240 insertions(+) create mode 100644 include/linux/damon.h create mode 100644 mm/damon.c diff --git a/include/linux/damon.h b/include/linux/damon.h new file mode 100644 index ..13564929 --- /dev/null +++ b/include/linux/damon.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * DAMON api + * + * Copyright 2019-2020 Amazon.com, Inc. or its affiliates. + * + * Author: SeongJae Park + */ + +#ifndef _DAMON_H_ +#define _DAMON_H_ + +#include +#include + +/** + * struct damon_region - Represents a monitoring target region of + * [@vm_start, @vm_end). + * + * @vm_start: Start address of the region (inclusive). + * @vm_end:End address of the region (exclusive). + * @sampling_addr: Address of the sample for the next access check. + * @nr_accesses: Access frequency of this region. + * @list: List head for siblings. + */ +struct damon_region { + unsigned long vm_start; + unsigned long vm_end; + unsigned long sampling_addr; + unsigned int nr_accesses; + struct list_head list; +}; + +/** + * struct damon_task - Represents a monitoring target task. + * @pid: Process id of the task. + * @regions_list: Head of the monitoring target regions of this task. + * @list: List head for siblings. + */ +struct damon_task { + int pid; + struct list_head regions_list; + struct list_head list; +}; + +/** + * struct damon_ctx - Represents a context for each monitoring. + * @tasks_list:Head of monitoring target tasks (_task) list. + */ +struct damon_ctx { + struct list_head tasks_list;/* 'damon_task' objects */ +}; + +#endif diff --git a/mm/Kconfig b/mm/Kconfig index c1acc34c1c35..ecea0889ea35 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -867,4 +867,16 @@ config ARCH_HAS_HUGEPD config MAPPING_DIRTY_HELPERS bool +config DAMON + tristate "Data Access Monitor" + depends on MMU + help + Provides data access monitoring. + + DAMON is a kernel module that allows users to monitor the actual + memory access pattern of specific user-space processes. It aims to + be 1) accurate enough to be useful for performance-centric domains, + and 2) sufficiently light-weight so that it can be applied online. + If unsure, say N. + endmenu diff --git a/mm/Makefile b/mm/Makefile index fccd3756b25f..230e545b6e07 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -112,3 +112,4 @@ obj-$(CONFIG_MEMFD_CREATE) += memfd.o obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o obj-$(CONFIG_PTDUMP_CORE) += ptdump.o obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o +obj-$(CONFIG_DAMON) += damon.o diff --git a/mm/damon.c b/mm/damon.c new file mode 100644 index ..a5a7820ef0ad --- /dev/null +++ b/mm/damon.c @@ -0,0 +1,173 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Data Access Monitor + * + * Copyright 2019-2020 Amazon.com, Inc. or its affiliates. + * + * Author: SeongJae Park + */ + +#define pr_fmt(fmt) "damon: " fmt + +#include +#include +#include +#include + +#define damon_get_task_struct(t) \ + (get_pid_task(find_vpid(t->pid), PIDTYPE_PID)) + +#define damon_next_region(r) \ + (container_of(r->list.next, struct damon_region, list)) + +#define damon_prev_region(r) \ + (container_of(r->list.prev, struct damon_region, list)) + +#define damon_for_each_region(r, t) \ + list_for_each_entry(r, >regions_list, list) + +#define damon_for_each_region_safe(r, next, t) \ + list_for_each_entry_safe(r, next, >regions_list, list) + +#define damon_for_each_task(t, ctx) \ + list_for_each_entry(t, &(ctx)->tasks_list, list) + +#define damon_for_each_task_safe(t, next, ctx) \ + list_for_each_entry_safe(t, next, &(ctx)->tasks_list, list) + +/* Get a random number in [l, r) */ +#define damon_rand(l, r) (l + prandom_u32() % (r - l)) + +/* + * Construct a damon_region struct + * + * Returns the pointer to the new struct if success, or NULL otherwise + */ +static struct damon_region *damon_new_region(struct damon_ctx *ctx, + unsigned long vm_start, unsigned long vm_end) +{ + struct damon_region *region; + + region = kmalloc(sizeof(*region), GFP_KERNEL); + if (!region) + return NULL; + + region->vm_start
Re: Security Random Number Generator support
On 2020-06-02 13:14, Ard Biesheuvel wrote: On Tue, 2 Jun 2020 at 10:15, Neal Liu wrote: These patch series introduce a security random number generator which provides a generic interface to get hardware rnd from Secure state. The Secure state can be Arm Trusted Firmware(ATF), Trusted Execution Environment(TEE), or even EL2 hypervisor. Patch #1..2 adds sec-rng kernel driver for Trustzone based SoCs. For security awareness SoCs on ARMv8 with TrustZone enabled, peripherals like entropy sources is not accessible from normal world (linux) and rather accessible from secure world (HYP/ATF/TEE) only. This driver aims to provide a generic interface to Arm Trusted Firmware or Hypervisor rng service. changes since v1: - rename mt67xx-rng to mtk-sec-rng since all MediaTek ARMv8 SoCs can reuse this driver. - refine coding style and unnecessary check. changes since v2: - remove unused comments. - remove redundant variable. changes since v3: - add dt-bindings for MediaTek rng with TrustZone enabled. - revise HWRNG SMC call fid. changes since v4: - move bindings to the arm/firmware directory. - revise driver init flow to check more property. changes since v5: - refactor to more generic security rng driver which is not platform specific. *** BLURB HERE *** Neal Liu (2): dt-bindings: rng: add bindings for sec-rng hwrng: add sec-rng driver There is no reason to model a SMC call as a driver, and represent it via a DT node like this. +1. It would be much better if this SMC interface is made truly generic, and wired into the arch_get_random() interface, which can be used much earlier. Wasn't there a plan to standardize a SMC call to rule them all? M. -- Who you jivin' with that Cosmik Debris?
[PATCH v14 01/15] mm/page_ext: Export lookup_page_ext() to GPL modules
From: SeongJae Park This commit exports 'lookup_page_ext()' to GPL modules. This will be used by DAMON. Signed-off-by: SeongJae Park Reviewed-by: Leonard Foerster --- mm/page_ext.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/page_ext.c b/mm/page_ext.c index a3616f7a0e9e..9d802d01fcb5 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -131,6 +131,7 @@ struct page_ext *lookup_page_ext(const struct page *page) MAX_ORDER_NR_PAGES); return get_entry(base, index); } +EXPORT_SYMBOL_GPL(lookup_page_ext); static int __init alloc_node_page_ext(int nid) { -- 2.17.1
[PATCH v14 00/15] Introduce Data Access MONitor (DAMON)
From: SeongJae Park Introduction DAMON is a data access monitoring framework subsystem for the Linux kernel. The core mechanisms of DAMON called 'region based sampling' and adaptive regions adjustment' (refer to :doc:`mechanisms` for the detail) make it - accurate (The monitored information is useful for DRAM level memory management. It might not appropriate for Cache-level accuracy, though.), - ligfht-weight (The monitoring overhead is low enough to be applied online while making no impact on the performance of the target workloads.), and - scalable (the upper-bound of the instrumentation overhead is controllable regardless of the size of target workloads.). Using this framework, therefore, the kernel's core memory management mechanisms including reclamation and THP can be optimized for better memory management. The experimental memory management optimization works that incurring high instrumentation overhead will be able to have another try. In user space, meanwhile, users who have some special workloads will be able to write personalized tools or applications for deeper understanding and specialized optimizations of their systems. Evaluations === We evaluated DAMON's overhead, monitoring quality and usefulness using 25 realistic workloads on my QEMU/KVM based virtual machine running a kernel that v13 DAMON patchset is applied. DAMON is lightweight. It increases system memory usage by only -0.39% and consumes less than 1% CPU time in most case. It slows target workloads down by only 0.63%. DAMON is accurate and useful for memory management optimizations. An experimental DAMON-based operation scheme for THP, 'ethp', removes 69.43% of THP memory overheads while preserving 37.11% of THP speedup. Another experimental DAMON-based 'proactive reclamation' implementation, 'prcl', reduces 89.30% of residential sets and 22.40% of system memory footprint while incurring only 1.98% runtime overhead in the best case (parsec3/freqmine). NOTE that the experimentail THP optimization and proactive reclamation are not for production, just only for proof of concepts. Please refer to the official document[1] or "Documentation/admin-guide/mm: Add a document for DAMON" patch in this patchset for detailed evaluation setup and results. [1] https://damonitor.github.io/doc/html/latest-damon More Information We prepared a showcase web site[1] that you can get more information. There are - the official documentations[2], - the heatmap format dynamic access pattern of various realistic workloads for heap area[3], mmap()-ed area[4], and stack[5] area, - the dynamic working set size distribution[6] and chronological working set size changes[7], and - the latest performance test results[8]. [1] https://damonitor.github.io/_index [2] https://damonitor.github.io/doc/html/latest-damon [3] https://damonitor.github.io/test/result/visual/latest/heatmap.0.html [4] https://damonitor.github.io/test/result/visual/latest/heatmap.1.html [5] https://damonitor.github.io/test/result/visual/latest/heatmap.2.html [6] https://damonitor.github.io/test/result/visual/latest/wss_sz.html [7] https://damonitor.github.io/test/result/visual/latest/wss_time.html [8] https://damonitor.github.io/test/result/perf/latest/html/index.html Baseline and Complete Git Trees === The patches are based on the v5.7. You can also clone the complete git tree: $ git clone git://github.com/sjp38/linux -b damon/patches/v14 The web is also available: https://github.com/sjp38/linux/releases/tag/damon/patches/v14 There are a couple of trees for entire DAMON patchset series. It includes future features. The first one[1] contains the changes for latest release, while the other one[2] contains the changes for next release. [1] https://github.com/sjp38/linux/tree/damon/master [2] https://github.com/sjp38/linux/tree/damon/next Sequence Of Patches === The 1st patch exports 'lookup_page_ext()' to GPL modules so that it can be used by DAMON even though it is built as a loadable module. Next four patches implement the core of DAMON and it's programming interface. The 2nd patch introduces DAMON module, it's data structures, and data structure related common functions. Following three patches (3rd to 5th) implements the core mechanisms of DAMON, namely regions based sampling (patch 3), adaptive regions adjustment (patch 4), and dynamic memory mapping chage adoption (patch 5). Following four patches are for low level users of DAMON. The 6th patch implements callbacks for each of monitoring steps so that users can do whatever they want with the access patterns. The 7th one implements recording of access patterns in DAMON for better convenience and efficiency. Each of next two patches (8th and 9th) respectively adds a debugfs interface for privileged people and/or programs in user space, and a tracepoint for other tracepoints supporting tracers such as perf.
Re: [RESEND PATCH v1 6/6] staging: greybus: audio: Enable GB codec, audio module compilation.
On Tue, Jun 02, 2020 at 10:51:15AM +0530, Vaibhav Agarwal wrote: > Currently, GB codec and audio module is conditionally compiled based on > GREYBUS_AUDIO_MSM8994. However, audio module is not dependent on MSM8994 > platform and can be used generically with any platform that follows > GB Audio class specification. > > Also, GB codec driver corresponds to dummy codec represented by I2S port > available on Toshiba AP Bridge. Added config option for the same in > kconfig file and accordingly updated Makefile. > This commit message was a bit confusing to me. Just say: "Currently you can't enable the Grey Bus Audio Codec because there is no entry for it in the Kconfig file. Originally the config name was going to be AUDIO_MSM8994 but that's not correct because other types of hardware are supported now. I have chosen the name AUDIO_APB_CODEC instead. Also I had to update the dependencies for GREYBUS_AUDIO to make the compile work." Otherwise this looks fine. regards, dan carpenter
Re: [PATCH v2 09/12] iio: imu: inv_icm42600: add buffer support in iio devices
Hi Jonathan, I agree that this multiplexed watermark computation value is anything except simple and clear to understand. I will add more documentation about it. And it also triggered a kbuild robot issue, because it is using 64 bits modulo without using math64 macros. For buffer preenable/postenable/..., the sequence I am using currently is: - preenable -> turn chip on (pm_runtime_get) - update_scan_mode -> set FIFO en bits configuration (which sensor data is going into the fifo) - hwfifo_watermark -> compute and set watermark value - postenable -> turn FIFO on (and multiplexed with a FIFO on counter since used by accel & gyro) - predisable -> turn FIFO off (multiplexed with counter) - postdisable -> turn chip off (pm_runtime_put) This setting is working well. Good to note that if there is an error when enabling the buffer, postdisable will always get called after preenable. So it ensures pm_runtime reference counter to be always OK. Another way would be to only store configuration in internal state with update_scan_mode and hwfifo_watermark, and do everything in postenable/predisable. This is a possibility, but makes things a little more complex. For hwfifo flush, this is an interesting feature when there is a need to have data immediately. Or when there is a need to do a clean change of configuration. In Android systems, Android framework is mainly using FIFO flush to change the sensor configuration (ODR, watermark) in a clean way. For our case with the FIFO interleaved this is a not an issue. If there are samples from the 2 sensors, it means the 2 buffers are enabled. And if data is coming to the iio buffer sooner than expected, that should not be a problem. The limitation I see when the 2 sensors are runnings, is that we will return less data than should have been possible. I limit FIFO reading to the provided n bytes, so we could read less than n samples of 1 sensor. Something I have in mind, that would be really interesting to be able to set/change watermark value when the buffer is enabled. Otherwise, we are always loosing events by turning sensor off when we want to change the value. Is there any limitation to work this way, or should it be possible to implement this feature in the future ? Thanks, JB From: Jonathan Cameron Sent: Sunday, May 31, 2020 14:56 To: Jean-Baptiste Maneyrol Cc: robh...@kernel.org ; r...@kernel.org ; mchehab+hua...@kernel.org ; da...@davemloft.net ; gre...@linuxfoundation.org ; linux-...@vger.kernel.org ; devicet...@vger.kernel.org ; linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 09/12] iio: imu: inv_icm42600: add buffer support in iio devices CAUTION: This email originated from outside of the organization. Please make sure the sender is who they say they are and do not click links or open attachments unless you recognize the sender and know the content is safe. On Wed, 27 May 2020 20:57:08 +0200 Jean-Baptiste Maneyrol wrote: > Add all FIFO parsing and reading functions. Add accel and gyro > kfifo buffer and FIFO data parsing. Use device interrupt for > reading data FIFO and launching accel and gyro parsing. > > Support hwfifo watermark by multiplexing gyro and accel settings. > Support hwfifo flush. Both of these are complex given the interactions of the two sensors types and to be honest I couldn't figure out exactly what the intent was. Needs more docs! Thanks, Jonathan > > Signed-off-by: Jean-Baptiste Maneyrol > --- > drivers/iio/imu/inv_icm42600/Kconfig | 1 + > drivers/iio/imu/inv_icm42600/Makefile | 1 + > drivers/iio/imu/inv_icm42600/inv_icm42600.h | 8 + > .../iio/imu/inv_icm42600/inv_icm42600_accel.c | 160 - > .../imu/inv_icm42600/inv_icm42600_buffer.c | 555 ++ > .../imu/inv_icm42600/inv_icm42600_buffer.h | 98 > .../iio/imu/inv_icm42600/inv_icm42600_core.c | 30 + > .../iio/imu/inv_icm42600/inv_icm42600_gyro.c | 160 - > 8 files changed, 1011 insertions(+), 2 deletions(-) > create mode 100644 drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.c > create mode 100644 drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h > > diff --git a/drivers/iio/imu/inv_icm42600/Kconfig > b/drivers/iio/imu/inv_icm42600/Kconfig > index 22390a72f0a3..50cbcfcb6cf1 100644 > --- a/drivers/iio/imu/inv_icm42600/Kconfig > +++ b/drivers/iio/imu/inv_icm42600/Kconfig > @@ -2,6 +2,7 @@ > > config INV_ICM42600 > tristate > + select IIO_BUFFER > > config INV_ICM42600_I2C > tristate "InvenSense ICM-426xx I2C driver" > diff --git a/drivers/iio/imu/inv_icm42600/Makefile > b/drivers/iio/imu/inv_icm42600/Makefile > index 48965824f00c..0f49f6df3647 100644 > --- a/drivers/iio/imu/inv_icm42600/Makefile > +++ b/drivers/iio/imu/inv_icm42600/Makefile > @@ -5,6 +5,7 @@ inv-icm42600-y += inv_icm42600_core.o > inv-icm42600-y += inv_icm42600_gyro.o > inv-icm42600-y += inv_icm42600_accel.o > inv-icm42600-y += inv_icm42600_temp.o > +inv-icm42600-y +=
Re: [Linux-stm32] [PATCH v8 08/10] drm: stm: dw-mipi-dsi: let the bridge handle the HW version check
Hi Adrian, On Mon, 1 Jun 2020 at 10:14, Adrian Ratiu wrote: > > On Fri, 29 May 2020, Philippe CORNU wrote: > > Hi Adrian, and thank you very much for the patchset. Thank you > > also for having tested it on STM32F769 and STM32MP1. Sorry for > > the late response, Yannick and I will review it as soon as > > possible and we will keep you posted. Note: Do not hesitate to > > put us in copy for the next version (philippe.co...@st.com, > > yannick.fer...@st.com) Regards, Philippe :-) > > Hi Philippe, > > Thank you very much for your previous and future STM testing, > really appreciate it! I've CC'd Yannick until now but I'll also CC > you sure :) > > It's been over a month since I posted v8 and I was just gearing up > to address all feedback, rebase & retest to prepare v9 but I'll > wait a little longer, no problem, it's no rush. > Small idea, pardon for joining so late: Might be a good idea to add inline comment, why the clocks are disabled so late. Effectively a 2 line version of the commit summary. Feel free to make that a separate/follow-up patch. -Emil
[GIT PULL] MIPS changes for v5.8-rc1
Hi Linus, this is the pull-requests for MIPS. A test merged between your current master (f359287765c0) and mips-next showed a conflict in arch/mips/include/asm/module.h between commit: 62d0fd591db1 ("arch: split MODULE_ARCH_VERMAGIC definitions out to ") from Linus' tree and commits: ab7c01fdc3cf ("mips: Add MIPS Release 5 support") 281e3aea35e5 ("mips: Add MIPS Warrior P5600 support") from the mips tree. Conflict resolution from Stephen Rottwell is: From: Stephen Rothwell Date: Mon, 25 May 2020 10:03:07 +1000 Subject: [PATCH] mips: vermagic updates Signed-off-by: Stephen Rothwell --- arch/mips/include/asm/vermagic.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/mips/include/asm/vermagic.h b/arch/mips/include/asm/vermagic.h index 24dc3d35161c..4d2dae0c7c57 100644 --- a/arch/mips/include/asm/vermagic.h +++ b/arch/mips/include/asm/vermagic.h @@ -8,12 +8,16 @@ #define MODULE_PROC_FAMILY "MIPS32_R1 " #elif defined CONFIG_CPU_MIPS32_R2 #define MODULE_PROC_FAMILY "MIPS32_R2 " +#elif defined CONFIG_CPU_MIPS32_R5 +#define MODULE_PROC_FAMILY "MIPS32_R5 " #elif defined CONFIG_CPU_MIPS32_R6 #define MODULE_PROC_FAMILY "MIPS32_R6 " #elif defined CONFIG_CPU_MIPS64_R1 #define MODULE_PROC_FAMILY "MIPS64_R1 " #elif defined CONFIG_CPU_MIPS64_R2 #define MODULE_PROC_FAMILY "MIPS64_R2 " +#elif defined CONFIG_CPU_MIPS64_R5 +#define MODULE_PROC_FAMILY "MIPS64_R5 " #elif defined CONFIG_CPU_MIPS64_R6 #define MODULE_PROC_FAMILY "MIPS64_R6 " #elif defined CONFIG_CPU_R3000 @@ -46,6 +50,8 @@ #define MODULE_PROC_FAMILY "LOONGSON64 " #elif defined CONFIG_CPU_CAVIUM_OCTEON #define MODULE_PROC_FAMILY "OCTEON " +#elif defined CONFIG_CPU_P5600 +#define MODULE_PROC_FAMILY "P5600 " #elif defined CONFIG_CPU_XLR #define MODULE_PROC_FAMILY "XLR " #elif defined CONFIG_CPU_XLP There were two more conflicts in linux-next: Conflict in arch/mips/lasat/sysctl.c between commit: 10760dde9be3 ("MIPS: Remove support for LASAT") from the mips tree and commit: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler") from the vfs tree. Here the resolution is to keep arch/mips/lasat/sysctl.c deleted. Conflict in mm/memory.c between commit: 7df676974359 ("mm/memory.c: Update local TLB if PTE entry exists") from the mips tree and commit: e325f89fdd69 ("mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API") from the akpm-current tree. Agreed resolution is: "We decided that the update_mmu_tlb() call is no longer needed here, so there is no need to re-add it when resolving this." Thomas. The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136: Linux 5.7-rc1 (2020-04-12 12:35:55 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux tags/mips_5.8 for you to fetch changes up to 9bd0bd264578fe191bf5d2ff23f9887b91862536: MIPS: ralink: drop ralink_clk_init for mt7621 (2020-05-31 14:17:00 +0200) MIPS updates for v5.8: - added support for MIPSr5 and P5600 cores - converted Loongson PCI driver into a PCI host driver using the generic PCI framework - added emulation of CPUCFG command for Loogonson64 cpus - removed of LASAT, PMC MSP71xx and NEC MARKEINS/EMMA - ioremap cleanup - fix for a race between two threads faulting the same page - various cleanups and fixes Arnd Bergmann (1): mips: loongsoon2ef: remove private clk api Ben Hutchings (1): MIPS: Fix exception handler memcpy() Bibo Mao (4): MIPS: Do not flush tlb page when updating PTE entry mm/memory.c: Update local TLB if PTE entry exists mm/memory.c: Add memory read privilege on page fault handling MIPS: mm: add page valid judgement in function pte_modify Bin Meng (1): mips: Drop CONFIG_MTD_M25P80 in various defconfig files Christoph Hellwig (9): ASoC: txx9: don't work around too small resource_size_t MIPS: remove cpu_has_64bit_addresses MIPS: cleanup fixup_bigphys_addr handling MIPS: merge __ioremap_mode into ioremap_prot MIPS: split out the 64-bit ioremap implementation MIPS: move ioremap_prot und iounmap out of line MIPS: use ioremap_page_range ASoC: txx9: add back the hack for a too small resource_size_t MIPS: unexport __flush_icache_user_range Christophe JAILLET (1): MIPS: ralink: bootrom: mark a function as __init to save some memory Chuanhong Guo (1): MIPS: ralink: drop ralink_clk_init for mt7621 Daniel González Cabanelas (1): MIPS: BCM63XX: fix BCM6358 GPIO count Geert Uytterhoeven (1): MIPS: ingenic: Replace by Guoyun Sun (1): mips/mm: Add page soft dirty tracking Gustavo A. R. Silva (1): MIPS: Replace zero-length array with flexible-array H. Nikolaus Schaller (1): MIPS: DTS: CI20: make DM9000 Ethernet controller use NVMEM to find
Re: [PATCH v3 015/105] drm/vc4: hvs: Boost the core clock during modeset
Hi Eric, On Wed, May 27, 2020 at 09:33:44AM -0700, Eric Anholt wrote: > On Wed, May 27, 2020 at 8:49 AM Maxime Ripard wrote: > > > > In order to prevent timeouts and stalls in the pipeline, the core clock > > needs to be maxed at 500MHz during a modeset on the BCM2711. > > Like, the whole system's core clock? Yep, unfortunately... > How is it reasonable for some device driver to crank the system's core > clock up and back down to some fixed-in-the-driver frequency? Sounds > like you need some sort of opp thing here. That frequency is the minimum rate of that clock. However, since other devices have similar requirements (unicam in particular) with different minimum requirements, we will switch to setting a minimum rate instead of enforcing a particular rate, so that patch would be essentially s/clk_set_rate/clk_set_min_rate/. Would that work for you? > > Patch 13,14 r-b. Thanks! Maxime signature.asc Description: PGP signature