Re: [PATCH v3 3/3] powerpc/module64: Use symbolic instructions names.
Christophe Leroy writes: > Le 08/07/2019 à 02:56, Michael Ellerman a écrit : >> Christophe Leroy writes: >>> To increase readability/maintainability, replace hard coded >>> instructions values by symbolic names. >>> >>> Signed-off-by: Christophe Leroy >>> --- >>> v3: fixed warning by adding () in an 'if' around X | Y (unlike said in v2 >>> history, this change was forgotten in v2) >>> v2: rearranged comments >>> >>> arch/powerpc/kernel/module_64.c | 53 >>> +++-- >>> 1 file changed, 35 insertions(+), 18 deletions(-) >>> >>> diff --git a/arch/powerpc/kernel/module_64.c >>> b/arch/powerpc/kernel/module_64.c >>> index c2e1b06253b8..b33a5d5e2d35 100644 >>> --- a/arch/powerpc/kernel/module_64.c >>> +++ b/arch/powerpc/kernel/module_64.c >>> @@ -704,18 +711,21 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, >> ... >>> /* >>> * If found, replace it with: >>> * addis r2, r12, (.TOC.-func)@ha >>> * addi r2, r12, (.TOC.-func)@l >>> */ >>> - ((uint32_t *)location)[0] = 0x3c4c + PPC_HA(value); >>> - ((uint32_t *)location)[1] = 0x3842 + PPC_LO(value); >>> + ((uint32_t *)location)[0] = PPC_INST_ADDIS | >>> __PPC_RT(R2) | >>> + __PPC_RA(R12) | >>> PPC_HA(value); >>> + ((uint32_t *)location)[1] = PPC_INST_ADDI | >>> __PPC_RT(R2) | >>> + __PPC_RA(R12) | >>> PPC_LO(value); >>> break; >> >> This was crashing and it's amazing how long you can stare at a >> disassembly and not see the difference between `r2` and `r12` :) > > Argh, yes. I was misleaded by the comment I guess. Sorry for that and > thanks for fixing. No worries, yes the comment was the problem. I fixed that as well. cheers
Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
On Thu 11-07-19 23:25:44, Hoan Tran OS wrote: > In NUMA layout which nodes have memory ranges that span across other nodes, > the mm driver can detect the memory node id incorrectly. > > For example, with layout below > Node 0 address: > Node 1 address: > > Note: > - Memory from low to high > - 0/1: Node id > - x: Invalid memory of a node > > When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES > config, mm only checks the memory validity but not the node id. > Because of that, Node 1 also detects the memory from node 0 as below > when it scans from the start address to the end address of node 1. > > Node 0 address: > Node 1 address: > > This layout could occur on any architecture. This patch enables > CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue. Yes it can occur on any arch but most sane platforms simply do not overlap physical ranges. So I do not really see any reason to unconditionally enable the config for everybody. What is an advantage? -- Michal Hocko SUSE Labs
Re: [PATCH] vfio: platform: reset: add support for XHCI reset hook
Hi Gregory, On 7/11/19 4:31 PM, Gregory CLEMENT wrote: > The VFIO reset hook is called every time a platform device is passed > to a guest or removed from a guest. > > When the XHCI device is unbound from the host, the host driver > disables the XHCI clocks/phys/regulators so when the device is passed > to the guest it becomes dis-functional. > > This initial implementation uses the VFIO reset hook to enable the > XHCI clocks/phys on behalf of the guest. the platform reset module must also make sure there are no more DMA requests and interrupts that can be sent by the device anymore. > > Ported from Marvell LSP code originally written by Yehuda Yitschak > > Signed-off-by: Gregory CLEMENT > --- > drivers/vfio/platform/reset/Kconfig | 8 +++ > drivers/vfio/platform/reset/Makefile | 2 + > .../vfio/platform/reset/vfio_platform_xhci.c | 60 +++ > 3 files changed, 70 insertions(+) > create mode 100644 drivers/vfio/platform/reset/vfio_platform_xhci.c > > diff --git a/drivers/vfio/platform/reset/Kconfig > b/drivers/vfio/platform/reset/Kconfig > index 392e3c09def0..14f620fd250d 100644 > --- a/drivers/vfio/platform/reset/Kconfig > +++ b/drivers/vfio/platform/reset/Kconfig > @@ -22,3 +22,11 @@ config VFIO_PLATFORM_BCMFLEXRM_RESET > Enables the VFIO platform driver to handle reset for Broadcom FlexRM > > If you don't know what to do here, say N. > + > +config VFIO_PLATFORM_XHCI_RESET > + tristate "VFIO support for USB XHCI reset" > + depends on VFIO_PLATFORM > + help > + Enables the VFIO platform driver to handle reset for USB XHCI > + > + If you don't know what to do here, say N. > diff --git a/drivers/vfio/platform/reset/Makefile > b/drivers/vfio/platform/reset/Makefile > index 7294c5ea122e..d84c4d3dc041 100644 > --- a/drivers/vfio/platform/reset/Makefile > +++ b/drivers/vfio/platform/reset/Makefile > @@ -1,7 +1,9 @@ > # SPDX-License-Identifier: GPL-2.0 > vfio-platform-calxedaxgmac-y := vfio_platform_calxedaxgmac.o > vfio-platform-amdxgbe-y := vfio_platform_amdxgbe.o > +vfio-platform-xhci-y := vfio_platform_xhci.o > > obj-$(CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET) += > vfio-platform-calxedaxgmac.o > obj-$(CONFIG_VFIO_PLATFORM_AMDXGBE_RESET) += vfio-platform-amdxgbe.o > obj-$(CONFIG_VFIO_PLATFORM_BCMFLEXRM_RESET) += vfio_platform_bcmflexrm.o > +obj-$(CONFIG_VFIO_PLATFORM_XHCI_RESET) += vfio-platform-xhci.o > diff --git a/drivers/vfio/platform/reset/vfio_platform_xhci.c > b/drivers/vfio/platform/reset/vfio_platform_xhci.c > new file mode 100644 > index ..7b75a04402ee > --- /dev/null > +++ b/drivers/vfio/platform/reset/vfio_platform_xhci.c > @@ -0,0 +1,60 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * VFIO platform driver specialized for XHCI reset > + * > + * Copyright 2016 Marvell Semiconductors, Inc. > + * > + */ > + > +#include > +#include > +#include > +#include > +#include io, init, kernel should be removable (noticed init and kernel.h also are in other reset modules though) > +#include > +#include > +#include > +#include > + > +#include "../vfio_platform_private.h" > + > +#define MAX_XHCI_CLOCKS 4 Where does this number come from? >From Documentation/devicetree/bindings/usb/usb-xhci.txt I understand there are max 2 clocks, "core" and "reg" (I don't have any specific knowledge on the device though). > +#define MAX_XHCI_PHYS2 not used > + > +int vfio_platform_xhci_reset(struct vfio_platform_device *vdev) > +{ > + struct device *dev = vdev->device; > + struct device_node *np = dev->of_node; > + struct usb_phy *usb_phy; > + struct clk *clk; > + int ret, i; > + > + /* > + * Compared to the native driver, no need to handle the > + * deferred case, because the resources are already > + * there > + */ > + for (i = 0; i < MAX_XHCI_CLOCKS; i++) { > + clk = of_clk_get(np, i); > + if (!IS_ERR(clk)) { > + ret = clk_prepare_enable(clk); > + if (ret) > + return -ENODEV; return ret? > + } > + } > + > + usb_phy = devm_usb_get_phy_by_phandle(dev, "usb-phy", 0); > + if (!IS_ERR(usb_phy)) { > + ret = usb_phy_init(usb_phy); > + if (ret) > + return -ENODEV; return ret? > + } > + > + return 0; > +} > + > +module_vfio_reset_handler("generic-xhci", vfio_platform_xhci_reset); > + > +MODULE_AUTHOR("Yehuda Yitschak"); > +MODULE_DESCRIPTION("Reset support for XHCI vfio platform device"); > +MODULE_LICENSE("GPL"); > Thanks Eric
[PATCH v4 0/3] Forced-wakeup for stop states on Powernv
Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be busy soon. However, if no new workload is scheduled on that CPU in the near future, the CPU will end up in the shallow state. Motivation -- In case of POWER, this is problematic, when the predicted state in the aforementioned scenario is a shallow stop state on a tickless system. As we might get stuck into shallow states even for hours, in absence of ticks or interrupts. To address this, We forcefully wakeup the cpu by setting the decrementer. The decrementer is set to a value that corresponds with the residency of the next available state. Thus firing up a timer that will forcefully wakeup the cpu. Few such iterations will essentially train the governor to select a deeper state for that cpu, as the timer here corresponds to the next available cpuidle state residency. Thus, cpu will eventually end up in the deepest possible state and we won't get stuck in a shallow state for long duration. Experiment -- For earlier versions when this feature was meat to be only for shallow lite states, I performed experiments for three scenarios to collect some data. case 1 : Without this patch and without tick retained, i.e. in a upstream kernel, It would spend more than even a second to get out of stop0_lite. case 2 : With tick retained in a upstream kernel - Generally, we have a sched tick at 4ms(CONF_HZ = 250). Ideally I expected it to take 8 sched tick to get out of stop0_lite. Experimentally, observation was = sample minmax 99percentile 20 4ms12ms 4ms = It would take atleast one sched tick to get out of stop0_lite. case 2 : With this patch (not stopping tick, but explicitly queuing a timer) sample min max 99percentile 20 144us 192us 144us Description of current implementation - We calculate timeout for the current idle state as the residency value of the next available idle state. If the decrementer is set to be greater than this timeout, we update the decrementer value with the residency of next available idle state. Thus, essentially training the governor to select the next available deeper state until we reach the deepest state. Hence, we won't get stuck unnecessarily in shallow states for longer duration. v1 of auto-promotion : https://lkml.org/lkml/2019/3/22/58 This patch was implemented only for shallow lite state in generic cpuidle driver. v2 : Removed timeout_needed and rebased to current upstream kernel Then, v1 of forced-wakeup : Moved the code to cpuidle powernv driver and started as forced wakeup instead of auto-promotion v2 : Extended the forced wakeup logic for all states. Setting the decrementer instead of queuing up a hrtimer to implement the logic. v3 : 1) Cleanly handle setting the decrementer after exiting out of stop states. 2) Added a disable_callback feature to compute timeout whenever a state is enbaled or disabled instead of computing everytime in fast idle path. 3) Use disable callback to recompute timeout whenever state usage is changed for a state. Also, cleaned up the get_snooze_timeout function. v4 :Changed the type and name of set/reset decrementer function. Handled irq work pending in try_set_dec_before_idle. No change in patch 2 and 3. Abhishek Goel (3): cpuidle-powernv : forced wakeup for stop states cpuidle : Add callback whenever a state usage is enabled/disabled cpuidle-powernv : Recompute the idle-state timeouts when state usage is enabled/disabled arch/powerpc/include/asm/time.h | 2 ++ arch/powerpc/kernel/time.c| 43 drivers/cpuidle/cpuidle-powernv.c | 55 +++ drivers/cpuidle/sysfs.c | 15 - include/linux/cpuidle.h | 5 +++ 5 files changed, 106 insertions(+), 14 deletions(-) -- 2.17.1
[RFC v4 2/3] cpuidle : Add callback whenever a state usage is enabled/disabled
To force wakeup a cpu, we need to compute the timeout in the fast idle path as a state may be enabled or disabled but there did not exist a feedback to driver when a state is enabled or disabled. This patch adds a callback whenever a state_usage records a store for disable attribute. Signed-off-by: Abhishek Goel --- drivers/cpuidle/sysfs.c | 15 ++- include/linux/cpuidle.h | 4 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index eb20adb5de23..141671a53967 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -415,8 +415,21 @@ static ssize_t cpuidle_state_store(struct kobject *kobj, struct attribute *attr, struct cpuidle_state_usage *state_usage = kobj_to_state_usage(kobj); struct cpuidle_state_attr *cattr = attr_to_stateattr(attr); - if (cattr->store) + if (cattr->store) { ret = cattr->store(state, state_usage, buf, size); + if (ret == size && + strncmp(cattr->attr.name, "disable", + strlen("disable"))) { + struct kobject *cpuidle_kobj = kobj->parent; + struct cpuidle_device *dev = + to_cpuidle_device(cpuidle_kobj); + struct cpuidle_driver *drv = + cpuidle_get_cpu_driver(dev); + + if (drv->disable_callback) + drv->disable_callback(dev, drv); + } + } return ret; } diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index bb9a0db89f1a..8a0e54bd0d5d 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -119,6 +119,10 @@ struct cpuidle_driver { /* the driver handles the cpus in cpumask */ struct cpumask *cpumask; + + void (*disable_callback)(struct cpuidle_device *dev, + struct cpuidle_driver *drv); + }; #ifdef CONFIG_CPU_IDLE -- 2.17.1
[PATCH v4 1/3] cpuidle-powernv : forced wakeup for stop states
Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be busy soon. However, if no new workload is scheduled on that CPU in the near future, the CPU may end up in the shallow state. This is problematic, when the predicted state in the aforementioned scenario is a shallow stop state on a tickless system. As we might get stuck into shallow states for hours, in absence of ticks or interrupts. To address this, We forcefully wakeup the cpu by setting the decrementer. The decrementer is set to a value that corresponds with the residency of the next available state. Thus firing up a timer that will forcefully wakeup the cpu. Few such iterations will essentially train the governor to select a deeper state for that cpu, as the timer here corresponds to the next available cpuidle state residency. Thus, cpu will eventually end up in the deepest possible state. Signed-off-by: Abhishek Goel --- Auto-promotion v1 : started as auto promotion logic for cpuidle states in generic driver v2 : Removed timeout_needed and rebased the code to upstream kernel Forced-wakeup v1 : New patch with name of forced wakeup started v2 : Extending the forced wakeup logic for all states. Setting the decrementer instead of queuing up a hrtimer to implement the logic. v3 : Cleanly handle setting/resetting of decrementer so as to not break irq work v4 : Changed type and name of set/reset decrementer fucntion Handled irq_work_pending in try_set_dec_before_idle arch/powerpc/include/asm/time.h | 2 ++ arch/powerpc/kernel/time.c| 43 +++ drivers/cpuidle/cpuidle-powernv.c | 40 3 files changed, 85 insertions(+) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 54f4ec1f9fab..294a472ce161 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -188,6 +188,8 @@ static inline unsigned long tb_ticks_since(unsigned long tstamp) extern u64 mulhdu(u64, u64); #endif +extern bool try_set_dec_before_idle(u64 timeout); +extern void try_reset_dec_after_idle(void); extern void div128_by_32(u64 dividend_high, u64 dividend_low, unsigned divisor, struct div_result *dr); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 694522308cd5..d004c0d8e099 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -576,6 +576,49 @@ void arch_irq_work_raise(void) #endif /* CONFIG_IRQ_WORK */ +/* + * This function tries setting decrementer before entering into idle. + * Returns true if we have reprogrammed the decrementer for idle. + * Returns false if the decrementer is unchanged. + */ +bool try_set_dec_before_idle(u64 timeout) +{ + u64 *next_tb = this_cpu_ptr(&decrementers_next_tb); + u64 now = get_tb_or_rtc(); + + if (now + timeout > *next_tb) + return false; + + set_dec(timeout); + if (test_irq_work_pending()) + set_dec(1); + + return true; +} + +/* + * This function gets called if we have set decrementer before + * entering into idle. It tries to reset/restore the decrementer + * to its original value. + */ +void try_reset_dec_after_idle(void) +{ + u64 now; + u64 *next_tb; + + if (test_irq_work_pending()) + return; + + now = get_tb_or_rtc(); + next_tb = this_cpu_ptr(&decrementers_next_tb); + if (now >= *next_tb) + return; + + set_dec(*next_tb - now); + if (test_irq_work_pending()) + set_dec(1); +} + /* * timer_interrupt - gets called when the decrementer overflows, * with interrupts disabled. diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 84b1ebe212b3..17e20e408ffe 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -21,6 +21,7 @@ #include #include #include +#include /* * Expose only those Hardware idle states via the cpuidle framework @@ -46,6 +47,26 @@ static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX] __read_mostly static u64 default_snooze_timeout __read_mostly; static bool snooze_timeout_en __read_mostly; +static u64 forced_wakeup_timeout(struct cpuidle_device *dev, +struct cpuidle_driver *drv, +int index) +{ + int i; + + for (i = index + 1; i < drv->state_count; i++) { + struct cpuidle_state *s = &drv->states[i]; + struct cpuidle_state_usage *su = &dev->states_usage[i]; + + if (s->disabled || su->disable) + continue; + + return (s->target_residency + 2 * s->exit_latency) * +
[RFC v4 3/3] cpuidle-powernv : Recompute the idle-state timeouts when state usage is enabled/disabled
The disable callback can be used to compute timeout for other states whenever a state is enabled or disabled. We store the computed timeout in "timeout" defined in cpuidle state strucure. So, we compute timeout only when some state is enabled or disabled and not every time in the fast idle path. We also use the computed timeout to get timeout for snooze, thus getting rid of get_snooze_timeout for snooze loop. Signed-off-by: Abhishek Goel --- drivers/cpuidle/cpuidle-powernv.c | 35 +++ include/linux/cpuidle.h | 1 + 2 files changed, 13 insertions(+), 23 deletions(-) diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 17e20e408ffe..29add322d0c4 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -45,7 +45,6 @@ struct stop_psscr_table { static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX] __read_mostly; static u64 default_snooze_timeout __read_mostly; -static bool snooze_timeout_en __read_mostly; static u64 forced_wakeup_timeout(struct cpuidle_device *dev, struct cpuidle_driver *drv, @@ -67,26 +66,13 @@ static u64 forced_wakeup_timeout(struct cpuidle_device *dev, return 0; } -static u64 get_snooze_timeout(struct cpuidle_device *dev, - struct cpuidle_driver *drv, - int index) +static void pnv_disable_callback(struct cpuidle_device *dev, +struct cpuidle_driver *drv) { int i; - if (unlikely(!snooze_timeout_en)) - return default_snooze_timeout; - - for (i = index + 1; i < drv->state_count; i++) { - struct cpuidle_state *s = &drv->states[i]; - struct cpuidle_state_usage *su = &dev->states_usage[i]; - - if (s->disabled || su->disable) - continue; - - return s->target_residency * tb_ticks_per_usec; - } - - return default_snooze_timeout; + for (i = 0; i < drv->state_count; i++) + drv->states[i].timeout = forced_wakeup_timeout(dev, drv, i); } static int snooze_loop(struct cpuidle_device *dev, @@ -94,16 +80,20 @@ static int snooze_loop(struct cpuidle_device *dev, int index) { u64 snooze_exit_time; + u64 snooze_timeout = drv->states[index].timeout; + + if (!snooze_timeout) + snooze_timeout = default_snooze_timeout; set_thread_flag(TIF_POLLING_NRFLAG); local_irq_enable(); - snooze_exit_time = get_tb() + get_snooze_timeout(dev, drv, index); + snooze_exit_time = get_tb() + snooze_timeout; ppc64_runlatch_off(); HMT_very_low(); while (!need_resched()) { - if (likely(snooze_timeout_en) && get_tb() > snooze_exit_time) { + if (get_tb() > snooze_exit_time) { /* * Task has not woken up but we are exiting the polling * loop anyway. Require a barrier after polling is @@ -168,7 +158,7 @@ static int stop_loop(struct cpuidle_device *dev, u64 timeout_tb; bool forced_wakeup = false; - timeout_tb = forced_wakeup_timeout(dev, drv, index); + timeout_tb = drv->states[index].timeout; /* Ensure that the timeout is at least one microsecond * greater than current decrement value. Else, we will @@ -263,6 +253,7 @@ static int powernv_cpuidle_driver_init(void) */ drv->cpumask = (struct cpumask *)cpu_present_mask; + drv->disable_callback = pnv_disable_callback; return 0; } @@ -422,8 +413,6 @@ static int powernv_idle_probe(void) /* Device tree can indicate more idle states */ max_idle_state = powernv_add_idle_states(); default_snooze_timeout = TICK_USEC * tb_ticks_per_usec; - if (max_idle_state > 1) - snooze_timeout_en = true; } else return -ENODEV; diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 8a0e54bd0d5d..31662b657b9c 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -50,6 +50,7 @@ struct cpuidle_state { int power_usage; /* in mW */ unsigned inttarget_residency; /* in US */ booldisabled; /* disabled on all CPUs */ + unsigned long long timeout; /* timeout for exiting out of a state */ int (*enter)(struct cpuidle_device *dev, struct cpuidle_driver *drv, -- 2.17.1
Re: [PATCH] scatterlist: Allocate a contiguous array instead of chaining
On Fri, 12 Jul 2019, Ming Lei wrote: > On Thu, Jul 11, 2019 at 11:36:56PM -0700, Sultan Alsawaf wrote: > > From: Sultan Alsawaf > > > > Typically, drivers allocate sg lists of sizes up to a few MiB in size. > > The current algorithm deals with large sg lists by splitting them into > > several smaller arrays and chaining them together. But if the sg list > > allocation is large, and we know the size ahead of time, sg chaining is > > both inefficient and unnecessary. > > > > Rather than calling kmalloc hundreds of times in a loop for chaining > > tiny arrays, we can simply do it all at once with kvmalloc, which has > > the proper tradeoff on when to stop using kmalloc and instead use > > vmalloc. > > vmalloc() may sleep, so it is impossible to be called in atomic context. Allocations from atomic context should be avoided wherever possible and you really have to have a very convincing argument why an atomic allocation is absolutely necessary. I cleaned up quite some GFP_ATOMIC users over the last couple of years and all of them were doing it for the very wrong reasons and mostly just to silence the warning which is triggered with GFP_KERNEL when called from a non-sleepable context. So I suggest to audit all call sites first and figure out whether they really must use GFP_ATOMIC and if possible clean them up, remove the GFP argument and then do the vmalloc thing on top. Thanks, tglx
[PATCH] KVM: Boosting vCPUs that are delivering interrupts
From: Wanpeng Li Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu wakeup and interrupt delivery), except the lock holder, we want to also boost vCPUs that are delivering interrupts. Actually most smp_call_function_many calls are synchronous ipi calls, the ipi target vCPUs are also good yield candidates. This patch sets preempted flag during wakeup and interrupt delivery time. Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: ebizzy -M vanilla boostingimproved 1VM 23000 21232-9% 2VM 28008000 180% 3VM 1800310072% Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, one running ebizzy -M, the other running 'stress --cpu 2': w/ boosting + w/o pv sched yield(vanilla) vanilla boosting improved 1570 4000 55% w/ boosting + w/ pv sched yield(vanilla) vanilla boosting improved 1844 5157 79% w/o boosting, perf top in VM: 72.33% [kernel] [k] smp_call_function_many 4.22% [kernel] [k] call_function_i 3.71% [kernel] [k] async_page_fault w/ boosting, perf top in VM: 38.43% [kernel] [k] smp_call_function_many 6.31% [kernel] [k] async_page_fault 6.13% libc-2.23.so [.] __memcpy_avx_unaligned 4.88% [kernel] [k] call_function_interrupt Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Christian Borntraeger Signed-off-by: Wanpeng Li --- virt/kvm/kvm_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b4ab59d..2c46705 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2404,8 +2404,10 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu) int me; int cpu = vcpu->cpu; - if (kvm_vcpu_wake_up(vcpu)) + if (kvm_vcpu_wake_up(vcpu)) { + vcpu->preempted = true; return; + } me = get_cpu(); if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) -- 2.7.4
Re: [PATCH] mm: vmscan: scan anonymous pages on file refaults
On Fri 05-07-19 20:45:05, Kuo-Hsin Yang wrote: > With 4 processes accessing non-overlapping parts of a large file, 30316 > pages swapped out with this patch, 5152 pages swapped out without this > patch. The swapout number is small comparing to pgpgin. which is 5 times more swapout. This may be seen to be a lot for workloads that prefer no swapping (e.g. large in memory databases) with an occasional heavy IO (e.g. backup). And I am worried those would regress. I do agree that the current behavior is far from optimal because the trashing is real. I believe that we really need a different approach. Johannes has brought this up few years back (sorry I do not have a link handy) but it was essentially about implementing refault logic to anonymous memory and swap out based on the refault price. If there is effectively no swapin then it simply makes more sense to swap out rather than refault a page cache. That being said, I am not nacking the patch. Let's see whether something regresses as there is a no clear cut for the proper behavior. But I am bringing that up because we really need a better and more robust plan for the future. -- Michal Hocko SUSE Labs
Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files
Honestly I think this code should go away without any replacement. There is no reason why we should have a special debug printk just for one specific reason why there is a requirement for a large DMA mask.
[PATCH RESEND] KVM: Boosting vCPUs that are delivering interrupts
From: Wanpeng Li Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu wakeup and interrupt delivery), except the lock holder, we want to also boost vCPUs that are delivering interrupts. Actually most smp_call_function_many calls are synchronous ipi calls, the ipi target vCPUs are also good yield candidates. This patch sets preempted flag during wakeup and interrupt delivery time. Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: ebizzy -M vanilla boostingimproved 1VM 23000 21232-9% 2VM 28008000 180% 3VM 1800310072% Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, one running ebizzy -M, the other running 'stress --cpu 2': w/ boosting + w/o pv sched yield(vanilla) vanilla boosting improved 1570 4000 55% w/ boosting + w/ pv sched yield(vanilla) vanilla boosting improved 1844 5157 79% w/o boosting, perf top in VM: 72.33% [kernel] [k] smp_call_function_many 4.22% [kernel] [k] call_function_i 3.71% [kernel] [k] async_page_fault w/ boosting, perf top in VM: 38.43% [kernel] [k] smp_call_function_many 6.31% [kernel] [k] async_page_fault 6.13% libc-2.23.so [.] __memcpy_avx_unaligned 4.88% [kernel] [k] call_function_interrupt Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Christian Borntraeger Signed-off-by: Wanpeng Li --- virt/kvm/kvm_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b4ab59d..2c46705 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2404,8 +2404,10 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu) int me; int cpu = vcpu->cpu; - if (kvm_vcpu_wake_up(vcpu)) + if (kvm_vcpu_wake_up(vcpu)) { + vcpu->preempted = true; return; + } me = get_cpu(); if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) -- 2.7.4
Re: [PATCH v1] Bluetooth: hci_qca: Send VS pre shutdown command.
Hi Harish, > WCN399x chips are coex chips, it needs a VS pre shutdown > command while turning off the BT. So that chip can inform > BT is OFF to other active clients. > > Signed-off-by: Harish Bandi > --- > drivers/bluetooth/btqca.c | 21 + > drivers/bluetooth/btqca.h | 7 +++ > drivers/bluetooth/hci_qca.c | 3 +++ > 3 files changed, 31 insertions(+) patch has been applied to bluetooth-next tree. Regards Marcel
[PATCHv3] clk: add imx8 clk defines
From: Anson Huang added header defines for imx8qm clock Signed-off-by: Anson Huang Signed-off-by: Oliver Graute Reviewed-by: Rob Herring --- - fixed authorship include/dt-bindings/clock/imx8qm-clock.h | 851 +++ 1 file changed, 851 insertions(+) create mode 100644 include/dt-bindings/clock/imx8qm-clock.h diff --git a/include/dt-bindings/clock/imx8qm-clock.h b/include/dt-bindings/clock/imx8qm-clock.h new file mode 100644 index ..47217e4eaa6b --- /dev/null +++ b/include/dt-bindings/clock/imx8qm-clock.h @@ -0,0 +1,851 @@ +/* SPDX-License-Identifier: (GPL-2.0+ OR MIT) */ +/* + * Copyright (C) 2016 Freescale Semiconductor, Inc. + * Copyright 2017 NXP +*/ + +#ifndef __DT_BINDINGS_CLOCK_IMX8QM_H +#define __DT_BINDINGS_CLOCK_IMX8QM_H + +#define IMX8QM_CLK_DUMMY 0 + +#define IMX8QM_A53_DIV 1 +#define IMX8QM_A53_CLK 2 +#define IMX8QM_A72_DIV 3 +#define IMX8QM_A72_CLK 4 + +/* SC Clocks. */ +#define IMX8QM_SC_I2C_DIV 5 +#define IMX8QM_SC_I2C_CLK 6 +#define IMX8QM_SC_PID0_DIV 7 +#define IMX8QM_SC_PID0_CLK 8 +#define IMX8QM_SC_PIT_DIV 9 +#define IMX8QM_SC_PIT_CLK 10 +#define IMX8QM_SC_TPM_DIV 11 +#define IMX8QM_SC_TPM_CLK 12 +#define IMX8QM_SC_UART_DIV 13 +#define IMX8QM_SC_UART_CLK 14 + +/* LSIO */ +#define IMX8QM_PWM0_DIV15 +#define IMX8QM_PWM0_CLK16 +#define IMX8QM_PWM1_DIV17 +#define IMX8QM_PWM1_CLK18 +#define IMX8QM_PWM2_DIV19 +#define IMX8QM_PWM2_CLK20 +#define IMX8QM_PWM3_DIV21 +#define IMX8QM_PWM3_CLK22 +#define IMX8QM_PWM4_DIV23 +#define IMX8QM_PWM4_CLK24 +#define IMX8QM_PWM5_DIV26 +#define IMX8QM_PWM5_CLK27 +#define IMX8QM_PWM6_DIV28 +#define IMX8QM_PWM6_CLK29 +#define IMX8QM_PWM7_DIV30 +#define IMX8QM_PWM7_CLK31 +#define IMX8QM_FSPI0_DIV 32 +#define IMX8QM_FSPI0_CLK 33 +#define IMX8QM_FSPI1_DIV 34 +#define IMX8QM_FSPI1_CLK 35 +#define IMX8QM_GPT0_DIV36 +//#define IMX8QM_GPT0_CLK 37 +#define IMX8QM_GPT1_DIV38 +//#define IMX8QM_GPT1_CLK 39 +#define IMX8QM_GPT2_DIV40 +#define IMX8QM_GPT2_CLK41 +#define IMX8QM_GPT3_DIV42 +#define IMX8QM_GPT3_CLK43 +#define IMX8QM_GPT4_DIV44 +#define IMX8QM_GPT4_CLK45 + +/* Connectivity */ +#define IMX8QM_APBHDMA_CLK 46 +#define IMX8QM_GPMI_APB_CLK47 +#define IMX8QM_GPMI_APB_BCH_CLK48 +#define IMX8QM_GPMI_BCH_IO_DIV 49 +#define IMX8QM_GPMI_BCH_IO_CLK 50 +#define IMX8QM_GPMI_BCH_DIV51 +#define IMX8QM_GPMI_BCH_CLK52 +#define IMX8QM_SDHC0_IPG_CLK 53 +#define IMX8QM_SDHC0_DIV 54 +#define IMX8QM_SDHC0_CLK 55 +#define IMX8QM_SDHC1_IPG_CLK 56 +#define IMX8QM_SDHC1_DIV 57 +#define IMX8QM_SDHC1_CLK 58 +#define IMX8QM_SDHC2_IPG_CLK 59 +#define IMX8QM_SDHC2_DIV 60 +#define IMX8QM_SDHC2_CLK 61 +#define IMX8QM_USB2_OH_AHB_CLK 62 +#define IMX8QM_USB2_OH_IPG_S_CL63 +#define IMX8QM_USB2_OH_IPG_S_PL301_CLK 64 +#define IMX8QM_USB2_PHY_IPG_CLK65 +#define IMX8QM_USB3_IPG_CLK66 +#define IMX8QM_USB3_CORE_PCLK 67
Re: [PATCH] scatterlist: Allocate a contiguous array instead of chaining
On Fri, Jul 12, 2019 at 09:06:40AM +0200, Thomas Gleixner wrote: > On Fri, 12 Jul 2019, Ming Lei wrote: > > vmalloc() may sleep, so it is impossible to be called in atomic context. > > Allocations from atomic context should be avoided wherever possible and you > really have to have a very convincing argument why an atomic allocation is > absolutely necessary. I cleaned up quite some GFP_ATOMIC users over the > last couple of years and all of them were doing it for the very wrong > reasons and mostly just to silence the warning which is triggered with > GFP_KERNEL when called from a non-sleepable context. > > So I suggest to audit all call sites first and figure out whether they > really must use GFP_ATOMIC and if possible clean them up, remove the GFP > argument and then do the vmalloc thing on top. Hello Thomas and Ming, It looks like the following call sites are atomic: drivers/crypto/qce/ablkcipher.c:92: ret = sg_alloc_table(&rctx->dst_tbl, rctx->dst_nents, gfp); drivers/crypto/ccp/ccp-crypto-aes-cmac.c:110: ret = sg_alloc_table(&rctx->data_sg, sg_count, gfp); drivers/crypto/ccp/ccp-crypto-sha.c:103:ret = sg_alloc_table(&rctx->data_sg, sg_count, gfp); drivers/spi/spi-pl022.c:1035: ret = sg_alloc_table(&pl022->sgt_rx, pages, GFP_ATOMIC); drivers/spi/spi-pl022.c:1039: ret = sg_alloc_table(&pl022->sgt_tx, pages, GFP_ATOMIC); The crypto ones are conditionally made atomic depending on the presence of CRYPTO_TFM_REQ_MAY_SLEEP. Additionally, the following allocation could be problematic with kvmalloc: net/ceph/crypto.c:180: ret = sg_alloc_table(sgt, chunk_cnt, GFP_NOFS); This is a snippet from kvmalloc: /* * vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables) * so the given set of flags has to be compatible. */ if ((flags & GFP_KERNEL) != GFP_KERNEL) return kmalloc_node(size, flags, node); Use of GFP_NOFS in net/ceph/crypto.c would cause kvmalloc to fall back to kmalloc_node, which could cause problems if the allocation size is too large for kmalloc_node to reasonably accomodate. Also, it looks like the vmalloc family doesn't have kvmalloc's GFP_KERNEL check. Is this intentional, or does vmalloc really not require GFP_KERNEL context? Thanks, Sultan
Re: [PATCH v4 4/4] mm: introduce MADV_PAGEOUT
On Fri 12-07-19 14:18:28, Minchan Kim wrote: [...] > >From 41592f23e876ec21e49dc3c76dc89538e2bb16be Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Fri, 12 Jul 2019 14:05:36 +0900 > Subject: [PATCH] mm: factor out common parts between MADV_COLD and > MADV_PAGEOUT > > There are many common parts between MADV_COLD and MADV_PAGEOUT. > This patch factor them out to save code duplication. This looks better indeed. I still hope that this can get improved even further but let's do that in a follow up patch. > Signed-off-by: Minchan Kim Acked-by: Michal Hocko > --- > mm/madvise.c | 201 +-- > 1 file changed, 52 insertions(+), 149 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index bc2f0138982e..3d3d14517cc8 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -30,6 +30,11 @@ > > #include "internal.h" > > +struct madvise_walk_private { > + struct mmu_gather *tlb; > + bool pageout; > +}; > + > /* > * Any behaviour which results in changes to the vma->vm_flags needs to > * take mmap_sem for writing. Others, which simply traverse vmas, need > @@ -310,16 +315,23 @@ static long madvise_willneed(struct vm_area_struct *vma, > return 0; > } > > -static int madvise_cold_pte_range(pmd_t *pmd, unsigned long addr, > - unsigned long end, struct mm_walk *walk) > +static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > + unsigned long addr, unsigned long end, > + struct mm_walk *walk) > { > - struct mmu_gather *tlb = walk->private; > + struct madvise_walk_private *private = walk->private; > + struct mmu_gather *tlb = private->tlb; > + bool pageout = private->pageout; > struct mm_struct *mm = tlb->mm; > struct vm_area_struct *vma = walk->vma; > pte_t *orig_pte, *pte, ptent; > spinlock_t *ptl; > - struct page *page; > unsigned long next; > + struct page *page = NULL; > + LIST_HEAD(page_list); > + > + if (fatal_signal_pending(current)) > + return -EINTR; > > next = pmd_addr_end(addr, end); > if (pmd_trans_huge(*pmd)) { > @@ -358,6 +370,12 @@ static int madvise_cold_pte_range(pmd_t *pmd, unsigned > long addr, > return 0; > } > > + if (pageout) { > + if (isolate_lru_page(page)) > + goto huge_unlock; > + list_add(&page->lru, &page_list); > + } > + > if (pmd_young(orig_pmd)) { > pmdp_invalidate(vma, addr, pmd); > orig_pmd = pmd_mkold(orig_pmd); > @@ -366,10 +384,14 @@ static int madvise_cold_pte_range(pmd_t *pmd, unsigned > long addr, > tlb_remove_pmd_tlb_entry(tlb, pmd, addr); > } > > + ClearPageReferenced(page); > test_and_clear_page_young(page); > - deactivate_page(page); > huge_unlock: > spin_unlock(ptl); > + if (pageout) > + reclaim_pages(&page_list); > + else > + deactivate_page(page); > return 0; > } > > @@ -423,6 +445,12 @@ static int madvise_cold_pte_range(pmd_t *pmd, unsigned > long addr, > > VM_BUG_ON_PAGE(PageTransCompound(page), page); > > + if (pageout) { > + if (isolate_lru_page(page)) > + continue; > + list_add(&page->lru, &page_list); > + } > + > if (pte_young(ptent)) { > ptent = ptep_get_and_clear_full(mm, addr, pte, > tlb->fullmm); > @@ -437,12 +465,16 @@ static int madvise_cold_pte_range(pmd_t *pmd, unsigned > long addr, >* As a side effect, it makes confuse idle-page tracking >* because they will miss recent referenced history. >*/ > + ClearPageReferenced(page); > test_and_clear_page_young(page); > - deactivate_page(page); > + if (!pageout) > + deactivate_page(page); > } > > arch_enter_lazy_mmu_mode(); > pte_unmap_unlock(orig_pte, ptl); > + if (pageout) > + reclaim_pages(&page_list); > cond_resched(); > > return 0; > @@ -452,10 +484,15 @@ static void madvise_cold_page_range(struct mmu_gather > *tlb, >struct vm_area_struct *vma, >unsigned long addr, unsigned long end) > { > + struct madvise_walk_private walk_private = { > + .tlb = tlb, > + .pageout = false, > + }; > + > struct mm_walk cold_walk = { > - .pmd_entry = madvise_cold_pte_range, > + .pmd_entry = madvise_cold_or_pageout_pte_range, >
Re: [PATCH] phy: Change the configuration interface param to void* to make it more general
On Fri, Jul 12, 2019 at 05:26:04PM +0800, Zeng Tao wrote: > The phy framework now allows runtime configurations, but only limited > to mipi now, and it's not reasonable to introduce user specified > configurations into the union phy_configure_opts structure. An simple > way is to replace with a void *. > > We have already got some phy drivers which introduce private phy API > for runtime configurations, and with this patch, they can switch to > the phy_configure as a replace. > > Signed-off-by: Zeng Tao I still don't believe this is the right approach, for the reasons exposed in my first review of that patch. Maxime -- Maxime Ripard, Bootlin Embedded Linux and Kernel engineering https://bootlin.com signature.asc Description: PGP signature
Re: [PATCH] [media] media: mtk-mdp: fix reference count on old device tree
On Mon, 2019-07-08 at 17:06 +0800, Matthias Brugger wrote: > > On 21/06/2019 13:32, Matthias Brugger wrote: > > of_get_next_child() increments the reference count of the returning > > device_node. Decrement it in the check if we are using the old or the > > new DTB. > > > > Fixes: ba1f1f70c2c0 ("[media] media: mtk-mdp: Fix mdp device tree") > > Signed-off-by: Matthias Brugger > > Any comments on that? > Hi Matthias, Thanks for fixing the bug. Sorry to reply late~ Acked-by: Houlong Wei > > --- > > drivers/media/platform/mtk-mdp/mtk_mdp_core.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c > > b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c > > index bbb24fb95b95..bafe53c5d54a 100644 > > --- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c > > +++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c > > @@ -118,7 +118,9 @@ static int mtk_mdp_probe(struct platform_device *pdev) > > mutex_init(&mdp->vpulock); > > > > /* Old dts had the components as child nodes */ > > - if (of_get_next_child(dev->of_node, NULL)) { > > + parent = of_get_next_child(dev->of_node, NULL); > > + if (parent) { > > + of_node_put(parent); > > parent = dev->of_node; > > dev_warn(dev, "device tree is out of date\n"); > > } else { > >
Re: [PATCH] input: API for Setting a Timestamp from a Driver
On Fri, Jul 12, 2019 at 8:41 AM Dmitry Torokhov wrote: > > Hi Atif, > > On Wed, Jul 10, 2019 at 04:04:10PM -0700, Atif Niyaz wrote: > > Currently, evdev stamps time with timestamps acquired in > > evdev_events. However, this timestamping may not be accurate in terms of > > measuring when the actual event happened. This API allows any 3rd party > > driver to be able to call input_set_timestamp, and provide a timestamp > > that can be utilized in order to provide a more accurate sense of time > > for the event > > > > Signed-off-by: Atif Niyaz > > This looks OK to me. Benjamin, Peter, any concerns here? > No red flags from me (though Peter is the one using all of this). Just curious, which drivers do you think will be using this new API? I can see that we might want to use hid-multitouch for it, with the Scan Time forwarded by the device, but what do you have in mind? Cheers, Benjamin > > > --- > > drivers/input/evdev.c | 42 -- > > drivers/input/input.c | 17 + > > include/linux/input.h | 38 ++ > > 3 files changed, 71 insertions(+), 26 deletions(-) > > > > diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c > > index 867c2cfd0038..a331efa0a3f6 100644 > > --- a/drivers/input/evdev.c > > +++ b/drivers/input/evdev.c > > @@ -25,13 +25,6 @@ > > #include > > #include "input-compat.h" > > > > -enum evdev_clock_type { > > - EV_CLK_REAL = 0, > > - EV_CLK_MONO, > > - EV_CLK_BOOT, > > - EV_CLK_MAX > > -}; > > - > > struct evdev { > > int open; > > struct input_handle handle; > > @@ -53,7 +46,7 @@ struct evdev_client { > > struct fasync_struct *fasync; > > struct evdev *evdev; > > struct list_head node; > > - unsigned int clk_type; > > + input_clk_t clk_type; > > bool revoked; > > unsigned long *evmasks[EV_CNT]; > > unsigned int bufsize; > > @@ -150,16 +143,18 @@ static void __evdev_flush_queue(struct evdev_client > > *client, unsigned int type) > > static void __evdev_queue_syn_dropped(struct evdev_client *client) > > { > > struct input_event ev; > > - ktime_t time; > > struct timespec64 ts; > > + ktime_t *time = input_get_timestamp(client->evdev->handle.dev); > > > > - time = client->clk_type == EV_CLK_REAL ? > > - ktime_get_real() : > > - client->clk_type == EV_CLK_MONO ? > > - ktime_get() : > > - ktime_get_boottime(); > > + switch (client->clk_type) { > > + case INPUT_CLK_REAL: > > + case INPUT_CLK_MONO: > > + ts = ktime_to_timespec64(time[client->clk_type]); > > + break; > > + default: > > + ts = ktime_to_timespec64(time[INPUT_CLK_BOOT]); > > Add "break" here please. > > > + } > > > > - ts = ktime_to_timespec64(time); > > ev.input_event_sec = ts.tv_sec; > > ev.input_event_usec = ts.tv_nsec / NSEC_PER_USEC; > > ev.type = EV_SYN; > > @@ -185,21 +180,21 @@ static void evdev_queue_syn_dropped(struct > > evdev_client *client) > > spin_unlock_irqrestore(&client->buffer_lock, flags); > > } > > > > -static int evdev_set_clk_type(struct evdev_client *client, unsigned int > > clkid) > > +static int evdev_set_clk_type(struct evdev_client *client, clockid_t clkid) > > { > > unsigned long flags; > > - unsigned int clk_type; > > + input_clk_t clk_type; > > > > switch (clkid) { > > > > case CLOCK_REALTIME: > > - clk_type = EV_CLK_REAL; > > + clk_type = INPUT_CLK_REAL; > > break; > > case CLOCK_MONOTONIC: > > - clk_type = EV_CLK_MONO; > > + clk_type = INPUT_CLK_MONO; > > break; > > case CLOCK_BOOTTIME: > > - clk_type = EV_CLK_BOOT; > > + clk_type = INPUT_CLK_BOOT; > > break; > > default: > > return -EINVAL; > > @@ -307,12 +302,7 @@ static void evdev_events(struct input_handle *handle, > > { > > struct evdev *evdev = handle->private; > > struct evdev_client *client; > > - ktime_t ev_time[EV_CLK_MAX]; > > - > > - ev_time[EV_CLK_MONO] = ktime_get(); > > - ev_time[EV_CLK_REAL] = ktime_mono_to_real(ev_time[EV_CLK_MONO]); > > - ev_time[EV_CLK_BOOT] = ktime_mono_to_any(ev_time[EV_CLK_MONO], > > - TK_OFFS_BOOT); > > + ktime_t *ev_time = input_get_timestamp(handle->dev); > > > > rcu_read_lock(); > > > > diff --git a/drivers/input/input.c b/drivers/input/input.c > > index 7f3c5fcb9ed6..ae8b0ee58120 100644 > > --- a/drivers/input/input.c > > +++ b/drivers/input/input.c > > @@ -1894,6 +1894,23 @@ void input_free_device(struct input_dev *dev) > > } > > EXPORT_SYMBOL(input_free_device); > > > > +/** > > + * input_get_timestamp - get timestamp for input events > > + * @dev: input device to get timestamp f
Re: [PATCH v2] rtl8xxxu: Fix wifi low signal strength issue of RTL8723BU
On Fri, Jul 5, 2019 at 10:27 AM Chris Chiu wrote: > Per the code before REG_S0S1_PATH_SWITCH setting, the driver has told > the co-processor the antenna is inverse. > memset(&h2c, 0, sizeof(struct h2c_cmd)); > h2c.ant_sel_rsv.cmd = H2C_8723B_ANT_SEL_RSV; > h2c.ant_sel_rsv.ant_inverse = 1; > h2c.ant_sel_rsv.int_switch_type = 0; > rtl8xxxu_gen2_h2c_cmd(priv, &h2c, sizeof(h2c.ant_sel_rsv)); > > At least the current modification is consistent with the antenna > inverse setting. > I'll verify on vendor driver about when/how the inverse be determined. I checked this out. The codepath hit hardcodes it to the AUX port, i.e. "inverted" setup: EXhalbtc8723b1ant_PowerOnSetting(): if(pBtCoexist->chipInterface == BTC_INTF_USB) { // fixed at S0 for USB interface pBtCoexist->fBtcWrite4Byte(pBtCoexist, 0x948, 0x0); u1Tmp |= 0x1;// antenna inverse pBtCoexist->fBtcWriteLocalReg1Byte(pBtCoexist, 0xfe08, u1Tmp); pBoardInfo->btdmAntPos = BTC_ANTENNA_AT_AUX_PORT; } So I'm further convinced that these performance-enhancing changes are increasing consistency with the vendor driver. Daniel
Re: [PATCH RESEND] KVM: Boosting vCPUs that are delivering interrupts
On Fri, 12 Jul 2019 at 15:15, Wanpeng Li wrote: > > From: Wanpeng Li > > Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu wakeup > and interrupt delivery), except the lock holder, we want to also boost vCPUs > that are delivering interrupts. Actually most smp_call_function_many calls are > synchronous ipi calls, the ipi target vCPUs are also good yield candidates. > This patch sets preempted flag during wakeup and interrupt delivery time. > I forgot to mention that I disable pv tlb shootdown during testing, function call interrupts are not easy to be triggered directly by userspace workloads, in addition, distros' guest kernel w/o pv tlb shootdown support can also get benefit in both tlb shootdown and function call interrupts scenarios. > Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: > ebizzy -M > > vanilla boostingimproved > 1VM 23000 21232-9% > 2VM 28008000 180% > 3VM 1800310072% > > Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, > one running ebizzy -M, the other running 'stress --cpu 2': > > w/ boosting + w/o pv sched yield(vanilla) > > vanilla boosting improved > 1570 4000 55% > > w/ boosting + w/ pv sched yield(vanilla) > > vanilla boosting improved > 1844 5157 79% > > w/o boosting, perf top in VM: > > 72.33% [kernel] [k] smp_call_function_many > 4.22% [kernel] [k] call_function_i > 3.71% [kernel] [k] async_page_fault > > w/ boosting, perf top in VM: > > 38.43% [kernel] [k] smp_call_function_many > 6.31% [kernel] [k] async_page_fault > 6.13% libc-2.23.so [.] __memcpy_avx_unaligned > 4.88% [kernel] [k] call_function_interrupt > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Christian Borntraeger > Signed-off-by: Wanpeng Li > --- > virt/kvm/kvm_main.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index b4ab59d..2c46705 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2404,8 +2404,10 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu) > int me; > int cpu = vcpu->cpu; > > - if (kvm_vcpu_wake_up(vcpu)) > + if (kvm_vcpu_wake_up(vcpu)) { > + vcpu->preempted = true; > return; > + } > > me = get_cpu(); > if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) > -- > 2.7.4 >
[PATCH 1/2] usb: dwc3: Use devres to get clocks
Use devres to get clocks and drop explicit clock freeing. No functional change intended. Signed-off-by: Andrey Smirnov Cc: Felipe Balbi Cc: Chris Healy Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- drivers/usb/dwc3/core.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index c9bb93a2c81e..768023a2553c 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1436,7 +1436,7 @@ static int dwc3_probe(struct platform_device *pdev) if (dev->of_node) { dwc->num_clks = ARRAY_SIZE(dwc3_core_clks); - ret = clk_bulk_get(dev, dwc->num_clks, dwc->clks); + ret = devm_clk_bulk_get(dev, dwc->num_clks, dwc->clks); if (ret == -EPROBE_DEFER) return ret; /* @@ -1449,7 +1449,7 @@ static int dwc3_probe(struct platform_device *pdev) ret = reset_control_deassert(dwc->reset); if (ret) - goto put_clks; + return ret; ret = clk_bulk_prepare(dwc->num_clks, dwc->clks); if (ret) @@ -1536,8 +1536,6 @@ static int dwc3_probe(struct platform_device *pdev) clk_bulk_unprepare(dwc->num_clks, dwc->clks); assert_reset: reset_control_assert(dwc->reset); -put_clks: - clk_bulk_put(dwc->num_clks, dwc->clks); return ret; } @@ -1560,7 +1558,6 @@ static int dwc3_remove(struct platform_device *pdev) dwc3_free_event_buffers(dwc); dwc3_free_scratch_buffers(dwc); - clk_bulk_put(dwc->num_clks, dwc->clks); return 0; } -- 2.21.0
Re: [PATCH] waitqueue: fix clang -Wuninitialized warnings
On Tue, Jul 09, 2019 at 09:27:17PM +0200, Arnd Bergmann wrote: > On Wed, Jul 3, 2019 at 7:58 PM Nathan Chancellor > wrote: > > On Wed, Jul 03, 2019 at 10:10:55AM +0200, Arnd Bergmann wrote: > > > When CONFIG_LOCKDEP is set, every use of DECLARE_WAIT_QUEUE_HEAD_ONSTACK() > > > produces an annoying warning from clang, which is particularly annoying > > > for allmodconfig builds: > > > > > > fs/namei.c:1646:34: error: variable 'wq' is uninitialized when used > > > within its own initialization [-Werror,-Wuninitialized] > > > DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); > > > ^~ > > > include/linux/wait.h:74:63: note: expanded from macro > > > 'DECLARE_WAIT_QUEUE_HEAD_ONSTACK' > > > struct wait_queue_head name = __WAIT_QUEUE_HEAD_INIT_ONSTACK(name) > > > ^~~~ > > > include/linux/wait.h:72:33: note: expanded from macro > > > '__WAIT_QUEUE_HEAD_INIT_ONSTACK' > > > ({ init_waitqueue_head(&name); name; }) > > >^~~~ > > > > > > After playing with it for a while, I have found a way to rephrase the > > > macro in a way that should work well with both gcc and clang and not > > > produce this warning. The open-coded __WAIT_QUEUE_HEAD_INIT_ONSTACK > > > is a little more verbose than the original version by Peter Zijlstra, > > > but avoids the gcc-ism that suppresses warnings when assigning a > > > variable to itself. > > > > > > Cc: Peter Zijlstra > > > Signed-off-by: Arnd Bergmann > > > > Reviewed-by: Nathan Chancellor > > Tested-by: Nathan Chancellor > > Who would be the right person to pick this patch up for mainline? That would be me; but like Andrew, I'm not a fan of this patch.
[GIT PULL] Driver core patches for 5.3-rc1
The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a: Linux 5.2-rc3 (2019-06-02 13:55:33 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git tags/driver-core-5.3-rc1 for you to fetch changes up to c33d442328f556460b79aba6058adb37bb555389: debugfs: make error message a bit more verbose (2019-07-08 10:44:57 +0200) Driver Core and debugfs changes for 5.3-rc1 Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Because of this, there is going to be some merge issues with your tree at the moment, I'll follow up with the expected resolutions to make it easier for you. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups (will cause build warnings with s390 and coresight drivers in your tree) - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for. Other than the merge issues, functionality is working properly in linux-next :) Signed-off-by: Greg Kroah-Hartman Anders Roxell (1): mm/zsmalloc.c: remove unused variable Arnd Bergmann (1): ARM: omap1: remove unused variable Colin Ian King (1): lkdtm: remove redundant initialization of ret Geert Uytterhoeven (2): tools/firmware: Add missing newline at end of file arch_topology: Remove error messages on out-of-memory conditions Greg Kroah-Hartman (53): zswap: ignore debugfs_create_dir() return value trace: no need to check return value of debugfs_create functions blktrace: no need to check return value of debugfs_create functions zsmalloc: no need to check return value of debugfs_create functions mm: kmemleak: no need to check return value of debugfs_create functions hwpoison-inject: no need to check return value of debugfs_create functions sh: no need to check return value of debugfs_create functions fail_function: no need to check return value of debugfs_create functions kprobes: no need to check return value of debugfs_create functions mm: cleancache: no need to check return value of debugfs_create functions backing-dev: no need to check return value of debugfs_create functions x86: xen: no need to check return value of debugfs_create functions arm: omap1: no need to check return value of debugfs_create functions arm: omap2: no need to check return value of debugfs_create functions arm: dump: no need to check return value of debugfs_create functions x86: mm: no need to check return value of debugfs_create functions x86: platform: no need to check return value of debugfs_create functions x86: kdebugfs: no need to check return value of debugfs_create functions gcov: no need to check return value of debugfs_create functions mailbox: no need to check return value of debugfs_create functions btrfs: no need to check return value of debugfs_create functions debugfs: make debugfs_create_u32_array() return void vmw_balloon: no need to check return value of debugfs_create functions lkdtm: no need to check return value of debugfs_create functions ti-st: no need to check return value of debugfs_create functions thermal: intel: no need to check return value of debugfs_create functions thermal: intel_powerclamp: no need to check return value of debugfs_create functions thermal: tegra: no need to check return value of debugfs_create functions cxl: no need to check return value of debugfs_create functions lib: dynamic_debug: no need to check return value of debugfs_create functions fault-inject: clean up debugfs file creation logic mic: no need to check return value of debugfs_create functions genwq: no need to check return value of debugfs_create functions mei: no need to check return value of debugfs_create functions coresight: cpu-debug: no need to check return value of debugfs_create functions watchdog: mei_wdt: no need to check return value of debugfs_create functions watchdog: bcm_kona_wdt: no need to check return value of debugfs_create functions 6lowpan: no need to check return value of debugfs_create functions power: avs: sma
Re: [GIT PULL] Driver core patches for 5.3-rc1
On Fri, Jul 12, 2019 at 09:36:23AM +0200, Greg KH wrote: > The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a: > > Linux 5.2-rc3 (2019-06-02 13:55:33 -0700) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git > tags/driver-core-5.3-rc1 > > for you to fetch changes up to c33d442328f556460b79aba6058adb37bb555389: > > debugfs: make error message a bit more verbose (2019-07-08 10:44:57 +0200) > > > Driver Core and debugfs changes for 5.3-rc1 > > Here is the "big" driver core and debugfs changes for 5.3-rc1 > > It's a lot of different patches, all across the tree due to some api > changes and lots of debugfs cleanups. Because of this, there is going > to be some merge issues with your tree at the moment, I'll follow up > with the expected resolutions to make it easier for you. Here's the merge resolution patch that worked for me: diff --cc drivers/acpi/sleep.c index fcf4386ecc78,f0fe7c15d657.. --- a/drivers/acpi/sleep.c +++ b/drivers/acpi/sleep.c diff --cc drivers/misc/mei/debugfs.c index df6bf8b81936,47cfd5005e1b.. --- a/drivers/misc/mei/debugfs.c +++ b/drivers/misc/mei/debugfs.c @@@ -233,22 -154,46 +154,21 @@@ void mei_dbgfs_deregister(struct mei_de * * @dev: the mei device structure * @name: the mei device name - * - * Return: 0 on success, <0 on failure. */ -int mei_dbgfs_register(struct mei_device *dev, const char *name) +void mei_dbgfs_register(struct mei_device *dev, const char *name) { - struct dentry *dir, *f; + struct dentry *dir; dir = debugfs_create_dir(name, NULL); - if (!dir) - return -ENOMEM; - dev->dbgfs_dir = dir; - f = debugfs_create_file("meclients", S_IRUSR, dir, - dev, &mei_dbgfs_meclients_fops); - if (!f) { - dev_err(dev->dev, "meclients: registration failed\n"); - goto err; - } - f = debugfs_create_file("active", S_IRUSR, dir, - dev, &mei_dbgfs_active_fops); - if (!f) { - dev_err(dev->dev, "active: registration failed\n"); - goto err; - } - f = debugfs_create_file("devstate", S_IRUSR, dir, - dev, &mei_dbgfs_devstate_fops); - if (!f) { - dev_err(dev->dev, "devstate: registration failed\n"); - goto err; - } - f = debugfs_create_file("allow_fixed_address", S_IRUSR | S_IWUSR, dir, - &dev->allow_fixed_address, - &mei_dbgfs_allow_fa_fops); - if (!f) { - dev_err(dev->dev, "allow_fixed_address: registration failed\n"); - goto err; - } - return 0; -err: - mei_dbgfs_deregister(dev); - return -ENODEV; + debugfs_create_file("meclients", S_IRUSR, dir, dev, - &mei_dbgfs_fops_meclients); ++ &mei_dbgfs_meclients_fops); + debugfs_create_file("active", S_IRUSR, dir, dev, - &mei_dbgfs_fops_active); ++ &mei_dbgfs_active_fops); + debugfs_create_file("devstate", S_IRUSR, dir, dev, - &mei_dbgfs_fops_devstate); ++ &mei_dbgfs_devstate_fops); + debugfs_create_file("allow_fixed_address", S_IRUSR | S_IWUSR, dir, + &dev->allow_fixed_address, - &mei_dbgfs_fops_allow_fa); ++ &mei_dbgfs_allow_fa_fops); } - diff --cc drivers/misc/vmw_balloon.c index fdf5ad757226,043eed845246.. --- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@@ -1553,15 -1942,26 +1932,24 @@@ static int __init vmballoon_init(void if (x86_hyper_type != X86_HYPER_VMWARE) return -ENODEV; - for (page_size = VMW_BALLOON_4K_PAGE; -page_size <= VMW_BALLOON_LAST_SIZE; page_size++) - INIT_LIST_HEAD(&balloon.page_sizes[page_size].pages); - - INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work); + error = vmballoon_register_shrinker(&balloon); + if (error) + goto fail; + - error = vmballoon_debugfs_init(&balloon); - if (error) - goto fail; + vmballoon_debugfs_init(&balloon); + /* +* Initialization of compaction must be done after the call to +* balloon_devinfo_init() . +*/ + balloon_devinfo_init(&balloon.b_dev_info); + error = vmballoon_compaction_init(&balloon); + if (error) + goto fail; + + INIT_LIST_HEAD(&balloon.huge_pages); spin_lock_init(&balloon.comm_lock); init_rwsem(&balloon.conf_sem); balloon.vmci_doorbell = VMCI_INVALID_HANDLE; * Unmerged pat
[PATCH RESEND 1/2] KVM: LAPIC: Add pv ipi tracepoint
From: Wanpeng Li Add pv ipi tracepoint. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- arch/x86/kvm/lapic.c | 2 ++ arch/x86/kvm/trace.h | 25 + 2 files changed, 27 insertions(+) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 42da7eb..403ae3f 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -562,6 +562,8 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, irq.level = (icr & APIC_INT_ASSERT) != 0; irq.trig_mode = icr & APIC_INT_LEVELTRIG; + trace_kvm_pv_send_ipi(irq.vector, min, ipi_bitmap_low, ipi_bitmap_high); + if (icr & APIC_DEST_MASK) return -KVM_EINVAL; if (icr & APIC_SHORT_MASK) diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index b5c831e..ce6ee34 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -1462,6 +1462,31 @@ TRACE_EVENT(kvm_hv_send_ipi_ex, __entry->vector, __entry->format, __entry->valid_bank_mask) ); + +/* + * Tracepoints for kvm_pv_send_ipi. + */ +TRACE_EVENT(kvm_pv_send_ipi, + TP_PROTO(u32 vector, u32 min, unsigned long ipi_bitmap_low, unsigned long ipi_bitmap_high), + TP_ARGS(vector, min, ipi_bitmap_low, ipi_bitmap_high), + + TP_STRUCT__entry( + __field(u32, vector) + __field(u32, min) + __field(unsigned long, ipi_bitmap_low) + __field(unsigned long, ipi_bitmap_high) + ), + + TP_fast_assign( + __entry->vector = vector; + __entry->min = min; + __entry->ipi_bitmap_low = ipi_bitmap_low; + __entry->ipi_bitmap_high = ipi_bitmap_high; + ), + + TP_printk("vector %d min 0x%x ipi_bitmap_low 0x%lx ipi_bitmap_high 0x%lx", + __entry->vector, __entry->min, __entry->ipi_bitmap_low, __entry->ipi_bitmap_high) +); #endif /* _TRACE_KVM_H */ #undef TRACE_INCLUDE_PATH -- 2.7.4
[PATCH RESEND 2/2] KVM: X86: Add pv tlb shootdown tracepoint
From: Wanpeng Li Add pv tlb shootdown tracepoint. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- arch/x86/kvm/trace.h | 19 +++ arch/x86/kvm/x86.c | 2 ++ 2 files changed, 21 insertions(+) diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index ce6ee34..84f32d3 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -1487,6 +1487,25 @@ TRACE_EVENT(kvm_pv_send_ipi, TP_printk("vector %d min 0x%x ipi_bitmap_low 0x%lx ipi_bitmap_high 0x%lx", __entry->vector, __entry->min, __entry->ipi_bitmap_low, __entry->ipi_bitmap_high) ); + +TRACE_EVENT(kvm_pv_tlb_flush, + TP_PROTO(unsigned int vcpu_id, bool need_flush_tlb), + TP_ARGS(vcpu_id, need_flush_tlb), + + TP_STRUCT__entry( + __field(unsigned int, vcpu_id ) + __field(bool, need_flush_tlb ) + ), + + TP_fast_assign( + __entry->vcpu_id= vcpu_id; + __entry->need_flush_tlb = need_flush_tlb; + ), + + TP_printk("vcpu %u need_flush_tlb %s", __entry->vcpu_id, + __entry->need_flush_tlb ? "true" : "false") +); + #endif /* _TRACE_KVM_H */ #undef TRACE_INCLUDE_PATH diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2c32311..f487c9a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2458,6 +2458,8 @@ static void record_steal_time(struct kvm_vcpu *vcpu) * Doing a TLB flush here, on the guest's behalf, can avoid * expensive IPIs. */ + trace_kvm_pv_tlb_flush(vcpu->vcpu_id, + vcpu->arch.st.steal.preempted & KVM_VCPU_FLUSH_TLB); if (xchg(&vcpu->arch.st.steal.preempted, 0) & KVM_VCPU_FLUSH_TLB) kvm_vcpu_flush_tlb(vcpu, false); -- 2.7.4
Re: [GIT PULL] Driver core patches for 5.3-rc1
On Fri, Jul 12, 2019 at 09:36:23AM +0200, Greg KH wrote: > The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a: > > Linux 5.2-rc3 (2019-06-02 13:55:33 -0700) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git > tags/driver-core-5.3-rc1 > > for you to fetch changes up to c33d442328f556460b79aba6058adb37bb555389: > > debugfs: make error message a bit more verbose (2019-07-08 10:44:57 +0200) > > > Driver Core and debugfs changes for 5.3-rc1 > > Here is the "big" driver core and debugfs changes for 5.3-rc1 > > It's a lot of different patches, all across the tree due to some api > changes and lots of debugfs cleanups. Because of this, there is going > to be some merge issues with your tree at the moment, I'll follow up > with the expected resolutions to make it easier for you. > > Other than the debugfs cleanups, in this set of changes we have: > - bus iteration function cleanups (will cause build warnings > with s390 and coresight drivers in your tree) Here's the s390 patch that was sent previously to resolve this issue. From: Christian Borntraeger commit 92ce7e83b4e5 ("driver_find_device: Unify the match function with class_find_device()") changed the prototype of driver_find_device to use a const void pointer. Change match_apqn accordingly. Fixes: ec89b55e3bce ("s390: ap: implement PAPQ AQIC interception in kernel") Signed-off-by: Christian Borntraeger Signed-off-by: Vasily Gorbik --- drivers/s390/crypto/vfio_ap_ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 2c9fb1423a39..7e85ba7c6ef0 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -26,7 +26,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev); -static int match_apqn(struct device *dev, void *data) +static int match_apqn(struct device *dev, const void *data) { struct vfio_ap_queue *q = dev_get_drvdata(dev); -- 2.21.0
Re: [GIT PULL] Driver core patches for 5.3-rc1
On Fri, Jul 12, 2019 at 09:36:23AM +0200, Greg KH wrote: > The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a: > > Linux 5.2-rc3 (2019-06-02 13:55:33 -0700) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git > tags/driver-core-5.3-rc1 > > for you to fetch changes up to c33d442328f556460b79aba6058adb37bb555389: > > debugfs: make error message a bit more verbose (2019-07-08 10:44:57 +0200) > > > Driver Core and debugfs changes for 5.3-rc1 > > Here is the "big" driver core and debugfs changes for 5.3-rc1 > > It's a lot of different patches, all across the tree due to some api > changes and lots of debugfs cleanups. Because of this, there is going > to be some merge issues with your tree at the moment, I'll follow up > with the expected resolutions to make it easier for you. > > Other than the debugfs cleanups, in this set of changes we have: > - bus iteration function cleanups (will cause build warnings > with s390 and coresight drivers in your tree) And here is the patch that should resolve the coresight build issue. From: Nathan Chancellor Date: Mon, 1 Jul 2019 11:28:08 -0700 Subject: [PATCH] coresight: Make the coresight_device_fwnode_match declaration's fwnode parameter const drivers/hwtracing/coresight/coresight.c:1051:11: error: incompatible pointer types passing 'int (struct device *, void *)' to parameter of type 'int (*)(struct device *, const void *)' [-Werror,-Wincompatible-pointer-types] coresight_device_fwnode_match); ^ include/linux/device.h:173:17: note: passing argument to parameter 'match' here int (*match)(struct device *dev, const void *data)); ^ 1 error generated. Signed-off-by: Nathan Chancellor --- drivers/hwtracing/coresight/coresight-priv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 8b07fe55395a..7d401790dd7e 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -202,6 +202,6 @@ static inline void *coresight_get_uci_data(const struct amba_id *id) void coresight_release_platform_data(struct coresight_platform_data *pdata); -int coresight_device_fwnode_match(struct device *dev, void *fwnode); +int coresight_device_fwnode_match(struct device *dev, const void *fwnode); #endif -- 2.22.0 -- Cheers, Stephen Rothwell
Re: [PATCH] xen/pv: Fix a boot up hang triggered by int3 self test
Sorry for the noise, it looks description is wrong. This is not a double pop, but xen pv taking the path with create_gap=0, I'll send a v2. Zhenzhong On 2019/7/11 12:47, Zhenzhong Duan wrote: Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call() selftest") reveals a bug in XEN PV int3 assemble code. There is a double pop of register R11 and RCX currupting the exception frame, one in xen_int3 and the other in xen_xenint3. We see below hang at bootup: general protection fault: [#1] SMP NOPTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #6 RIP: e030:int3_magic+0x0/0x7 Call Trace: alternative_instructions+0x3d/0x12e check_bugs+0x7c9/0x887 ?__get_locked_pte+0x178/0x1f0 start_kernel+0x4ff/0x535 ?set_init_arg+0x55/0x55 xen_start_kernel+0x571/0x57a Fix it by removing xen_xenint3. Signed-off-by: Zhenzhong Duan Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov --- arch/x86/include/asm/traps.h | 2 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/xen-asm_64.S| 1 - 3 files changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 7d6f3f3..f2bd284 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -40,7 +40,7 @@ asmlinkage void xen_divide_error(void); asmlinkage void xen_xennmi(void); asmlinkage void xen_xendebug(void); -asmlinkage void xen_xenint3(void); +asmlinkage void xen_int3(void); asmlinkage void xen_overflow(void); asmlinkage void xen_bounds(void); asmlinkage void xen_invalid_op(void); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 4722ba2..2138d69 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -596,7 +596,7 @@ struct trap_array_entry { static struct trap_array_entry trap_array[] = { { debug, xen_xendebug,true }, - { int3,xen_xenint3, true }, + { int3,xen_int3,true }, { double_fault,xen_double_fault,true }, #ifdef CONFIG_X86_MCE { machine_check, xen_machine_check, true }, diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S index 1e9ef0b..ebf610b 100644 --- a/arch/x86/xen/xen-asm_64.S +++ b/arch/x86/xen/xen-asm_64.S @@ -32,7 +32,6 @@ xen_pv_trap divide_error xen_pv_trap debug xen_pv_trap xendebug xen_pv_trap int3 -xen_pv_trap xenint3 xen_pv_trap xennmi xen_pv_trap overflow xen_pv_trap bounds
Re: [RFC v2 01/26] mm/x86: Introduce kernel address space isolation
On 7/11/19 11:33 PM, Thomas Gleixner wrote: On Thu, 11 Jul 2019, Alexandre Chartre wrote: +/* + * When isolation is active, the address space doesn't necessarily map + * the percpu offset value (this_cpu_off) which is used to get pointers + * to percpu variables. So functions which can be invoked while isolation + * is active shouldn't be getting pointers to percpu variables (i.e. with + * get_cpu_var() or this_cpu_ptr()). Instead percpu variable should be + * directly read or written to (i.e. with this_cpu_read() or + * this_cpu_write()). + */ + +int asi_enter(struct asi *asi) +{ + enum asi_session_state state; + struct asi *current_asi; + struct asi_session *asi_session; + + state = this_cpu_read(cpu_asi_session.state); + /* +* We can re-enter isolation, but only with the same ASI (we don't +* support nesting isolation). Also, if isolation is still active, +* then we should be re-entering with the same task. +*/ + if (state == ASI_SESSION_STATE_ACTIVE) { + current_asi = this_cpu_read(cpu_asi_session.asi); + if (current_asi != asi) { + WARN_ON(1); + return -EBUSY; + } + WARN_ON(this_cpu_read(cpu_asi_session.task) != current); + return 0; + } + + /* isolation is not active so we can safely access the percpu pointer */ + asi_session = &get_cpu_var(cpu_asi_session); get_cpu_var()?? Where is the matching put_cpu_var() ? get_cpu_var() contains a preempt_disable ... What's wrong with a simple this_cpu_ptr() here? Oups, my mistake, I should be using this_cpu_ptr(). I will replace all get_cpu_var() with this_cpu_ptr(). +void asi_exit(struct asi *asi) +{ + struct asi_session *asi_session; + enum asi_session_state asi_state; + unsigned long original_cr3; + + asi_state = this_cpu_read(cpu_asi_session.state); + if (asi_state == ASI_SESSION_STATE_INACTIVE) + return; + + /* TODO: Kick sibling hyperthread before switching to kernel cr3 */ + original_cr3 = this_cpu_read(cpu_asi_session.original_cr3); + if (original_cr3) Why would this be 0 if the session is active? Correct, original_cr3 won't be 0. I think this is a remain from a previous version where original_cr3 was handled differently. + write_cr3(original_cr3); + + /* page-table was switched, we can now access the percpu pointer */ + asi_session = &get_cpu_var(cpu_asi_session); See above. Will fix that. Thanks, alex. + WARN_ON(asi_session->task != current); + asi_session->state = ASI_SESSION_STATE_INACTIVE; + asi_session->asi = NULL; + asi_session->task = NULL; + asi_session->original_cr3 = 0; +} Thanks, tglx
Re: [PATCH] waitqueue: fix clang -Wuninitialized warnings
On Fri, Jul 12, 2019 at 2:49 AM Andrew Morton wrote: > On Wed, 3 Jul 2019 10:10:55 +0200 Arnd Bergmann wrote: > > > Surely clang is being extraordinarily dumb here? > > DECLARE_WAIT_QUEUE_HEAD_ONSTACK() is effectively doing > > struct wait_queue_head name = ({ __init_waitqueue_head(&name) ; name; > }) > > which is perfectly legitimate! clang has no business assuming that > __init_waitqueue_head() will do any reads from the pointer which it was > passed, nor can clang assume that __init_waitqueue_head() leaves any of > *name uninitialized. > > Does it also warn if code does this? > > struct wait_queue_head name; > __init_waitqueue_head(&name); > name = name; > > which is equivalent, isn't it? No, it does not warn for this. I've tried a few more variants here: https://godbolt.org/z/ykSX0r What I think is going on here is a result of clang and gcc fundamentally treating -Wuninitialized warnings differently. gcc tries to make the warnings as helpful as possible, but given the NP-complete nature of this problem it won't always get it right, and it traditionally allowed this syntax as a workaround. int f(void) { int i = i; // tell gcc not to warn return i; } clang apparently implements the warnings in a way that is as completely predictable (and won't warn in cases that it doesn't completely understand), but decided as a result that the gcc 'int i = i' syntax is bogus and it always warns about a variable used in its own declaration that is later referenced, without looking at whether the declaration does initialize it or not. > The proposed solution is, effectively, to open-code > __init_waitqueue_head() at each DECLARE_WAIT_QUEUE_HEAD_ONSTACK() > callsite. That's pretty unpleasant and calls for an explanatory > comment at the __WAIT_QUEUE_HEAD_INIT_ONSTACK() definition site as well > as a cautionary comment at the __init_waitqueue_head() definition so we > can keep the two versions in sync as code evolves. Yes, makes sense. > Hopefully clang will soon be hit with the cluebat (yes?) and this > change becomes obsolete in the quite short term. Surely 6-12 months > from now nobody will be using the uncluebatted version of clang on > contemporary kernel sources so we get to remove this nastiness again. > Which makes me wonder whether we should merge it at all. Would it make you feel better to keep the current code but have an alternative version guarded with e.g. "#if defined(__clang__ && (__clang_major__ <= 9)"? While it is probably a good idea to fix clang here, this is one of the last issues that causes a significant difference between gcc and clang in build testing with kernelci: https://kernelci.org/build/next/branch/master/kernel/next-20190709/ I'm trying to get all the warnings fixed there so we can spot build-time regressions more easily. Arnd
Re: linux-next: manual merge of the char-misc tree with the driver-core tree
On Tue, Jul 09, 2019 at 09:20:03AM +1000, Stephen Rothwell wrote: > Hi all, > > On Thu, 13 Jun 2019 15:53:44 +1000 Stephen Rothwell > wrote: > > > > Today's linux-next merge of the char-misc tree got a conflict in: > > > > drivers/misc/vmw_balloon.c > > > > between commit: > > > > 225afca60b8a ("vmw_balloon: no need to check return value of > > debugfs_create functions") > > > > from the driver-core tree and commits: > > > > 83a8afa72e9c ("vmw_balloon: Compaction support") > > 5d1a86ecf328 ("vmw_balloon: Add memory shrinker") > > > > from the char-misc tree. > > > > I fixed it up (see below) and can carry the fix as necessary. This > > is now fixed as far as linux-next is concerned, but any non trivial > > conflicts should be mentioned to your upstream maintainer when your tree > > is submitted for merging. You may also want to consider cooperating > > with the maintainer of the conflicting tree to minimise any particularly > > complex conflicts. > > > > -- > > Cheers, > > Stephen Rothwell > > > > diff --cc drivers/misc/vmw_balloon.c > > index fdf5ad757226,043eed845246.. > > --- a/drivers/misc/vmw_balloon.c > > +++ b/drivers/misc/vmw_balloon.c > > @@@ -1553,15 -1942,26 +1932,24 @@@ static int __init vmballoon_init(void > > if (x86_hyper_type != X86_HYPER_VMWARE) > > return -ENODEV; > > > > - for (page_size = VMW_BALLOON_4K_PAGE; > > -page_size <= VMW_BALLOON_LAST_SIZE; page_size++) > > - INIT_LIST_HEAD(&balloon.page_sizes[page_size].pages); > > - > > - > > INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work); > > > > + error = vmballoon_register_shrinker(&balloon); > > + if (error) > > + goto fail; > > + > > - error = vmballoon_debugfs_init(&balloon); > > - if (error) > > - goto fail; > > + vmballoon_debugfs_init(&balloon); > > > > + /* > > +* Initialization of compaction must be done after the call to > > +* balloon_devinfo_init() . > > +*/ > > + balloon_devinfo_init(&balloon.b_dev_info); > > + error = vmballoon_compaction_init(&balloon); > > + if (error) > > + goto fail; > > + > > + INIT_LIST_HEAD(&balloon.huge_pages); > > spin_lock_init(&balloon.comm_lock); > > init_rwsem(&balloon.conf_sem); > > balloon.vmci_doorbell = VMCI_INVALID_HANDLE; > > I am still getting this conflict (the commit ids may have changed). > Just a reminder in case you think Linus may need to know. Ok, I sent off the pull request for the driver core tree now. I had all of my other trees merged "first" so that all of the conflicts would happen just once here. Hopefully I've pointed out all of the potential and real problems with this merge. Ugh, this was a messy one, sorry about all of this, full-tree api changes and cleanups are a pain at times. thanks, greg k-h
Re: [RFC v2 02/26] mm/asi: Abort isolation on interrupt, exception and context switch
On 7/12/19 2:05 AM, Andy Lutomirski wrote: On Jul 11, 2019, at 8:25 AM, Alexandre Chartre wrote: Address space isolation should be aborted if there is an interrupt, an exception or a context switch. Interrupt/exception handlers and context switch code need to run with the full kernel address space. Address space isolation is aborted by restoring the original CR3 value used before entering address space isolation. NAK to the entry changes. That code you’re changing is already known to be a bit buggy, and it’s spaghetti. PeterZ and I are gradually working on fixing some bugs and C-ifying it. ASI can go on top. Agree this is spaghetti and I will be happy to move ASI on top. I will keep an eye for your changes, and I will change the ASI code accordingly. Thanks, alex.
Re: objtool crashes on clang output (drivers/hwmon/pmbus/adm1275.o)
On Thu, Jul 11, 2019 at 11:29 PM Arnd Bergmann wrote: > > On Thu, Jul 11, 2019 at 11:05 PM 'Jann Horn' via Clang Built Linux > wrote: > > I was playing around with building the kernel with LLVM a few months > > ago and used this local patch, but didn't get around to submitting > > upstream because I couldn't reproduce the problem for some reason. I > > think the warnings you're getting sound like what I saw back then: > > https://gist.github.com/thejh/0434662728afb95d72455bf30ece5817 > > > > Quoting the commit message from that patch: > > > > > > With clang from git master, code can be generated where a function contains > > two indirect jump instructions that use the same switch table. To deal with > > this case and similar ones properly, convert the switch table parsing to > > use two passes: > > > > > > Does that sound like what you're seeing? > > Yes, that is exactly right, and your patch seems to address the problem > for the cases I tried so far (will know more after a night of randconfig > testing). I no longer see any of the "can't find switch jump table" in last nights randconfig builds. I do see one other rare warning, see attached object file: fs/reiserfs/do_balan.o: warning: objtool: replace_key()+0x158: stack state mismatch: cfa1=7+40 cfa2=7+56 fs/reiserfs/do_balan.o: warning: objtool: balance_leaf()+0x2791: stack state mismatch: cfa1=7+176 cfa2=7+192 fs/reiserfs/ibalance.o: warning: objtool: balance_internal()+0xe8f: stack state mismatch: cfa1=7+240 cfa2=7+248 fs/reiserfs/ibalance.o: warning: objtool: internal_move_pointers_items()+0x36f: stack state mismatch: cfa1=7+152 cfa2=7+144 fs/reiserfs/lbalance.o: warning: objtool: leaf_cut_from_buffer()+0x58b: stack state mismatch: cfa1=7+128 cfa2=7+112 fs/reiserfs/lbalance.o: warning: objtool: leaf_copy_boundary_item()+0x7a9: stack state mismatch: cfa1=7+104 cfa2=7+96 fs/reiserfs/lbalance.o: warning: objtool: leaf_copy_items_entirely()+0x3d2: stack state mismatch: cfa1=7+120 cfa2=7+128 I suspect this comes from the calls to the __reiserfs_panic() noreturn function, but have not actually looked at the object file. Arnd lbalance.o Description: application/object
Re: [PATCH 4/4] numa: introduce numa cling feature
On Fri, Jul 12, 2019 at 11:10:08AM +0800, 王贇 wrote: > On 2019/7/11 下午10:27, Peter Zijlstra wrote: > >> Thus we introduce the numa cling, which try to prevent tasks leaving > >> the preferred node on wakeup fast path. > > > > > >> @@ -6195,6 +6447,13 @@ static int select_idle_sibling(struct task_struct > >> *p, int prev, int target) > >>if ((unsigned)i < nr_cpumask_bits) > >>return i; > >> > >> + /* > >> + * Failed to find an idle cpu, wake affine may want to pull but > >> + * try stay on prev-cpu when the task cling to it. > >> + */ > >> + if (task_numa_cling(p, cpu_to_node(prev), cpu_to_node(target))) > >> + return prev; > >> + > >>return target; > >> } > > > > Select idle sibling should never cross node boundaries and is thus the > > entirely wrong place to fix anything. > > Hmm.. in our early testing the printk show both select_task_rq_fair() and > task_numa_find_cpu() will call select_idle_sibling with prev and target on > different node, thus we pick this point to save few lines. But it will never return @prev if it is not in the same cache domain as @target. See how everything is gated by: && cpus_share_cache(x, target) > But if the semantics of select_idle_sibling() is to return cpu on the same > node of target, what about move the logical after select_idle_sibling() for > the two callers? No, that's insane. You don't do select_idle_sibling() to then ignore the result. You have to change @target before calling select_idle_sibling().
[PATCH] perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program block. It can help us to understand if the cycles diff is noisy or not. This patch is inspired by Andi Kleen's patch https://lwn.net/Articles/600471/ We create new option '-n or --noisy'. Example: perf record -b ./div perf record -b ./div perf diff -c cycles # Event 'cycles' # # Baseline [Program Block Range] Cycles Diff Shared Object Symbol # .. . # 46.42% [div.c:40 -> div.c:40] 0 div[.] main 46.42% [div.c:42 -> div.c:44] 0 div[.] main 46.42% [div.c:42 -> div.c:39] 0 div[.] main 20.72% [random_r.c:357 -> random_r.c:394] -2 libc-2.27.so [.] __random_r 20.72% [random_r.c:357 -> random_r.c:380] -1 libc-2.27.so [.] __random_r 20.72% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.72% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r 17.58% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random 17.58% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random 17.58% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random 17.58% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.58% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.58% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random 8.33% [div.c:22 -> div.c:25] 0 div[.] compute_flag 8.33% [div.c:27 -> div.c:28] 0 div[.] compute_flag 4.80% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand 4.80% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.14% [rand@plt+0 -> rand@plt+0] 0 div[.] rand@plt When we enable the option '-n', the output is perf diff -c cycles -n # Event 'cycles' # # Baseline [Program Block Range]/Cycles Diff/stddev/Hist Shared Object Symbol # . # 46.42%[div.c:40 -> div.c:40]0 ± 40.2% ▂███▁▂▁▁ div[.] main 46.42%[div.c:42 -> div.c:44]0 ±100.0% █▁▁▁ div[.] main 46.42%[div.c:42 -> div.c:39]0 ± 15.3% ▃▃▂▆▃▂█▁ div[.] main 20.72%[random_r.c:357 -> random_r.c:394] -2 ± 20.1% ▁▄▄▅▂▅█▁ libc-2.27.so [.] __random_r 20.72%[random_r.c:357 -> random_r.c:380] -1 ± 20.9% ▁▆▇▁█▅▇█ libc-2.27.so [.] __random_r 20.72%[random_r.c:388 -> random_r.c:388]0 ± 0.0%libc-2.27.so [.] __random_r 20.72%[random_r.c:388 -> random_r.c:391]0 ± 88.0% ▁▁▁█ libc-2.27.so [.] __random_r 17.58%[random.c:288 -> random.c:291]0 ± 29.3% ▁▁█▁ libc-2.27.so [.] __random 17.58%[random.c:291 -> random.c:291]0 ± 29.3% ▁▁▁█ libc-2.27.so [.] __random 17.58%[random.c:293 -> random.c:293]0 ± 29.3% ▁▁▁█ libc-2.27.so [.] __random 17.58%[random.c:295 -> random.c:295]0 ± 0.0%libc-2.27.so [.] __random 17.58%[random.c:295 -> random.c:295]0 ± 0.0%libc-2.27.so [.] __random 17.58%[random.c:298 -> random.c:298]0 ± 0.0%libc-2.27.so [.] __random 8.33%[div.c:22 -> div.c:25]0 ± 29.3% ▁▁█▁ div[.] compute_flag 8.33%[div.c:27 -> div.c:28]0 ± 48.8% ▁██▁▁▁█▁ div
Re: [PATCH] waitqueue: fix clang -Wuninitialized warnings
On Fri, Jul 12, 2019 at 09:45:06AM +0200, Arnd Bergmann wrote: > On Fri, Jul 12, 2019 at 2:49 AM Andrew Morton > wrote: > > On Wed, 3 Jul 2019 10:10:55 +0200 Arnd Bergmann wrote: > > > > > > > Surely clang is being extraordinarily dumb here? > > > > DECLARE_WAIT_QUEUE_HEAD_ONSTACK() is effectively doing > > > > struct wait_queue_head name = ({ __init_waitqueue_head(&name) ; > > name; }) > > > > which is perfectly legitimate! clang has no business assuming that > > __init_waitqueue_head() will do any reads from the pointer which it was > > passed, nor can clang assume that __init_waitqueue_head() leaves any of > > *name uninitialized. > > > > Does it also warn if code does this? > > > > struct wait_queue_head name; > > __init_waitqueue_head(&name); > > name = name; > > > > which is equivalent, isn't it? > > No, it does not warn for this. > > I've tried a few more variants here: https://godbolt.org/z/ykSX0r > > What I think is going on here is a result of clang and gcc fundamentally > treating -Wuninitialized warnings differently. gcc tries to make the warnings > as helpful as possible, but given the NP-complete nature of this problem > it won't always get it right, and it traditionally allowed this syntax as a > workaround. > > int f(void) > { > int i = i; // tell gcc not to warn > return i; > } > > clang apparently implements the warnings in a way that is as > completely predictable (and won't warn in cases that it > doesn't completely understand), but decided as a result that the > gcc 'int i = i' syntax is bogus and it always warns about a variable > used in its own declaration that is later referenced, without looking > at whether the declaration does initialize it or not. > > > The proposed solution is, effectively, to open-code > > __init_waitqueue_head() at each DECLARE_WAIT_QUEUE_HEAD_ONSTACK() > > callsite. That's pretty unpleasant and calls for an explanatory > > comment at the __WAIT_QUEUE_HEAD_INIT_ONSTACK() definition site as well > > as a cautionary comment at the __init_waitqueue_head() definition so we > > can keep the two versions in sync as code evolves. > > Yes, makes sense. > > > Hopefully clang will soon be hit with the cluebat (yes?) and this > > change becomes obsolete in the quite short term. Surely 6-12 months > > from now nobody will be using the uncluebatted version of clang on > > contemporary kernel sources so we get to remove this nastiness again. > > Which makes me wonder whether we should merge it at all. > > Would it make you feel better to keep the current code but have an alternative > version guarded with e.g. "#if defined(__clang__ && (__clang_major__ <= 9)"? > > While it is probably a good idea to fix clang here, this is one of the last > issues that causes a significant difference between gcc and clang in build > testing with kernelci: > https://kernelci.org/build/next/branch/master/kernel/next-20190709/ > I'm trying to get all the warnings fixed there so we can spot build-time > regressions more easily. > > Arnd I'm just spitballing here since I am about to go to sleep but could we do something like you did for bee20031772a ("disable -Wattribute-alias warning for SYSCALL_DEFINEx()") and disable the warning in DECLARE_WAIT_QUEUE_HEAD_ONSTACK only since we know it is not going to be a problem? That way, if/when Clang is fixed, we can just have the warning be disabled for older versions? Cheers, Nathan
Re: [PATCH 1/4] numa: introduce per-cgroup numa balancing locality, statistic
On Fri, Jul 12, 2019 at 11:43:17AM +0800, 王贇 wrote: > > > On 2019/7/11 下午9:47, Peter Zijlstra wrote: > [snip] > >> + rcu_read_lock(); > >> + memcg = mem_cgroup_from_task(p); > >> + if (idx != -1) > >> + this_cpu_inc(memcg->stat_numa->locality[idx]); > > > > I thought cgroups were supposed to be hierarchical. That is, if we have: > > > > R > > / \ > > A > > /\ > > B > > \ > >t1 > > > > Then our task t1 should be accounted to B (as you do), but also to A and > > R. > > I get the point but not quite sure about this... > > Not like pages there are no hierarchical limitation on locality, also tasks You can use cpusets to affect that. > running in a particular group have no influence to others, not to mention the > extra overhead, does it really meaningful to account the stuff hierarchically? AFAIU it's a requirement of cgroups to be hierarchical. All our other cgroup accounting is like that.
[GIT PULL] 9p updates for 5.3
Hi Linus, Here is a 9p update for 5.3, just a couple of fixes that have been sitting here for too long as I missed the 5.2 merge window. I have two more patches that I didn't have time to test early enough for this but also are plain details fix, please let me know if you would prefer having me send a pull request for -rc2 after a week in -next or if I should just wait until the next window. There's little risk but I'm usually rather conservative on this. The following changes since commit 5908e6b738e3357af42c10e1183753c70a0117a9: Linux 5.0-rc8 (2019-02-24 16:46:45 -0800) are available in the git repository at: git://github.com/martinetd/linux tags/9p-for-5.3 for you to fetch changes up to 80a316ff16276b36d0392a8f8b2f63259857ae98: 9p/xen: Add cleanup path in p9_trans_xen_init (2019-05-15 13:00:07 +) 9p pull request for inclusion in 5.13 Two small fixes to properly cleanup the 9p transports list if virtio/xen module initialization fail. 9p might otherwise try to access memory from a module that failed to register got freed. YueHaibing (2): 9p/virtio: Add cleanup path in p9_virtio_init 9p/xen: Add cleanup path in p9_trans_xen_init net/9p/trans_virtio.c |8 +++- net/9p/trans_xen.c|8 +++- 2 files changed, 14 insertions(+), 2 deletions(-)
Re: [GIT PULL] 9p updates for 5.3
Dominique Martinet wrote on Fri, Jul 12, 2019: > 9p pull request for inclusion in 5.13 Just noticed this typo in version number here, should I make a new tag with the correct text? Sorry, -- Dominique
Re: [PATCH v2] printk: Do not lose last line in kmsg buffer dump
On Thu 2019-07-11 16:29:37, Vincent Whitchurch wrote: > kmsg_dump_get_buffer() is supposed to select all the youngest log > messages which fit into the provided buffer. It determines the correct > start index by using msg_print_text() with a NULL buffer to calculate > the size of each entry. However, when performing the actual writes, > msg_print_text() only writes the entry to the buffer if the written len > is lesser than the size of the buffer. So if the lengths of the > selected youngest log messages happen to precisely fill up the provided > buffer, the last log message is not included. > > We don't want to modify msg_print_text() to fill up the buffer and start > returning a length which is equal to the size of the buffer, since > callers of its other users, such as kmsg_dump_get_line(), depend upon > the current behaviour. > > Instead, fix kmsg_dump_get_buffer() to compensate for this. > > For example, with the following two final prints: > > [6.427502] A > [6.427769] 12345 > > A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this > patch: > > : 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37 <0>[6.522197 > 0010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a ] A. > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > After this patch: > > : 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38 <0>[6.456678 > 0010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a ] 12345. > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Signed-off-by: Vincent Whitchurch > --- > v2: Move fix to kmsg_dump_get_buffer() > > kernel/printk/printk.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 1888f6a3b694..424abf802f02 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -3274,7 +3274,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, > bool syslog, > /* move first record forward until length fits into the buffer */ > seq = dumper->cur_seq; > idx = dumper->cur_idx; > - while (l > size && seq < dumper->next_seq) { > + while (l >= size && seq < dumper->next_seq) { This cycle searches how many messages would fit into the buffer. The patch looks like a hack using a hole that the next cycle does not longer check the number of really stored characters. What would happen when msg_print_text() starts adding the trailing '\0' as suggested by https://lkml.kernel.org/r/20190710121049.rwhk7fknfzn3c...@pathway.suse.cz I would much more appreciate if we make the code more secure instead of stretching its weakness to the limits. BTW: What is the motivation for this fix? Is a bug report or just some research of possible buffer overflows? The commit message pretends that the problem is bigger than it really is. It is about one byte and not one line. Best Regards, Petr
Re: [RFC v2 00/27] Kernel Address Space Isolation
On 7/12/19 12:38 AM, Dave Hansen wrote: On 7/11/19 7:25 AM, Alexandre Chartre wrote: - Kernel code mapped to the ASI page-table has been reduced to: . the entire kernel (I still need to test with only the kernel text) . the cpu entry area (because we need the GDT to be mapped) . the cpu ASI session (for managing ASI) . the current stack - Optionally, an ASI can request the following kernel mapping to be added: . the stack canary . the cpu offsets (this_cpu_off) . the current task . RCU data (rcu_data) . CPU HW events (cpu_hw_events). I don't see the per-cpu areas in here. But, the ASI macros in entry_64.S (and asi_start_abort()) use per-cpu data. We don't map all per-cpu areas, but only the per-cpu variables we need. ASI code uses the per-cpu cpu_asi_session variable which is mapped when an ASI is created (see patch 15/26): + /* +* Map the percpu ASI sessions. This is used by interrupt handlers +* to figure out if we have entered isolation and switch back to +* the kernel address space. +*/ + err = ASI_MAP_CPUVAR(asi, cpu_asi_session); + if (err) + return err; Also, this stuff seems to do naughty stuff (calling C code, touching per-cpu data) before the PTI CR3 writes have been done. But, I don't see anything excluding PTI and this code from coexisting. My understanding is that PTI CR3 writes only happens when switching to/from userland. While ASI enter/exit/abort happens while we are already in the kernel, so asi_start_abort() is not called when coming from userland and so not interacting with PTI. For example, if ASI in used during a syscall (e.g. with KVM), we have: -> syscall - PTI CR3 write (kernel CR3) - syscall handler: ... asi_enter()-> write ASI CR3 .. code run with ASI .. asi_exit() or asi abort -> restore original CR3 ... - PTI CR3 write (userland CR3) <- syscall Thanks, alex.
[PATCH v4 0/5] hv: Remove dependencies on guest page size
The Linux guest page size and hypervisor page size concepts are different, even though they happen to be the same value on x86. Hyper-V code mixes up the two, so this patchset begins to address that by creating and using a set of Hyper-V specific page definitions. A major benefit of those new definitions is that they support non-x86 architectures, such as ARM64, that use different page sizes. On ARM64, the guest page size may not be 4096, and Hyper-V always runs with a page size of 4096. In this patchset, the first two patches lay the foundation for the others, creating definitions and preparing for allocation of memory with the size and alignment that Hyper-V expects as a page. Patch 3 applies the page size definition where the guest VM and Hyper-V communicate, and where the code intends to use the Hyper-V page size. The last two patches set the ring buffer size to a fixed value, removing the dependency on the guest page size. This is the initial set of changes to the Hyper-V code, and future patches will make additional changes using the same foundation, for example, replace __vmalloc() and related functions when Hyper-V pages are intended. Changes in v4 (all apply to patch 2 only): - Remove file name from the subject. - Include prototypes of two new functions. - Add another Link tag. Changes in v3: - Simplify expression for BUILD_BUG_ON() in patch 2. - Add Link and Reviewed-by tags. Change in v2: - Replace patch 2 with a new one. Maya Nakamura (5): x86: hv: hyperv-tlfs.h: Create and use Hyper-V page definitions x86: hv: Add functions to allocate/deallocate page for Hyper-V hv: vmbus: Replace page definition with Hyper-V specific one HID: hv: Remove dependencies on PAGE_SIZE for ring buffer Input: hv: Remove dependencies on PAGE_SIZE for ring buffer arch/x86/hyperv/hv_init.c | 14 ++ arch/x86/include/asm/hyperv-tlfs.h| 12 +++- arch/x86/include/asm/mshyperv.h | 5 - drivers/hid/hid-hyperv.c | 4 ++-- drivers/hv/hyperv_vmbus.h | 8 drivers/input/serio/hyperv-keyboard.c | 4 ++-- 6 files changed, 37 insertions(+), 10 deletions(-) -- 2.17.1
Re: [PATCH v2 0/5] Add NUMA-awareness to qspinlock
On 2019/7/3 19:58, Jan Glauber wrote: > Hi Alex, > I've tried this series on arm64 (ThunderX2 with up to SMT=4 and 224 CPUs) > with the borderline testcase of accessing a single file from all > threads. With that > testcase the qspinlock slowpath is the top spot in the kernel. > > The results look really promising: > > CPUsnormalnuma-qspinlocks > - > 56149.41 73.90 > 224 576.95 290.31 > > Also frontend-stalls are reduced to 50% and interconnect traffic is > greatly reduced. > Tested-by: Jan Glauber Tested this patchset on Kunpeng920 ARM64 server (96 cores, 4 NUMA nodes), and with the same test case from Jan, I can see 150%+ boost! (Need to add a patch below [1].) For the real workload such as Nginx I can see about 10% performance improvement as well. Tested-by: Hanjun Guo Please cc me for new versions and I'm willing to test it. Thanks Hanjun [1] diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 657bbc5..72c1346 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -792,6 +792,20 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. +config NUMA_AWARE_SPINLOCKS + bool "Numa-aware spinlocks" + depends on NUMA + default y + help + Introduce NUMA (Non Uniform Memory Access) awareness into + the slow path of spinlocks. + + The kernel will try to keep the lock on the same node, + thus reducing the number of remote cache misses, while + trading some of the short term fairness for better performance. + + Say N if you want absolute first come first serve fairness. + config USE_PERCPU_NUMA_NODE_ID def_bool y depends on NUMA diff --git a/kernel/locking/qspinlock_cna.h b/kernel/locking/qspinlock_cna.h index 2994167..be5dd44 100644 --- a/kernel/locking/qspinlock_cna.h +++ b/kernel/locking/qspinlock_cna.h @@ -4,7 +4,7 @@ #endif #include - +#include /* * Implement a NUMA-aware version of MCS (aka CNA, or compact NUMA-aware lock). * @@ -170,7 +170,7 @@ static __always_inline void cna_init_node(struct mcs_spinlock *node, int cpuid, u32 tail) { if (decode_numa_node(node->node_and_count) == -1) - store_numa_node(node, numa_cpu_node(cpuid)); + store_numa_node(node, cpu_to_node(cpuid)); node->encoded_tail = tail; }
[PATCH v2] xen/pv: Fix a boot up hang revealed by int3 self test
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call() selftest") is used to ensure there is a gap setup in exception stack which could be used for inserting call return address. This gap is missed in XEN PV int3 exception entry path, then below panic triggered: [0.772876] general protection fault: [#1] SMP NOPTI [0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11 [0.772893] RIP: e030:int3_magic+0x0/0x7 [0.772905] RSP: 3507:82203e98 EFLAGS: 0246 [0.773334] Call Trace: [0.773334] alternative_instructions+0x3d/0x12e [0.773334] check_bugs+0x7c9/0x887 [0.773334] ? __get_locked_pte+0x178/0x1f0 [0.773334] start_kernel+0x4ff/0x535 [0.773334] ? set_init_arg+0x55/0x55 [0.773334] xen_start_kernel+0x571/0x57a As xenint3 and int3 entry code are same except xenint3 doesn't generate a gap, we can fix it by using int3 and drop useless xenint3. Signed-off-by: Zhenzhong Duan Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov --- v2: fix up description. --- arch/x86/entry/entry_64.S| 1 - arch/x86/include/asm/traps.h | 2 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/xen-asm_64.S| 1 - 4 files changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 0ea4831..35a66fc 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1176,7 +1176,6 @@ idtentry stack_segmentdo_stack_segment has_error_code=1 #ifdef CONFIG_XEN_PV idtentry xennmido_nmi has_error_code=0 idtentry xendebug do_debughas_error_code=0 -idtentry xenint3 do_int3 has_error_code=0 #endif idtentry general_protectiondo_general_protection has_error_code=1 diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 7d6f3f3..f2bd284 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -40,7 +40,7 @@ asmlinkage void xen_divide_error(void); asmlinkage void xen_xennmi(void); asmlinkage void xen_xendebug(void); -asmlinkage void xen_xenint3(void); +asmlinkage void xen_int3(void); asmlinkage void xen_overflow(void); asmlinkage void xen_bounds(void); asmlinkage void xen_invalid_op(void); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 4722ba2..2138d69 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -596,7 +596,7 @@ struct trap_array_entry { static struct trap_array_entry trap_array[] = { { debug, xen_xendebug,true }, - { int3,xen_xenint3, true }, + { int3,xen_int3,true }, { double_fault,xen_double_fault,true }, #ifdef CONFIG_X86_MCE { machine_check, xen_machine_check, true }, diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S index 1e9ef0b..ebf610b 100644 --- a/arch/x86/xen/xen-asm_64.S +++ b/arch/x86/xen/xen-asm_64.S @@ -32,7 +32,6 @@ xen_pv_trap divide_error xen_pv_trap debug xen_pv_trap xendebug xen_pv_trap int3 -xen_pv_trap xenint3 xen_pv_trap xennmi xen_pv_trap overflow xen_pv_trap bounds -- 1.8.3.1
[PATCH v4 1/5] x86: hv: hyperv-tlfs.h: Create and use Hyper-V page definitions
Define HV_HYP_PAGE_SHIFT, HV_HYP_PAGE_SIZE, and HV_HYP_PAGE_MASK because the Linux guest page size and hypervisor page size concepts are different, even though they happen to be the same value on x86. Also, replace PAGE_SIZE with HV_HYP_PAGE_SIZE. Signed-off-by: Maya Nakamura Reviewed-by: Michael Kelley Reviewed-by: Vitaly Kuznetsov --- arch/x86/include/asm/hyperv-tlfs.h | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h index af78cd72b8f3..7a2705694f5b 100644 --- a/arch/x86/include/asm/hyperv-tlfs.h +++ b/arch/x86/include/asm/hyperv-tlfs.h @@ -12,6 +12,16 @@ #include #include +/* + * While not explicitly listed in the TLFS, Hyper-V always runs with a page size + * of 4096. These definitions are used when communicating with Hyper-V using + * guest physical pages and guest physical page addresses, since the guest page + * size may not be 4096 on all architectures. + */ +#define HV_HYP_PAGE_SHIFT 12 +#define HV_HYP_PAGE_SIZE BIT(HV_HYP_PAGE_SHIFT) +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) + /* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is set by CPUID(HvCpuIdFunctionVersionAndFeatures). @@ -847,7 +857,7 @@ union hv_gpa_page_range { * count is equal with how many entries of union hv_gpa_page_range can * be populated into the input parameter page. */ -#define HV_MAX_FLUSH_REP_COUNT ((PAGE_SIZE - 2 * sizeof(u64)) /\ +#define HV_MAX_FLUSH_REP_COUNT ((HV_HYP_PAGE_SIZE - 2 * sizeof(u64)) / \ sizeof(union hv_gpa_page_range)) struct hv_guest_mapping_flush_list { -- 2.17.1
Re: [PATCH v3 0/9] i2c: add support for filters
On Tue, Jul 09, 2019 at 03:19:26PM +0200, Eugen Hristev - M18282 wrote: > From: Eugen Hristev > > Hello, > > This series adds support for analog and digital filters for i2c controllers > > This series is based on the series: > [PATCH v2 0/9] i2c: at91: filters support for at91 SoCs > and enhanced to add the bindings for all controllers plus an extra binding > for the width of the spikes in nanoseconds. > > First, bindings are created for > 'i2c-ana-filter' > 'i2c-dig-filter' > 'i2c-filter-width-ns' > > The support is added in the i2c core to retrieve filter width and add it > to the timings structure. > Next, the at91 driver is enhanced for supporting digital filter, advanced > digital filter (with selectable spike width) and the analog filter. > > Finally the device tree for two boards are modified to make use of the > new properties. > > This series is the result of the comments on the ML in the direction > requested: to make the bindings globally available for i2c drivers. > > Changes in v3: > - made bindings global for i2c controllers and modified accordingly > - gave up PADFCDF bit because it's a lack in datasheet > - the computation on the width of the spike is based on periph clock as it > is done for hold time. > > Changes in v2: > - added device tree bindings and support for enable-ana-filt and > enable-dig-filt > - added the new properties to the DT for sama5d4_xplained/sama5d2_xplained > > Eugen Hristev (9): > dt-bindings: i2c: at91: add new compatible > dt-bindings: i2c: add bindings for i2c analog and digital filter > i2c: add support for filter-width-ns optional property > i2c: at91: add new platform support for sam9x60 > i2c: at91: add support for digital filtering > i2c: at91: add support for advanced digital filtering > i2c: at91: add support for analog filtering > ARM: dts: at91: sama5d2_xplained: add analog and digital filter for > i2c > ARM: dts: at91: sama5d4_xplained: add analog filter for i2c > > Documentation/devicetree/bindings/i2c/i2c-at91.txt | 3 +- > Documentation/devicetree/bindings/i2c/i2c.txt | 11 + > arch/arm/boot/dts/at91-sama5d2_xplained.dts| 6 +++ > arch/arm/boot/dts/at91-sama5d4_xplained.dts| 1 + > drivers/i2c/busses/i2c-at91-core.c | 38 + > drivers/i2c/busses/i2c-at91-master.c | 49 > -- > drivers/i2c/busses/i2c-at91.h | 13 ++ > drivers/i2c/i2c-core-base.c| 2 + > include/linux/i2c.h| 2 + > 9 files changed, 121 insertions(+), 4 deletions(-) Hi, I don't know if it will fit other vendors need concerning the binding but for Microchip it sounds good. Acked-by: Ludovic Desroches for the whole serie. Regards Ludovic
[PATCH v4 2/5] x86: hv: Add functions to allocate/deallocate page for Hyper-V
Introduce two new functions, hv_alloc_hyperv_page() and hv_free_hyperv_page(), to allocate/deallocate memory with the size and alignment that Hyper-V expects as a page. Although currently they are not used, they are ready to be used to allocate/deallocate memory on x86 when their ARM64 counterparts are implemented, keeping symmetry between architectures with potentially different guest page sizes. Link: https://lore.kernel.org/lkml/alpine.deb.2.21.1906272334560.32...@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/lkml/87muindr9c@vitty.brq.redhat.com/ Signed-off-by: Maya Nakamura Reviewed-by: Michael Kelley Reviewed-by: Vitaly Kuznetsov --- arch/x86/hyperv/hv_init.c | 14 ++ arch/x86/include/asm/mshyperv.h | 5 - 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 0e033ef11a9f..e8960a83add7 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -37,6 +37,20 @@ EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg); u32 hv_max_vp_index; EXPORT_SYMBOL_GPL(hv_max_vp_index); +void *hv_alloc_hyperv_page(void) +{ + BUILD_BUG_ON(PAGE_SIZE != HV_HYP_PAGE_SIZE); + + return (void *)__get_free_page(GFP_KERNEL); +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); + +void hv_free_hyperv_page(unsigned long addr) +{ + free_page(addr); +} +EXPORT_SYMBOL_GPL(hv_free_hyperv_page); + static int hv_cpu_init(unsigned int cpu) { u64 msr_vp_index; diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 2a793bf6ebb0..32ec9df39a99 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -218,7 +218,8 @@ static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu) void __init hyperv_init(void); void hyperv_setup_mmu_ops(void); - +void *hv_alloc_hyperv_page(void); +void hv_free_hyperv_page(unsigned long addr); void hyperv_reenlightenment_intr(struct pt_regs *regs); void set_hv_tscchange_cb(void (*cb)(void)); void clear_hv_tscchange_cb(void); @@ -241,6 +242,8 @@ static inline void hv_apic_init(void) {} #else /* CONFIG_HYPERV */ static inline void hyperv_init(void) {} static inline void hyperv_setup_mmu_ops(void) {} +static inline void *hv_alloc_hyperv_page(void) { return NULL; } +static inline void hv_free_hyperv_page(unsigned long addr) {} static inline void set_hv_tscchange_cb(void (*cb)(void)) {} static inline void clear_hv_tscchange_cb(void) {} static inline void hyperv_stop_tsc_emulation(void) {}; -- 2.17.1
[PATCH v4 3/5] hv: vmbus: Replace page definition with Hyper-V specific one
Replace PAGE_SIZE with HV_HYP_PAGE_SIZE because the guest page size may not be 4096 on all architectures and Hyper-V always runs with a page size of 4096. Signed-off-by: Maya Nakamura Reviewed-by: Michael Kelley Reviewed-by: Vitaly Kuznetsov --- drivers/hv/hyperv_vmbus.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 362e70e9d145..019469c3cbca 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -192,11 +192,11 @@ int hv_ringbuffer_read(struct vmbus_channel *channel, u64 *requestid, bool raw); /* - * Maximum channels is determined by the size of the interrupt page - * which is PAGE_SIZE. 1/2 of PAGE_SIZE is for send endpoint interrupt - * and the other is receive endpoint interrupt + * Maximum channels, 16348, is determined by the size of the interrupt page, + * which is HV_HYP_PAGE_SIZE. 1/2 of HV_HYP_PAGE_SIZE is to send endpoint + * interrupt, and the other is to receive endpoint interrupt. */ -#define MAX_NUM_CHANNELS ((PAGE_SIZE >> 1) << 3) /* 16348 channels */ +#define MAX_NUM_CHANNELS ((HV_HYP_PAGE_SIZE >> 1) << 3) /* The value here must be in multiple of 32 */ /* TODO: Need to make this configurable */ -- 2.17.1
Re: [PATCH -mm] autonuma: Fix scan period updating
On Thu, Jul 04, 2019 at 08:32:06AM +0800, Huang, Ying wrote: > Mel Gorman writes: > > > On Tue, Jun 25, 2019 at 09:23:22PM +0800, huang ying wrote: > >> On Mon, Jun 24, 2019 at 10:25 PM Mel Gorman wrote: > >> > > >> > On Mon, Jun 24, 2019 at 10:56:04AM +0800, Huang Ying wrote: > >> > > The autonuma scan period should be increased (scanning is slowed down) > >> > > if the majority of the page accesses are shared with other processes. > >> > > But in current code, the scan period will be decreased (scanning is > >> > > speeded up) in that situation. > >> > > > >> > > This patch fixes the code. And this has been tested via tracing the > >> > > scan period changing and /proc/vmstat numa_pte_updates counter when > >> > > running a multi-threaded memory accessing program (most memory > >> > > areas are accessed by multiple threads). > >> > > > >> > > >> > The patch somewhat flips the logic on whether shared or private is > >> > considered and it's not immediately obvious why that was required. That > >> > aside, other than the impact on numa_pte_updates, what actual > >> > performance difference was measured and on on what workloads? > >> > >> The original scanning period updating logic doesn't match the original > >> patch description and comments. I think the original patch > >> description and comments make more sense. So I fix the code logic to > >> make it match the original patch description and comments. > >> > >> If my understanding to the original code logic and the original patch > >> description and comments were correct, do you think the original patch > >> description and comments are wrong so we need to fix the comments > >> instead? Or you think we should prove whether the original patch > >> description and comments are correct? > >> > > > > I'm about to get knocked offline so cannot answer properly. The code may > > indeed be wrong and I have observed higher than expected NUMA scanning > > behaviour than expected although not enough to cause problems. A comment > > fix is fine but if you're changing the scanning behaviour, it should be > > backed up with data justifying that the change both reduces the observed > > scanning and that it has no adverse performance implications. > > Got it! Thanks for comments! As for performance testing, do you have > some candidate workloads? > Ordinarily I would hope that the patch was motivated by observed behaviour so you have a metric for goodness. However, for NUMA balancing I would typically run basic workloads first -- dbench, tbench, netperf, hackbench and pipetest. The objective would be to measure the degree automatic NUMA balancing is interfering with a basic workload to see if they patch reduces the number of minor faults incurred even though there is no NUMA balancing to be worried about. This measures the general overhead of a patch. If your reasoning is correct, you'd expect lower overhead. For balancing itself, I usually look at Andrea's original autonuma benchmark, NAS Parallel Benchmark (D class usually although C class for much older or smaller machines) and spec JBB 2005 and 2015. Of the JBB benchmarks, 2005 is usually more reasonable for evaluating NUMA balancing than 2015 is (which can be unstable for a variety of reasons). In this case, I would be looking at whether the overhead is reduced, whether the ratio of local hits is the same or improved and the primary metric of each (time to completion for Andrea's and NAS, throughput for JBB). Even if there is no change to locality and the primary metric but there is less scanning and overhead overall, it would still be an improvement. If you have trouble doing such an evaluation, I'll queue tests if they are based on a patch that addresses the specific point of concern (scan period not updated) as it's still not obvious why flipping the logic of whether shared or private is considered was necessary. -- Mel Gorman SUSE Labs
[PATCH v4 4/5] HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
Define the ring buffer size as a constant expression because it should not depend on the guest page size. Signed-off-by: Maya Nakamura Reviewed-by: Michael Kelley --- drivers/hid/hid-hyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c index 7795831d37c2..cc5b09b87ab0 100644 --- a/drivers/hid/hid-hyperv.c +++ b/drivers/hid/hid-hyperv.c @@ -104,8 +104,8 @@ struct synthhid_input_report { #pragma pack(pop) -#define INPUTVSC_SEND_RING_BUFFER_SIZE (10*PAGE_SIZE) -#define INPUTVSC_RECV_RING_BUFFER_SIZE (10*PAGE_SIZE) +#define INPUTVSC_SEND_RING_BUFFER_SIZE (40 * 1024) +#define INPUTVSC_RECV_RING_BUFFER_SIZE (40 * 1024) enum pipe_prot_msg_type { -- 2.17.1
[PATCH v4 5/5] Input: hv: Remove dependencies on PAGE_SIZE for ring buffer
Define the ring buffer size as a constant expression because it should not depend on the guest page size. Signed-off-by: Maya Nakamura Reviewed-by: Michael Kelley --- drivers/input/serio/hyperv-keyboard.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/input/serio/hyperv-keyboard.c b/drivers/input/serio/hyperv-keyboard.c index 8e457e50f837..88ae7c2ac3c8 100644 --- a/drivers/input/serio/hyperv-keyboard.c +++ b/drivers/input/serio/hyperv-keyboard.c @@ -75,8 +75,8 @@ struct synth_kbd_keystroke { #define HK_MAXIMUM_MESSAGE_SIZE 256 -#define KBD_VSC_SEND_RING_BUFFER_SIZE (10 * PAGE_SIZE) -#define KBD_VSC_RECV_RING_BUFFER_SIZE (10 * PAGE_SIZE) +#define KBD_VSC_SEND_RING_BUFFER_SIZE (40 * 1024) +#define KBD_VSC_RECV_RING_BUFFER_SIZE (40 * 1024) #define XTKBD_EMUL0 0xe0 #define XTKBD_EMUL1 0xe1 -- 2.17.1
Re: [PATCH v1] drm/modes: Skip invalid cmdline mode
12.07.2019 11:10, Maxime Ripard пишет: > On Thu, Jul 11, 2019 at 06:55:03PM +0300, Dmitry Osipenko wrote: >> 11.07.2019 12:03, Maxime Ripard пишет: >>> On Wed, Jul 10, 2019 at 06:05:18PM +0300, Dmitry Osipenko wrote: 10.07.2019 17:05, Maxime Ripard пишет: > On Wed, Jul 10, 2019 at 04:29:19PM +0300, Dmitry Osipenko wrote: >> This works: >> >> diff --git a/drivers/gpu/drm/drm_client_modeset.c >> b/drivers/gpu/drm/drm_client_modeset.c >> index 56d36779d213..e5a2f9c8f404 100644 >> --- a/drivers/gpu/drm/drm_client_modeset.c >> +++ b/drivers/gpu/drm/drm_client_modeset.c >> @@ -182,6 +182,8 @@ drm_connector_pick_cmdline_mode(struct drm_connector >> *connector) >> mode = drm_mode_create_from_cmdline_mode(connector->dev, >> cmdline_mode); >> if (mode) >> list_add(&mode->head, &connector->modes); >> + else >> + cmdline_mode->specified = false; > > Hmmm, it's not clear to me why that wouldn't be the case. > > If we come back to the beginning of that function, we retrieve the > cmdline_mode buffer from the connector pointer, that will probably > have been parsed a first time using drm_mode_create_from_cmdline_mode > in drm_helper_probe_add_cmdline_mode. > > Now, I'm guessing that the issue is that in > drm_mode_parse_command_line_for_connector, if we have a named mode, we > just copy the mode over and set mode->specified. > > And we then move over to do other checks, and that's probably what > fails and returns, but our drm_cmdline_mode will have been modified. > > I'm not entirely sure how to deal with that though. > > I guess we could allocate a drm_cmdline_mode structure on the stack, > fill that, and if successful copy over its content to the one in > drm_connector. That would allow us to only change the content on > success, which is what I would expect from such a function? > > How does that sound? I now see that there is DRM_MODE_TYPE_USERDEF flag that is assigned only for the "cmdline" mode and drm_client_rotation() is the only place in DRM code that cares about whether mode is from cmdline, hence looks like it will be more correct to do the following: >>> >>> I'm still under the impression that we're dealing with workarounds of >>> a more central issue, which is that we shouldn't return a partially >>> modified drm_cmdline_mode. >>> >>> You said it yourself, the breakage is in the commit changing the >>> command line parsing logic, while you're fixing here some code that >>> was introduced later on. >> >> The problem stems from assumption that *any* named mode is valid. It >> looks to me that the ultimate solution would be to move the mode's name >> comparison into the [1], if that's possible. >> >> [1] drm_mode_parse_command_line_for_connector() > > Well, one could argue that video=tegrafb is invalid and should be > rejected as well, but we haven't cleared that up. The video=tegrafb is invalid mode, there is nothing to argue here. And the problem is that invalid modes and not rejected for the very beginning. >>> Can you try the followintg patch? >>> http://code.bulix.org/8cwk4c-794565?raw >> >> This doesn't help because the problem with the rotation_reflection is >> that it's 0 if "rotation" not present in the cmdline and then ilog2(0) >> returns -1. So the patch "drm/modes: Don't apply cmdline's rotation if >> it wasn't specified" should be correct in any case. > > So we would have the same issue with rotate=0 then? No, we won't. Rotation mode is parsed into the DRM_MODE bitmask and rotate=0 corresponds to DRM_MODE_ROTATE_0, which is BIT(0) as you may notice. Hence rotation_reflection=0 is always an invalid value, meaning that "rotate" option does not present in the cmdline. Please consult the code, in particular see drm_mode_parse_cmdline_options() which was written by yourself ;)
[PATCH] fdt: Properly handle "no-map" field in the memory region
Mark the memory region with NOMAP flag instead of completely removing it from the memory blocks. That makes the FDT handling consistent with the EFI memory map handling. Cc: Rob Herring Cc: Frank Rowand Cc: devicet...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed --- drivers/of/fdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index de893c9..77982ae 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -1175,7 +1175,7 @@ int __init __weak early_init_dt_reserve_memory_arch(phys_addr_t base, phys_addr_t size, bool nomap) { if (nomap) - return memblock_remove(base, size); + return memblock_mark_nomap(base, size); return memblock_reserve(base, size); } -- 2.7.4
[PATCH v7 1/3] KVM: x86: add support for user wait instructions
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions. This patch adds support for user wait instructions in KVM. Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to set the maximum time. The behavior of user wait instructions in VMX non-root operation is determined first by the setting of the "enable user wait and pause" secondary processor-based VM-execution control bit 26. If the VM-execution control is 0, UMONITOR/UMWAIT/TPAUSE cause an invalid-opcode exception (#UD). If the VM-execution control is 1, treatment is based on the setting of the “RDTSC exiting” VM-execution control. Because KVM never enables RDTSC exiting, if the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay. If IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in EDX:EAX minus the value that RDTSC would return; if IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum of that difference and AND(IA32_UMWAIT_CONTROL,FFFCH). Because umwait and tpause can put a (psysical) CPU into a power saving state, by default we dont't expose it to kvm and enable it only when guest CPUID has it. Detailed information about user wait instructions can be found in the latest Intel 64 and IA-32 Architectures Software Developer's Manual. Co-developed-by: Jingqi Liu Signed-off-by: Jingqi Liu Signed-off-by: Tao Xu --- Changes in v7: - Add nested support for user wait instructions (Paolo) --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/vmx/nested.c | 1 + arch/x86/kvm/vmx/vmx.c | 20 4 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index a39136b0d509..8f00882664d3 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -69,6 +69,7 @@ #define SECONDARY_EXEC_PT_USE_GPA 0x0100 #define SECONDARY_EXEC_MODE_BASED_EPT_EXEC 0x0040 #define SECONDARY_EXEC_TSC_SCALING 0x0200 +#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE 0x0400 #define PIN_BASED_EXT_INTR_MASK 0x0001 #define PIN_BASED_NMI_EXITING 0x0008 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 4992e7c99588..7d2cd4066f64 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -402,7 +402,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | - F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B); + F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/; /* cpuid 7.0.edx*/ const u32 kvm_cpuid_7_0_edx_x86_features = diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 46af3a5e9209..a4d5da34b306 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2048,6 +2048,7 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) SECONDARY_EXEC_ENABLE_INVPCID | SECONDARY_EXEC_RDTSCP | SECONDARY_EXEC_XSAVES | + SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | SECONDARY_EXEC_APIC_REGISTER_VIRT | SECONDARY_EXEC_ENABLE_VMFUNC); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d98eac371c0a..f411c9ae5589 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2247,6 +2247,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, SECONDARY_EXEC_RDRAND_EXITING | SECONDARY_EXEC_ENABLE_PML | SECONDARY_EXEC_TSC_SCALING | + SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE | SECONDARY_EXEC_PT_USE_GPA | SECONDARY_EXEC_PT_CONCEAL_VMX | SECONDARY_EXEC_ENABLE_VMFUNC | @@ -3984,6 +3985,25 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx) } } + if (vmcs_config.cpu_based_2nd_exec_ctrl & + SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE) { + /* Exposing WAITPKG only when WAITPKG is exposed */ + bool waitpkg_enabled = + guest_cpuid_has(vcpu, X86_FEATURE_WAITPKG); + + if (!waitpkg_enabled) +
[PATCH v7 3/3] KVM: vmx: handle vm-exit for UMWAIT and TPAUSE
As the latest Intel 64 and IA-32 Architectures Software Developer's Manual, UMWAIT and TPAUSE instructions cause a VM exit if the RDTSC exiting and enable user wait and pause VM-execution controls are both 1. This patch is to handle the vm-exit for UMWAIT and TPAUSE as this should never happen. Co-developed-by: Jingqi Liu Signed-off-by: Jingqi Liu Signed-off-by: Tao Xu --- Changes in v7: - Add nested exit reason for UMWAIT and TPAUSE (Paolo) --- arch/x86/include/uapi/asm/vmx.h | 6 +- arch/x86/kvm/vmx/nested.c | 3 +++ arch/x86/kvm/vmx/vmx.c | 16 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index d213ec5c3766..d88d7a68849b 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -85,6 +85,8 @@ #define EXIT_REASON_PML_FULL62 #define EXIT_REASON_XSAVES 63 #define EXIT_REASON_XRSTORS 64 +#define EXIT_REASON_UMWAIT 67 +#define EXIT_REASON_TPAUSE 68 #define VMX_EXIT_REASONS \ { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \ @@ -142,7 +144,9 @@ { EXIT_REASON_RDSEED,"RDSEED" }, \ { EXIT_REASON_PML_FULL, "PML_FULL" }, \ { EXIT_REASON_XSAVES,"XSAVES" }, \ - { EXIT_REASON_XRSTORS, "XRSTORS" } + { EXIT_REASON_XRSTORS, "XRSTORS" }, \ + { EXIT_REASON_UMWAIT,"UMWAIT" }, \ + { EXIT_REASON_TPAUSE,"TPAUSE" } #define VMX_ABORT_SAVE_GUEST_MSR_FAIL1 #define VMX_ABORT_LOAD_HOST_PDPTE_FAIL 2 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index a4d5da34b306..9f91f834ec43 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5213,6 +5213,9 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason) case EXIT_REASON_ENCLS: /* SGX is never exposed to L1 */ return false; + case EXIT_REASON_UMWAIT: case EXIT_REASON_TPAUSE: + return nested_cpu_has2(vmcs12, + SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE); default: return true; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0787f140d155..e026b1313dc3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5349,6 +5349,20 @@ static int handle_monitor(struct kvm_vcpu *vcpu) return handle_nop(vcpu); } +static int handle_umwait(struct kvm_vcpu *vcpu) +{ + kvm_skip_emulated_instruction(vcpu); + WARN(1, "this should never happen\n"); + return 1; +} + +static int handle_tpause(struct kvm_vcpu *vcpu) +{ + kvm_skip_emulated_instruction(vcpu); + WARN(1, "this should never happen\n"); + return 1; +} + static int handle_invpcid(struct kvm_vcpu *vcpu) { u32 vmx_instruction_info; @@ -5559,6 +5573,8 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_VMFUNC] = handle_vmx_instruction, [EXIT_REASON_PREEMPTION_TIMER]= handle_preemption_timer, [EXIT_REASON_ENCLS] = handle_encls, + [EXIT_REASON_UMWAIT] = handle_umwait, + [EXIT_REASON_TPAUSE] = handle_tpause, }; static const int kvm_vmx_max_exit_handlers = -- 2.20.1
[PATCH v7 2/3] KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
UMWAIT and TPAUSE instructions use IA32_UMWAIT_CONTROL at MSR index E1H to determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. This patch emulates MSR IA32_UMWAIT_CONTROL in guest and differentiate IA32_UMWAIT_CONTROL between host and guest. The variable mwait_control_cached in arch/x86/power/umwait.c caches the MSR value, so this patch uses it to avoid frequently rdmsr of IA32_UMWAIT_CONTROL. Co-developed-by: Jingqi Liu Signed-off-by: Jingqi Liu Signed-off-by: Tao Xu --- Changes in v7: - Use the test on vmx->secondary_exec_control to replace guest_cpuid_has (Paolo) --- arch/x86/kernel/cpu/umwait.c | 3 ++- arch/x86/kvm/vmx/vmx.c | 33 + arch/x86/kvm/vmx/vmx.h | 9 + arch/x86/kvm/x86.c | 1 + 4 files changed, 45 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/umwait.c b/arch/x86/kernel/cpu/umwait.c index 6a204e7336c1..631152a67c6e 100644 --- a/arch/x86/kernel/cpu/umwait.c +++ b/arch/x86/kernel/cpu/umwait.c @@ -15,7 +15,8 @@ * Cache IA32_UMWAIT_CONTROL MSR. This is a systemwide control. By default, * umwait max time is 10 in TSC-quanta and C0.2 is enabled */ -static u32 umwait_control_cached = UMWAIT_CTRL_VAL(10, UMWAIT_C02_ENABLE); +u32 umwait_control_cached = UMWAIT_CTRL_VAL(10, UMWAIT_C02_ENABLE); +EXPORT_SYMBOL_GPL(umwait_control_cached); /* * Serialize access to umwait_control_cached and IA32_UMWAIT_CONTROL MSR in diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f411c9ae5589..0787f140d155 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1676,6 +1676,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) #endif case MSR_EFER: return kvm_get_msr_common(vcpu, msr_info); + case MSR_IA32_UMWAIT_CONTROL: + if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx)) + return 1; + + msr_info->data = vmx->msr_ia32_umwait_control; + break; case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) @@ -1838,6 +1844,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; vmcs_write64(GUEST_BNDCFGS, data); break; + case MSR_IA32_UMWAIT_CONTROL: + if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx)) + return 1; + + /* The reserved bit IA32_UMWAIT_CONTROL[1] should be zero */ + if (data & BIT_ULL(1)) + return 1; + + vmx->msr_ia32_umwait_control = data; + break; case MSR_IA32_SPEC_CTRL: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)) @@ -4139,6 +4155,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->rmode.vm86_active = 0; vmx->spec_ctrl = 0; + vmx->msr_ia32_umwait_control = 0; + vcpu->arch.microcode_version = 0x1ULL; vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val(); kvm_set_cr8(vcpu, 0); @@ -6352,6 +6370,19 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) msrs[i].host, false); } +static void atomic_switch_umwait_control_msr(struct vcpu_vmx *vmx) +{ + if (!vmx_has_waitpkg(vmx)) + return; + + if (vmx->msr_ia32_umwait_control != umwait_control_cached) + add_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL, + vmx->msr_ia32_umwait_control, + umwait_control_cached, false); + else + clear_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL); +} + static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val) { vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val); @@ -6460,6 +6491,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) atomic_switch_perf_msrs(vmx); + atomic_switch_umwait_control_msr(vmx); + vmx_update_hv_timer(vcpu); /* diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 61128b48c503..b4ca34f7a2da 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -14,6 +14,8 @@ extern const u32 vmx_msr_index[]; extern u64 host_efer; +extern u32 umwait_control_cached; + #define MSR_TYPE_R 1 #define MSR_TYPE_W 2 #define MSR_TYPE_RW3 @@ -194,6 +196,7 @@ struct vcpu_vmx { #endif u64 spec_ctrl; + u64 msr_ia32_umwait_control; u32 vm_entry_controls_shadow; u32 vm_exit_controls_shadow; @@ -523,6 +526,12 @@ static inline void decache_tsc_multiplier(struct vcpu_vmx *vmx) vmcs_write64(TSC_MULTIPLIER, vmx->current_tsc_ratio); } +static inline bool vmx
[PATCH v7 0/3] KVM: x86: Enable user wait instructions
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions. UMONITOR arms address monitoring hardware using an address. A store to an address within the specified address range triggers the monitoring hardware to wake up the processor waiting in umwait. UMWAIT instructs the processor to enter an implementation-dependent optimized state while monitoring a range of addresses. The optimized state may be either a light-weight power/performance optimized state (c0.1 state) or an improved power/performance optimized state (c0.2 state). TPAUSE instructs the processor to enter an implementation-dependent optimized state c0.1 or c0.2 state and wake up when time-stamp counter reaches specified timeout. Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. The patches enable the umonitor, umwait and tpause features in KVM. Because umwait and tpause can put a (psysical) CPU into a power saving state, by default we dont't expose it to kvm and enable it only when guest CPUID has it. If the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay (the time to delay relative to the VM’s timestamp counter). The release document ref below link: Intel 64 and IA-32 Architectures Software Developer's Manual, https://software.intel.com/sites/default/files/\ managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf Changelog: v7: Add nested support for user wait instructions (Paolo) Use the test on vmx->secondary_exec_control to replace guest_cpuid_has (Paolo) v6: add check msr_info->host_initiated in get/set msr(Xiaoyao) restore the atomic_switch_umwait_control_msr()(Xiaoyao) Tao Xu (3): KVM: x86: add support for user wait instructions KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL KVM: vmx: handle vm-exit for UMWAIT and TPAUSE arch/x86/include/asm/vmx.h | 1 + arch/x86/include/uapi/asm/vmx.h | 6 ++- arch/x86/kernel/cpu/umwait.c| 3 +- arch/x86/kvm/cpuid.c| 2 +- arch/x86/kvm/vmx/nested.c | 4 ++ arch/x86/kvm/vmx/vmx.c | 69 + arch/x86/kvm/vmx/vmx.h | 9 + arch/x86/kvm/x86.c | 1 + 8 files changed, 92 insertions(+), 3 deletions(-) -- 2.20.1
Re: linux-next: build failure after merge of the char-misc tree
On Fri, Jul 12, 2019 at 10:44:30AM +1000, Stephen Rothwell wrote: > Hi all, > > On Mon, 8 Jul 2019 19:23:45 +1000 Stephen Rothwell > wrote: > > > > After merging the char-misc tree, today's linux-next build (x86_64 > > allmodconfig) failed like this: > > > > drivers/misc/vmw_balloon.c: In function 'vmballoon_mount': > > drivers/misc/vmw_balloon.c:1736:14: error: 'simple_dname' undeclared (first > > use in this function); did you mean 'simple_rename'? > >.d_dname = simple_dname, > > ^~~~ > > simple_rename > > drivers/misc/vmw_balloon.c:1736:14: note: each undeclared identifier is > > reported only once for each function it appears in > > drivers/misc/vmw_balloon.c:1739:9: error: implicit declaration of function > > 'mount_pseudo'; did you mean 'mount_bdev'? > > [-Werror=implicit-function-declaration] > > return mount_pseudo(fs_type, "balloon-vmware:", NULL, &ops, > > ^~~~ > > mount_bdev > > drivers/misc/vmw_balloon.c:1739:9: warning: returning 'int' from a function > > with return type 'struct dentry *' makes pointer from integer without a > > cast [-Wint-conversion] > > return mount_pseudo(fs_type, "balloon-vmware:", NULL, &ops, > > ^~~~ > > BALLOON_VMW_MAGIC); > > ~~ > > > > Caused by commit > > > > 83a8afa72e9c ("vmw_balloon: Compaction support") > > > > interacting with commits > > > > 7e5f7bb08b8c ("unexport simple_dname()") > > 8d9e46d80777 ("fold mount_pseudo_xattr() into pseudo_fs_get_tree()") > > > > from the vfs tree. > > > > I applied the following merge fix patch: > > > > From: Stephen Rothwell > > Date: Mon, 8 Jul 2019 19:17:56 +1000 > > Subject: [PATCH] convert vmwballoon to use the new mount API > > > > Signed-off-by: Stephen Rothwell > > --- > > drivers/misc/vmw_balloon.c | 14 -- > > 1 file changed, 4 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c > > index 91fa43051535..e8c0f7525f13 100644 > > --- a/drivers/misc/vmw_balloon.c > > +++ b/drivers/misc/vmw_balloon.c > > @@ -29,6 +29,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -1728,21 +1729,14 @@ static inline void vmballoon_debugfs_exit(struct > > vmballoon *b) > > > > #ifdef CONFIG_BALLOON_COMPACTION > > > > -static struct dentry *vmballoon_mount(struct file_system_type *fs_type, > > - int flags, const char *dev_name, > > - void *data) > > +static int vmballoon_init_fs_context(struct fs_context *fc) > > { > > - static const struct dentry_operations ops = { > > - .d_dname = simple_dname, > > - }; > > - > > - return mount_pseudo(fs_type, "balloon-vmware:", NULL, &ops, > > - BALLOON_VMW_MAGIC); > > + return init_pseudo(fc, BALLOON_VMW_MAGIC) ? 0 : -ENOMEM; > > } > > > > static struct file_system_type vmballoon_fs = { > > .name = "balloon-vmware", > > - .mount = vmballoon_mount, > > + .init_fs_context = vmballoon_init_fs_context, > > .kill_sb= kill_anon_super, > > }; > > > > This is now a conflict between the vfs tree and Linus' tree. Looks good to me, I'll watch out for this when Al's tree is merged. thanks, greg k-h
Re: Staging status of speakup
On Sun, Jul 07, 2019 at 08:57:10AM +0200, Greg Kroah-Hartman wrote: > On Sat, Jul 06, 2019 at 08:08:57PM +0100, Okash Khawaja wrote: > > On Fri, 15 Mar 2019 20:18:31 -0700 > > Greg Kroah-Hartman wrote: > > > > > On Fri, Mar 15, 2019 at 01:01:27PM +, Okash Khawaja wrote: > > > > Hi, > > > > > > > > We have made progress on the items in TODO file of speakup driver in > > > > staging directory and wanted to get some clarity on the remaining > > > > items. Below is a summary of status of each item along with the > > > > quotes from TODO file. > > > > > > > > 1. "The first issue has to do with the way speakup communicates > > > > with serial ports. Currently, we communicate directly with the > > > > hardware ports. This however conflicts with the standard serial > > > > port drivers, which poses various problems. This is also not > > > > working for modern hardware such as PCI-based serial ports. Also, > > > > there is not a way we can communicate with USB devices. The > > > > current serial port handling code is in serialio.c in this > > > > directory." > > > > > > > > Drivers for all external synths now use TTY to communcate with the > > > > devices. Only ones still using direct communication with hardware > > > > ports are internal synths: acntpc, decpc, dtlk and keypc. These are > > > > typically ISA cards and generally hardware which is difficult to > > > > make work. We can leave these in staging. > > > > > > Ok, that's fine. > > > > > > > 2. "Some places are currently using in_atomic() because speakup > > > > functions are called in various contexts, and a couple of things > > > > can't happen in these cases. Pushing work to some worker thread > > > > would probably help, as was already done for the serial port > > > > driving part." > > > > > > > > There aren't any uses of in_atomic anymore. Commit d7500135802c > > > > "Staging: speakup: Move pasting into a work item" was the last one > > > > that removed such uses. > > > > > > Great, let's remove that todo item then. > > > > > > > 3. "There is a duplication of the selection functions in > > > > selections.c. These functions should get exported from > > > > drivers/char/selection.c (clear_selection notably) and used from > > > > there instead." > > > > > > > > This is yet to be done. I guess drivers/char/selection.c is now > > > > under drivers/tty/vt/selection.c. > > > > > > Yes, someone should update the todo item :) > > > > > > > 4. "The kobjects may have to move to a more proper place in /sys.The > > > > discussion on lkml resulted to putting speech synthesizers in the > > > > "speech" class, and the speakup screen reader itself > > > > into /sys/class/vtconsole/vtcon0/speakup, the nasty path being > > > > handled by userland tools." > > > > > > > > Although this makes logical sense, the change will mean changing > > > > interface with userspace and hence the user space tools. I tried to > > > > search the lkml discussion but couldn't find it. It will be good to > > > > know your thoughts on this. > > > > > > I don't remember, sorry. I can review the kobject/sysfs usage if you > > > think it is "good enough" now and see if I find anything > > > objectionable. > > > > > > > Finally there is an issue where text in output buffer sometimes gets > > > > garbled on SMP systems, but we can continue working on it after the > > > > driver is moved out of staging, if that's okay. Basically we need a > > > > reproducer of this issue. > > > > > > > > In addition to above, there are likely code style issues which will > > > > need to be fixed. > > > > > > > > We are very keen to get speakup out of staging both, for settling > > > > the driver but also for getting included in distros which build > > > > only the mainline drivers. > > > > > > That's great, I am glad to see this happen. How about work on the > > > selection thing and then I can review the kobject stuff in a few > > > weeks, and then we can start moving things for 5.2? > > > > Hi Greg, > > > > Apologies for the delay. I de-duplicated selection code in speakup to > > use code that's already in kernel (commit ids 496124e5e16e and > > 41f13084506a). Following items are what remain now: > > > > 1. moving kobjects location > > 2. fixing garbled text > > > > I couldn't replicate garbled text but Simon (also in CC list) is > > looking into it. > > > > Can you please advise on the way forward? > > I don't think the "garbled text" is an issue to get this out of staging > if others do not see this. It can be fixed like any other bug at a > later point if it is figured out. > > The kobject stuff does need to be looked at. Let me carve out some time > next week to do that and I will let you know what I see/recommend. At first glance, this might all be just fine. But, I can't quite figure out what some files are doing. No matter what, you will need Documentation/ABI/ entries for the speakup code for these sysfs files. Can you make up a patch to create a drivers/
Re: [PATCH v3] media: si2168: Refactor command setup code
Hello, On Thu, Jul 04, 2019 at 12:33:22PM +0200, Marc Gonzalez wrote: > Refactor the command setup code, and let the compiler determine > the size of each command. > > Reviewed-by: Jonathan Neuschäfer > Signed-off-by: Marc Gonzalez > --- > Changes from v1: > - Use a real function to populate struct si2168_cmd *cmd, and a trivial > macro wrapping it (macro because sizeof). > Changes from v2: > - Fix header mess > - Add Jonathan's tag > --- > drivers/media/dvb-frontends/si2168.c | 146 +-- > 1 file changed, 45 insertions(+), 101 deletions(-) > > diff --git a/drivers/media/dvb-frontends/si2168.c > b/drivers/media/dvb-frontends/si2168.c > index c64b360ce6b5..5e81e076369c 100644 > --- a/drivers/media/dvb-frontends/si2168.c > +++ b/drivers/media/dvb-frontends/si2168.c > @@ -12,6 +12,16 @@ > > static const struct dvb_frontend_ops si2168_ops; > > +static void cmd_setup(struct si2168_cmd *cmd, char *args, int wlen, int rlen) I'd add an "inline" here. And you could add a const for *args. > +{ > + memcpy(cmd->args, args, wlen); > + cmd->wlen = wlen; > + cmd->rlen = rlen; > +} > + > +#define CMD_SETUP(cmd, args, rlen) \ > + cmd_setup(cmd, args, sizeof(args) - 1, rlen) Here is the chance to add some static checking. Also it is a good habit to put parens around macro arguments. Something like: #define CMD_SETUP(cmd, args, rlen) ({ \ BUILD_BUG_ON(sizeof((args)) - 1 > SI2168_ARGLEN); cmd_setup((cmd), (args), __must_be_array((args)) + sizeof((args)) - 1, (rlen)); Maybe let this macro live in drivers/media/dvb-frontends/si2168_priv.h where struct si2168_cmd is defined? I looked over the transformations in the rest of the patch and this looks good. Best regards Uwe signature.asc Description: PGP signature
Re: [PATCH v2] printk: Do not lose last line in kmsg buffer dump
On Fri, Jul 12, 2019 at 10:09:04AM +0200, Petr Mladek wrote: > The patch looks like a hack using a hole that the next cycle > does not longer check the number of really stored characters. > > What would happen when msg_print_text() starts adding > the trailing '\0' as suggested by > https://lkml.kernel.org/r/20190710121049.rwhk7fknfzn3c...@pathway.suse.cz I did have a look at that possibility, but I didn't see how that could work without potentially affecting userspace users of the syslog ABI. AFAICS the suggested change in msg_print_text() can be done in one of three ways: (1) msg_print_text() adds the '\0' and includes this length both when it estimates the size (NULL buffer) and when it actually prints: If we do this: - kmsg_dump_get_line_nolock() would have to subtract 1 from the len since its callers expected that len is always smaller than the size of the buffer. - The buffers given to use via the syslog interface will now include a '\0', potentially affecting userspace applications which use this ABI. (2) msg_print_text() adds the '\0', and includes this in the length only when estimating the size, and not when it actually prints. If we do this: - SYSLOG_ACTION_SIZE_UNREAD tries uses the size estimate to give userspace a count of how many characters are present in the buffer, and now this count will start differing from the actual count that can be read, potentially affecting userspace applications. (3) msg_print_text() adds the '\0', and does not include this length in the result at all. If we do this: - The original kmsg dump issue is not solved, since the last line is still lost. > BTW: What is the motivation for this fix? Is a bug report > or just some research of possible buffer overflows? The fix is not attempting to fix a buffer overflow, theoretical or otherwise. It's a fix for a bug in functionality which has been observed on our systems: We use pstore to save the kernel log when the kernel crashes, and sometimes the log in the pstore misses the last line, and since the last line usual says why we're panicing so it's rather important not to miss. > The commit message pretends that the problem is bigger than > it really is. It is about one byte and not one line. I'm not quite sure I follow. The current code does fail to include the *entire* last line. The memcpy on line #1294 is never executed for the last line because we stop the loop because of the check on line #1289: 1270 static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size) 1271 { 1272 const char *text = log_text(msg); 1273 size_t text_size = msg->text_len; 1274 size_t len = 0; 1275 1276 do { 1277 const char *next = memchr(text, '\n', text_size); 1278 size_t text_len; 1279 1280 if (next) { 1281 text_len = next - text; 1282 next++; 1283 text_size -= next - text; 1284 } else { 1285 text_len = text_size; 1286 } 1287 1288 if (buf) { 1289 if (print_prefix(msg, syslog, NULL) + 1290 text_len + 1 > size - len) 1291 break; 1292 1293 len += print_prefix(msg, syslog, buf + len); 1294 memcpy(buf + len, text, text_len); 1295 len += text_len; 1296 buf[len++] = '\n'; 1297 } else { 1298 /* SYSLOG_ACTION_* buffer size only calculation */ 1299 len += print_prefix(msg, syslog, NULL); 1300 len += text_len; 1301 len++; 1302 }
Re: linux-next: Fixes tag needs some work in the block tree
On 19-07-11 16:03:22, Jens Axboe wrote: > On 7/11/19 3:35 PM, Stephen Rothwell wrote: > > Hi all, > > > > In commit > > > >8f3858763d33 ("nvme: fix NULL deref for fabrics options") > > > > Fixes tag > > > >Fixes: 958f2a0f8 ("nvme-tcp: set the STABLE_WRITES flag when data digests > > > > has these problem(s): > > > >- SHA1 should be at least 12 digits long > > Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11 > > or later) just making sure it is not set (or set to "auto"). > >- Subject has leading but no trailing parentheses > >- Subject has leading but no trailing quotes > > > > Please do not split Fixes tags over more than one line. Also do not > > include blank lines among the tags. I'm sorry for noises here. I will keep that in mind. Thanks Stephen, > > I should have caught that. Since it's top-of-tree and recent, I'll > amend it. Jens, I will do it from the next time. Thanks for ammend.
[PATCH] staging: android: ion: Remove unused rbtree for ion_buffer
ion_buffer_add() insert ion_buffer into rbtree every time creating an ion_buffer but never use it after ION reworking. Also, buffer_lock protects only rbtree operation, remove it together. Signed-off-by: Lecopzer Chen Cc: YJ Chiang Cc: Lecopzer Chen --- drivers/staging/android/ion/ion.c | 36 --- drivers/staging/android/ion/ion.h | 10 + 2 files changed, 1 insertion(+), 45 deletions(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 92c2914239e3..e6b1ca141b93 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -29,32 +29,6 @@ static struct ion_device *internal_dev; static int heap_id; -/* this function should only be called while dev->lock is held */ -static void ion_buffer_add(struct ion_device *dev, - struct ion_buffer *buffer) -{ - struct rb_node **p = &dev->buffers.rb_node; - struct rb_node *parent = NULL; - struct ion_buffer *entry; - - while (*p) { - parent = *p; - entry = rb_entry(parent, struct ion_buffer, node); - - if (buffer < entry) { - p = &(*p)->rb_left; - } else if (buffer > entry) { - p = &(*p)->rb_right; - } else { - pr_err("%s: buffer already found.", __func__); - BUG(); - } - } - - rb_link_node(&buffer->node, parent, p); - rb_insert_color(&buffer->node, &dev->buffers); -} - /* this function should only be called while dev->lock is held */ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, struct ion_device *dev, @@ -100,9 +74,6 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, INIT_LIST_HEAD(&buffer->attachments); mutex_init(&buffer->lock); - mutex_lock(&dev->buffer_lock); - ion_buffer_add(dev, buffer); - mutex_unlock(&dev->buffer_lock); return buffer; err1: @@ -131,11 +102,6 @@ void ion_buffer_destroy(struct ion_buffer *buffer) static void _ion_buffer_destroy(struct ion_buffer *buffer) { struct ion_heap *heap = buffer->heap; - struct ion_device *dev = buffer->dev; - - mutex_lock(&dev->buffer_lock); - rb_erase(&buffer->node, &dev->buffers); - mutex_unlock(&dev->buffer_lock); if (heap->flags & ION_HEAP_FLAG_DEFER_FREE) ion_heap_freelist_add(heap, buffer); @@ -694,8 +660,6 @@ static int ion_device_create(void) } idev->debug_root = debugfs_create_dir("ion", NULL); - idev->buffers = RB_ROOT; - mutex_init(&idev->buffer_lock); init_rwsem(&idev->lock); plist_head_init(&idev->heaps); internal_dev = idev; diff --git a/drivers/staging/android/ion/ion.h b/drivers/staging/android/ion/ion.h index e291299fd35f..74914a266e25 100644 --- a/drivers/staging/android/ion/ion.h +++ b/drivers/staging/android/ion/ion.h @@ -23,7 +23,6 @@ /** * struct ion_buffer - metadata for a particular buffer - * @node: node in the ion_device buffers tree * @list: element in list of deferred freeable buffers * @dev: back pointer to the ion_device * @heap: back pointer to the heap the buffer came from @@ -39,10 +38,7 @@ * @attachments: list of devices attached to this buffer */ struct ion_buffer { - union { - struct rb_node node; - struct list_head list; - }; + struct list_head list; struct ion_device *dev; struct ion_heap *heap; unsigned long flags; @@ -61,14 +57,10 @@ void ion_buffer_destroy(struct ion_buffer *buffer); /** * struct ion_device - the metadata of the ion device node * @dev: the actual misc device - * @buffers: an rb tree of all the existing buffers - * @buffer_lock: lock protecting the tree of buffers * @lock: rwsem protecting the tree of heaps and clients */ struct ion_device { struct miscdevice dev; - struct rb_root buffers; - struct mutex buffer_lock; struct rw_semaphore lock; struct plist_head heaps; struct dentry *debug_root; -- 2.17.1
[PATCH] mm: sparse: Skip no-map regions in memblocks_present
Do not mark regions that are marked with nomap to be present, otherwise these memblock cause unnecessarily allocation of metadata. Cc: Andrew Morton Cc: Pavel Tatashin Cc: Oscar Salvador Cc: Michal Hocko Cc: Mike Rapoport Cc: Baoquan He Cc: Qian Cai Cc: Wei Yang Cc: Logan Gunthorpe Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed --- mm/sparse.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/sparse.c b/mm/sparse.c index fd13166..33810b6 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -256,6 +256,10 @@ void __init memblocks_present(void) struct memblock_region *reg; for_each_memblock(memory, reg) { + + if (memblock_is_nomap(reg)) + continue; + memory_present(memblock_get_region_node(reg), memblock_region_memory_base_pfn(reg), memblock_region_memory_end_pfn(reg)); -- 2.7.4
[PATCH] rdma/siw: avoid smp_store_mb() on a u64
The new siw driver fails to build on i386 with drivers/infiniband/sw/siw/siw_qp.c:1025:3: error: invalid output size for constraint '+q' smp_store_mb(*cq->notify, SIW_NOTIFY_NOT); ^ include/asm-generic/barrier.h:141:35: note: expanded from macro 'smp_store_mb' #define smp_store_mb(var, value) __smp_store_mb(var, value) ^ arch/x86/include/asm/barrier.h:65:47: note: expanded from macro '__smp_store_mb' #define __smp_store_mb(var, value) do { (void)xchg(&var, value); } while (0) ^ include/asm-generic/atomic-instrumented.h:1648:2: note: expanded from macro 'xchg' arch_xchg(__ai_ptr, __VA_ARGS__); \ ^ arch/x86/include/asm/cmpxchg.h:78:27: note: expanded from macro 'arch_xchg' #define arch_xchg(ptr, v) __xchg_op((ptr), (v), xchg, "") ^ arch/x86/include/asm/cmpxchg.h:48:19: note: expanded from macro '__xchg_op' : "+q" (__ret), "+m" (*(ptr)) \ ^ drivers/infiniband/sw/siw/siw_qp.o: In function `siw_sqe_complete': siw_qp.c:(.text+0x1450): undefined reference to `__xchg_wrong_size' drivers/infiniband/sw/siw/siw_qp.o: In function `siw_rqe_complete': siw_qp.c:(.text+0x15b0): undefined reference to `__xchg_wrong_size' drivers/infiniband/sw/siw/siw_verbs.o: In function `siw_req_notify_cq': siw_verbs.c:(.text+0x18ff): undefined reference to `__xchg_wrong_size' Since smp_store_mb() has to be an atomic store, but the architecture can only do this on 32-bit quantities or smaller, but 'cq->notify' is a 64-bit word. Apparently the smp_store_mb() is paired with a READ_ONCE() here, which seems like an odd choice because there is only a barrier on the writer side and not the reader, and READ_ONCE() is already not atomic on quantities larger than a CPU register. I suspect it is sufficient to use the (possibly nonatomic) WRITE_ONCE() and an SMP memory barrier here. If it does need to be atomic as well as 64-bit quantities, using an atomic64_set_release()/atomic64_read_acquire() may be a better choice. Fixes: 303ae1cdfdf7 ("rdma/siw: application interface") Fixes: f29dd55b0236 ("rdma/siw: queue pair methods") Cc: Peter Zijlstra Signed-off-by: Arnd Bergmann --- drivers/infiniband/sw/siw/siw_qp.c| 4 +++- drivers/infiniband/sw/siw/siw_verbs.c | 5 +++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp.c b/drivers/infiniband/sw/siw/siw_qp.c index 11383d9f95ef..a2c08f17f13d 100644 --- a/drivers/infiniband/sw/siw/siw_qp.c +++ b/drivers/infiniband/sw/siw/siw_qp.c @@ -1016,13 +1016,15 @@ static bool siw_cq_notify_now(struct siw_cq *cq, u32 flags) if (!cq->base_cq.comp_handler) return false; + smp_rmb(); cq_notify = READ_ONCE(*cq->notify); if ((cq_notify & SIW_NOTIFY_NEXT_COMPLETION) || ((cq_notify & SIW_NOTIFY_SOLICITED) && (flags & SIW_WQE_SOLICITED))) { /* dis-arm CQ */ - smp_store_mb(*cq->notify, SIW_NOTIFY_NOT); + WRITE_ONCE(*cq->notify, SIW_NOTIFY_NOT); + smp_wmb(); return true; } diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c index 32dc79d0e898..41c5ab293fe1 100644 --- a/drivers/infiniband/sw/siw/siw_verbs.c +++ b/drivers/infiniband/sw/siw/siw_verbs.c @@ -1142,10 +1142,11 @@ int siw_req_notify_cq(struct ib_cq *base_cq, enum ib_cq_notify_flags flags) if ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED) /* CQ event for next solicited completion */ - smp_store_mb(*cq->notify, SIW_NOTIFY_SOLICITED); + WRITE_ONCE(*cq->notify, SIW_NOTIFY_SOLICITED); else /* CQ event for any signalled completion */ - smp_store_mb(*cq->notify, SIW_NOTIFY_ALL); + WRITE_ONCE(*cq->notify, SIW_NOTIFY_ALL); + smp_wmb(); if (flags & IB_CQ_REPORT_MISSED_EVENTS) return cq->cq_put - cq->cq_get; -- 2.20.0
[PATCH] rdma/siw: select CONFIG_DMA_VIRT_OPS
Without this symbol we get a link failure: ERROR: "dma_virt_ops" [drivers/infiniband/sw/siw/siw.ko] undefined! Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Signed-off-by: Arnd Bergmann --- drivers/infiniband/sw/siw/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/sw/siw/Kconfig b/drivers/infiniband/sw/siw/Kconfig index 94f684174ce3..ea282789f466 100644 --- a/drivers/infiniband/sw/siw/Kconfig +++ b/drivers/infiniband/sw/siw/Kconfig @@ -1,6 +1,7 @@ config RDMA_SIW tristate "Software RDMA over TCP/IP (iWARP) driver" depends on INET && INFINIBAND && CRYPTO_CRC32 + select DMA_VIRT_OPS help This driver implements the iWARP RDMA transport over the Linux TCP/IP network stack. It enables a system with a -- 2.20.0
[PATCH] rdma/siw: fix enum type mismatch warnings
The values in map_cqe_status[] don't match the type: drivers/infiniband/sw/siw/siw_cq.c:31:4: error: implicit conversion from enumeration type 'enum siw_wc_status' to different enumeration type 'enum siw_opcode' [-Werror,-Wenum-conversion] { SIW_WC_SUCCESS, IB_WC_SUCCESS }, ~ ^~ drivers/infiniband/sw/siw/siw_cq.c:32:4: error: implicit conversion from enumeration type 'enum siw_wc_status' to different enumeration type 'enum siw_opcode' [-Werror,-Wenum-conversion] { SIW_WC_LOC_LEN_ERR, IB_WC_LOC_LEN_ERR }, ~ ^~ Change the struct definition to make them match and stop the warning. Fixes: b0fff7317bb4 ("rdma/siw: completion queue methods") Signed-off-by: Arnd Bergmann --- drivers/infiniband/sw/siw/siw_cq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/sw/siw/siw_cq.c b/drivers/infiniband/sw/siw/siw_cq.c index e2a0ee40d5b5..e381ae9b7d62 100644 --- a/drivers/infiniband/sw/siw/siw_cq.c +++ b/drivers/infiniband/sw/siw/siw_cq.c @@ -25,7 +25,7 @@ static int map_wc_opcode[SIW_NUM_OPCODES] = { }; static struct { - enum siw_opcode siw; + enum siw_wc_status siw; enum ib_wc_status ib; } map_cqe_status[SIW_NUM_WC_STATUS] = { { SIW_WC_SUCCESS, IB_WC_SUCCESS }, -- 2.20.0
[PATCH] platform/x86: pcengines-apu2 needs gpiolib
I ran into another build issue in randconfig testing for this driver, when CONFIG_GPIOLIB is not set: WARNING: unmet direct dependencies detected for GPIO_AMD_FCH Depends on [n]: GPIOLIB [=n] && HAS_IOMEM [=y] Selected by [y]: - PCENGINES_APU2 [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && INPUT [=y] && INPUT_KEYBOARD [=y] && LEDS_CLASS [=y] WARNING: unmet direct dependencies detected for KEYBOARD_GPIO_POLLED Depends on [n]: !UML && INPUT [=y] && INPUT_KEYBOARD [=y] && GPIOLIB [=n] Selected by [y]: - PCENGINES_APU2 [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && INPUT [=y] && INPUT_KEYBOARD [=y] && LEDS_CLASS [=y] Make the 'select' statements conditional on that so we don't have to introduce another 'select'. Fixes: f8eb0235f659 ("x86: pcengines apuv2 gpio/leds/keys platform driver") Fixes: a422bf11bdb4 ("platform/x86: fix PCENGINES_APU2 Kconfig warning") Signed-off-by: Arnd Bergmann --- drivers/platform/x86/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index e869a5c760b6..cf48b9068843 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -1324,8 +1324,8 @@ config PCENGINES_APU2 tristate "PC Engines APUv2/3 front button and LEDs driver" depends on INPUT && INPUT_KEYBOARD depends on LEDS_CLASS - select GPIO_AMD_FCH - select KEYBOARD_GPIO_POLLED + select GPIO_AMD_FCH if GPIOLIB + select KEYBOARD_GPIO_POLLED if GPIOLIB select LEDS_GPIO help This driver provides support for the front button and LEDs on -- 2.20.0
Re: BUG: MAX_STACK_TRACE_ENTRIES too low! (2)
On Thu, Jul 11, 2019 at 11:53:12AM -0700, Bart Van Assche wrote: > On 7/10/19 3:09 PM, Peter Zijlstra wrote: > > One thing I mentioned when Thomas did the unwinder API changes was > > trying to move lockdep over to something like stackdepot. > > > > We can't directly use stackdepot as is, because it uses locks and memory > > allocation, but we could maybe add a lower level API to it and use that > > under the graph_lock() on static storage or something. > > > > Otherwise we'll have to (re)implement something like it. > > > > I've not looked at it in detail. > > Hi Peter, > > Is something like the untested patch below perhaps what you had in mind? Most excellent, yes! Now I suppose the $64000 question is if it actually reduces the amount of storage we use for stack traces.. Seems to boot just fine.. :-)
[PATCH] ASoC: audio-graph-card: fix type mismatch warning
The new temporary variable is lacks a 'const' annotation: sound/soc/generic/audio-graph-card.c:87:7: error: assigning to 'u32 *' (aka 'unsigned int *') from 'const void *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] Fixes: c152f8491a8d ("ASoC: audio-graph-card: fix an use-after-free in graph_get_dai_id()") Signed-off-by: Arnd Bergmann --- sound/soc/generic/audio-graph-card.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c index c8abb86afefa..288df245b2f0 100644 --- a/sound/soc/generic/audio-graph-card.c +++ b/sound/soc/generic/audio-graph-card.c @@ -63,7 +63,7 @@ static int graph_get_dai_id(struct device_node *ep) struct device_node *endpoint; struct of_endpoint info; int i, id; - u32 *reg; + const u32 *reg; int ret; /* use driver specified DAI ID if exist */ -- 2.20.0
[PATCH 1/2] f2fs: introduce {page,io}_is_mergeable() for readability
Wrap merge condition into function for readability, no logic change. Signed-off-by: Chao Yu --- v2: remove bio validation check in page_is_mergeable(). fs/f2fs/data.c | 40 +--- 1 file changed, 33 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 6a8db4abdf5f..f1e401f9fc13 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -482,6 +482,33 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) return 0; } +static bool page_is_mergeable(struct f2fs_sb_info *sbi, struct bio *bio, + block_t last_blkaddr, block_t cur_blkaddr) +{ + if (last_blkaddr != cur_blkaddr) + return false; + return __same_bdev(sbi, cur_blkaddr, bio); +} + +static bool io_type_is_mergeable(struct f2fs_bio_info *io, + struct f2fs_io_info *fio) +{ + if (io->fio.op != fio->op) + return false; + return io->fio.op_flags == fio->op_flags; +} + +static bool io_is_mergeable(struct f2fs_sb_info *sbi, struct bio *bio, + struct f2fs_bio_info *io, + struct f2fs_io_info *fio, + block_t last_blkaddr, + block_t cur_blkaddr) +{ + if (!page_is_mergeable(sbi, bio, last_blkaddr, cur_blkaddr)) + return false; + return io_type_is_mergeable(io, fio); +} + int f2fs_merge_page_bio(struct f2fs_io_info *fio) { struct bio *bio = *fio->bio; @@ -495,8 +522,8 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) trace_f2fs_submit_page_bio(page, fio); f2fs_trace_ios(fio, 0); - if (bio && (*fio->last_block + 1 != fio->new_blkaddr || - !__same_bdev(fio->sbi, fio->new_blkaddr, bio))) { + if (bio && !page_is_mergeable(fio->sbi, bio, *fio->last_block, + fio->new_blkaddr)) { __submit_bio(fio->sbi, bio, fio->type); bio = NULL; } @@ -569,9 +596,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) inc_page_count(sbi, WB_DATA_TYPE(bio_page)); - if (io->bio && (io->last_block_in_bio != fio->new_blkaddr - 1 || - (io->fio.op != fio->op || io->fio.op_flags != fio->op_flags) || - !__same_bdev(sbi, fio->new_blkaddr, io->bio))) + if (io->bio && !io_is_mergeable(sbi, io->bio, io, fio, + io->last_block_in_bio, fio->new_blkaddr)) __submit_merged_bio(io); alloc_new: if (io->bio == NULL) { @@ -1643,8 +1669,8 @@ static int f2fs_read_single_page(struct inode *inode, struct page *page, * This page will go to BIO. Do we need to send this * BIO off first? */ - if (bio && (*last_block_in_bio != block_nr - 1 || - !__same_bdev(F2FS_I_SB(inode), block_nr, bio))) { + if (bio && !page_is_mergeable(F2FS_I_SB(inode), bio, + *last_block_in_bio, block_nr - 1)) { submit_and_realloc: __submit_bio(F2FS_I_SB(inode), bio, DATA); bio = NULL; -- 2.18.0.rc1
Re: [PATCH] arm: Extend the check for RAM in /dev/mem
On Fri, Jul 12, 2019 at 02:58:18AM +, Raslan, KarimAllah wrote: > On Fri, 2019-07-12 at 08:06 +0530, Anshuman Khandual wrote: > > > > On 07/12/2019 03:51 AM, KarimAllah Ahmed wrote: > > > > > > Some valid RAM can live outside kernel control (e.g. using mem= kernel > > > command-line). For these regions, pfn_valid would return "false" causing > > > system RAM to be mapped as uncached. Use memblock instead to identify RAM. > > > > Once the remaining memory is outside of the kernel (as the admin would have > > intended with mem= command line) what is the particular concern regarding > > the way those get mapped (cached or not) ? It is not to be used any way. > > They can be used by user-space which might lead to them being used by the > kernel. One use-case would be using them as guest memory for KVM as I > detailed > here: > > https://lwn.net/Articles/778240/ >From the 32-bit ARM point of view... What if someone's already doing something similar with a non-coherent DSP and is relying on the current behaviour? This change is a user visible behavioural change that could end up breaking userspace. In other words, it isn't something we should rush into. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
[PATCH] [net-next, netfilter] mlx5: avoid unused variable warning
Without CONFIG_MLX5_ESWITCH we get a harmless warning: drivers/net/ethernet/mellanox/mlx5/core/en_main.c:3467:21: error: unused variable 'priv' [-Werror,-Wunused-variable] struct mlx5e_priv *priv = netdev_priv(dev); Hide the declaration in the same #ifdef as its usage. Fixes: 4e95bc268b91 ("net: flow_offload: add flow_block_cb_setup_simple()") Signed-off-by: Arnd Bergmann --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 6d0ae87c8ded..b562ba904ea1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3464,7 +3464,9 @@ static LIST_HEAD(mlx5e_block_cb_list); static int mlx5e_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data) { +#ifdef CONFIG_MLX5_ESWITCH struct mlx5e_priv *priv = netdev_priv(dev); +#endif switch (type) { #ifdef CONFIG_MLX5_ESWITCH -- 2.20.0
Re: [PATCH 4/4] numa: introduce numa cling feature
On 2019/7/12 下午3:53, Peter Zijlstra wrote: [snip] return target; } >>> >>> Select idle sibling should never cross node boundaries and is thus the >>> entirely wrong place to fix anything. >> >> Hmm.. in our early testing the printk show both select_task_rq_fair() and >> task_numa_find_cpu() will call select_idle_sibling with prev and target on >> different node, thus we pick this point to save few lines. > > But it will never return @prev if it is not in the same cache domain as > @target. See how everything is gated by: > > && cpus_share_cache(x, target) Yeah, that's right. > >> But if the semantics of select_idle_sibling() is to return cpu on the same >> node of target, what about move the logical after select_idle_sibling() for >> the two callers? > > No, that's insane. You don't do select_idle_sibling() to then ignore the > result. You have to change @target before calling select_idle_sibling(). > I see, we should not override the decision of select_idle_sibling(). Actually the original design we try to achieve is: let wake affine select the target try find idle sibling of target if got one pick it else if task cling to prev pick prev That is to consider wake affine superior to numa cling. But after rethinking maybe this is not necessary, since numa cling is also some kind of strong wake affine hint, actually maybe even a better one to filter out the bad cases. I'll try change @target instead and give a retest then. Regards, Michael Wang
[PATCH] xen/trace: avoid clang warning on function pointers
clang-9 does not like the way that the is_signed_type() compares function pointers deep inside of the trace even macros: In file included from arch/x86/xen/trace.c:21: In file included from include/trace/events/xen.h:475: In file included from include/trace/define_trace.h:102: In file included from include/trace/trace_events.h:467: include/trace/events/xen.h:69:7: error: ordered comparison of function pointers ('xen_mc_callback_fn_t' (aka 'void (*)(void *)') and 'xen_mc_callback_fn_t') [-Werror,-Wordered-compare-function-pointers] __field(xen_mc_callback_fn_t, fn) ^ include/trace/trace_events.h:415:29: note: expanded from macro '__field' #define __field(type, item) __field_ext(type, item, FILTER_OTHER) ^ include/trace/trace_events.h:401:6: note: expanded from macro '__field_ext' is_signed_type(type), filter_type);\ ^ include/linux/trace_events.h:540:44: note: expanded from macro 'is_signed_type' #define is_signed_type(type)(((type)(-1)) < (type)1) ^ note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all) include/trace/trace_events.h:77:16: note: expanded from macro 'TRACE_EVENT' PARAMS(tstruct), \ ~~~^~~~ include/linux/tracepoint.h:95:25: note: expanded from macro 'PARAMS' #define PARAMS(args...) args ^ include/trace/trace_events.h:455:2: note: expanded from macro 'DECLARE_EVENT_CLASS' tstruct;\ ^~~ I guess the warning is reasonable in principle, though this seems to be the only instance we get in the entire kernel today. Shut up the warning by making it a void pointer in the exported structure. Fixes: c796f213a693 ("xen/trace: add multicall tracing") Signed-off-by: Arnd Bergmann --- include/trace/events/xen.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h index 9a0e8af21310..f75b77414ac1 100644 --- a/include/trace/events/xen.h +++ b/include/trace/events/xen.h @@ -66,7 +66,7 @@ TRACE_EVENT(xen_mc_callback, TP_PROTO(xen_mc_callback_fn_t fn, void *data), TP_ARGS(fn, data), TP_STRUCT__entry( - __field(xen_mc_callback_fn_t, fn) + __field(void *, fn) __field(void *, data) ), TP_fast_assign( -- 2.20.0
HELLO! PLEASE TRY AND RESPOND SOONEST
My Dear Friend, Before I introduce myself, I wish to inform you that this letter is not a hoax mail and I urge you to treat it serious. This letter must come to you as a big surprise, but I believe it is only a day that people meet and become great friends and business partners. Please I want you to read this letter very carefully and I must apologize for barging this message into your mailbox without any formal introduction due to the urgency and confidentiality of this business and I know that this message will come to you as a surprise. Please this is not a joke and I will not like you to joke with it ok, with due respect to your person and much sincerity of purpose, I make this contact with you as I believe that you can be of great assistance to me. My name is Mr.Wilson Smith, from London, UK. I work in Kas Bank UK branch as telex manager, please see this as a confidential message and do not reveal it to another person and let me know whether you can be of assistance regarding my proposal below because it is top secret. I am about to retire from active Banking service to start a new life but I am sceptical to reveal this particular secret to a stranger. You must assure me that everything will be handled confidentially because we are not going to suffer again in life. It has been 10 years now that most of the greedy African Politicians used our bank to launder money overseas through the help of their Political advisers. Most of the funds which they transferred out of the shores of Africa were gold and oil money that was supposed to have been used to develop the continent. Their Political advisers always inflated the amounts before transferring to foreign accounts, so I also used the opportunity to divert part of the funds hence I am aware that there is no official trace of how much was transferred as all the accounts used for such transfers were being closed after transfer. I acted as the Bank Officer to most of the politicians and when I discovered that they were using me to succeed in their greedy act; I also cleaned some of their banking records from the Bank files and no one cared to ask me because the money was too much for them to control. They laundered over £5billion pounds during the process. Before I send this message to you, I have already diverted (£3.5million pounds) to an escrow account belonging to no one in the bank. The bank is anxious now to know who the beneficiary to the funds is because they have made a lot of profits with the funds. It is more than Eight years now and most of the politicians are no longer using our bank to transfer funds overseas. The (£3.5million pounds) has been laying waste in our bank and I don’t want to retire from the bank without transferring the funds to a foreign account to enable me to share the proceeds with the receiver (a foreigner). The money will be shared 60% for me and 40% for you. There is no one coming to ask you about the funds because I secured everything. I only want you to assist me by providing a reliable bank account where the funds can be transferred. Make Sure You Reply To My private email: wilsn...@gmail.com
Re: [PATCH] arm: Extend the check for RAM in /dev/mem
On Fri, 2019-07-12 at 09:56 +0100, Russell King - ARM Linux admin wrote: > On Fri, Jul 12, 2019 at 02:58:18AM +, Raslan, KarimAllah wrote: > > > > On Fri, 2019-07-12 at 08:06 +0530, Anshuman Khandual wrote: > > > > > > > > > On 07/12/2019 03:51 AM, KarimAllah Ahmed wrote: > > > > > > > > > > > > Some valid RAM can live outside kernel control (e.g. using mem= kernel > > > > command-line). For these regions, pfn_valid would return "false" causing > > > > system RAM to be mapped as uncached. Use memblock instead to identify > > > > RAM. > > > > > > Once the remaining memory is outside of the kernel (as the admin would > > > have > > > intended with mem= command line) what is the particular concern regarding > > > the way those get mapped (cached or not) ? It is not to be used any way. > > > > They can be used by user-space which might lead to them being used by the > > kernel. One use-case would be using them as guest memory for KVM as I > > detailed > > here: > > > > https://lwn.net/Articles/778240/ > > From the 32-bit ARM point of view... > > What if someone's already doing something similar with a non-coherent > DSP and is relying on the current behaviour? This change is a user > visible behavioural change that could end up breaking userspace. > > In other words, it isn't something we should rush into. Yes, that makes sense. How about adding a command-line option for this new behavior instead? Would this be more reasonable? Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
[PATCH] acpi: fix false-positive -Wuninitialized warning
clang gets confused by an uninitialized variable in what looks to it like a never executed code path: arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized] polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH; ^~~~ arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning int rc, irq, trigger, polarity; ^ = 0 arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized] trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; ^~~ arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning int rc, irq, trigger, polarity; ^ = 0 This is unfortunately a design decision in clang and won't be fixed. Changing the acpi_get_override_irq() macro to an inline function reliably avoids the issue. Signed-off-by: Arnd Bergmann --- include/linux/acpi.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/acpi.h b/include/linux/acpi.h index a95cce5e82e7..9426b9aaed86 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -324,7 +324,10 @@ struct irq_domain *acpi_irq_create_hierarchy(unsigned int flags, #ifdef CONFIG_X86_IO_APIC extern int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity); #else -#define acpi_get_override_irq(gsi, trigger, polarity) (-1) +static inline int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity) +{ + return -1; +} #endif /* * This function undoes the effect of one call to acpi_register_gsi(). -- 2.20.0
[PATCH] slab: work around clang bug #42570
Clang gets rather confused about two variables in the same special section when one of them is not initialized, leading to an assembler warning later: /tmp/slab_common-18f869.s: Assembler messages: /tmp/slab_common-18f869.s:7526: Warning: ignoring changed section attributes for .data..ro_after_init Adding an initialization to kmalloc_caches is rather silly here but does avoid the issue. Link: https://bugs.llvm.org/show_bug.cgi?id=42570 Signed-off-by: Arnd Bergmann --- We might decide to wait until this is fixed in clang, but so far all versions targetting x86 seem to be affected. --- mm/slab_common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 6c49dbb3769e..807490fe217a 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1028,7 +1028,8 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name, } struct kmem_cache * -kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init; +kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init = +{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ }; EXPORT_SYMBOL(kmalloc_caches); /* -- 2.20.0
[PATCH] [net-next] cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()
The cudbg_collect_mem_region() and cudbg_read_fw_mem() both use several hundred kilobytes of kernel stack space. One gets inlined into the other, which causes the stack usage to be combined beyond the warning limit when building with clang: drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c:1057:12: error: stack frame size of 1244 bytes in function 'cudbg_collect_mem_region' [-Werror,-Wframe-larger-than=] Restructuring cudbg_collect_mem_region() lets clang do the same optimization that gcc does and reuse the stack slots as it can see that the large variables are never used together. A better fix might be to avoid using cudbg_meminfo on the stack altogether, but that requires a larger rewrite. Fixes: a1c69520f785 ("cxgb4: collect MC memory dump") Signed-off-by: Arnd Bergmann --- .../net/ethernet/chelsio/cxgb4/cudbg_lib.c| 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c index a76529a7662d..c2e92786608b 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c @@ -1054,14 +1054,12 @@ static void cudbg_t4_fwcache(struct cudbg_init *pdbg_init, } } -static int cudbg_collect_mem_region(struct cudbg_init *pdbg_init, - struct cudbg_buffer *dbg_buff, - struct cudbg_error *cudbg_err, - u8 mem_type) +static unsigned long cudbg_mem_region_size(struct cudbg_init *pdbg_init, + struct cudbg_error *cudbg_err, + u8 mem_type) { struct adapter *padap = pdbg_init->adap; struct cudbg_meminfo mem_info; - unsigned long size; u8 mc_idx; int rc; @@ -1075,7 +1073,16 @@ static int cudbg_collect_mem_region(struct cudbg_init *pdbg_init, if (rc) return rc; - size = mem_info.avail[mc_idx].limit - mem_info.avail[mc_idx].base; + return mem_info.avail[mc_idx].limit - mem_info.avail[mc_idx].base; +} + +static int cudbg_collect_mem_region(struct cudbg_init *pdbg_init, + struct cudbg_buffer *dbg_buff, + struct cudbg_error *cudbg_err, + u8 mem_type) +{ + unsigned long size = cudbg_mem_region_size(pdbg_init, cudbg_err, mem_type); + return cudbg_read_fw_mem(pdbg_init, dbg_buff, mem_type, size, cudbg_err); } -- 2.20.0
[PATCH] lib/mpi: fix building with 32-bit x86
The mpi library contains some rather old inline assembly statements that produce a lot of warnings for 32-bit x86, such as: lib/mpi/mpih-div.c:76:16: error: invalid use of a cast in a inline asm context requiring an l-value: remove the cast or build with -fheinous-gnu-extensions udiv_qrnnd(qp[i], n1, n1, np[i], d); ~~~^~~~ lib/mpi/longlong.h:423:20: note: expanded from macro 'udiv_qrnnd' : "=a" ((USItype)(q)), \ ~~^~ There is no point in doing a type cast for the output of an inline assembler statement, so just remove the cast here, as we have done for other architectures in the past. See-also: dea632cadd12 ("lib/mpi: fix build with clang") Signed-off-by: Arnd Bergmann --- lib/mpi/longlong.h | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/lib/mpi/longlong.h b/lib/mpi/longlong.h index 08c60d10747f..3bb6260d8f42 100644 --- a/lib/mpi/longlong.h +++ b/lib/mpi/longlong.h @@ -397,8 +397,8 @@ do { \ #define add_ss(sh, sl, ah, al, bh, bl) \ __asm__ ("addl %5,%1\n" \ "adcl %3,%0" \ - : "=r" ((USItype)(sh)), \ -"=&r" ((USItype)(sl)) \ + : "=r" (sh), \ +"=&r" (sl) \ : "%0" ((USItype)(ah)), \ "g" ((USItype)(bh)), \ "%1" ((USItype)(al)), \ @@ -406,22 +406,22 @@ do { \ #define sub_ddmmss(sh, sl, ah, al, bh, bl) \ __asm__ ("subl %5,%1\n" \ "sbbl %3,%0" \ - : "=r" ((USItype)(sh)), \ -"=&r" ((USItype)(sl)) \ + : "=r" (sh), \ +"=&r" (sl) \ : "0" ((USItype)(ah)), \ "g" ((USItype)(bh)), \ "1" ((USItype)(al)), \ "g" ((USItype)(bl))) #define umul_ppmm(w1, w0, u, v) \ __asm__ ("mull %3" \ - : "=a" ((USItype)(w0)), \ -"=d" ((USItype)(w1)) \ + : "=a" (w0), \ +"=d" (w1) \ : "%0" ((USItype)(u)), \ "rm" ((USItype)(v))) #define udiv_qrnnd(q, r, n1, n0, d) \ __asm__ ("divl %4" \ - : "=a" ((USItype)(q)), \ -"=d" ((USItype)(r)) \ + : "=a" (q), \ +"=d" (r) \ : "0" ((USItype)(n0)), \ "1" ((USItype)(n1)), \ "rm" ((USItype)(d))) -- 2.20.0
[PATCH] x86: math-emu: hide clang warnings for 16-bit overflow
clang warns about a few parts of the math-emu implementation where a 16-bit integer becomes negative during assignment: arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion] (0x41 + EXTENDED_Ebias) | SIGN_Negative); ^~~~ arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16' #define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } ~ ^ arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion] FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66, ^~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~^~ arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion] FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x, 0xC000); ^~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~^~ The code seems correct to me, so add a typecast to shut up the warnings. Signed-off-by: Arnd Bergmann --- arch/x86/math-emu/fpu_emu.h | 2 +- arch/x86/math-emu/reg_constant.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/math-emu/fpu_emu.h b/arch/x86/math-emu/fpu_emu.h index a5a41ec58072..0c16ca56 100644 --- a/arch/x86/math-emu/fpu_emu.h +++ b/arch/x86/math-emu/fpu_emu.h @@ -177,7 +177,7 @@ static inline void reg_copy(FPU_REG const *x, FPU_REG *y) #define setexponentpos(x,y) { (*(short *)&((x)->exp)) = \ ((y) + EXTENDED_Ebias) & 0x7fff; } #define exponent16(x) (*(short *)&((x)->exp)) -#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } +#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (u16)(y); } #define addexponent(x,y){ (*(short *)&((x)->exp)) += (y); } #define stdexp(x) { (*(short *)&((x)->exp)) += EXTENDED_Ebias; } diff --git a/arch/x86/math-emu/reg_constant.c b/arch/x86/math-emu/reg_constant.c index 8dc9095bab22..742619e94bdf 100644 --- a/arch/x86/math-emu/reg_constant.c +++ b/arch/x86/math-emu/reg_constant.c @@ -18,7 +18,7 @@ #include "control_w.h" #define MAKE_REG(s, e, l, h) { l, h, \ - ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } + (u16)((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } FPU_REG const CONST_1 = MAKE_REG(POS, 0, 0x, 0x8000); #if 0 -- 2.20.0
[GIT PULL] Pin control bulk changes for v5.3
Hi Linus, here is the bulk of pin control changes for the v5.3 kernel cycle. This is pretty linear development in pin control, nothing really stand out. We had a bit of SPDX fuzz with tglx fixing up tags with scripts at the same time as maintainers were fixing up the same tags, but I regard that as a one-off and not a good time for an exercise in "what can be done differently". Let's resolve the conflicts and move on (I don't know if there will be any, don't think so.) Please pull it in! Technical details in the signed tag. Yours, Linus Walleij The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9: Linux 5.2-rc1 (2019-05-19 15:47:09 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git tags/pinctrl-v5.3-1 for you to fetch changes up to 4c105769bf6de29856bf80a4045e6725301c58ce: pinctrl: aspeed: Strip moved macros and structs from private header (2019-07-10 11:19:20 +0200) This is the bulk of pin control changes for the v5.3 kernel cycle: Core changes: - Device links can optionally be added between a pin control producer and its consumers. This will affect how the system power management is handled: a pin controller will not suspend before all of its consumers have been suspended. This was necessary for the ST Microelectronics STMFX expander and need to be tested on other systems as well: it makes sense to make this default in the long run. Right now it is opt-in per driver. - Drive strength can be specified in microamps. With decreases in silicon technology, milliamps isn't granular enough, let's make it possible to select drive strengths in microamps. Right now the Meson (AMlogic) driver needs this. New drivers: - New subdriver for the Tegra 194 SoC. - New subdriver for the Qualcomm SDM845. - New subdriver for the Qualcomm SM8150. - New subdriver for the Freescale i.MX8MN (Freescale is now a product line of NXP). - New subdriver for Marvell MV98DX1135. Driver improvements: - The Bitmain BM1880 driver now supports pin config in addition to muxing. - The Qualcomm drivers can now reserve some GPIOs as taken aside and not usable for users. This is used in ACPI systems to take out some GPIO lines used by the BIOS so that noone else (neither kernel nor userspace) will play with them by mistake and crash the machine. - A slew of refurbishing around the Aspeed drivers (board management controllers for servers) in preparation for the new Aspeed AST2600 SoC. - A slew of improvements over the SH PFC drivers as usual. - Misc cleanups and fixes. Alexandre Torgue (4): pinctrl: stm32: add suspend/resume management pinctrl: stm32: Enable suspend/resume for stm32mp157c SoC pinctrl: stm32: add lock mechanism for irqmux selection dt-bindings: pinctrl: Convert stm32 pinctrl bindings to json-schema Andrew Jeffery (9): dt-bindings: pinctrl: aspeed: Split bindings document in two dt-bindings: pinctrl: aspeed: Convert AST2400 bindings to json-schema dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema MAINTAINERS: Add entry for ASPEED pinctrl drivers pinctrl: aspeed: Correct comment that is no longer true pinctrl: aspeed: Clarify comment about strapping W1C pinctrl: aspeed: Split out pinmux from general pinctrl pinctrl: aspeed: Add implementation-related documentation pinctrl: aspeed: Strip moved macros and structs from private header Andy Shevchenko (3): pinctrl: baytrail: Use defined macro instead of magic in byt_get_gpio_mux() pinctrl: baytrail: Re-use data structures from pinctrl-intel.h pinctrl: baytrail: Use GENMASK() consistently Anson Huang (3): dt-bindings: imx: Correct pinfunc head file path for i.MX8MM dt-bindings: imx: Add pinctrl binding doc for i.MX8MN pinctrl: freescale: Add i.MX8MN pinctrl driver support Benjamin Gaignard (2): pinctrl: Enable device link creation for pin control pinctrl: stmfx: enable links creations Bjorn Andersson (1): pinctrl: qcom: sdm845: Expose ufs_reset as gpio Charles Keepax (1): pinctrl: madera: Fixup SPDX headers Chris Packham (2): dt-bindings: pinctrl: mvebu: Document bindings for 98DX1135 pinctrl: mvebu: Add support for MV98DX1135 Colin Ian King (1): dt-bindings: pinctrl: fix spelling mistakes in pinctl documentation Doug Berger (1): pinctrl: bcm: Allow PINCTRL_BCM2835 for ARCH_BRCMSTB Enrico Weigelt (1): gpio: Fix build warnings on undefined struct pinctrl_dev Florian Fainelli (1): dt-bindings: pinctrl: bcm2835-gpio: Document BCM7211 compatible Geert Uytterhoeven (26): pinctrl: sh-pfc: Correct printk level of group reference warning pinctrl: sh-pfc: Mark run-time debug code __init pinctrl
Re: [PATCH 1/4] numa: introduce per-cgroup numa balancing locality, statistic
On 2019/7/12 下午3:58, Peter Zijlstra wrote: [snip] >>> >>> Then our task t1 should be accounted to B (as you do), but also to A and >>> R. >> >> I get the point but not quite sure about this... >> >> Not like pages there are no hierarchical limitation on locality, also tasks > > You can use cpusets to affect that. Could you please give more detail on this? > >> running in a particular group have no influence to others, not to mention the >> extra overhead, does it really meaningful to account the stuff >> hierarchically? > > AFAIU it's a requirement of cgroups to be hierarchical. All our other > cgroup accounting is like that. Ok, should respect the convention :-) Regards, Michael Wang >
[PATCH] thp: fix unused shmem_parse_huge() function warning
When CONFIG_SYSFS is disabled but CONFIG_TMPFS is enabled, we get a warning about shmem_parse_huge() never being called: mm/shmem.c:417:12: error: unused function 'shmem_parse_huge' [-Werror,-Wunused-function] static int shmem_parse_huge(const char *str) Change the #ifdef so we no longer build this function in that configuration. Fixes: 144df3b288c4 ("vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API") Signed-off-by: Arnd Bergmann --- mm/shmem.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index ba40fac908c5..32aa9d46b87c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -413,7 +413,7 @@ static bool shmem_confirm_swap(struct address_space *mapping, static int shmem_huge __read_mostly; -#if defined(CONFIG_SYSFS) || defined(CONFIG_TMPFS) +#if defined(CONFIG_SYSFS) static int shmem_parse_huge(const char *str) { if (!strcmp(str, "never")) @@ -430,7 +430,9 @@ static int shmem_parse_huge(const char *str) return SHMEM_HUGE_FORCE; return -EINVAL; } +#endif +#if defined(CONFIG_SYSFS) || defined(CONFIG_TMPFS) static const char *shmem_format_huge(int huge) { switch (huge) { -- 2.20.0
Re: [PATCH v2] printk: Do not lose last line in kmsg buffer dump
On Thu 2019-07-11 16:29:37, Vincent Whitchurch wrote: > kmsg_dump_get_buffer() is supposed to select all the youngest log > messages which fit into the provided buffer. It determines the correct > start index by using msg_print_text() with a NULL buffer to calculate > the size of each entry. However, when performing the actual writes, > msg_print_text() only writes the entry to the buffer if the written len > is lesser than the size of the buffer. So if the lengths of the > selected youngest log messages happen to precisely fill up the provided > buffer, the last log message is not included. > > We don't want to modify msg_print_text() to fill up the buffer and start > returning a length which is equal to the size of the buffer, since > callers of its other users, such as kmsg_dump_get_line(), depend upon > the current behaviour. > > Instead, fix kmsg_dump_get_buffer() to compensate for this. > > For example, with the following two final prints: > > [6.427502] A > [6.427769] 12345 > > A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this > patch: > > : 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37 <0>[6.522197 > 0010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a ] A. > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > After this patch: > > : 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38 <0>[6.456678 > 0010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a ] 12345. > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Signed-off-by: Vincent Whitchurch I think that I need vacation. I have got lost in all the checks and got it wrongly in the morning. This patch fixes the calculation of messages that might fit into the buffer. It makes sure that the function that writes the messages will really allow to write them. It seems to be the correct fix. Reviewed-by: Petr Mladek Best Regards, Petr
[PATCH 1/2] x86: kvm: avoid -Wsometimes-uninitized warning
clang points out that running a 64-bit guest on a 32-bit host would lead to uninitialized variables: arch/x86/kvm/hyperv.c:1610:6: error: variable 'ingpa' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (!longmode) { ^ arch/x86/kvm/hyperv.c:1632:55: note: uninitialized use occurs here trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa); ^ arch/x86/kvm/hyperv.c:1610:2: note: remove the 'if' if its condition is always true if (!longmode) { ^~~ arch/x86/kvm/hyperv.c:1595:18: note: initialize the variable 'ingpa' to silence this warning u64 param, ingpa, outgpa, ret = HV_STATUS_SUCCESS; ^ = 0 arch/x86/kvm/hyperv.c:1610:6: error: variable 'outgpa' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] arch/x86/kvm/hyperv.c:1610:6: error: variable 'param' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] Since that combination is not supported anyway, change the condition to tell the compiler how the code is actually executed. Signed-off-by: Arnd Bergmann --- arch/x86/kvm/hyperv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index a39e38f13029..950436c502ba 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1607,7 +1607,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu) longmode = is_64_bit_mode(vcpu); - if (!longmode) { + if (!IS_ENABLED(CONFIG_X86_64) || !longmode) { param = ((u64)kvm_rdx_read(vcpu) << 32) | (kvm_rax_read(vcpu) & 0x); ingpa = ((u64)kvm_rbx_read(vcpu) << 32) | -- 2.20.0
[PATCH 2/2] x86: kvm: avoid constant-conversion warning
clang finds a contruct suspicious that converts an unsigned character to a signed integer and back, causing an overflow: arch/x86/kvm/mmu.c:4605:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Werror,-Wconstant-conversion] u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; ~~ ^~ arch/x86/kvm/mmu.c:4607:38: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -241 to 15 [-Werror,-Wconstant-conversion] u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; ~~ ^~ arch/x86/kvm/mmu.c:4609:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -171 to 85 [-Werror,-Wconstant-conversion] u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; ~~ ^~ Add an explicit cast to tell clang that everything works as intended here. Signed-off-by: Arnd Bergmann --- arch/x86/kvm/mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 17ece7b994b1..aea7f969ecb8 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4602,11 +4602,11 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, */ /* Faults from writes to non-writable pages */ - u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; + u8 wf = (pfec & PFERR_WRITE_MASK) ? (u8)~w : 0; /* Faults from user mode accesses to supervisor pages */ - u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; + u8 uf = (pfec & PFERR_USER_MASK) ? (u8)~u : 0; /* Faults from fetches of non-executable pages*/ - u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; + u8 ff = (pfec & PFERR_FETCH_MASK) ? (u8)~x : 0; /* Faults from kernel mode fetches of user pages */ u8 smepf = 0; /* Faults from kernel mode accesses of user pages */ -- 2.20.0
Re: [PATCH] dax: Fix missed PMD wakeups
On Thu 11-07-19 08:25:50, Matthew Wilcox wrote: > On Thu, Jul 11, 2019 at 07:13:50AM -0700, Matthew Wilcox wrote: > > However, the XA_RETRY_ENTRY might be a good choice. It doesn't normally > > appear in an XArray (it may appear if you're looking at a deleted node, > > but since we're holding the lock, we can't see deleted nodes). > ... > @@ -254,7 +267,7 @@ static void wait_entry_unlocked(struct xa_state *xas, > void *entry) > static void put_unlocked_entry(struct xa_state *xas, void *entry) > { > /* If we were the only waiter woken, wake the next one */ > - if (entry) > + if (entry && dax_is_conflict(entry)) This should be !dax_is_conflict(entry)... > dax_wake_entry(xas, entry, false); > } Otherwise the patch looks good to me so feel free to add: Reviewed-by: Jan Kara once you fix this. Honza -- Jan Kara SUSE Labs, CR
[PATCH] dma: ste_dma40: fix unneeded variable warning
clang-9 points out that there are two variables that depending on the configuration may only be used in an ARRAY_SIZE() expression but not referenced: drivers/dma/ste_dma40.c:145:12: error: variable 'd40_backup_regs' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static u32 d40_backup_regs[] = { ^ drivers/dma/ste_dma40.c:214:12: error: variable 'd40_backup_regs_chan' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static u32 d40_backup_regs_chan[] = { Mark these __maybe_unused to shut up the warning. Signed-off-by: Arnd Bergmann --- drivers/dma/ste_dma40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c index 89d710899010..de8bfd9a76e9 100644 --- a/drivers/dma/ste_dma40.c +++ b/drivers/dma/ste_dma40.c @@ -142,7 +142,7 @@ enum d40_events { * when the DMA hw is powered off. * TODO: Add save/restore of D40_DREG_GCC on dma40 v3 or later, if that works. */ -static u32 d40_backup_regs[] = { +static __maybe_unused u32 d40_backup_regs[] = { D40_DREG_LCPA, D40_DREG_LCLA, D40_DREG_PRMSE, @@ -211,7 +211,7 @@ static u32 d40_backup_regs_v4b[] = { #define BACKUP_REGS_SZ_V4B ARRAY_SIZE(d40_backup_regs_v4b) -static u32 d40_backup_regs_chan[] = { +static __maybe_unused u32 d40_backup_regs_chan[] = { D40_CHAN_REG_SSCFG, D40_CHAN_REG_SSELT, D40_CHAN_REG_SSPTR, -- 2.20.0
Re: [PATCH] xen/trace: avoid clang warning on function pointers
On Fri, Jul 12, 2019 at 10:59 AM Arnd Bergmann wrote: > > clang-9 does not like the way that the is_signed_type() compares > function pointers deep inside of the trace even macros: > > In file included from arch/x86/xen/trace.c:21: > In file included from include/trace/events/xen.h:475: > In file included from include/trace/define_trace.h:102: > In file included from include/trace/trace_events.h:467: > include/trace/events/xen.h:69:7: error: ordered comparison of function > pointers ('xen_mc_callback_fn_t' (aka 'void (*)(void *)') and > 'xen_mc_callback_fn_t') [-Werror,-Wordered-compare-function-pointers] > __field(xen_mc_callback_fn_t, fn) > ^ > include/trace/trace_events.h:415:29: note: expanded from macro '__field' > #define __field(type, item) __field_ext(type, item, FILTER_OTHER) > ^ > include/trace/trace_events.h:401:6: note: expanded from macro '__field_ext' > is_signed_type(type), filter_type);\ > ^ > include/linux/trace_events.h:540:44: note: expanded from macro > 'is_signed_type' > #define is_signed_type(type)(((type)(-1)) < (type)1) > ^ > note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to > see all) > include/trace/trace_events.h:77:16: note: expanded from macro 'TRACE_EVENT' > PARAMS(tstruct), \ > ~~~^~~~ > include/linux/tracepoint.h:95:25: note: expanded from macro 'PARAMS' > #define PARAMS(args...) args > ^ > include/trace/trace_events.h:455:2: note: expanded from macro > 'DECLARE_EVENT_CLASS' > tstruct;\ > ^~~ > > I guess the warning is reasonable in principle, though this seems to > be the only instance we get in the entire kernel today. > Shut up the warning by making it a void pointer in the exported > structure. > Thanks for bringing this up (again), Arnd. As this is a known CBL issue please add... Link: https://github.com/ClangBuiltLinux/linux/issues/97 ...and... Tested-by: Sedat Dilek For the sake of completeness see also the comments of Steven Rostedt and user "Honeybyte" in the above Link - if not known/read. - Sedat - P.S.: I am using this patch since 6 months in my for-5.x/clang-warningfree local Git repository. > Fixes: c796f213a693 ("xen/trace: add multicall tracing") > Signed-off-by: Arnd Bergmann > --- > include/trace/events/xen.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h > index 9a0e8af21310..f75b77414ac1 100644 > --- a/include/trace/events/xen.h > +++ b/include/trace/events/xen.h > @@ -66,7 +66,7 @@ TRACE_EVENT(xen_mc_callback, > TP_PROTO(xen_mc_callback_fn_t fn, void *data), > TP_ARGS(fn, data), > TP_STRUCT__entry( > - __field(xen_mc_callback_fn_t, fn) > + __field(void *, fn) > __field(void *, data) > ), > TP_fast_assign( > -- > 2.20.0 > > -- > You received this message because you are subscribed to the Google Groups > "Clang Built Linux" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clang-built-linux+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/clang-built-linux/20190712085908.4146364-1-arnd%40arndb.de.
RE: [PATCH] clk: renesas: cpg-mssr: Fix reset control race condition
Hi Geert-san, > From: Geert Uytterhoeven, Sent: Thursday, July 11, 2019 10:04 PM > > The module reset code in the Renesas CPG/MSSR driver uses > read-modify-write (RMW) operations to write to a Software Reset Register > (SRCRn), and simple writes to write to a Software Reset Clearing > Register (SRSTCLRn), as was mandated by the R-Car Gen2 and Gen3 Hardware > User's Manuals. > > However, this may cause a race condition when two devices are reset in > parallel: if the reset for device A completes in the middle of the RMW > operation for device B, device A may be reset again, causing subtle > failures (e.g. i2c timeouts): > > thread Athread B > > > val = SRCRn > val |= bit A > SRCRn = val > > delay > > val = SRCRn (bit A is set) > > SRSTCLRn = bit A > (bit A in SRCRn is cleared) > > val |= bit B > SRCRn = val (bit A and B are set) > > This can be reproduced on e.g. Salvator-XS using: > > $ while true; do i2cdump -f -y 4 0x6A b > /dev/null; done & > $ while true; do i2cdump -f -y 2 0x10 b > /dev/null; done & > > i2c-rcar e651.i2c: error -110 : 4002 > i2c-rcar e66d8000.i2c: error -110 : 4002 > > According to the R-Car Gen3 Hardware Manual Errata for Rev. > 0.80 of Feb 28, 2018, reflected in Rev. 1.00 of the R-Car Gen3 Hardware > User's Manual, writes to SRCRn do not require read-modify-write cycles. > > Note that the R-Car Gen2 Hardware User's Manual has not been updated > yet, and still says a read-modify-write sequence is required. According > to the hardware team, the reset hardware block is the same on both R-Car > Gen2 and Gen3, though. > > Hence fix the issue by replacing the read-modify-write operations on > SRCRn by simple writes. > > Reported-by: Yao Lihua > Fixes: 6197aa65c4905532 ("clk: renesas: cpg-mssr: Add support for reset > control") > Signed-off-by: Geert Uytterhoeven > --- Thank you for the patch! Our test team tested this patch, so Tested-by: Linh Phung > So far I haven't been able to reproduce the issue on R-Car Gen2 (after > forcing i2c reset on Gen2, too). Perhaps my Koelsch doesn't have enough > CPU cores. What about Lager? According to the test team, Lager also could not reproduce this issue. Should we investigate it why? Best regards, Yoshihiro Shimoda > Hi Mike, Stephen, > > As this is a bugfix, can you please take this directly, if accepted? > > Thanks! > --- > drivers/clk/renesas/renesas-cpg-mssr.c | 16 ++-- > 1 file changed, 2 insertions(+), 14 deletions(-) > > diff --git a/drivers/clk/renesas/renesas-cpg-mssr.c > b/drivers/clk/renesas/renesas-cpg-mssr.c > index 52bbb9ce3807db31..d4075b13067429cd 100644 > --- a/drivers/clk/renesas/renesas-cpg-mssr.c > +++ b/drivers/clk/renesas/renesas-cpg-mssr.c > @@ -572,17 +572,11 @@ static int cpg_mssr_reset(struct reset_controller_dev > *rcdev, > unsigned int reg = id / 32; > unsigned int bit = id % 32; > u32 bitmask = BIT(bit); > - unsigned long flags; > - u32 value; > > dev_dbg(priv->dev, "reset %u%02u\n", reg, bit); > > /* Reset module */ > - spin_lock_irqsave(&priv->rmw_lock, flags); > - value = readl(priv->base + SRCR(reg)); > - value |= bitmask; > - writel(value, priv->base + SRCR(reg)); > - spin_unlock_irqrestore(&priv->rmw_lock, flags); > + writel(bitmask, priv->base + SRCR(reg)); > > /* Wait for at least one cycle of the RCLK clock (@ ca. 32 kHz) */ > udelay(35); > @@ -599,16 +593,10 @@ static int cpg_mssr_assert(struct reset_controller_dev > *rcdev, unsigned long id) > unsigned int reg = id / 32; > unsigned int bit = id % 32; > u32 bitmask = BIT(bit); > - unsigned long flags; > - u32 value; > > dev_dbg(priv->dev, "assert %u%02u\n", reg, bit); > > - spin_lock_irqsave(&priv->rmw_lock, flags); > - value = readl(priv->base + SRCR(reg)); > - value |= bitmask; > - writel(value, priv->base + SRCR(reg)); > - spin_unlock_irqrestore(&priv->rmw_lock, flags); > + writel(bitmask, priv->base + SRCR(reg)); > return 0; > } > > -- > 2.17.1
Re: Re: [PATCH] media: v4l: Add packed YUV444 24bpp pixel format
Hi, On Thu 11 Jul 19, 13:57, Mirela Rabulea wrote: > On Jo, 2019-07-11 at 10:18 +0200, Paul Kocialkowski wrote: > > Caution: EXT Email > > > > Hi, > > > > On Wed 03 Jul 19, 18:15, Mirela Rabulea wrote: > > > > > > The added format is V4L2_PIX_FMT_YUV24, this is a packed > > > YUV 4:4:4 format, with 8 bits for each component, 24 bits > > > per sample. > > > > > > This format is used by the i.MX 8QuadMax and i.MX > > > 8DualXPlus/8QuadXPlus > > > JPEG encoder/decoder. > > So this format is not aligned to 32-bit words at all and we can > > expect > > to see cases where a single 32-bit word contains data for two pixels? > > > > Nothing wrong with that, just checking whether I understood this > > right :) > > > > Hi Paul, > yes, your understanding is correct. Out of curiosity, is the JPEG block assmiliated to (one of) the Hantro VPUs or is it a totally different and unrelated hardware block? Anyway the change looks good to me: Reviewed-by: Paul Kocialkowski Cheers, Paul -- Paul Kocialkowski, Bootlin Embedded Linux and kernel engineering https://bootlin.com
Re: [PATCH v2] printk: Do not lose last line in kmsg buffer dump
On (07/12/19 11:12), Petr Mladek wrote: > > For example, with the following two final prints: > > > > [6.427502] A > > [6.427769] 12345 > > > > A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this > > patch: > > > > : 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37 <0>[6.522197 > > 0010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a ] A. > > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > After this patch: > > > > : 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38 <0>[6.456678 > > 0010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a ] 12345. > > 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > Signed-off-by: Vincent Whitchurch > > I think that I need vacation. I have got lost in all the checks > and got it wrongly in the morning. > > This patch fixes the calculation of messages that might fit > into the buffer. It makes sure that the function that writes > the messages will really allow to write them. > > It seems to be the correct fix. > > Reviewed-by: Petr Mladek Looks correct to me as well. Reviewed-by: Sergey Senozhatsky -ss
[HELP REQUESTED from the community] Was: Staging status of speakup
Hello, To readers of the linux-speakup: could you help on this so we can get Speakup in mainline? Neither Okash or I completely know what user consequences the files in /sys/accessibility/speakup/ have, so could people give brief explanations for each file (something like 3-6 lines of explanation)? The i18n/ files have been already documented in section 14.1 of the spkguide.txt, so we do not need help for them. Thanks! Samuel Greg KH, le ven. 12 juil. 2019 10:38:19 +0200, a ecrit: > Can you make up a patch to create a > drivers/staging/speakup/sysfs-speakup file with the needed information? > That way it will be much easier to determine exactly what these sysfs > files do and my review can be easier, and perhaps not needed at all :)
[PATCH] [v2] mic: avoid statically declaring a 'struct device'.
Generally, declaring a platform device as a static variable is a bad idea and can cause all kinds of problems, in particular with the DMA configuration and lifetime rules. A specific problem we hit here is from a bug in clang that warns about certain (otherwise valid) macros when used in static variables: drivers/misc/mic/card/mic_x100.c:285:27: warning: shift count >= width of type [-Wshift-count-overflow] static u64 mic_dma_mask = DMA_BIT_MASK(64); ^~~~ include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) ^ ~~~ A slightly better way here is to create the platform device dynamically and set the dma mask in the probe function. This avoids the warning and some other problems, but is still not ideal because the device creation should really be separated from the driver, and the fact that the device has no parent means we have to force the dma mask rather than having it set up from the bus that the device is actually on. Fixes: dd8d8d44df64 ("misc: mic: MIC card driver specific changes to enable SCIF") Signed-off-by: Arnd Bergmann --- v2: rewrite to use platform_device_register_simple() and make it actually build Please merge after -rc1 is out. --- drivers/misc/mic/card/mic_x100.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c index 266ffb6f6c44..c8bff2916d3d 100644 --- a/drivers/misc/mic/card/mic_x100.c +++ b/drivers/misc/mic/card/mic_x100.c @@ -237,6 +237,9 @@ static int __init mic_probe(struct platform_device *pdev) mdrv->dev = &pdev->dev; snprintf(mdrv->name, sizeof(mic_driver_name), mic_driver_name); + /* FIXME: use dma_set_mask_and_coherent() and check result */ + dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + mdev->mmio.pa = MIC_X100_MMIO_BASE; mdev->mmio.len = MIC_X100_MMIO_LEN; mdev->mmio.va = devm_ioremap(&pdev->dev, MIC_X100_MMIO_BASE, @@ -282,18 +285,6 @@ static void mic_platform_shutdown(struct platform_device *pdev) mic_remove(pdev); } -static u64 mic_dma_mask = DMA_BIT_MASK(64); - -static struct platform_device mic_platform_dev = { - .name = mic_driver_name, - .id = 0, - .num_resources = 0, - .dev = { - .dma_mask = &mic_dma_mask, - .coherent_dma_mask = DMA_BIT_MASK(64), - }, -}; - static struct platform_driver __refdata mic_platform_driver = { .probe = mic_probe, .remove = mic_remove, @@ -303,6 +294,8 @@ static struct platform_driver __refdata mic_platform_driver = { }, }; +static struct platform_device *mic_platform_dev; + static int __init mic_init(void) { int ret; @@ -316,9 +309,12 @@ static int __init mic_init(void) request_module("mic_x100_dma"); mic_init_card_debugfs(); - ret = platform_device_register(&mic_platform_dev); + + mic_platform_dev = platform_device_register_simple(mic_driver_name, + 0, NULL, 0); + ret = PTR_ERR_OR_ZERO(mic_platform_dev); if (ret) { - pr_err("platform_device_register ret %d\n", ret); + pr_err("platform_device_register_full ret %d\n", ret); goto cleanup_debugfs; } ret = platform_driver_register(&mic_platform_driver); @@ -329,7 +325,7 @@ static int __init mic_init(void) return ret; device_unregister: - platform_device_unregister(&mic_platform_dev); + platform_device_unregister(mic_platform_dev); cleanup_debugfs: mic_exit_card_debugfs(); done: @@ -339,7 +335,7 @@ static int __init mic_init(void) static void __exit mic_exit(void) { platform_driver_unregister(&mic_platform_driver); - platform_device_unregister(&mic_platform_dev); + platform_device_unregister(mic_platform_dev); mic_exit_card_debugfs(); } -- 2.20.0
Re: Staging status of speakup
On Fri, Jul 12, 2019 at 9:38 AM Greg Kroah-Hartman wrote: > > On Sun, Jul 07, 2019 at 08:57:10AM +0200, Greg Kroah-Hartman wrote: > > On Sat, Jul 06, 2019 at 08:08:57PM +0100, Okash Khawaja wrote: > > > On Fri, 15 Mar 2019 20:18:31 -0700 > > > Greg Kroah-Hartman wrote: > > > > > > > On Fri, Mar 15, 2019 at 01:01:27PM +, Okash Khawaja wrote: > > > > > Hi, > > > > > > > > > > We have made progress on the items in TODO file of speakup driver in > > > > > staging directory and wanted to get some clarity on the remaining > > > > > items. Below is a summary of status of each item along with the > > > > > quotes from TODO file. > > > > > > > > > > 1. "The first issue has to do with the way speakup communicates > > > > > with serial ports. Currently, we communicate directly with the > > > > > hardware ports. This however conflicts with the standard serial > > > > > port drivers, which poses various problems. This is also not > > > > > working for modern hardware such as PCI-based serial ports. Also, > > > > > there is not a way we can communicate with USB devices. The > > > > > current serial port handling code is in serialio.c in this > > > > > directory." > > > > > > > > > > Drivers for all external synths now use TTY to communcate with the > > > > > devices. Only ones still using direct communication with hardware > > > > > ports are internal synths: acntpc, decpc, dtlk and keypc. These are > > > > > typically ISA cards and generally hardware which is difficult to > > > > > make work. We can leave these in staging. > > > > > > > > Ok, that's fine. > > > > > > > > > 2. "Some places are currently using in_atomic() because speakup > > > > > functions are called in various contexts, and a couple of things > > > > > can't happen in these cases. Pushing work to some worker thread > > > > > would probably help, as was already done for the serial port > > > > > driving part." > > > > > > > > > > There aren't any uses of in_atomic anymore. Commit d7500135802c > > > > > "Staging: speakup: Move pasting into a work item" was the last one > > > > > that removed such uses. > > > > > > > > Great, let's remove that todo item then. > > > > > > > > > 3. "There is a duplication of the selection functions in > > > > > selections.c. These functions should get exported from > > > > > drivers/char/selection.c (clear_selection notably) and used from > > > > > there instead." > > > > > > > > > > This is yet to be done. I guess drivers/char/selection.c is now > > > > > under drivers/tty/vt/selection.c. > > > > > > > > Yes, someone should update the todo item :) > > > > > > > > > 4. "The kobjects may have to move to a more proper place in /sys.The > > > > > discussion on lkml resulted to putting speech synthesizers in the > > > > > "speech" class, and the speakup screen reader itself > > > > > into /sys/class/vtconsole/vtcon0/speakup, the nasty path being > > > > > handled by userland tools." > > > > > > > > > > Although this makes logical sense, the change will mean changing > > > > > interface with userspace and hence the user space tools. I tried to > > > > > search the lkml discussion but couldn't find it. It will be good to > > > > > know your thoughts on this. > > > > > > > > I don't remember, sorry. I can review the kobject/sysfs usage if you > > > > think it is "good enough" now and see if I find anything > > > > objectionable. > > > > > > > > > Finally there is an issue where text in output buffer sometimes gets > > > > > garbled on SMP systems, but we can continue working on it after the > > > > > driver is moved out of staging, if that's okay. Basically we need a > > > > > reproducer of this issue. > > > > > > > > > > In addition to above, there are likely code style issues which will > > > > > need to be fixed. > > > > > > > > > > We are very keen to get speakup out of staging both, for settling > > > > > the driver but also for getting included in distros which build > > > > > only the mainline drivers. > > > > > > > > That's great, I am glad to see this happen. How about work on the > > > > selection thing and then I can review the kobject stuff in a few > > > > weeks, and then we can start moving things for 5.2? > > > > > > Hi Greg, > > > > > > Apologies for the delay. I de-duplicated selection code in speakup to > > > use code that's already in kernel (commit ids 496124e5e16e and > > > 41f13084506a). Following items are what remain now: > > > > > > 1. moving kobjects location > > > 2. fixing garbled text > > > > > > I couldn't replicate garbled text but Simon (also in CC list) is > > > looking into it. > > > > > > Can you please advise on the way forward? > > > > I don't think the "garbled text" is an issue to get this out of staging > > if others do not see this. It can be fixed like any other bug at a > > later point if it is figured out. > > > > The kobject stuff does need to be looked at. Let me carve out some time > > next week to do that and I will let you know what I see/recommend. > > At first glance,
Re: [PATCH v3] arm64: dts: sdm845: Add video nodes
On 7/2/2019 5:42 PM, Aniket Masule wrote: From: Malathi Gottam This adds video nodes to sdm845 based on the examples in the bindings. Signed-off-by: Malathi Gottam Co-developed-by: Aniket Masule Signed-off-by: Aniket Masule Reviewed-by: Rajendra Nayak --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 30 ++ 1 file changed, 30 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index fcb9330..f3cd94f 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -1893,6 +1893,36 @@ }; }; + video-codec@aa0 { + compatible = "qcom,sdm845-venus"; + reg = <0 0x0aa0 0 0xff000>; + interrupts = ; + power-domains = <&videocc VENUS_GDSC>; + clocks = <&videocc VIDEO_CC_VENUS_CTL_CORE_CLK>, +<&videocc VIDEO_CC_VENUS_AHB_CLK>, +<&videocc VIDEO_CC_VENUS_CTL_AXI_CLK>; + clock-names = "core", "iface", "bus"; + iommus = <&apps_smmu 0x10a0 0x8>, +<&apps_smmu 0x10b0 0x0>; + memory-region = <&venus_mem>; + + video-core0 { + compatible = "venus-decoder"; + clocks = <&videocc VIDEO_CC_VCODEC0_CORE_CLK>, +<&videocc VIDEO_CC_VCODEC0_AXI_CLK>; + clock-names = "core", "bus"; + power-domains = <&videocc VCODEC0_GDSC>; + }; + + video-core1 { + compatible = "venus-encoder"; + clocks = <&videocc VIDEO_CC_VCODEC1_CORE_CLK>, +<&videocc VIDEO_CC_VCODEC1_AXI_CLK>; + clock-names = "core", "bus"; + power-domains = <&videocc VCODEC1_GDSC>; + }; + }; + videocc: clock-controller@ab0 { compatible = "qcom,sdm845-videocc"; reg = <0 0x0ab0 0 0x1>; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation