[PATCH] mtd: nand: vf610_nfc: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/mtd/nand/vf610_nfc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c index 744ab10..ca0ab96 100644 --- a/drivers/mtd/nand/vf610_nfc.c +++ b/drivers/mtd/nand/vf610_nfc.c @@ -674,6 +674,11 @@ static int vf610_nfc_probe(struct platform_device *pdev) } of_id = of_match_device(vf610_nfc_dt_ids, &pdev->dev); + if (!of_id) { + dev_err(&pdev->dev, "Failed to match device!\n"); + return -ENODEV; + } + nfc->variant = (enum vf610_nfc_variant)of_id->data; for_each_available_child_of_node(nfc->dev->of_node, child) { -- 2.5.0
Re: [PATCH v12 02/10] powerpc/powernv: Autoload IMC device driver module
Hi Maddy/Anju, Comments below :) Anju T Sudhakar writes: > Code to create platform device for the IMC counters. > Paltform devices are created based on the IMC compatibility > string. > > New Config flag "CONFIG_HV_PERF_IMC_CTRS" add to contain the > IMC counter changes. I don't think we need a separate config, it can just use CONFIG_PPC_POWERNV. I don't think we'll ever want to turn it off for powernv, unless we're trying to build a small kernel, in which case we'll turn of PERF entirely. > diff --git a/arch/powerpc/platforms/powernv/opal-imc.c > b/arch/powerpc/platforms/powernv/opal-imc.c > new file mode 100644 > index ..5b1045c81af4 > --- /dev/null > +++ b/arch/powerpc/platforms/powernv/opal-imc.c > @@ -0,0 +1,73 @@ > +/* > + * OPAL IMC interface detection driver > + * Supported on POWERNV platform > + * > + * Copyright (C) 2017 Madhavan Srinivasan, IBM Corporation. > + * (C) 2017 Anju T Sudhakar, IBM Corporation. > + * (C) 2017 Hemant K Shaw, IBM Corporation. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version > + * 2 of the License, or later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. We usually don't include that part in every file. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +static int opal_imc_counters_probe(struct platform_device *pdev) > +{ > + struct device_node *imc_dev = NULL; > + > + if (!pdev || !pdev->dev.of_node) > + return -ENODEV; We don't need that level of paranoia :) > + /* > + * Check whether this is kdump kernel. If yes, just return. > + */ > + if (is_kdump_kernel()) > + return -ENODEV; Hmm, that's a bit unusual. Is there any particular reason to do that for this driver? > + imc_dev = pdev->dev.of_node; > + if (!imc_dev) > + return -ENODEV; > + > + return 0; > +} > + > +static const struct of_device_id opal_imc_match[] = { > + { .compatible = IMC_DTB_COMPAT }, > + {}, > +}; > + > +static struct platform_driver opal_imc_driver = { > + .driver = { > + .name = "opal-imc-counters", > + .of_match_table = opal_imc_match, > + }, > + .probe = opal_imc_counters_probe, > +}; > + This can't be built as a module, so it should not be using MODULE macros. > +MODULE_DEVICE_TABLE(of, opal_imc_match); Drop that. > +module_platform_driver(opal_imc_driver); Use builtin_platform_driver(). > +MODULE_DESCRIPTION("PowerNV OPAL IMC driver"); > +MODULE_LICENSE("GPL"); Drop those. > diff --git a/arch/powerpc/platforms/powernv/opal.c > b/arch/powerpc/platforms/powernv/opal.c > index 59684b4af4d1..fbdca259ea76 100644 > --- a/arch/powerpc/platforms/powernv/opal.c > +++ b/arch/powerpc/platforms/powernv/opal.c > @@ -705,6 +707,17 @@ static void opal_pdev_init(const char *compatible) > of_platform_device_create(np, NULL, NULL); > } > > +#ifdef CONFIG_HV_PERF_IMC_CTRS > +static void __init opal_imc_init_dev(void) > +{ > + struct device_node *np; > + > + np = of_find_compatible_node(NULL, NULL, IMC_DTB_COMPAT); > + if (np) > + of_platform_device_create(np, NULL, NULL); > +} > +#endif That doesn't need the #ifdef. > static int kopald(void *unused) > { > unsigned long timeout = msecs_to_jiffies(opal_heartbeat) + 1; > @@ -778,6 +791,11 @@ static int __init opal_init(void) > /* Setup a heatbeat thread if requested by OPAL */ > opal_init_heartbeat(); > > +#ifdef CONFIG_HV_PERF_IMC_CTRS > + /* Detect IMC pmu counters support and create PMUs */ > + opal_imc_init_dev(); > +#endif > + Neither here. > /* Create leds platform devices */ > leds = of_find_node_by_path("/ibm,opal/leds"); > if (leds) { > -- > 2.11.0 cheers
[PATCH] iio: adc: rockchip_saradc: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/iio/adc/rockchip_saradc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c index ae6d332..2bf2ed1 100644 --- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -224,6 +224,11 @@ static int rockchip_saradc_probe(struct platform_device *pdev) info = iio_priv(indio_dev); match = of_match_device(rockchip_saradc_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } + info->data = match->data; mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); -- 2.5.0
Re: [PATCH 21/21] x86/intel_rdt/mbm: Handle counter overflow
On Thu, 6 Jul 2017, Shivappa Vikas wrote: > On Sun, 2 Jul 2017, Thomas Gleixner wrote: > > On Mon, 26 Jun 2017, Vikas Shivappa wrote: > > > +static void mbm_update(struct rdt_domain *d, int rmid) > > > +{ > > > + struct rmid_read rr; > > > + > > > + rr.first = false; > > > + rr.d = d; > > > + > > > + if (is_mbm_total_enabled()) { > > > + rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; > > > + __mon_event_count(rmid, &rr); > > > > This is broken as it is not protected against a concurrent read from user > > space which comes in via a smp function call. > > The read from user also has the rdtgroup_mutex. Which is again, completely non obvious and undocumented in the code. Aside of that, are you really serious about serializing the world and everything on a single global mutex? Thanks, tglx
[PATCH v2] net: axienet: add support for standard phy-mode binding
Keep supporting proprietary "xlnx,phy-type" attribute and add support for MII connectivity to the PHY. Signed-off-by: Alvaro Gamez Machado --- Changes since v1: * Renamed phy_type to phy_mode. No other instances of this struct member were found except for those we wanted to change, so there's no other hidden meaning behind this. * Added Device Tree Binding document specifying required and optional properties, an example. Also, make a explicit note of why this driver is incompatible with AXI DMA driver .../devicetree/bindings/net/xilinx_axienet.txt | 55 ++ drivers/net/ethernet/xilinx/xilinx_axienet.h | 4 +- drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 48 ++- 3 files changed, 93 insertions(+), 14 deletions(-) create mode 100644 Documentation/devicetree/bindings/net/xilinx_axienet.txt diff --git a/Documentation/devicetree/bindings/net/xilinx_axienet.txt b/Documentation/devicetree/bindings/net/xilinx_axienet.txt new file mode 100644 index ..38f9ec076743 --- /dev/null +++ b/Documentation/devicetree/bindings/net/xilinx_axienet.txt @@ -0,0 +1,55 @@ +XILINX AXI ETHERNET Device Tree Bindings + + +Also called AXI 1G/2.5G Ethernet Subsystem, the xilinx axi ethernet IP core +provides connectivity to an external ethernet PHY supporting different +interfaces: MII, GMII, RGMII, SGMII, 1000BaseX. It also includes two +segments of memory for buffering TX and RX, as well as the capability of +offloading TX/RX checksum calculation off the processor. + +Management configuration is done through the AXI interface, while payload is +sent and received through means of an AXI DMA controller. This driver +includes the DMA driver code, so this driver is incompatible with AXI DMA +driver. + +For more details about mdio please refer phy.txt file in the same directory. + +Required properties: +- compatible : Must be one of "xlnx,axi-ethernet-1.00.a", + "xlnx,axi-ethernet-1.01.a", "xlnx,axi-ethernet-2.01.a" +- reg : Address and length of the IO space. +- interrupts : Should be a list of two interrupt, TX and RX. +- phy-handle : Should point to the external phy device. + See ethernet.txt file in the same directory. +- xlnx,rxmem : Set to allocated memory buffer for Rx/Tx in the hardware + +Optional properties: +- phy-mode : See ethernet.txt +- xlnx,phy-type: Deprecated, do not use, but still accepted in preference + to phy-mode. +- xlnx,txcsum : 0 or empty for disabling TX checksum offload, + 1 to enable partial TX checksum offload, + 2 to enable full TX checksum offload +- xlnx,rxcsum : Same values as xlnx,txcsum but for RX checksum offload + +Example: + axi_ethernet_eth: ethernet@40c0 { + compatible = "xlnx,axi-ethernet-1.00.a"; + device_type = "network"; + interrupt-parent = <µblaze_0_axi_intc>; + interrupts = <2 0>; + phy-mode = "mii"; + reg = <0x40c0 0x4>; + xlnx,rxcsum = <0x2>; + xlnx,rxmem = <0x800>; + xlnx,txcsum = <0x2>; + phy-handle = <&phy0>; + axi_ethernetlite_0_mdio: mdio { + #address-cells = <1>; + #size-cells = <0>; + phy0: phy@0 { + device_type = "ethernet-phy"; + reg = <1>; + }; + }; + }; diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h index af27f7d1cbf3..48e939a5fa31 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h @@ -389,7 +389,7 @@ struct axidma_bd { * @dma_err_tasklet: Tasklet structure to process Axi DMA errors * @tx_irq:Axidma TX IRQ number * @rx_irq:Axidma RX IRQ number - * @phy_type: Phy type to identify between MII/GMII/RGMII/SGMII/1000 Base-X + * @phy_mode: Phy type to identify between MII/GMII/RGMII/SGMII/1000 Base-X * @options: AxiEthernet option word * @last_link: Phy link state in which the PHY was negotiated earlier * @features: Stores the extended features supported by the axienet hw @@ -432,7 +432,7 @@ struct axienet_local { int tx_irq; int rx_irq; - u32 phy_type; + u32 phy_mode; u32 options;/* Current options word */ u32 last_link; diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c index 33c595f4691d..2f759d448383 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c @@ -531,11 +531,11 @@ static void axienet_adjust_link(struct net_device
[PATCH] include: platform_device: add pdev_info(), pdev_warn, ... convencience macros
--- include/linux/platform_device.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index 98c2a7c7108e..723c209d3760 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -368,4 +368,11 @@ extern int platform_pm_restore(struct device *dev); #define USE_PLATFORM_PM_SLEEP_OPS #endif +#define pdev_crit(pdev, fmt, args...) dev_crit(&pdev->dev, fmt, ##args) +#define pdev_err(pdev, fmt, args...) dev_err(&pdev->dev, fmt, ##args) +#define pdev_warn(pdev, fmt, args...) dev_warn(&pdev->dev, fmt, ##args) +#define pdev_notice(pdev, fmt, args...)dev_notice(&pdev->dev, fmt, ##args) +#define pdev_info(pdev, fmt, args...) dev_info(&pdev->dev, fmt, ##args) +#define pdev_debug(pdev, fmt, args...) dev_debug(&pdev->dev, fmt, ##args) + #endif /* _PLATFORM_DEVICE_H_ */ -- 2.11.0.rc0.7.gbe5a750
Re: [PATCH 19/21] x86/intel_rdt/mbm: Basic counting of MBM events (total and local)
On Thu, 6 Jul 2017, Shivappa Vikas wrote: > On Sun, 2 Jul 2017, Thomas Gleixner wrote: > > > INIT_LIST_HEAD(&r->evt_list); > > > > > > if (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID)) > > > list_add_tail(&llc_occupancy_event.list, &r->evt_list); > > > + if (is_mbm_total_enabled()) > > > + list_add_tail(&mbm_total_event.list, &r->evt_list); > > > + if (is_mbm_local_enabled()) > > > + list_add_tail(&mbm_local_event.list, &r->evt_list); > > > > Confused. This hooks all monitoring features to RDT_RESOURCE_L3. Why? > > They are really L3 resource events as per the spec. > CPUID.(EAX=0FH, ECX=0):EDX.L3[bit 1] = 1 if L3 monitoring and we query for all > the llc_occupancy, l3 total and local b/w with the same resource id 1. Then this should be documented somewhere in the code
[PATCH] iio: adc: meson-saradc: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/iio/adc/meson_saradc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/iio/adc/meson_saradc.c b/drivers/iio/adc/meson_saradc.c index 83da50e..07dbcc3 100644 --- a/drivers/iio/adc/meson_saradc.c +++ b/drivers/iio/adc/meson_saradc.c @@ -915,6 +915,11 @@ static int meson_sar_adc_probe(struct platform_device *pdev) init_completion(&priv->done); match = of_match_device(meson_sar_adc_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } + priv->data = match->data; indio_dev->name = priv->data->name; -- 2.5.0
Re: [PATCH 13/21] x86/intel_rdt/cqm: Add cpus file support
On Thu, 6 Jul 2017, Shivappa Vikas wrote: > On Sun, 2 Jul 2017, Thomas Gleixner wrote: > > > + /* Check whether cpus belong to parent ctrl group */ > > > + cpumask_andnot(tmpmask, newmask, &pr->cpu_mask); > > > + if (cpumask_weight(tmpmask)) { > > > + ret = -EINVAL; > > > + goto out; > > > + } > > > + > > > + /* Check whether cpus are dropped from this group */ > > > + cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask); > > > + if (cpumask_weight(tmpmask)) { > > > + /* Give any dropped cpus to parent rdtgroup */ > > > + cpumask_or(&pr->cpu_mask, &pr->cpu_mask, tmpmask); > > > > This does not make any sense. The check above verifies that all cpus in > > newmask belong to the parent->cpu_mask. If they don't then you return > > -EINVAL, but here you give them back to parent->cpu_mask. How is that > > supposed to work? You never get into this code path! > > The parent->cpu_mask always is the parent->cpus_valid_mask if i understand > right. With monitor group, the cpu is present is always present in "one" > ctrl_mon group and one mon_group. And the mon group can have only cpus in its > parent. May be it needs a comment? (its explaind in the documentation patch). Sigh, the code needs to be written in a way that it is halfways obvious what's going on. > # mkdir /sys/fs/resctrl/p1 > # mkdir /sys/fs/resctrl/p1/mon_groups/m1 > # echo 5-10 > /sys/fs/resctr/p1/cpus_list > Say p1 has RMID 2 > cpus 5-10 have RMID 2 So what you say, is that parent is always the resource control group itself. Can we please have a proper distinction in the code? I tripped over that ambigiousities several times. The normal meaning of parent->child relations is that both have the same type. While this is the case at the implementation detail level (both are type struct rdtgroup), from a conceptual level they are different: parent is a resource group and child is a monitoring group That should be expressed in the code, at the very least by variable naming, so it becomes immediately clear that this operates on two different entities. The proper solution is to have different data types or at least embedd the monitoring bits in a seperate entity inside of struct rdtgroup. struct mongroup { monitoring stuff; }; struct rdtgroup { common stuff; struct mongroup mon; }; So the code can operate on r->mon.foo or mon->foo which makes it entirely clear what kind of operation this is. Sigh, cramming everything into a single struct without distinction is the same as operating on a pile of global variables, which is the most common pattern used by people learning C. You certainly belong not to that group, so dammit, get your act together and structure the code so it's obvious and maintainable. Thanks, tglx
Re: [PATCH v2 4/6] cpufreq: schedutil: update CFS util only if used
On 2017-07-04 10:34, Patrick Bellasi wrote: Currently the utilization of the FAIR class is collected before locking the policy. Although that should not be a big issue for most cases, we also don't really know how much latency there can be between the utilization reading and its usage. Let's get the FAIR utilization right before its usage to be better in sync with the current status of a CPU. Signed-off-by: Patrick Bellasi Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Rafael J. Wysocki Cc: Viresh Kumar Cc: linux-kernel@vger.kernel.org Cc: linux...@vger.kernel.org --- kernel/sched/cpufreq_schedutil.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 98704d8..df433f1 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -308,10 +308,9 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, if (unlikely(current == sg_policy->thread)) return; - sugov_get_util(&util, &max); - raw_spin_lock(&sg_policy->update_lock); + sugov_get_util(&util, &max); sg_cpu->util = util; sg_cpu->max = max; Given that the utilization update hooks are called with the per-cpu rq lock held (for all classes), I don't think PELT utilization can change throughout the lifetime of the cpufreq_update_{util,this_cpu} call? Even with Viresh's remote cpu callback series we'd still have to hold the rq lock across cpufreq_update_util.. what can change today is 'max' (arch_scale_cpu_capacity) when a cpufreq policy is shared, so the patch is still needed for that reason I think? Thanks, Vikram -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] pwm: sun4i: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/pwm/pwm-sun4i.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pwm/pwm-sun4i.c b/drivers/pwm/pwm-sun4i.c index 1284ffa..295ca19 100644 --- a/drivers/pwm/pwm-sun4i.c +++ b/drivers/pwm/pwm-sun4i.c @@ -321,6 +321,10 @@ static int sun4i_pwm_probe(struct platform_device *pdev) const struct of_device_id *match; match = of_match_device(sun4i_pwm_dt_ids, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } pwm = devm_kzalloc(&pdev->dev, sizeof(*pwm), GFP_KERNEL); if (!pwm) -- 2.5.0
Re: [next-20170609] Oops while running CPU off-on (cpuset.c/cpuset_can_attach)
On Wed, 2017-07-05 at 11:28 -0400, Tejun Heo wrote: > Hello, Abdul. > > Thanks for the debug info. Can you please see whether the following > patch fixes the issue? It is my pleasure and yes the patch fixes the problem. > If the problem is too difficult to reproduce The problem was reproducible all the time. With the patch fix, I tried multiple times and long runs of cpu off-on cycles but no Oops is seen. Thank you for spending your valuable time on fixing this issue. Reported-and-tested-by : Abdul Haleem > to confirm the fix by seeing whether it no longer triggers, please let > me know. We can instead apply a patch which triggers WARN on the > failing condition to confirm the diagnosis. > > Thanks. > > diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h > index 793565c05742..8b4c3c2f2509 100644 > --- a/kernel/cgroup/cgroup-internal.h > +++ b/kernel/cgroup/cgroup-internal.h > @@ -33,6 +33,9 @@ struct cgroup_taskset { > struct list_headsrc_csets; > struct list_headdst_csets; > > + /* the number of tasks in the set */ > + int nr_tasks; > + > /* the subsys currently being processed */ > int ssid; > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index dbfd7028b1c6..e3c4152741a3 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -1954,6 +1954,8 @@ static void cgroup_migrate_add_task(struct task_struct > *task, > if (!cset->mg_src_cgrp) > return; > > + mgctx->tset.nr_tasks++; > + > list_move_tail(&task->cg_list, &cset->mg_tasks); > if (list_empty(&cset->mg_node)) > list_add_tail(&cset->mg_node, > @@ -2047,16 +2049,18 @@ static int cgroup_migrate_execute(struct cgroup_mgctx > *mgctx) > return 0; > > /* check that we can legitimately attach to the cgroup */ > - do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > - if (ss->can_attach) { > - tset->ssid = ssid; > - ret = ss->can_attach(tset); > - if (ret) { > - failed_ssid = ssid; > - goto out_cancel_attach; > + if (tset->nr_tasks) { > + do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > + if (ss->can_attach) { > + tset->ssid = ssid; > + ret = ss->can_attach(tset); > + if (ret) { > + failed_ssid = ssid; > + goto out_cancel_attach; > + } > } > - } > - } while_each_subsys_mask(); > + } while_each_subsys_mask(); > + } > > /* >* Now that we're guaranteed success, proceed to move all tasks to > @@ -2085,25 +2089,29 @@ static int cgroup_migrate_execute(struct cgroup_mgctx > *mgctx) >*/ > tset->csets = &tset->dst_csets; > > - do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > - if (ss->attach) { > - tset->ssid = ssid; > - ss->attach(tset); > - } > - } while_each_subsys_mask(); > + if (tset->nr_tasks) { > + do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > + if (ss->attach) { > + tset->ssid = ssid; > + ss->attach(tset); > + } > + } while_each_subsys_mask(); > + } > > ret = 0; > goto out_release_tset; > > out_cancel_attach: > - do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > - if (ssid == failed_ssid) > - break; > - if (ss->cancel_attach) { > - tset->ssid = ssid; > - ss->cancel_attach(tset); > - } > - } while_each_subsys_mask(); > + if (tset->nr_tasks) { > + do_each_subsys_mask(ss, ssid, mgctx->ss_mask) { > + if (ssid == failed_ssid) > + break; > + if (ss->cancel_attach) { > + tset->ssid = ssid; > + ss->cancel_attach(tset); > + } > + } while_each_subsys_mask(); > + } > out_release_tset: > spin_lock_irq(&css_set_lock); > list_splice_init(&tset->dst_csets, &tset->src_csets); > -- Regard's Abdul Haleem IBM Linux Technology Centre
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:49 PM, Kees Cook wrote: > On Thu, Jul 6, 2017 at 10:39 PM, Linus Torvalds >> And I think the credentials switch (which is the point of no return >> anyway) happens before we start mmap'ing the executable etc. We used >> to have some odd code there and do it in the completely wrong order >> (checking that the binary was executable for the *old* user, which >> makes no sense, iirc) > > Yeah, it all happens in setup_new_exec(). The first thing is layout > selection, then switching credentials. It could be made to take a hint > from GNU_STACK (which was parsed before setup_new_exec() is called), > check security_bprm_secureexec() and then make the rlimit changes, all > before the layout selection. At Andy's suggestion I'm using security_bprm_secureexec() to test for setuid-ness. However, this seems to expect the credentials to have already been installed. And yet ... the following patch still works correctly when I call it "early". I'm going to look again in the morning. diff --git a/fs/exec.c b/fs/exec.c index b60804216b59..a4d2433a44ec 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1334,9 +1334,20 @@ EXPORT_SYMBOL(would_dump); void setup_new_exec(struct linux_binprm * bprm) { + /* This is the point of no return */ + + /* +* If this is a setuid execution, reset the stack limit to +* a sane default to avoid bad behavior from the prior rlimits. +*/ + if (security_bprm_secureexec(bprm)) { + struct rlimit default_stack = { _STK_LIM, RLIM_INFINITY }; + + current->signal->rlim[RLIMIT_STACK] = default_stack; + } + arch_pick_mmap_layout(current->mm); - /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; if (uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), current_gid())) -- Kees Cook Pixel Security
[PATCH] crypto: brcm: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/crypto/bcm/cipher.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/crypto/bcm/cipher.c b/drivers/crypto/bcm/cipher.c index 9cfd36c..f1a826f 100644 --- a/drivers/crypto/bcm/cipher.c +++ b/drivers/crypto/bcm/cipher.c @@ -4813,6 +4813,11 @@ static int spu_dt_read(struct platform_device *pdev) int err; match = of_match_device(of_match_ptr(bcm_spu_dt_ids), dev); + if (!match) { + dev_err(&pdev->dev, "Failed to match device\n"); + return -ENODEV; + } + matched_spu_type = match->data; if (iproc_priv.spu.num_spu > 1) { -- 2.5.0
Re: [RFC v2 0/5] surface heterogeneous memory performance information
On Thu, 2017-07-06 at 15:52 -0600, Ross Zwisler wrote: > Quick Summary > > Platforms in the very near future will have multiple types of memory > attached to a single CPU. These disparate memory ranges will have some > characteristics in common, such as CPU cache coherence, but they can have > wide ranges of performance both in terms of latency and bandwidth. > > For example, consider a system that contains persistent memory, standard > DDR memory and High Bandwidth Memory (HBM), all attached to the same CPU. > There could potentially be an order of magnitude or more difference in > performance between the slowest and fastest memory attached to that CPU. > > With the current Linux code NUMA nodes are CPU-centric, so all the memory > attached to a given CPU will be lumped into the same NUMA node. This makes > it very difficult for userspace applications to understand the performance > of different memory ranges on a given CPU. > > We solve this issue by providing userspace with performance information on > individual memory ranges. This performance information is exposed via > sysfs: > > # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null > mem_tgt2/firmware_id:1 > mem_tgt2/is_cached:0 > mem_tgt2/is_enabled:1 > mem_tgt2/is_isolated:0 Could you please explain these charactersitics, are they in the patches to follow? > mem_tgt2/phys_addr_base:0x0 > mem_tgt2/phys_length_bytes:0x8 > mem_tgt2/local_init/read_bw_MBps:30720 > mem_tgt2/local_init/read_lat_nsec:100 > mem_tgt2/local_init/write_bw_MBps:30720 > mem_tgt2/local_init/write_lat_nsec:100 > How to these numbers compare to normal system memory? > This allows applications to easily find the memory that they want to use. > We expect that the existing NUMA APIs will be enhanced to use this new > information so that applications can continue to use them to select their > desired memory. > > This series is built upon acpica-1705: > > https://github.com/zetalog/linux/commits/acpica-1705 > > And you can find a working tree here: > > https://git.kernel.org/pub/scm/linux/kernel/git/zwisler/linux.git/log/?h=hmem_sysfs > > Lots of Details > > This patch set is only concerned with CPU-addressable memory types, not > on-device memory like what we have with Jerome Glisse's HMM series: > > https://lwn.net/Articles/726691/ > > This patch set works by enabling the new Heterogeneous Memory Attribute > Table (HMAT) table, newly defined in ACPI 6.2. One major conceptual change > in ACPI 6.2 related to this work is that proximity domains no longer need > to contain a processor. We can now have memory-only proximity domains, > which means that we can now have memory-only Linux NUMA nodes. > > Here is an example configuration where we have a single processor, one > range of regular memory and one range of HBM: > > +---+ ++ > | Processor | | Memory | > | prox domain 0 +---+ prox domain 1 | > | NUMA node 1 | | NUMA node 2| > +---+---+ ++ > | > +---+--+ > | HBM | > | prox domain 2| > | NUMA node 0 | > +--+ > > This gives us one initiator (the processor) and two targets (the two memory > ranges). Each of these three has its own ACPI proximity domain and > associated Linux NUMA node. Note also that while there is a 1:1 mapping > from each proximity domain to each NUMA node, the numbers don't necessarily > match up. Additionally we can have extra NUMA nodes that don't map back to > ACPI proximity domains. Could you expand on proximity domains, are they the same as node distance or is this ACPI terminology for something more? > > The above configuration could also have the processor and one of the two > memory ranges sharing a proximity domain and NUMA node, but for the > purposes of the HMAT the two memory ranges will always need to be > separated. > > The overall goal of this series and of the HMAT is to allow users to > identify memory using its performance characteristics. This can broadly be > done in one of two ways: > > Option 1: Provide the user with a way to map between proximity domains and > NUMA nodes and a way to access the HMAT directly (probably via > /sys/firmware/acpi/tables). Then, through possibly a library and a daemon, > provide an API so that applications can either request information about > memory ranges, or request memory allocations that meet a given set of > performance characteristics. > > Option 2: Provide the user with HMAT performance data directly in sysfs, > allowing applications to directly access it without the need for the > library and daemon. > > The kernel work for option 1 is started by patches 1-3. These just surface > the minimal amount of information in sysfs to allow userspace to map > between proximity domains and NUMA nodes so that the raw data in the HMAT > table can be understood. > > Patches 4
[PATCH] input: tegra-kbc: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/input/keyboard/tegra-kbc.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/input/keyboard/tegra-kbc.c b/drivers/input/keyboard/tegra-kbc.c index 0c07e10..742c5ac 100644 --- a/drivers/input/keyboard/tegra-kbc.c +++ b/drivers/input/keyboard/tegra-kbc.c @@ -617,6 +617,10 @@ static int tegra_kbc_probe(struct platform_device *pdev) const struct of_device_id *match; match = of_match_device(tegra_kbc_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } kbc = devm_kzalloc(&pdev->dev, sizeof(*kbc), GFP_KERNEL); if (!kbc) { -- 2.5.0
[RFC PATCH v4] acpi: indicate to platform when hot remove returns busy
In hotplug logic, it always indicates non-specific failure to platform through _OST when handing acpi hot-remove event failed. Then platform terminates the hot-remove process but it can not identify the reason. Base on current hot-remove code, there have two situations that it returns busy: - OSPM try to offline an individual device, but the device offline function returns busy. - When the ejection event is applied to an "not offlined yet" container. OSPM send kobject change event to userspace and returns busy. Both of them will returns -EBUSY to acpi device hotplug function then hotplug function indicates non-specific failure to platform just like any other error, e.g. -ENODEV or -EIO. The benefit to platform for identifying the OS busy state is that platform can be applied different approach to handle the busy but not just terminate the hot-remove process by unknown reason. For example, platform can wait for a while then triggers hot-remove again. This RFC patch adds one more parameter to the handler function of acpi generic hotplug event to give the function a chance to propose the return code of _OST. In this case, it sets ost return code to ACPI_OST_SC_DEVICE_BUSY when the acpi hot remove function returns -EBUSY. v4: Use switch-case statements to simplify code. (Andy Shevchenko, Rafael J. Wysocki) v3: Removed redundant 'else' in acpi_ost_status_code(). (Andy Shevchenko) v2: Do not overwrite ost code in acpi_generic_hotplug_event(). Move the "error code to ost code" logic to a help function. (Andy Shevchenko) Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Michal Hocko Reviewed-by: Andy Shevchenko Signed-off-by: "Lee, Chun-Yi" --- drivers/acpi/scan.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index d531629..ce88175 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -404,10 +404,6 @@ void acpi_device_hotplug(struct acpi_device *adev, u32 src) error = dock_notify(adev, src); } else if (adev->flags.hotplug_notify) { error = acpi_generic_hotplug_event(adev, src); - if (error == -EPERM) { - ost_code = ACPI_OST_SC_EJECT_NOT_SUPPORTED; - goto err_out; - } } else { int (*notify)(struct acpi_device *, u32); @@ -423,8 +419,20 @@ void acpi_device_hotplug(struct acpi_device *adev, u32 src) else goto out; } - if (!error) + switch (error) { + case 0: ost_code = ACPI_OST_SC_SUCCESS; + break; + case -EPERM: + ost_code = ACPI_OST_SC_EJECT_NOT_SUPPORTED; + break; + case -EBUSY: + ost_code = ACPI_OST_SC_DEVICE_BUSY; + break; + default: + ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; + break; + } err_out: acpi_evaluate_ost(adev->handle, src, ost_code, NULL); -- 2.10.2
[PATCH v3] acpi: handle the acpi hotplug schedule error
Kernel should decrements the reference count of acpi device when the scheduling of acpi hotplug work failed, and evaluates _OST to notify BIOS the failure. v3: More simplify the code. (Rafael J. Wysocki) v2: To simplify the code. (Andy Shevchenko) Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Michal Hocko Reviewed-by: Andy Shevchenko Signed-off-by: "Lee, Chun-Yi" --- drivers/acpi/bus.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 784bda6..9d4fea6 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -432,11 +432,15 @@ static void acpi_bus_notify(acpi_handle handle, u32 type, void *data) (driver->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS)) driver->ops.notify(adev, type); - if (hotplug_event && ACPI_SUCCESS(acpi_hotplug_schedule(adev, type))) + if (!hotplug_event) { + acpi_bus_put_acpi_device(adev); + return; + } + + if (ACPI_SUCCESS(acpi_hotplug_schedule(adev, type))) return; acpi_bus_put_acpi_device(adev); - return; err: acpi_evaluate_ost(handle, type, ost_code, NULL); -- 2.10.2
Re: [PATCH 14/21] x86/intel_rdt/cqm: Add mon_data
On Thu, 6 Jul 2017, Shivappa Vikas wrote: > On Sun, 2 Jul 2017, Thomas Gleixner wrote: > > > +static bool __mon_event_count(u32 rmid, struct rmid_read *rr) > > > +{ > > > + u64 tval; > > > + > > > + tval = __rmid_read(rmid, rr->evtid); > > > + if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) { > > > + rr->val = tval; > > > + return false; > > > + } > > > + switch (rr->evtid) { > > > + case QOS_L3_OCCUP_EVENT_ID: > > > + rr->val += tval; > > > + return true; > > > + default: > > > + return false; > > > > I have no idea what that return code means. > > false for the invalid event id and all errors for __rmid_read. (IOW all errors > for __mon_event-read) Sure, but why bool? What's wrong with proper error return codes, so issues can be distinguished and potentially propagated in the callchain? Thanks, tglx
[PATCH] usb: dwc3: omap: remove IRQ_NOAUTOEN used with shared irq
IRQ_NOAUTOEN cannot be used with shared IRQs, since commit 04c848d39879 ("genirq: Warn when IRQ_NOAUTOEN is used with shared interrupts") and kernel now throws a warn dump. But OMAP DWC3 driver uses this flag. As per commit 12a7f17fac5b ("usb: dwc3: omap: fix race of pm runtime with irq handler in probe") that introduced this flag, PM runtime can race with IRQ handler when deferred probing happens due to extcon, therefore IRQ_NOAUTOEN needs to be set so that irq is not enabled until extcon is registered. Remove setting of IRQ_NOAUTOEN and move the registration of shared irq to a point after dwc3_omap_extcon_register() and of_platform_populate(). This avoids possibility of probe deferring and above said race condition. Signed-off-by: Vignesh R --- drivers/usb/dwc3/dwc3-omap.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/usb/dwc3/dwc3-omap.c b/drivers/usb/dwc3/dwc3-omap.c index 98926504b55b..f5aaa0cf3873 100644 --- a/drivers/usb/dwc3/dwc3-omap.c +++ b/drivers/usb/dwc3/dwc3-omap.c @@ -512,15 +512,6 @@ static int dwc3_omap_probe(struct platform_device *pdev) /* check the DMA Status */ reg = dwc3_omap_readl(omap->base, USBOTGSS_SYSCONFIG); - irq_set_status_flags(omap->irq, IRQ_NOAUTOEN); - ret = devm_request_threaded_irq(dev, omap->irq, dwc3_omap_interrupt, - dwc3_omap_interrupt_thread, IRQF_SHARED, - "dwc3-omap", omap); - if (ret) { - dev_err(dev, "failed to request IRQ #%d --> %d\n", - omap->irq, ret); - goto err1; - } ret = dwc3_omap_extcon_register(omap); if (ret < 0) @@ -532,8 +523,15 @@ static int dwc3_omap_probe(struct platform_device *pdev) goto err1; } + ret = devm_request_threaded_irq(dev, omap->irq, dwc3_omap_interrupt, + dwc3_omap_interrupt_thread, IRQF_SHARED, + "dwc3-omap", omap); + if (ret) { + dev_err(dev, "failed to request IRQ #%d --> %d\n", + omap->irq, ret); + goto err1; + } dwc3_omap_enable_irqs(omap); - enable_irq(omap->irq); return 0; err1: -- 2.13.0
[PATCH] fpga: altera-hps2fpga: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return -ENODEV Signed-off-by: Gustavo A. R. Silva --- drivers/fpga/altera-hps2fpga.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/fpga/altera-hps2fpga.c b/drivers/fpga/altera-hps2fpga.c index 3066b80..ca8212c 100644 --- a/drivers/fpga/altera-hps2fpga.c +++ b/drivers/fpga/altera-hps2fpga.c @@ -143,6 +143,11 @@ static int alt_fpga_bridge_probe(struct platform_device *pdev) int ret; of_id = of_match_device(altera_fpga_of_match, dev); + if (!of_id) { + dev_err(dev, "failed to match device\n"); + return -ENODEV; + } + priv = (struct altera_hps2fpga_data *)of_id->data; priv->bridge_reset = of_reset_control_get_by_index(dev->of_node, 0); -- 2.5.0
[PATCH] drm/tegra: vic: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return. Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/tegra/vic.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c index 47cb1aa..7e11825 100644 --- a/drivers/gpu/drm/tegra/vic.c +++ b/drivers/gpu/drm/tegra/vic.c @@ -283,6 +283,11 @@ static int vic_probe(struct platform_device *pdev) int err; match = of_match_device(vic_match, dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } + vic_config = (struct vic_config *)match->data; vic = devm_kzalloc(dev, sizeof(*vic), GFP_KERNEL); -- 2.5.0
Re: [PATCH] x86/boot/KASLR: exclude EFI_BOOT_SERVICES_{CODE|DATA} from KASLR's choice
On Fri, Jul 07, 2017 at 11:07:59AM +0800, Baoquan He wrote: > On 07/06/17 at 03:57pm, Matt Fleming wrote: > > On Thu, 06 Jul, at 08:31:07AM, Naoya Horiguchi wrote: > > > + for (i = 0; i < nr_desc; i++) { > > > + md = (efi_memory_desc_t *)(pmap + (i * e->efi_memdesc_size)); > > > + > > > + /* > > > + * EFI_BOOT_SERVICES_{CODE|DATA} are avoided because boot > > > + * services regions could be accessed after ExitBootServices() > > > + * due to the workaround for buggy firmware. > > > + */ > > > + if (!(md->type == EFI_LOADER_CODE || > > > + md->type == EFI_LOADER_DATA || > > > + md->type == EFI_CONVENTIONAL_MEMORY)) > > > + continue; > > > > Wouldn't it make more sense to *only* use EFI_CONVENTIONAL_MEMORY? > > > > You can't re-use EFI_LOADER_* regions because the kaslr code is run so > > early in boot that you've no idea if data the kernel will need is in > > those EFI_LOADER_* regions. > > > > For example, we pass struct setup_data objects inside of > > EFI_LOADER_DATA regions. > > It doesn't matter because we have tried to avoid those memory setup_data > resides in in mem_avoid_overlap(). Here discarding EFI_LOADER_* could > discard the whole regions while setup_data could occupy small part of > them. Hi Matt, Baoquan, I added these three checks to accept any regions corresponding to E820_TYPE_RAM except EFI_BOOT_SERVICES_*, just thinking of that it's minimum surprising. Baoquan gave a good justification on that, so I'll leave it as-is in next version. Thanks, Naoya Horiguchi
[PATCH v2] f2fs: use spin_{,un}lock_irq{save,restore}
From: Chao Yu generic/361 reports below warning, this is because: once, there is someone entering into critical region of sbi.cp_lock, if write_end_io. f2fs_stop_checkpoint is invoked from an triggered IRQ, we will encounter deadlock. So this patch changes to use spin_{,un}lock_irq{save,restore} to create critical region without IRQ enabled to avoid potential deadlock. irq event stamp: 83391573 loop: Write error at byte offset 438729728, length 1024. hardirqs last enabled at (83391573): [] restore_all+0xf/0x65 hardirqs last disabled at (83391572): [] reschedule_interrupt+0x30/0x3c loop: Write error at byte offset 438860288, length 1536. softirqs last enabled at (83389244): [] __do_softirq+0x1ae/0x476 softirqs last disabled at (83389237): [] do_softirq_own_stack+0x2c/0x40 loop: Write error at byte offset 438990848, length 2048. WARNING: inconsistent lock state 4.12.0-rc2+ #30 Tainted: G O inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. xfs_io/7959 [HC1[1]:SC0[0]:HE0:SE1] takes: (&(&sbi->cp_lock)->rlock){?.+...}, at: [] f2fs_stop_checkpoint+0x1c/0x50 [f2fs] {HARDIRQ-ON-W} state was registered at: __lock_acquire+0x527/0x7b0 lock_acquire+0xae/0x220 _raw_spin_lock+0x42/0x50 do_checkpoint+0x165/0x9e0 [f2fs] write_checkpoint+0x33f/0x740 [f2fs] __f2fs_sync_fs+0x92/0x1f0 [f2fs] f2fs_sync_fs+0x12/0x20 [f2fs] sync_filesystem+0x67/0x80 generic_shutdown_super+0x27/0x100 kill_block_super+0x22/0x50 kill_f2fs_super+0x3a/0x40 [f2fs] deactivate_locked_super+0x3d/0x70 deactivate_super+0x40/0x60 cleanup_mnt+0x39/0x70 __cleanup_mnt+0x10/0x20 task_work_run+0x69/0x80 exit_to_usermode_loop+0x57/0x85 do_fast_syscall_32+0x18c/0x1b0 entry_SYSENTER_32+0x4c/0x7b irq event stamp: 1957420 hardirqs last enabled at (1957419): [] _raw_spin_unlock_irq+0x27/0x50 hardirqs last disabled at (1957420): [] call_function_single_interrupt+0x30/0x3c softirqs last enabled at (1953784): [] __do_softirq+0x1ae/0x476 softirqs last disabled at (1953773): [] do_softirq_own_stack+0x2c/0x40 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(&(&sbi->cp_lock)->rlock); lock(&(&sbi->cp_lock)->rlock); *** DEADLOCK *** 2 locks held by xfs_io/7959: #0: (sb_writers#13){.+.+.+}, at: [] vfs_write+0x16a/0x190 #1: (&sb->s_type->i_mutex_key#16){+.+.+.}, at: [] f2fs_file_write_iter+0x25/0x140 [f2fs] stack backtrace: CPU: 2 PID: 7959 Comm: xfs_io Tainted: G O4.12.0-rc2+ #30 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack+0x5f/0x92 print_usage_bug+0x1d3/0x1dd ? check_usage_backwards+0xe0/0xe0 mark_lock+0x23d/0x280 __lock_acquire+0x699/0x7b0 ? __this_cpu_preempt_check+0xf/0x20 ? trace_hardirqs_off_caller+0x91/0xe0 lock_acquire+0xae/0x220 ? f2fs_stop_checkpoint+0x1c/0x50 [f2fs] _raw_spin_lock+0x42/0x50 ? f2fs_stop_checkpoint+0x1c/0x50 [f2fs] f2fs_stop_checkpoint+0x1c/0x50 [f2fs] f2fs_write_end_io+0x147/0x150 [f2fs] bio_endio+0x7a/0x1e0 blk_update_request+0xad/0x410 blk_mq_end_request+0x16/0x60 lo_complete_rq+0x3c/0x70 __blk_mq_complete_request_remote+0x11/0x20 flush_smp_call_function_queue+0x6d/0x120 ? debug_smp_processor_id+0x12/0x20 generic_smp_call_function_single_interrupt+0x12/0x30 smp_call_function_single_interrupt+0x25/0x40 call_function_single_interrupt+0x37/0x3c EIP: _raw_spin_unlock_irq+0x2d/0x50 EFLAGS: 0296 CPU: 2 EAX: 0001 EBX: d2ccc51c ECX: 0001 EDX: c1aacebd ESI: EDI: EBP: c96c9d1c ESP: c96c9d18 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 ? inherit_task_group.isra.98.part.99+0x6b/0xb0 __add_to_page_cache_locked+0x1d4/0x290 add_to_page_cache_lru+0x38/0xb0 pagecache_get_page+0x8e/0x200 f2fs_write_begin+0x96/0xf00 [f2fs] ? trace_hardirqs_on_caller+0xdd/0x1c0 ? current_time+0x17/0x50 ? trace_hardirqs_on+0xb/0x10 generic_perform_write+0xa9/0x170 __generic_file_write_iter+0x1a2/0x1f0 ? f2fs_preallocate_blocks+0x137/0x160 [f2fs] f2fs_file_write_iter+0x6e/0x140 [f2fs] ? __lock_acquire+0x429/0x7b0 __vfs_write+0xc1/0x140 vfs_write+0x9b/0x190 SyS_pwrite64+0x63/0xa0 do_fast_syscall_32+0xa1/0x1b0 entry_SYSENTER_32+0x4c/0x7b EIP: 0xb7786c61 EFLAGS: 0293 CPU: 2 EAX: ffda EBX: 0003 ECX: 08416000 EDX: 1000 ESI: 18b24000 EDI: EBP: 0003 ESP: bf9b36b0 DS: 007b ES: 007b FS: GS: 0033 SS: 007b Fixes: aaec2b1d1879 ("f2fs: introduce cp_lock to protect updating of ckpt_flags") Cc: sta...@vger.kernel.org Signed-off-by: Chao Yu --- v2: spread spin_{,un}lock_irq{save,restore}. fs/f2fs/checkpoint.c | 11 ++- fs/f2fs/f2fs.h | 18 -- 2 files changed, 18 insertions(+), 11 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 954917d582f8..56bbf592e487 10064
[PATCH] drm/tegra: sor: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return. Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/tegra/sor.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c index a8f5289..ca3d9b1 100644 --- a/drivers/gpu/drm/tegra/sor.c +++ b/drivers/gpu/drm/tegra/sor.c @@ -2538,6 +2538,10 @@ static int tegra_sor_probe(struct platform_device *pdev) int err; match = of_match_device(tegra_sor_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } sor = devm_kzalloc(&pdev->dev, sizeof(*sor), GFP_KERNEL); if (!sor) -- 2.5.0
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 11:02 PM, Linus Torvalds wrote: > So 2+MB is still definitely something people can do (and probably *do* do). With the default 8MB stack, most people are already limited to 2MB here. I guess the question is, do people raise their stack rlimit to gain more arguments? Should I pick a different value for the args? -Kees -- Kees Cook Pixel Security
Re: [PATCH] mm: make allocation counters per-order
Hi Roman, [auto build test WARNING on mmotm/master] [also build test WARNING on next-20170706] [cannot apply to v4.12] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Roman-Gushchin/mm-make-allocation-counters-per-order/20170707-091630 base: git://git.cmpxchg.org/linux-mmotm.git master config: s390-default_defconfig (attached as .config) compiler: s390x-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=s390 All warnings (new ones prefixed by >>): mm/memcontrol.c: In function 'memory_stat_show': >> mm/memcontrol.c:5262:1: warning: the frame size of 1136 bytes is larger than >> 1024 bytes [-Wframe-larger-than=] } ^ vim +5262 mm/memcontrol.c f10eb7534 Roman Gushchin 2017-06-30 5246 events[PGSCAN_DIRECT]); f10eb7534 Roman Gushchin 2017-06-30 5247 seq_printf(m, "pgsteal %lu\n", events[PGSTEAL_KSWAPD] + f10eb7534 Roman Gushchin 2017-06-30 5248 events[PGSTEAL_DIRECT]); f10eb7534 Roman Gushchin 2017-06-30 5249 seq_printf(m, "pgactivate %lu\n", events[PGACTIVATE]); f10eb7534 Roman Gushchin 2017-06-30 5250 seq_printf(m, "pgdeactivate %lu\n", events[PGDEACTIVATE]); f10eb7534 Roman Gushchin 2017-06-30 5251 seq_printf(m, "pglazyfree %lu\n", events[PGLAZYFREE]); f10eb7534 Roman Gushchin 2017-06-30 5252 seq_printf(m, "pglazyfreed %lu\n", events[PGLAZYFREED]); f10eb7534 Roman Gushchin 2017-06-30 5253 2a2e48854 Johannes Weiner 2017-05-03 5254 seq_printf(m, "workingset_refault %lu\n", 71cd31135 Johannes Weiner 2017-05-03 5255 stat[WORKINGSET_REFAULT]); 2a2e48854 Johannes Weiner 2017-05-03 5256 seq_printf(m, "workingset_activate %lu\n", 71cd31135 Johannes Weiner 2017-05-03 5257 stat[WORKINGSET_ACTIVATE]); 2a2e48854 Johannes Weiner 2017-05-03 5258 seq_printf(m, "workingset_nodereclaim %lu\n", 71cd31135 Johannes Weiner 2017-05-03 5259 stat[WORKINGSET_NODERECLAIM]); 2a2e48854 Johannes Weiner 2017-05-03 5260 587d9f726 Johannes Weiner 2016-01-20 5261 return 0; 587d9f726 Johannes Weiner 2016-01-20 @5262 } 587d9f726 Johannes Weiner 2016-01-20 5263 241994ed8 Johannes Weiner 2015-02-11 5264 static struct cftype memory_files[] = { 241994ed8 Johannes Weiner 2015-02-11 5265 { 241994ed8 Johannes Weiner 2015-02-11 5266 .name = "current", f5fc3c5d8 Johannes Weiner 2015-11-05 5267 .flags = CFTYPE_NOT_ON_ROOT, 241994ed8 Johannes Weiner 2015-02-11 5268 .read_u64 = memory_current_read, 241994ed8 Johannes Weiner 2015-02-11 5269 }, 241994ed8 Johannes Weiner 2015-02-11 5270 { :: The code at line 5262 was first introduced by commit :: 587d9f726aaec52157e4156e50363dbe6cb82bdb mm: memcontrol: basic memory statistics in cgroup2 memory controller :: TO: Johannes Weiner :: CC: Linus Torvalds --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:45 PM, Kees Cook wrote: > On Thu, Jul 6, 2017 at 10:36 PM, Andy Lutomirski wrote: >> >> Aren't there real use cases that use many megs of arguments? > > They'd be relatively new since the args were pretty limited before. > I'd be curious to see them. "megs" yes. "many megs" no. The traditional kernel limit was 32 pages (so 128kB on x86, explaining our MAX_ARG value). We moved to the much nider "two active VM's at the same time" model a fairly long time ago, though - it was back in v2.6.23 or so. So about 10 years ago. I would have expected lots of scripts to have been written since that just end up going *far* over the old 128kB limit, because it's really easy to do. Things like big directories and the shell expanding "*" can easily be a megabyte of arguments. I know I used to have scripts where I had to use "xargs" in the past, and with the > 128kB change I just stopped, because "a couple of megabytes" is enough for a lot of things where 128kB wasn't necessarily. Oh, one example is actually the kernel source tree. I don't do it any more (because "git grep" is much better), but I used to do things like grep something $(find . -name '*.[ch]') all the time. And that actually currently *just* overflows the 2MB argument size, but used to work (easily) ten years ago. Oh, how the kernel has grown.. Yes, yes, *portably* you should always have done find . -print0 -name '*.[ch]' | xargs -0 grep but be honest now: that first thing is what you actually write when you do some throw-away one-liner. So 2+MB is still definitely something people can do (and probably *do* do). Linus
Re: [PATCH 2/2] net: ethernet: fsl: add phy reset after clk enable option
On 07/07/2017 07:30 AM, Andy Duan wrote: From: Richard Leitner Sent: Thursday, July 06, 2017 9:06 PM To: Andy Duan ; robh...@kernel.org; mark.rutl...@arm.com Cc: net...@vger.kernel.org; devicet...@vger.kernel.org; linux- ker...@vger.kernel.org; d...@g0hl1n.net; Richard Leitner Subject: [PATCH 2/2] net: ethernet: fsl: add phy reset after clk enable option Some PHYs (for example the LAN8710) doesn't allow turning the clocks off and on again without reset (according to their datasheet). Exactly this behaviour was introduced for power saving reasons by commit e8fcfcd5684a ("net: fec: optimize the clock management to save power") Therefore add a devictree option to perform a PHY reset and configuration after every clock enable. For a better understanding here's a outline of the time response of the clock and reset lines before and after this patch: v--fec_probe() v--fec_enet_open() v v w/o patch eCLK: ___| w/o patch nRST: __-- w/o patch CONF: ___XX___ w/ patch eCLK: ___| w/ patch nRST: __--__-- w/ patch CONF: ___XX__XX___ ^ ^ ^--fec_probe() ^--fec_enet_open() In our case this problem does occur at about every 10th to 50th POR of an LAN8710 connected to an i.MX6 SoC. The typical symptom of this problem is a "swinging" ethernet link. Similar issues were experienced by users of the NXP forum: https://community.nxp.com/thread/389902 https://community.nxp.com/message/309354 With this patch applied the issue didn't occur for at least a few hundred PORs of our board. Fixes: e8fcfcd5684a ("net: fec: optimize the clock management to sa...") Signed-off-by: Richard Leitner ... *ndev, bool enable) ret = clk_prepare_enable(fep->clk_ref); if (ret) goto failed_clk_ref; + + /* reset the phy after clk is enabled if it's configured */ + if (fep->phy_reset_after_clk_enable) { + ret = fec_reset_phy(ndev); + if (ret) + goto failed_reset; + if (ndev->phydev) { + ret = phy_init_hw(ndev->phydev); + if (ret) + goto failed_reset; + } + } Since it is common issue so long as using the PHY, can you move it into smsc phy driver like in .smsc_phy_reset() function ? And get the reset pin from phy dts node. During my first glance at this problem I had the same "feeling" that it should go into smsc.c. Therefore I've tried that already, but I haven't found a suitable "place" for that. My basic problem is: Where do I know (from smsc.c view) when the enetrefclk was disabled/enabled again without changing fec_main.c? Some more points that come into my mind: - The smsc_phy_reset function is registered as "soft_reset". Would it be OK to use nRST in it? - Would it be OK to call the phy_init_hw function from within the smsc_phy_reset? - IMHO I'd have to move the reset gpio binding inside the phy node then. Isn't that a pretty big change doing that for all PHYs/FECs? Would it be "worth" it? Sorry for these many (maybe noobish) questions, but I'm pretty new to the kernels fec/phy stuff ;-) Andy Thanks & regards, Richard.L
[PATCH] mmc: mxcmmc: fix error return code in mxcmci_probe()
platform_get_irq() returns an error code, but the mxcmmc driver ignores it and always returns -EINVAL. This is not correct, and prevents -EPROBE_DEFER from being propagated properly. Print error message and propagate the return value of platform_get_irq on failure. Signed-off-by: Gustavo A. R. Silva --- drivers/mmc/host/mxcmmc.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/mxcmmc.c b/drivers/mmc/host/mxcmmc.c index fb3ca82..f3c2832 100644 --- a/drivers/mmc/host/mxcmmc.c +++ b/drivers/mmc/host/mxcmmc.c @@ -1014,8 +1014,10 @@ static int mxcmci_probe(struct platform_device *pdev) res = platform_get_resource(pdev, IORESOURCE_MEM, 0); irq = platform_get_irq(pdev, 0); - if (irq < 0) - return -EINVAL; + if (irq < 0) { + dev_err(&pdev->dev, "failed to get IRQ: %d\n", irq); + return irq; + } mmc = mmc_alloc_host(sizeof(*host), &pdev->dev); if (!mmc) -- 2.5.0
Re: [PATCH v2 4/6] cpufreq: schedutil: update CFS util only if used
On Tue, Jul 4, 2017 at 10:34 AM, Patrick Bellasi wrote: > Currently the utilization of the FAIR class is collected before locking > the policy. Although that should not be a big issue for most cases, we > also don't really know how much latency there can be between the > utilization reading and its usage. > > Let's get the FAIR utilization right before its usage to be better in > sync with the current status of a CPU. > > Signed-off-by: Patrick Bellasi > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: linux-kernel@vger.kernel.org > Cc: linux...@vger.kernel.org > --- > kernel/sched/cpufreq_schedutil.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 98704d8..df433f1 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -308,10 +308,9 @@ static void sugov_update_shared(struct update_util_data > *hook, u64 time, > if (unlikely(current == sg_policy->thread)) > return; > > - sugov_get_util(&util, &max); > - > raw_spin_lock(&sg_policy->update_lock); > > + sugov_get_util(&util, &max); > sg_cpu->util = util; > sg_cpu->max = max; Is your concern that there will we spinlock contention before calling sugov_get_util? If that's the case, with your patch it seems to me such contention (and hence spinning) itself could cause the utilization to be inflated - thus calling sugov_get_util after acquiring the lock will not be as accurate as before. In that case it seems to me its better to let sugov_get_util be called before acquiring the lock (as before). thanks, -Joel
Re: [RFC v2 3/5] hmem: add heterogeneous memory sysfs support
On 07/06/2017 02:52 PM, Ross Zwisler wrote: [...] > diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile > index b1aacfc..31e3f20 100644 > --- a/drivers/acpi/Makefile > +++ b/drivers/acpi/Makefile > @@ -72,6 +72,7 @@ obj-$(CONFIG_ACPI_PROCESSOR)+= processor.o > obj-$(CONFIG_ACPI) += container.o > obj-$(CONFIG_ACPI_THERMAL) += thermal.o > obj-$(CONFIG_ACPI_NFIT) += nfit/ > +obj-$(CONFIG_ACPI_HMEM) += hmem/ > obj-$(CONFIG_ACPI) += acpi_memhotplug.o > obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o > obj-$(CONFIG_ACPI_BATTERY) += battery.o Hi Ross, Following are a series of suggestions, intended to clarify naming just enough so that, when Jerome's HMM patchset lands, we'll be able to tell the difference between the two types of Heterogeneous Memory. > diff --git a/drivers/acpi/hmem/Kconfig b/drivers/acpi/hmem/Kconfig > new file mode 100644 > index 000..09282be > --- /dev/null > +++ b/drivers/acpi/hmem/Kconfig > @@ -0,0 +1,7 @@ > +config ACPI_HMEM > + bool "ACPI Heterogeneous Memory Support" How about: bool "ACPI Heterogeneous Memory Attribute Table Support" The idea here, and throughout, is that this type of Heterogeneous Memory support is all about "the Heterogeneous Memory that you found via ACPI's Heterogeneous Memory Attribute Table". That's different from "the Heterogeneous Memory that you found when you installed a PCIe device that supports HMM". Or, at least it is different, until the day that someone decides to burn in support for an HMM device, into the ACPI tables. Seems unlikely, though. :) And even so, I think it would still work. > + depends on ACPI_NUMA > + depends on SYSFS > + help > + Exports a sysfs representation of the ACPI Heterogeneous Memory > + Attributes Table (HMAT). > diff --git a/drivers/acpi/hmem/Makefile b/drivers/acpi/hmem/Makefile > new file mode 100644 > index 000..d2aa546 > --- /dev/null > +++ b/drivers/acpi/hmem/Makefile > @@ -0,0 +1,2 @@ > +obj-$(CONFIG_ACPI_HMEM) := hmem.o > +hmem-y := core.o initiator.o target.o > diff --git a/drivers/acpi/hmem/core.c b/drivers/acpi/hmem/core.c > new file mode 100644 > index 000..f7638db > --- /dev/null > +++ b/drivers/acpi/hmem/core.c > @@ -0,0 +1,569 @@ > +/* > + * Heterogeneous memory representation in sysfs Heterogeneous memory, as discovered via ACPI's Heterogeneous Memory Attribute Table: representation in sysfs > + * > + * Copyright (c) 2017, Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + */ > + > +#include [...] > diff --git a/drivers/acpi/hmem/hmem.h b/drivers/acpi/hmem/hmem.h > new file mode 100644 > index 000..38ff540 > --- /dev/null > +++ b/drivers/acpi/hmem/hmem.h > @@ -0,0 +1,47 @@ > +/* > + * Heterogeneous memory representation in sysfs Heterogeneous memory, as discovered via ACPI's Heterogeneous Memory Attribute Table: representation in sysfs > + * > + * Copyright (c) 2017, Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + */ > + > +#ifndef _ACPI_HMEM_H_ > +#define _ACPI_HMEM_H_ > + > +struct memory_initiator { > + struct list_head list; > + struct device dev; > + > + /* only one of the following three will be set */ > + struct acpi_srat_cpu_affinity *cpu; > + struct acpi_srat_x2apic_cpu_affinity *x2apic; > + struct acpi_srat_gicc_affinity *gicc; > + > + int pxm; > + bool is_registered; > +}; > +#define to_memory_initiator(d) container_of((d), struct memory_initiator, > dev) > + > +struct memory_target { > + struct list_head list; > + struct device dev; > + struct acpi_srat_mem_affinity *ma; > + struct acpi_hmat_address_range *spa; > + struct memory_initiator *local_init; > + > + bool is_cached; > + bool is_registered; > +}; > +#define to_memory_target(d) container_of((d), struct memory_target, dev) > + > +extern const struct attribute_group *memory_initiator_attribute_groups[]; > +extern const struct attribute_group *memory_target_attribute_groups[]; > +#endif /* _ACPI_HMEM_H_ */ > diff --git a/drivers/acpi/hmem/initiator.c b/drivers/acpi/hmem/i
[PATCH] drm/udl: Make page_flip asynchronous
In page_flip vblank is sent with no delay. Driver does not know when the actual update is present on the display and has no means for getting this information from a device. It is practically impossible to say exactly *when* as there is also i.e. a usb delay. When we are unable to determine when the vblank actually happens we may assume it will behave accordingly, i.e. it will present frames with proper timing. In the worst case scenario it should take up to duration of one frame (we may get new frame in the device just after presenting current one so we would need to wait for the whole frame). Because of the asynchronous nature of the delay we need to synchronize: * read/write vrefresh/page_flip data when changing mode and preparing/executing vblank * USB requests to prevent interleaved access to URBs for two different frame buffers All those changes are backports from ChromeOS: 1. https://chromium-review.googlesource.com/250622 2. https://chromium-review.googlesource.com/249450 partially, only change in udl_modeset.c for 'udl_flip_queue' 3. https://chromium-review.googlesource.com/321378 4. https://chromium-review.googlesource.com/324119 + fixes for checkpatch and latest drm changes Cc: h...@chromium.org Cc: marc...@chromium.org Cc: za...@chromium.org Cc: db...@google.com Signed-off-by: Dawid Kurek --- drivers/gpu/drm/udl/udl_drv.h | 4 + drivers/gpu/drm/udl/udl_fb.c | 28 --- drivers/gpu/drm/udl/udl_main.c| 3 + drivers/gpu/drm/udl/udl_modeset.c | 160 ++ 4 files changed, 171 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/udl/udl_drv.h b/drivers/gpu/drm/udl/udl_drv.h index 2a75ab8..b93fb8d 100644 --- a/drivers/gpu/drm/udl/udl_drv.h +++ b/drivers/gpu/drm/udl/udl_drv.h @@ -47,6 +47,7 @@ struct urb_list { }; struct udl_fbdev; +struct udl_flip_queue; struct udl_device { struct device *dev; @@ -66,6 +67,9 @@ struct udl_device { atomic_t bytes_identical; /* saved effort with backbuffer comparison */ atomic_t bytes_sent; /* to usb, after compression including overhead */ atomic_t cpu_kcycles_used; /* transpired during pixel processing */ + + struct udl_flip_queue *flip_queue; + struct mutex transfer_lock; /* to serialize transfers */ }; struct udl_gem_object { diff --git a/drivers/gpu/drm/udl/udl_fb.c b/drivers/gpu/drm/udl/udl_fb.c index 4a65003..6dd9a49 100644 --- a/drivers/gpu/drm/udl/udl_fb.c +++ b/drivers/gpu/drm/udl/udl_fb.c @@ -82,7 +82,7 @@ int udl_handle_damage(struct udl_framebuffer *fb, int x, int y, { struct drm_device *dev = fb->base.dev; struct udl_device *udl = dev->dev_private; - int i, ret; + int i, ret = 0; char *cmd; cycles_t start_cycles, end_cycles; int bytes_sent = 0; @@ -91,18 +91,22 @@ int udl_handle_damage(struct udl_framebuffer *fb, int x, int y, int aligned_x; int bpp = fb->base.format->cpp[0]; + mutex_lock(&udl->transfer_lock); + if (!fb->active_16) - return 0; + goto out; if (!fb->obj->vmapping) { ret = udl_gem_vmap(fb->obj); if (ret == -ENOMEM) { DRM_ERROR("failed to vmap fb\n"); - return 0; + ret = 0; + goto out; } if (!fb->obj->vmapping) { DRM_ERROR("failed to vmapping\n"); - return 0; + ret = 0; + goto out; } } @@ -112,14 +116,18 @@ int udl_handle_damage(struct udl_framebuffer *fb, int x, int y, if ((width <= 0) || (x + width > fb->base.width) || - (y + height > fb->base.height)) - return -EINVAL; + (y + height > fb->base.height)) { + ret = -EINVAL; + goto out; + } start_cycles = get_cycles(); urb = udl_get_urb(dev); - if (!urb) - return 0; + if (!urb) { + ret = 0; + goto out; + } cmd = urb->transfer_buffer; for (i = y; i < y + height ; i++) { @@ -151,7 +159,9 @@ int udl_handle_damage(struct udl_framebuffer *fb, int x, int y, >> 10)), /* Kcycles */ &udl->cpu_kcycles_used); - return 0; +out: + mutex_unlock(&udl->transfer_lock); + return ret; } static int udl_fb_mmap(struct fb_info *info, struct vm_area_struct *vma) diff --git a/drivers/gpu/drm/udl/udl_main.c b/drivers/gpu/drm/udl/udl_main.c index a9d93b8..d376233 100644 --- a/drivers/gpu/drm/udl/udl_main.c +++ b/drivers/gpu/drm/udl/udl_main.c @@ -319,6 +319,8 @@ int udl_driver_load(struct drm_device *dev, unsigned long flags) if (!udl) return -ENOMEM; + mutex_init(&udl->transfer_lock); + udl->udev = ud
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:39 PM, Linus Torvalds wrote: > On Thu, Jul 6, 2017 at 10:15 PM, Kees Cook wrote: >> >> I always say this backwards. :P Default is top-down (allocate at high >> addresses and work down toward low). With unlimited stack, allocations >> start at low addresses and work up. Here's the results (shown with >> randomize_va_space sysctl set to 0): > > But this doesn't affect the stack layout itself. > > So we could do the stack copying without much caring, because that > happens first, right? > > So I think we can do all the envp/argv copying first, and then - as we > change the credentials, change the rlimit. And the string copies > wouldn't need to care much - although I guess they are also fine > checking against a possible *smaller* stack rlimit, which is actually > what we'd want. Yup, agreed. > And I think the credentials switch (which is the point of no return > anyway) happens before we start mmap'ing the executable etc. We used > to have some odd code there and do it in the completely wrong order > (checking that the binary was executable for the *old* user, which > makes no sense, iirc) Yeah, it all happens in setup_new_exec(). The first thing is layout selection, then switching credentials. It could be made to take a hint from GNU_STACK (which was parsed before setup_new_exec() is called), check security_bprm_secureexec() and then make the rlimit changes, all before the layout selection. > So I'm getting the sense that none of this should be a problem. > > But it's entirely possible that I missed something, and am just full > of shit. Our execve() path has traditionally been very hard to read. > It's actually gotten a bit better, but the whole "jump back and forth > between the generic fs/exec.c code and the binfmt code" is certainly > still there. Yeah, there might be something else lurking here, but so far, I'm satisfied this will work too. -Kees -- Kees Cook Pixel Security
Re: [PATCH v2] ASoC: samsung: i2s: Support more resolution rates
On Fri, Jul 07, 2017 at 10:31:10AM +0900, Jaechul Lee wrote: > This driver can support more frequencies over 96KHz. There are no reasons > to limit the frequency range below 96KHz. If codecs/amps or something else > can't support higher resolution rates, the constraints would be set rates > properly because each drivers have its own limits. > > I added the 'pcm_rates' field to the dai_data to be set rates by the > compatibilities. As a result, rates will be set each devices respectively. > For example of exynos5433, rates will be set from 8KHz to 192KHz. Code looks okay but commtit message could be still improved. Simply, the goal of this change is to add 192 kHz ferquency to Exynos5433 and Exynos7 DAI. Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof > > Signed-off-by: Jaechul Lee > --- > v2: > - changed the name of variable to pcm_rates. > > - removed duplicated code. > - modified commit message. > --- > sound/soc/samsung/i2s.c | 20 +--- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/sound/soc/samsung/i2s.c b/sound/soc/samsung/i2s.c > index af3ba4d4ccc5..c9f87f7bae99 100644 > --- a/sound/soc/samsung/i2s.c > +++ b/sound/soc/samsung/i2s.c > @@ -50,6 +50,7 @@ struct samsung_i2s_variant_regs { > > struct samsung_i2s_dai_data { > u32 quirks; > + unsigned int pcm_rates; > const struct samsung_i2s_variant_regs *i2s_variant_regs; > }; > > @@ -1076,13 +1077,13 @@ static const struct snd_soc_component_driver > samsung_i2s_component = { > .name = "samsung-i2s", > }; > > -#define SAMSUNG_I2S_RATESSNDRV_PCM_RATE_8000_96000 > - > #define SAMSUNG_I2S_FMTS (SNDRV_PCM_FMTBIT_S8 | \ > SNDRV_PCM_FMTBIT_S16_LE | \ > SNDRV_PCM_FMTBIT_S24_LE) > > -static struct i2s_dai *i2s_alloc_dai(struct platform_device *pdev, bool sec) > +static struct i2s_dai *i2s_alloc_dai(struct platform_device *pdev, > + const struct samsung_i2s_dai_data *i2s_dai_data, > + bool sec) > { > struct i2s_dai *i2s; > > @@ -1101,13 +1102,13 @@ static struct i2s_dai *i2s_alloc_dai(struct > platform_device *pdev, bool sec) > i2s->i2s_dai_drv.resume = i2s_resume; > i2s->i2s_dai_drv.playback.channels_min = 1; > i2s->i2s_dai_drv.playback.channels_max = 2; > - i2s->i2s_dai_drv.playback.rates = SAMSUNG_I2S_RATES; > + i2s->i2s_dai_drv.playback.rates = i2s_dai_data->pcm_rates; > i2s->i2s_dai_drv.playback.formats = SAMSUNG_I2S_FMTS; > > if (!sec) { > i2s->i2s_dai_drv.capture.channels_min = 1; > i2s->i2s_dai_drv.capture.channels_max = 2; > - i2s->i2s_dai_drv.capture.rates = SAMSUNG_I2S_RATES; > + i2s->i2s_dai_drv.capture.rates = i2s_dai_data->pcm_rates; > i2s->i2s_dai_drv.capture.formats = SAMSUNG_I2S_FMTS; > } > return i2s; > @@ -1242,7 +1243,7 @@ static int samsung_i2s_probe(struct platform_device > *pdev) > i2s_dai_data = (struct samsung_i2s_dai_data *) > platform_get_device_id(pdev)->driver_data; > > - pri_dai = i2s_alloc_dai(pdev, false); > + pri_dai = i2s_alloc_dai(pdev, i2s_dai_data, false); > if (!pri_dai) { > dev_err(&pdev->dev, "Unable to alloc I2S_pri\n"); > return -ENOMEM; > @@ -1316,7 +1317,7 @@ static int samsung_i2s_probe(struct platform_device > *pdev) > goto err_disable_clk; > > if (quirks & QUIRK_SEC_DAI) { > - sec_dai = i2s_alloc_dai(pdev, true); > + sec_dai = i2s_alloc_dai(pdev, i2s_dai_data, true); > if (!sec_dai) { > dev_err(&pdev->dev, "Unable to alloc I2S_sec\n"); > ret = -ENOMEM; > @@ -1452,29 +1453,34 @@ static const struct samsung_i2s_variant_regs > i2sv5_i2s1_regs = { > > static const struct samsung_i2s_dai_data i2sv3_dai_type = { > .quirks = QUIRK_NO_MUXPSR, > + .pcm_rates = SNDRV_PCM_RATE_8000_96000, > .i2s_variant_regs = &i2sv3_regs, > }; > > static const struct samsung_i2s_dai_data i2sv5_dai_type = { > .quirks = QUIRK_PRI_6CHAN | QUIRK_SEC_DAI | QUIRK_NEED_RSTCLR | > QUIRK_SUPPORTS_IDMA, > + .pcm_rates = SNDRV_PCM_RATE_8000_96000, > .i2s_variant_regs = &i2sv3_regs, > }; > > static const struct samsung_i2s_dai_data i2sv6_dai_type = { > .quirks = QUIRK_PRI_6CHAN | QUIRK_SEC_DAI | QUIRK_NEED_RSTCLR | > QUIRK_SUPPORTS_TDM | QUIRK_SUPPORTS_IDMA, > + .pcm_rates = SNDRV_PCM_RATE_8000_96000, > .i2s_variant_regs = &i2sv6_regs, > }; > > static const struct samsung_i2s_dai_data i2sv7_dai_type = { > .quirks = QUIRK_PRI_6CHAN | QUIRK_SEC_DAI | QUIRK_NEED_RSTCLR | > QUIRK_SUPPORTS_TDM, > + .pcm_rates = SNDRV_PCM_RATE_8000_1
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:36 PM, Andy Lutomirski wrote: > On Thu, Jul 6, 2017 at 10:15 PM, Kees Cook wrote: >> On Thu, Jul 6, 2017 at 10:10 PM, Kees Cook wrote: >>> On Thu, Jul 6, 2017 at 9:48 PM, Andy Lutomirski wrote: How about a much simpler solution: don't read rlimit at all in copy_strings(), let alone try to enforce it. Instead, just before the point of no return, check how much stack space is already used and, if it's more than an appropriate threshold (e.g. 1/4 of the rlimit), abort. Sure, this adds overhead if we're going to abort, but does that really matter? >>> >>> We should avoid using up tons of memory and then failing. Better to >>> cap it as we use it. Plumbing a sane value into this shouldn't be hard >>> at all. Just making this a hardcoded 2MB seems sane (1/4 of 8MB). > > Aren't there real use cases that use many megs of arguments? They'd be relatively new since the args were pretty limited before. I'd be curious to see them. > We could probably get away with saying max(rlimit(RLIMIT_STACK), 2MB) > as long as we make sure later on that we don't screw up if we've > overallocated? min, not max, but yeah. Here's part of what I have for get_arg_page(): rlim = current->signal->rlim; - if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) + arg_stack = READ_ONCE(rlim[RLIMIT_STACK].rlim_cur); + arg_stack = min_t(unsigned long, arg_stack, _STK_LIM) / 4; + if (size > arg_stack) goto fail; >>> IIUC, this is a big deal on 32-bit. Unlimited stack triggers top-down >>> mmap instead of bottom-up. I mean, I'd be delighted to get rid of >>> this, but I thought it was relied on by userspace. >> >> I always say this backwards. :P Default is top-down (allocate at high >> addresses and work down toward low). With unlimited stack, allocations >> start at low addresses and work up. Here's the results (shown with >> randomize_va_space sysctl set to 0): > > Uhh, crikey! Where's the code that does that? That was the call path I quoted earlier: > The stack rlimit defines the mmap layout too: > > do_execveat_common() -> > exec_binprm() -> > search_binary_handler() -> > fmt->load_binary (load_elf_binary()) -> > setup_new_exec() -> > arch_pick_mmap_layout() -> > mmap_is_legacy() -> > rlimit(RLIMIT_STACK) == RLIM_INFINITY i.e. arch_pick_mmap_layout(). -Kees -- Kees Cook Pixel Security
[PATCH] staging: rtl8712: fix "Alignment match open parenthesis"
Fix checkpatch issues: "CHECK: Alignment should match open parenthesis". Signed-off-by: Arushi Singhal --- drivers/staging/rtl8712/mlme_linux.c| 4 ++-- drivers/staging/rtl8712/rtl8712_cmd.c | 2 +- drivers/staging/rtl8712/rtl8712_efuse.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/staging/rtl8712/mlme_linux.c b/drivers/staging/rtl8712/mlme_linux.c index 2037265..a077069 100644 --- a/drivers/staging/rtl8712/mlme_linux.c +++ b/drivers/staging/rtl8712/mlme_linux.c @@ -111,8 +111,8 @@ void r8712_os_indicate_disconnect(struct _adapter *adapter) */ memcpy(&backupPMKIDList[0], - &adapter->securitypriv.PMKIDList[0], - sizeof(struct RT_PMKID_LIST) * NUM_PMKID_CACHE); + &adapter->securitypriv.PMKIDList[0], + sizeof(struct RT_PMKID_LIST) * NUM_PMKID_CACHE); backupPMKIDIndex = adapter->securitypriv.PMKIDIndex; backupTKIPCountermeasure = adapter->securitypriv.btkip_countermeasure; diff --git a/drivers/staging/rtl8712/rtl8712_cmd.c b/drivers/staging/rtl8712/rtl8712_cmd.c index 5346c65..0104ace 100644 --- a/drivers/staging/rtl8712/rtl8712_cmd.c +++ b/drivers/staging/rtl8712/rtl8712_cmd.c @@ -385,7 +385,7 @@ int r8712_cmd_thread(void *context) if (blnPending) wr_sz += 8; /* Append 8 bytes */ r8712_write_mem(padapter, RTL8712_DMA_H2CCMD, wr_sz, - (u8 *)pdesc); + (u8 *)pdesc); pcmdpriv->cmd_seq++; if (pcmd->cmdcode == GEN_CMD_CODE(_CreateBss)) { pcmd->res = H2C_SUCCESS; diff --git a/drivers/staging/rtl8712/rtl8712_efuse.c b/drivers/staging/rtl8712/rtl8712_efuse.c index 205298e..d90213e 100644 --- a/drivers/staging/rtl8712/rtl8712_efuse.c +++ b/drivers/staging/rtl8712/rtl8712_efuse.c @@ -347,7 +347,7 @@ static u8 fix_header(struct _adapter *padapter, u8 header, u16 header_addr) ret = false; if (value == 0xFF) /* write again */ efuse_one_byte_write(padapter, addr, - pkt.data[i * 2]); +pkt.data[i * 2]); } if (!efuse_one_byte_read(padapter, addr + 1, &value)) { ret = false; -- 2.7.4
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:15 PM, Kees Cook wrote: > > I always say this backwards. :P Default is top-down (allocate at high > addresses and work down toward low). With unlimited stack, allocations > start at low addresses and work up. Here's the results (shown with > randomize_va_space sysctl set to 0): But this doesn't affect the stack layout itself. So we could do the stack copying without much caring, because that happens first, right? So I think we can do all the envp/argv copying first, and then - as we change the credentials, change the rlimit. And the string copies wouldn't need to care much - although I guess they are also fine checking against a possible *smaller* stack rlimit, which is actually what we'd want. And I think the credentials switch (which is the point of no return anyway) happens before we start mmap'ing the executable etc. We used to have some odd code there and do it in the completely wrong order (checking that the binary was executable for the *old* user, which makes no sense, iirc) So I'm getting the sense that none of this should be a problem. But it's entirely possible that I missed something, and am just full of shit. Our execve() path has traditionally been very hard to read. It's actually gotten a bit better, but the whole "jump back and forth between the generic fs/exec.c code and the binfmt code" is certainly still there. Linus
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:15 PM, Kees Cook wrote: > On Thu, Jul 6, 2017 at 10:10 PM, Kees Cook wrote: >> On Thu, Jul 6, 2017 at 9:48 PM, Andy Lutomirski wrote: >>> How about a much simpler solution: don't read rlimit at all in >>> copy_strings(), let alone try to enforce it. Instead, just before the >>> point of no return, check how much stack space is already used and, if >>> it's more than an appropriate threshold (e.g. 1/4 of the rlimit), >>> abort. Sure, this adds overhead if we're going to abort, but does >>> that really matter? >> >> We should avoid using up tons of memory and then failing. Better to >> cap it as we use it. Plumbing a sane value into this shouldn't be hard >> at all. Just making this a hardcoded 2MB seems sane (1/4 of 8MB). Aren't there real use cases that use many megs of arguments? We could probably get away with saying max(rlimit(RLIMIT_STACK), 2MB) as long as we make sure later on that we don't screw up if we've overallocated? >> >>> I don't see why using rlimit for layout control makes any sense >>> whatsoever. Is there some historical reason we need that? As far as >>> I can see (on insufficient inspection) is that the kernel is trying to >>> guarantee that, if we have so much arg crap that our remaining stack >>> is less than 128k, then we don't exceed our limit by a little bit. >> >> IIUC, this is a big deal on 32-bit. Unlimited stack triggers top-down >> mmap instead of bottom-up. I mean, I'd be delighted to get rid of >> this, but I thought it was relied on by userspace. > > I always say this backwards. :P Default is top-down (allocate at high > addresses and work down toward low). With unlimited stack, allocations > start at low addresses and work up. Here's the results (shown with > randomize_va_space sysctl set to 0): Uhh, crikey! Where's the code that does that?
[PATCH] regulator: qcom_smd: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return. Signed-off-by: Gustavo A. R. Silva --- drivers/regulator/qcom_smd-regulator.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/regulator/qcom_smd-regulator.c b/drivers/regulator/qcom_smd-regulator.c index f35994a..940fe1b 100644 --- a/drivers/regulator/qcom_smd-regulator.c +++ b/drivers/regulator/qcom_smd-regulator.c @@ -570,6 +570,11 @@ static int rpm_reg_probe(struct platform_device *pdev) } match = of_match_device(rpm_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } + for (reg = match->data; reg->name; reg++) { vreg = devm_kzalloc(&pdev->dev, sizeof(*vreg), GFP_KERNEL); if (!vreg) -- 2.5.0
Re: [RFC v2 0/5] surface heterogeneous memory performance information
On 07/06/2017 02:52 PM, Ross Zwisler wrote: [...] > > The naming collision between Jerome's "Heterogeneous Memory Management > (HMM)" and this "Heterogeneous Memory (HMEM)" series is unfortunate, but I > was trying to stick with the word "Heterogeneous" because of the naming of > the ACPI 6.2 Heterogeneous Memory Attribute Table table. Suggestions for > better naming are welcome. > Hi Ross, Say, most of the places (file names, function and variable names, and even print statements) where this patchset uses hmem or HMEM, it really seems to mean, the Heterogeneous Memory Attribute Table. That's not *always* true, but given that it's a pretty severe naming conflict, how about just changing: hmem --> hmat HMEM --> HMAT ...everywhere? Then you still have Heterogeneous Memory in the name, but there is enough lexical distance (is that a thing? haha) between HMM and HMAT to keep us all sane. :) With or without the above suggestion, there are a few places (Kconfig, comments, prints) where we can more easily make it clear that HMM != HMEM (or HMAT), so for those I can just comment on them separately in the individual patches. thanks, john h
[PATCH] regulator: qcom_rpm-regulator: add NULL check on of_match_device() return value
Check return value from call to of_match_device() in order to prevent a NULL pointer dereference. In case of NULL print error message and return. Signed-off-by: Gustavo A. R. Silva --- drivers/regulator/qcom_rpm-regulator.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/regulator/qcom_rpm-regulator.c b/drivers/regulator/qcom_rpm-regulator.c index 1b2acc4..88dc0b0 100644 --- a/drivers/regulator/qcom_rpm-regulator.c +++ b/drivers/regulator/qcom_rpm-regulator.c @@ -959,6 +959,11 @@ static int rpm_reg_probe(struct platform_device *pdev) } match = of_match_device(rpm_of_match, &pdev->dev); + if (!match) { + dev_err(&pdev->dev, "failed to match device\n"); + return -ENODEV; + } + for (reg = match->data; reg->name; reg++) { vreg = devm_kmalloc(&pdev->dev, sizeof(*vreg), GFP_KERNEL); if (!vreg) -- 2.5.0
RE: [PATCH 2/2] net: ethernet: fsl: add phy reset after clk enable option
From: Richard Leitner Sent: Thursday, July 06, 2017 9:06 PM >To: Andy Duan ; robh...@kernel.org; >mark.rutl...@arm.com >Cc: net...@vger.kernel.org; devicet...@vger.kernel.org; linux- >ker...@vger.kernel.org; d...@g0hl1n.net; Richard Leitner > >Subject: [PATCH 2/2] net: ethernet: fsl: add phy reset after clk enable option > >Some PHYs (for example the LAN8710) doesn't allow turning the clocks off and >on again without reset (according to their datasheet). Exactly this behaviour >was introduced for power saving reasons by commit e8fcfcd5684a >("net: fec: optimize the clock management to save power") Therefore add a >devictree option to perform a PHY reset and configuration after every clock >enable. > >For a better understanding here's a outline of the time response of the clock >and reset lines before and after this patch: > > v--fec_probe() v--fec_enet_open() > v v > w/o patch eCLK: >___| > w/o patch nRST: __-- > w/o patch CONF: >___XX___ > > w/ patch eCLK: >___| > w/ patch nRST: __--__-- > w/ patch CONF: >___XX__XX___ > ^ ^ > ^--fec_probe() ^--fec_enet_open() > >In our case this problem does occur at about every 10th to 50th POR of an >LAN8710 connected to an i.MX6 SoC. The typical symptom of this problem is a >"swinging" ethernet link. Similar issues were experienced by users of the NXP >forum: > https://community.nxp.com/thread/389902 > https://community.nxp.com/message/309354 >With this patch applied the issue didn't occur for at least a few hundred PORs >of our board. > >Fixes: e8fcfcd5684a ("net: fec: optimize the clock management to sa...") >Signed-off-by: Richard Leitner >--- > Documentation/devicetree/bindings/net/fsl-fec.txt | 3 +++ > drivers/net/ethernet/freescale/fec.h | 1 + > drivers/net/ethernet/freescale/fec_main.c | 16 > 3 files changed, 20 insertions(+) > >diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt >b/Documentation/devicetree/bindings/net/fsl-fec.txt >index 6f55bdd..1766579 100644 >--- a/Documentation/devicetree/bindings/net/fsl-fec.txt >+++ b/Documentation/devicetree/bindings/net/fsl-fec.txt >@@ -23,6 +23,9 @@ Optional properties: > - phy-handle : phandle to the PHY device connected to this device. > - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory. > Use instead of phy-handle. >+- phy-reset-after-clk-enable : If present then a phy reset and >+configuration >+ will be performed everytime after the clocks are enabled. This is >+required >+ for some PHYs to work properly. > - fsl,num-tx-queues : The property is valid for enet-avb IP, which supports > hw multi queues. Should specify the tx queue number, otherwise set tx >queue > number to 1. >diff --git a/drivers/net/ethernet/freescale/fec.h >b/drivers/net/ethernet/freescale/fec.h >index 3da8084..d4f658a 100644 >--- a/drivers/net/ethernet/freescale/fec.h >+++ b/drivers/net/ethernet/freescale/fec.h >@@ -538,6 +538,7 @@ struct fec_enet_private { > int phy_reset_duration; > int phy_reset_post_delay; > boolphy_reset_active_high; >+ boolphy_reset_after_clk_enable; > > struct napi_struct napi; > int csum_flags; >diff --git a/drivers/net/ethernet/freescale/fec_main.c >b/drivers/net/ethernet/freescale/fec_main.c >index cbbf7b8..4e744f3 100644 >--- a/drivers/net/ethernet/freescale/fec_main.c >+++ b/drivers/net/ethernet/freescale/fec_main.c >@@ -68,6 +68,7 @@ > > static void set_multicast_list(struct net_device *ndev); static void >fec_enet_itr_coal_init(struct net_device *ndev); >+static int fec_reset_phy(struct net_device *ndev); > > #define DRIVER_NAME "fec" > >@@ -1862,6 +1863,18 @@ static int fec_enet_clk_enable(struct net_device >*ndev, bool enable) > ret = clk_prepare_enable(fep->clk_ref); > if (ret) > goto failed_clk_ref; >+ >+ /* reset the phy after clk is enabled if it's configured */ >+ if (fep->phy_reset_after_clk_enable) { >+ ret = fec_reset_phy(ndev); >+ if (ret) >+ goto failed_reset; >+ if (ndev->phydev) { >+ ret = phy_init_hw(ndev->phydev); >+ if (ret) >+ goto failed_reset; >+ } >+ } Since it is common issue so long as using the PHY, can you move it into smsc phy driver like in .smsc_phy_reset() function ? And get the reset
Re: [PATCH v2 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks
Hi, On Wed, Jul 5, 2017 at 6:41 AM, Patrick Bellasi wrote: [..] > >> But what about clearing the sched-class's flag from .pick_next_task() >> callback >> when they return NULL ? >> >> What about something like this instead (completely untested), with which we >> don't need the 2/3 patch as well: > > Just had a fast check but I think something like that can work. > We had an internal discussion with Brendan (in CC now) which had a > similar proposal. > > Main counter arguments for me was: > 1. we wanna to reduce the pollution of scheduling classes code with >schedutil specific code, unless strictly necessary > 2. we never had a "clear bit" semantics for flags updates > > Thus this proposal seemed to me less of a discontinuity wrt the > current interface. However, something similar to what you propose > below should also work. Let's collect some more feedback... > I was going to say something similar. I feel its much cleaner if the scheduler clears the flags that it set, thus taking "ownership" of the flag. I feel that will avoid complication like this where the governor has to peek into what's currently running and such (and also help with the suggestion I made for patch 2/3. Maybe the interface needs an extension for "clear flag" semantics? thanks, -Joel
Re: [git pull] vfs.git pile 10
On Thu, Jul 6, 2017 at 2:11 AM, Al Viro wrote: > uaccess str...() dead code removals. Side note: you left a couple of references to strlen_user() still in the tree. None of them *matter* (two comments and one declaration for the function that no longer exists), but it just strikes me as a bit of an odd oversight. Are those removals perhaps in other trees (and you didn't want to clash) or was it literally just "oops, I caught almost everything except for those three cases"? Linus
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 10:10 PM, Kees Cook wrote: > On Thu, Jul 6, 2017 at 9:48 PM, Andy Lutomirski wrote: >> How about a much simpler solution: don't read rlimit at all in >> copy_strings(), let alone try to enforce it. Instead, just before the >> point of no return, check how much stack space is already used and, if >> it's more than an appropriate threshold (e.g. 1/4 of the rlimit), >> abort. Sure, this adds overhead if we're going to abort, but does >> that really matter? > > We should avoid using up tons of memory and then failing. Better to > cap it as we use it. Plumbing a sane value into this shouldn't be hard > at all. Just making this a hardcoded 2MB seems sane (1/4 of 8MB). > >> I don't see why using rlimit for layout control makes any sense >> whatsoever. Is there some historical reason we need that? As far as >> I can see (on insufficient inspection) is that the kernel is trying to >> guarantee that, if we have so much arg crap that our remaining stack >> is less than 128k, then we don't exceed our limit by a little bit. > > IIUC, this is a big deal on 32-bit. Unlimited stack triggers top-down > mmap instead of bottom-up. I mean, I'd be delighted to get rid of > this, but I thought it was relied on by userspace. I always say this backwards. :P Default is top-down (allocate at high addresses and work down toward low). With unlimited stack, allocations start at low addresses and work up. Here's the results (shown with randomize_va_space sysctl set to 0): $ ulimit -s 8192 $ cat /proc/self/maps 08048000-0805 r-xp fc:01 843/bin/cat 0805-08051000 r--p 7000 fc:01 843/bin/cat 08051000-08052000 rw-p 8000 fc:01 843/bin/cat 08052000-08073000 rw-p 00:00 0 [heap] b7be7000-b7de7000 r--p fc:01 403307 /usr/lib/locale/locale-archive b7de7000-b7f9a000 r-xp fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so b7f9a000-b7f9b000 ---p 001b3000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so b7f9b000-b7f9d000 r--p 001b3000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so b7f9d000-b7f9e000 rw-p 001b5000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so b7f9e000-b7fa1000 rw-p 00:00 0 b7fb2000-b7fd7000 rw-p 00:00 0 b7fd7000-b7fd9000 r--p 00:00 0 [vvar] b7fd9000-b7fdb000 r-xp 00:00 0 [vdso] b7fdb000-b7ffd000 r-xp fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so b7ffd000-b7ffe000 r--p 002d7000 fc:01 403307 /usr/lib/locale/locale-archive b7ffe000-b7fff000 r--p 00022000 fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so b7fff000-b800 rw-p 00023000 fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so bffdf000-c000 rw-p 00:00 0 [stack] $ ulimit -s unlimited $ cat /proc/self/maps 08048000-0805 r-xp fc:01 843/bin/cat 0805-08051000 r--p 7000 fc:01 843/bin/cat 08051000-08052000 rw-p 8000 fc:01 843/bin/cat 08052000-08073000 rw-p 00:00 0 [heap] 4000-40022000 r-xp fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so 40022000-40023000 r--p 002d7000 fc:01 403307 /usr/lib/locale/locale-archive 40023000-40024000 r--p 00022000 fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so 40024000-40025000 rw-p 00023000 fc:01 276959 /lib/i386-linux-gnu/ld-2.24.so 40025000-40027000 r--p 00:00 0 [vvar] 40027000-40029000 r-xp 00:00 0 [vdso] 40029000-4004e000 rw-p 00:00 0 4005f000-40212000 r-xp fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so 40212000-40213000 ---p 001b3000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so 40213000-40215000 r--p 001b3000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so 40215000-40216000 rw-p 001b5000 fc:01 276980 /lib/i386-linux-gnu/libc-2.24.so 40216000-40219000 rw-p 00:00 0 40219000-40419000 r--p fc:01 403307 /usr/lib/locale/locale-archive bffdf000-c000 rw-p 00:00 0 [stack] -- Kees Cook Pixel Security
[PATCH] HID: hid-logitech-hidpp: add NULL check on devm_kmemdup() return value
Check return value from call to devm_kmemdup() in order to prevent a NULL pointer dereference. Signed-off-by: Gustavo A. R. Silva --- drivers/hid/hid-logitech-hidpp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index 41b3946..501e16a 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -2732,6 +2732,9 @@ static int hidpp_initialize_battery(struct hidpp_device *hidpp) hidpp_battery_props, sizeof(hidpp_battery_props), GFP_KERNEL); + if (!battery_props) + return -ENOMEM; + num_battery_props = ARRAY_SIZE(hidpp_battery_props) - 2; if (hidpp->capabilities & HIDPP_CAPABILITY_BATTERY_MILEAGE) -- 2.5.0
Re: [PATCH] perf report: Fix broken arrow at row 0 in annotate view
Hi Arnaldo, Could this patch be merged? Otherwise the jump arrow is broken when it's displayed at the row 0 in annotate view. Thanks Jin Yao On 6/8/2017 2:01 PM, Jin Yao wrote: When the jump instruction is displayed at the row 0 in annotate view, the arrow is broken. An example: 16.86 │ ┌──je 82 0.01 │ movsd (%rsp),%xmm0 │ movsd 0x8(%rsp),%xmm4 │ movsd 0x8(%rsp),%xmm1 │ movsd (%rsp),%xmm3 │ divsd %xmm4,%xmm0 │ divsd %xmm3,%xmm1 │ movsd (%rsp),%xmm2 │ addsd %xmm1,%xmm0 │ addsd %xmm2,%xmm0 │ movsd %xmm0,(%rsp) │82: sub$0x1,%ebx 83.03 │↑ jne38 │ add$0x10,%rsp │ xor%eax,%eax │ pop%rbx │← retq The patch increments the row number before checking with 0. Signed-off-by: Jin Yao --- tools/perf/ui/browser.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c index a4d3762..83874b0 100644 --- a/tools/perf/ui/browser.c +++ b/tools/perf/ui/browser.c @@ -704,7 +704,7 @@ static void __ui_browser__line_arrow_down(struct ui_browser *browser, ui_browser__gotorc(browser, row, column + 1); SLsmg_draw_hline(2); - if (row++ == 0) + if (++row == 0) goto out; } else row = 0;
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 9:48 PM, Andy Lutomirski wrote: > How about a much simpler solution: don't read rlimit at all in > copy_strings(), let alone try to enforce it. Instead, just before the > point of no return, check how much stack space is already used and, if > it's more than an appropriate threshold (e.g. 1/4 of the rlimit), > abort. Sure, this adds overhead if we're going to abort, but does > that really matter? We should avoid using up tons of memory and then failing. Better to cap it as we use it. Plumbing a sane value into this shouldn't be hard at all. Just making this a hardcoded 2MB seems sane (1/4 of 8MB). > I don't see why using rlimit for layout control makes any sense > whatsoever. Is there some historical reason we need that? As far as > I can see (on insufficient inspection) is that the kernel is trying to > guarantee that, if we have so much arg crap that our remaining stack > is less than 128k, then we don't exceed our limit by a little bit. IIUC, this is a big deal on 32-bit. Unlimited stack triggers top-down mmap instead of bottom-up. I mean, I'd be delighted to get rid of this, but I thought it was relied on by userspace. -Kees -- Kees Cook Pixel Security
Re: [git pull] vfs.git pile 11
On Thu, Jul 6, 2017 at 2:20 PM, Al Viro wrote: > > Linus, could you hold that one back until tomorrow? I want to tweak the > last commit in there a bit, but I want to give it a local beating first... Ok, dropping this one. All your other branches are merged now. Linus
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 9:48 PM, Andy Lutomirski wrote: > > How about a much simpler solution: don't read rlimit at all in > copy_strings(), let alone try to enforce it. People have historically relied on E2BIG and then splitting things into multiple chunks (ie do the whole 'xargs' thing). But I agree that *if* we abort cleanly with E2BIG, then it's fine if we copy a bit too much first. It's just that we need to check early enough that an E2BIG error is possible (ie not after the "point of no return"). And yes, I think we can get rid of the layout differences based on rlimit. After all, if I read that right the thing currently clamps that "gap" thing to 128MB anyway (and that's on the *low* side). Linus
Re: [PATCH] PCI: Do not enable extended tags on pre-dated (v1.x) systems
On Wed, Jul 5, 2017 at 9:19 PM, Sinan Kaya wrote: > According to extended tags ECN document, all PCIe receivers are expected > to support extended tags support. It should be safe to enable extended > tags on endpoints without checking compatibility. > > This assumption seems to be working fine except for the legacy systems. > The ECN has been written against PCIE spec version 2.0. Therefore, we need > to exclude all version 1.0 devices from this change as there is HW out > there that can't handle extended tags. > > Note that the default value of Extended Tags Enable bit is implementation > specific. Therefore, we are clearing the bit by default when incompatible > HW is found without assuming that value is zero. > > Reported-by: Wim ten Have > Link: > https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1467674 > Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") > Tested-by: Wim ten Have > Signed-off-by: Sinan Kaya > --- > drivers/pci/probe.c | 52 +--- > 1 file changed, 45 insertions(+), 7 deletions(-) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 19c8950..5e39013 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1684,21 +1684,58 @@ static void program_hpp_type2(struct pci_dev *dev, > struct hpp_type2 *hpp) > */ > } > > -static void pci_configure_extended_tags(struct pci_dev *dev) > +static bool pcie_bus_exttags_supported(struct pci_bus *bus) > +{ > + bool exttags_supported = true; > + struct pci_dev *bridge; > + int rc; > + u16 flags; > + > + bridge = bus->self; > + while (bridge) { > + if (pci_is_pcie(bridge)) { > + rc = pcie_capability_read_word(bridge, PCI_EXP_FLAGS, > + &flags); > + if (!rc && ((flags & PCI_EXP_FLAGS_VERS) < 2)) { > + exttags_supported = false; > + break; > + } > + } > + if (!bridge->bus->parent) > + break; > + bridge = bridge->bus->parent->self; > + } > + > + return exttags_supported; > +} > + > +static int pcie_bus_configure_exttags(struct pci_dev *dev, void *data) > { > u32 dev_cap; > int ret; > + bool supported; > > if (!pci_is_pcie(dev)) > - return; > + return 0; > > ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &dev_cap); > if (ret) > - return; > + return 0; > > - if (dev_cap & PCI_EXP_DEVCAP_EXT_TAG) > - pcie_capability_set_word(dev, PCI_EXP_DEVCTL, > -PCI_EXP_DEVCTL_EXT_TAG); > + if (dev_cap & PCI_EXP_DEVCAP_EXT_TAG) { > + supported = pcie_bus_exttags_supported(dev->bus); > + Maybe checking the version of this endpoint at first? Do you expect a v1 endpoint to be working under v2+ ports? -- Thanks, Jike
[PATCH v4 2/2] perf report: Implement visual marker for macro fusion in annotate
For marking the fused instructions clearly, This patch adds a line before the first instruction of pair and joins it with the arrow of the jump. For example, when je is selected in annotate view, the line before cmpl is displayed and joins the arrow of je. │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne29 │ │↓ jmp43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) That means the cmpl+je is fused instruction pair and they should be considered together. Change-log: --- v4: Since the first patch in patch set is changed, this is mainly changed for compilation. v3: Use Arnaldo's fix to let the display be better. To get the evsel->evlist->env->cpuid, save the evsel in annotate_browser. v2: No more changes, just uses a new function "ins__is_fused" to check if the instructions are fused. v1: Initial post Signed-off-by: Jin Yao --- tools/perf/ui/browser.c | 29 + tools/perf/ui/browser.h | 2 ++ tools/perf/ui/browsers/annotate.c | 26 ++ tools/perf/util/annotate.c| 5 + tools/perf/util/annotate.h| 1 + 5 files changed, 63 insertions(+) diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c index a4d3762..9ef7677 100644 --- a/tools/perf/ui/browser.c +++ b/tools/perf/ui/browser.c @@ -738,6 +738,35 @@ void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column, __ui_browser__line_arrow_down(browser, column, start, end); } +void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column, + unsigned int row, bool arrow_down) +{ + unsigned int end_row; + + if (row >= browser->top_idx) + end_row = row - browser->top_idx; + else + return; + + SLsmg_set_char_set(1); + + if (arrow_down) { + ui_browser__gotorc(browser, end_row, column - 1); + SLsmg_write_char(SLSMG_ULCORN_CHAR); + ui_browser__gotorc(browser, end_row, column); + SLsmg_draw_hline(2); + ui_browser__gotorc(browser, end_row + 1, column - 1); + SLsmg_write_char(SLSMG_LTEE_CHAR); + } else { + ui_browser__gotorc(browser, end_row, column - 1); + SLsmg_write_char(SLSMG_LTEE_CHAR); + ui_browser__gotorc(browser, end_row, column); + SLsmg_draw_hline(2); + } + + SLsmg_set_char_set(0); +} + void ui_browser__init(void) { int i = 0; diff --git a/tools/perf/ui/browser.h b/tools/perf/ui/browser.h index be3b70e..a12eff7 100644 --- a/tools/perf/ui/browser.h +++ b/tools/perf/ui/browser.h @@ -43,6 +43,8 @@ void ui_browser__printf(struct ui_browser *browser, const char *fmt, ...); void ui_browser__write_graph(struct ui_browser *browser, int graph); void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column, u64 start, u64 end); +void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column, + unsigned int row, bool arrow_down); void __ui_browser__show_title(struct ui_browser *browser, const char *title); void ui_browser__show_title(struct ui_browser *browser, const char *title); int ui_browser__show(struct ui_browser *browser, const char *title, diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index ae05ebb..b376637 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -273,6 +273,25 @@ static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sy return true; } +static bool is_fused(struct annotate_browser *ab, struct disasm_line *cursor) +{ + struct disasm_line *pos = list_prev_entry(cursor, node); + const char *name; + + if (!pos) + return false; + + if (ins__is_lock(&pos->ins)) + name = pos->ops.locked.ins.name; + else + name = pos->ins.name; + + if (!name || !cursor->ins.name) + return false; + + return ins__is_fused(ab->arch, name, cursor->ins.name); +} + static void annotate_browser__draw_current_jump(struct ui_browser *browser) { struct annotate_browser *ab = container_of(browser, struct annotate_browser, b); @@ -308,6 +327,13 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser) ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS); __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width, from, to); + + if (is_fused(ab, cursor)) { + ui_browser__mark_fused(browser, + pcnt_width + 3 + ab->addr_width, + from - 1, + to > from ? tru
[PATCH v4 1/2] perf util: Check for fused instruction
Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch adds an x86-specific function which checks if 2 instructions are in a "fused" pair. For non-x86 arch, the function is just NULL. Change-log: --- v4: Move the CPU model checking to symbol__disassemble and save the CPU family/model in arch structure. It avoids checking every time when jump arrow printed. v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). v2: Remove the origial weak function. Arnaldo points out that doing it as a weak function that will be overridden by the host arch doesn't work. So now it's implemented as an arch-specific function. v1: Initial post Signed-off-by: Jin Yao --- tools/perf/arch/x86/annotate/instructions.c | 46 + tools/perf/builtin-top.c| 2 +- tools/perf/ui/browsers/annotate.c | 4 ++- tools/perf/ui/gtk/annotate.c| 2 +- tools/perf/util/annotate.c | 22 -- tools/perf/util/annotate.h | 3 +- 6 files changed, 73 insertions(+), 6 deletions(-) diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c index c1625f2..d84b720 100644 --- a/tools/perf/arch/x86/annotate/instructions.c +++ b/tools/perf/arch/x86/annotate/instructions.c @@ -76,3 +76,49 @@ static struct ins x86__instructions[] = { { .name = "xbeginq",.ops = &jump_ops, }, { .name = "retq", .ops = &ret_ops, }, }; + +static bool x86__ins_is_fused(struct arch *arch, const char *ins1, + const char *ins2) +{ + if (arch->family != 6 || arch->model < 0x1e || strstr(ins2, "jmp")) + return false; + + if (arch->model == 0x1e) { + /* Nehalem */ + if ((strstr(ins1, "cmp") && !strstr(ins1, "xchg")) || +strstr(ins1, "test")) { + return true; + } + } else { + /* Newer platform */ + if ((strstr(ins1, "cmp") && !strstr(ins1, "xchg")) || +strstr(ins1, "test") || +strstr(ins1, "add") || +strstr(ins1, "sub") || +strstr(ins1, "and") || +strstr(ins1, "inc") || +strstr(ins1, "dec")) { + return true; + } + } + + return false; +} + +static int x86__cpuid_parse(struct arch *arch, char *cpuid) +{ + unsigned int family, model, stepping; + int ret; + + /* +* cpuid = "GenuineIntel,family,model,stepping" +*/ + ret = sscanf(cpuid, "%*[^,],%u,%u,%u", &family, &model, &stepping); + if (ret == 3) { + arch->family = family; + arch->model = model; + return 0; + } + + return -1; +} diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 6052376..022486d 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -134,7 +134,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he) return err; } - err = symbol__disassemble(sym, map, NULL, 0, NULL); + err = symbol__disassemble(sym, map, NULL, 0, NULL, NULL); if (err == 0) { out_assign: top->sym_filter_entry = he; diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 27f41f2..ae05ebb 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -9,6 +9,7 @@ #include "../../util/symbol.h" #include "../../util/evsel.h" #include "../../util/config.h" +#include "../../util/evlist.h" #include #include #include @@ -1074,7 +1075,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map, } err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), - sizeof_bdl, &browser.arch); + sizeof_bdl, &browser.arch, + evsel->evlist->env->cpuid); if (err) { char msg[BUFSIZ]; symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg)); diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c index d903fd4..87e3760 100644 --- a/tools/perf/ui/gtk/annotate.c +++ b/tools/perf/ui/gtk/annotate.c @@ -169,7 +169,7 @@ static i
[PATCH v4 0/2] perf report: Implement visual marker for macro fusion in annotate
Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch series marks the case clearly by joining the fused instruction pair in the arrow of the jump. For example: │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne29 │ │↓ jmp43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) Change-log: --- v4: Move the CPU model checking to symbol__disassemble and save the family/model in arch structure. It avoids checking everytime when display the jump arrows. The patch set is still performing the fused checking when user moves the cursor on the jump instruction. v3: 1. Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). 2. Use Arnaldo's fix to let the display be better v2: According to Arnaldo's comments, remove the weak function and use an arch-specific function instead to check fused instruction pair. v1: Inital post Jin Yao (2): perf util: Check for fused instruction perf report: Implement visual marker for macro fusion in annotate tools/perf/arch/x86/annotate/instructions.c | 46 + tools/perf/builtin-top.c| 2 +- tools/perf/ui/browser.c | 29 ++ tools/perf/ui/browser.h | 2 ++ tools/perf/ui/browsers/annotate.c | 30 ++- tools/perf/ui/gtk/annotate.c| 2 +- tools/perf/util/annotate.c | 27 +++-- tools/perf/util/annotate.h | 4 ++- 8 files changed, 136 insertions(+), 6 deletions(-) -- 2.7.4
Re: [PATCH] iio: multiplexer: add NULL check on devm_kzalloc() and devm_kmemdup() return values
On 2017-07-07 06:53, Gustavo A. R. Silva wrote: > Check return values from call to devm_kzalloc() and devm_kmemup() If someone cares enough: s/devm_kmemup/evm_kmemdup/ > in order to prevent a NULL pointer dereference. > > This issue was detected using Coccinelle and the following semantic patch: > > @@ > expression x; > identifier fld; > @@ > > * x = devm_kzalloc(...); >... when != x == NULL >x->fld > > Cc: Peter Rosin > Signed-off-by: Gustavo A. R. Silva Either way, Reviewed-by: Peter Rosin Thanks!
linux-next: Tree for Jul 7
Hi all, Please do not add any v4.14 material to you linux-next included branches until after v4.13-rc1 has been released. Changes since 20170706: The f2fs tree gained a conflict against Linus' tree. The akpm tree lost a patch that turned up elsewhere. Non-merge commits (relative to Linus' tree): 4006 3387 files changed, 456543 insertions(+), 62457 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu. Below is a summary of the state of the merge. I am currently merging 266 trees (counting Linus' and 41 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (90311148415a Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi) Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig entry) Merging kbuild-current/fixes (ad8181060788 kconfig: fix sparse warnings in nconfig) Merging arc-current/for-curr (c0bc126f97fb Linux 4.12-rc7) Merging arm-current/fixes (9e25ebfe56ec ARM: 8685/1: ensure memblock-limit is pmd-aligned) Merging m68k-current/for-linus (204a2be30a7a m68k: Remove ptrace_signal_deliver) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (d6bd8194e286 powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()) Merging sparc/master (dbd2667a4fb9 sparc64: Fix gup_huge_pmd) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (f630c38ef0d7 vrf: fix bug_on triggered by rx when destroying a vrf) Merging ipsec/master (ca3a1b856636 esp6_offload: Fix IP6CB(skb)->nhoff for ESP GRO) Merging netfilter/master (c644bd79c0a7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf) Merging ipvs/master (3c5ab3f395d6 ipvs: SNAT packet replies only for NATed connections) Merging wireless-drivers/master (35abcd4f9f30 brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()) Merging mac80211/master (4b153ca989a9 Merge tag 'mac80211-for-davem-2017-06-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211) Merging sound-current/for-linus (6ede2b7df92f ALSA: opl4: Move inline before return type) Merging pci-current/for-linus (98dbf5af4fdd PCI: endpoint: Select CRC32 to fix test build error) Merging driver-core.current/driver-core-linus (650fc870a2ef Merge tag 'docs-4.13' of git://git.lwn.net/linux) Merging tty.current/tty-linus (650fc870a2ef Merge tag 'docs-4.13' of git://git.lwn.net/linux) Merging usb.current/usb-linus (dec08194ffec xhci: Limit USB2 port wake support for AMD Promontory hosts) Merging usb-gadget-fixes/fixes (f50b878fed33 USB: gadget: fix GPF in gadgetfs) Merging usb-serial-fixes/usb-linus (996fab55d864 USB: serial: qcserial: new Sierra Wireless EM7305 device ID) Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: check before accessing ci_role in ci_role_show) Merging phy/fixes (9605bc46433d phy: qualcomm: phy-qcom-qmp: fix application of sizeof to pointer) Merging staging.current/staging-linus (650fc870a2ef Merge tag 'docs-4.13' of git://git.lwn.net/linux) Merging char-misc.current/char-misc-linus (650fc870a2ef Merge tag 'docs-4.13' of git://git.lwn.net/linux) Merging input-current/for-linus (ede2e7cdc58e Merge
[PATCH] arcnet: com20020-pci: Fix an error handling path in 'com20020pci_probe()'
If this memory allocation fails, we should go through the error handling path as done everywhere else in this function before returning. Signed-off-by: Christophe JAILLET --- drivers/net/arcnet/com20020-pci.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/arcnet/com20020-pci.c b/drivers/net/arcnet/com20020-pci.c index 239de38fbd6a..8e89a2ac071e 100644 --- a/drivers/net/arcnet/com20020-pci.c +++ b/drivers/net/arcnet/com20020-pci.c @@ -196,8 +196,10 @@ static int com20020pci_probe(struct pci_dev *pdev, card = devm_kzalloc(&pdev->dev, sizeof(struct com20020_dev), GFP_KERNEL); - if (!card) - return -ENOMEM; + if (!card) { + ret = -ENOMEM; + goto out_port; + } card->index = i; card->pci_priv = priv; -- 2.11.0
[PATCH] iio: multiplexer: add NULL check on devm_kzalloc() and devm_kmemdup() return values
Check return values from call to devm_kzalloc() and devm_kmemup() in order to prevent a NULL pointer dereference. This issue was detected using Coccinelle and the following semantic patch: @@ expression x; identifier fld; @@ * x = devm_kzalloc(...); ... when != x == NULL x->fld Cc: Peter Rosin Signed-off-by: Gustavo A. R. Silva --- Changes in v2: Add NULL check on devm_kmemup() return value. drivers/iio/multiplexer/iio-mux.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/iio/multiplexer/iio-mux.c b/drivers/iio/multiplexer/iio-mux.c index 37ba007..74831fc 100644 --- a/drivers/iio/multiplexer/iio-mux.c +++ b/drivers/iio/multiplexer/iio-mux.c @@ -285,6 +285,9 @@ static int mux_configure_channel(struct device *dev, struct mux *mux, child->ext_info_cache = devm_kzalloc(dev, sizeof(*child->ext_info_cache) * num_ext_info, GFP_KERNEL); + if (!child->ext_info_cache) + return -ENOMEM; + for (i = 0; i < num_ext_info; ++i) { child->ext_info_cache[i].size = -1; @@ -309,6 +312,9 @@ static int mux_configure_channel(struct device *dev, struct mux *mux, child->ext_info_cache[i].data = devm_kmemdup(dev, page, ret + 1, GFP_KERNEL); + if (!child->ext_info_cache[i].data) + return -ENOMEM; + child->ext_info_cache[i].data[ret] = 0; child->ext_info_cache[i].size = ret; } -- 2.5.0
Re: [PATCH] iio: multiplexer: add NULL check on devm_kzalloc() return value
Quoting Peter Rosin : On 2017-07-07 06:35, Gustavo A. R. Silva wrote: Hi Peter, Quoting Peter Rosin : On 2017-07-07 00:08, Gustavo A. R. Silva wrote: Check return value from call to devm_kzalloc() in order to prevent a NULL pointer dereference. Right, thanks for finding that one! There's another one inside the for loop that is just starting in the context of this patch. Care to fix checking the return value of that devm_kmemdup as well? Sure, I'll send a new patch shortly. And someone should perhaps teach Coccinelle about devm_kmemdup... Good catch, I just implemented that script. This issue was detected using Coccinelle and the following semantic patch: @@ expression x; identifier fld; @@ * x = devm_kzalloc(...); ... when != x == NULL x->fld One of these blank lines should perhaps be a "Fixes:" tag? mmm, I don't get this... If you add a Fixes-tag, like below, you help the stable kernel maintainers decide what to look at. In this case it might be overkill since the thing you fix is so fresh and does not apply to any old kernel. But I think it is a good habit... Fixes: 7ba9df54b091 ("iio: multiplexer: new iio category and iio-mux driver") (and it is a bit unusual to see two blank lines before the SoB-tag) Sorry for not spelling it out the first time. Oh, I get it now. Thank you very much for clarifying. I will definitely keep that in mind for future fixes when that could apply. Thanks! -- Gustavo A. R. Silva
Re: [RFC][PATCH] exec: Use init rlimits for setuid exec
On Thu, Jul 6, 2017 at 12:12 PM, Kees Cook wrote: > On Thu, Jul 6, 2017 at 10:52 AM, Linus Torvalds > wrote: >> On Thu, Jul 6, 2017 at 10:29 AM, Kees Cook wrote: (a) minimal: just use our existing default stack (and stack _only_) limit value for suid binaries that actually get extra permissions: { _STK_LIM, RLIM_INFINITY }. >>> >>> This would look a lot like the existing patch; it'd just not copy the >>> init process rlimits. >> >> Can't we just do the final rlimit setting so late in execve that we >> don't need that whole "saved_rlimit" thing? > > The stack rlimit defines the mmap layout too: > > do_execveat_common() -> > exec_binprm() -> > search_binary_handler() -> > fmt->load_binary (load_elf_binary()) -> > setup_new_exec() -> > arch_pick_mmap_layout() -> > mmap_is_legacy() -> > rlimit(RLIMIT_STACK) == RLIM_INFINITY > > exec_binprm() happens after the other stack setup (copy_strings()), so > if we wanted to avoid saved_rlimit, we'd have to replumb how > arch_pick_mmap_layout() works and how copy_strings() performs its > calculations (neither looks too terrible). How about a much simpler solution: don't read rlimit at all in copy_strings(), let alone try to enforce it. Instead, just before the point of no return, check how much stack space is already used and, if it's more than an appropriate threshold (e.g. 1/4 of the rlimit), abort. Sure, this adds overhead if we're going to abort, but does that really matter? I don't see why using rlimit for layout control makes any sense whatsoever. Is there some historical reason we need that? As far as I can see (on insufficient inspection) is that the kernel is trying to guarantee that, if we have so much arg crap that our remaining stack is less than 128k, then we don't exceed our limit by a little bit.
Re: [PATCH v2] arm: eBPF JIT compiler
Okay Kees. I will take a look at it. Best, Shubham Bansal On Fri, Jul 7, 2017 at 10:12 AM, Kees Cook wrote: > On Wed, Jul 5, 2017 at 8:49 PM, Shubham Bansal > wrote: >> Hi Kees, >> >> Problem is my ARM machine don't have clang and iproute2 which is >> keeping me from testing the bpf tail calls. >> >> You should do the following to test it,. >> >> 1. tools/testing/selftests/bpf/ >> 2. make >> 3. sudo ./test_progs >> >> And, before testing, you have to do "make headers_install". >> These tests are for tail calls with the attached patch. If its too >> much work, Can you please upload your arm image so that I can test it? >> I just need a good machine. > > I've got all this set up now, and it faults during the test: > > Unable to handle kernel NULL pointer dereference at virtual address 0008 > ... > CPU: 0 PID: 1922 Comm: test_progs Not tainted 4.12.0+ #60 > ... > PC is at __htab_map_lookup_elem+0x54/0x1f4 > > I'll see if I can send you this disk image... > > -Kees > > > -- > Kees Cook > Pixel Security
Re: [PATCH 2/2] serial: earlycon: Make early_con as __initdata
On (07/06/17 11:38), Matt Redfearn wrote: > All early console drivers that may be registered as the earlycon are > marked __init to be placed in the init section. The drivers' code and > data are freed during free_initmem_default() but the early console is > not unregistered in printk_late_init() as the init_section_intersects() > test fails. This leads to the earlycon still being active, potentially > with dangling pointers into the freed init section, which may have been > poisoned by the slab debugger. Attempting to use the boot console after > this will likely lead to weird behaviour and kernel crashes. people want to use early-con as a panic() console fallback. -ss > Fix this by marking the early_con struct __initdata so that the > init_section_intersects() test succeeds and the console is unregistered > in printk_late_init() before the init section is freed. > > The 8250 earlycon, on which the generic earlycon was based, had these > attributes on it's console struct. The switch to the generic > implementation in commit d2fd6810a823 ("tty/serial: convert 8250 to > generic earlycon") appears to have broken unregistraton of the boot > console when its code and data are in __init. > > Fixes: d2fd6810a823 ("tty/serial: convert 8250 to generic earlycon") > Signed-off-by: Matt Redfearn > > --- > > drivers/tty/serial/earlycon.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c > index c3651540e1ba..388aaafcca2b 100644 > --- a/drivers/tty/serial/earlycon.c > +++ b/drivers/tty/serial/earlycon.c > @@ -29,13 +29,13 @@ > > #include > > -static struct console early_con = { > +static struct console early_con __initdata = { > .name = "uart", /* fixed up at earlycon registration */ > .flags =CON_PRINTBUFFER | CON_BOOT, > .index =0, > }; > > -static struct earlycon_device early_console_dev = { > +static struct earlycon_device early_console_dev __initdata = { > .con = &early_con, > }; > > -- > 2.7.4 >
Re: [PATCH] iio: multiplexer: add NULL check on devm_kzalloc() return value
On 2017-07-07 06:35, Gustavo A. R. Silva wrote: > Hi Peter, > > Quoting Peter Rosin : > >> On 2017-07-07 00:08, Gustavo A. R. Silva wrote: >>> Check return value from call to devm_kzalloc() >>> in order to prevent a NULL pointer dereference. >> >> Right, thanks for finding that one! There's another one inside the >> for loop that is just starting in the context of this patch. Care >> to fix checking the return value of that devm_kmemdup as well? >> > > Sure, I'll send a new patch shortly. > >> And someone should perhaps teach Coccinelle about devm_kmemdup... >> > > Good catch, I just implemented that script. > >>> This issue was detected using Coccinelle and the following semantic patch: >>> >>> @@ >>> expression x; >>> identifier fld; >>> @@ >>> >>> * x = devm_kzalloc(...); >>> ... when != x == NULL >>> x->fld >>> >>> >> >> One of these blank lines should perhaps be a "Fixes:" tag? >> > > mmm, I don't get this... If you add a Fixes-tag, like below, you help the stable kernel maintainers decide what to look at. In this case it might be overkill since the thing you fix is so fresh and does not apply to any old kernel. But I think it is a good habit... Fixes: 7ba9df54b091 ("iio: multiplexer: new iio category and iio-mux driver") (and it is a bit unusual to see two blank lines before the SoB-tag) Sorry for not spelling it out the first time. Cheers, peda
Re: [x86/time] 03fa63cc96: ACPI_Error:Table[DMAR]is_not_invalidated_during_early_boot_stage(#/tbxface -#)
Hi Thomas, At 07/07/2017 11:04 AM, Ye Xiaolong wrote: On 07/07, Dou Liyang wrote: Hi xiaolong, Really thanks for your testing. At 07/07/2017 09:54 AM, Ye Xiaolong wrote: On 07/06, Thomas Gleixner wrote: On Thu, 6 Jul 2017, kernel test robot wrote: commit: 03fa63cc96ab35592e0a7d522b8edbc1e6b02d22 ("x86/time: Initialize interrupt mode behind timer init") ++++ || 43436935b7 | 03fa63cc96 | ++++ | boot_successes | 0 | 4 | ++++ So 03fa63cc96 makes the box boot again. I'm confused as usual by the output of this tool., kern :info : [0.005000] tsc: Fast TSC calibration using PIT kern :info : [0.006000] tsc: Detected 2195.020 MHz processor kern :info : [0.007000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4390.04 BogoMIPS (lpj=2195020) kern :info : [0.008001] pid_max: default: 90112 minimum: 704 kern :info : [0.009037] ACPI: Core revision 20170303 kern :err : [0.010002] ACPI Error: Table [DMAR] is not invalidated during early boot stage (20170303/tbxface-193) Sure we have a error message here, but compared to what? Compared to something which does not boot at all? Sorry for the confusion, here commit 43436935b7 boot failed due to OOM which happened at the late stage of kernel boot while the ACPI error showed at the early boot stage for commit 03fa63cc96 and it didn't appear in 43436935b7's dmesg. let's make the problem clearly firstly: 1) Commit 43436935b7 ("x86/xen: Bypass intr mode setup in enlighten_pv system") made kernel boot failed, which caused by OOM. It is so interesting! This patch only work for XEN in pv mode, if we boot a kernel w/o XEN, it will not work in fact. 2) Commit 03fa63cc96 ("x86/time: Initialize interrupt mode behind timer init") can make the kernel boot success again, but with an ACPI error happened. Indeed, the ACPI Error appeared with this patch. If x2apic is enabled in x86-64, initializing interrupt mode contains irq remapping setup and the ACPI DMAR table initialization. default_setup_apic_routing() enable_IR_x2apic() irq_remapping_prepare() intel_prepare_irq_remapping() dmar_table_init() this patch make interrupt mode setup before acpi_early_init() which has the check of ACPI table state in acpi_reallocate_root_table() where the ACPI Error generated. I have got a machine which supports x2apic. I am tracking the code, try to find out more. Thanks, dou. And both *1* and *2* used the same configuration showed in the attachment. Does anything I missed? Yes, this is exactly what I meant. Thanks, Xiaolong Thanks, dou. Thanks, Xiaolong Thanks, tglx
Re: [PATCH 1/2] printk: Unconditionally unregister boot consoles if in init section
On (07/06/17 11:38), Matt Redfearn wrote: > Commit 4c30c6f566c0 ("kernel/printk: do not turn off bootconsole in > printk_late_init() if keep_bootcon") added a check on keep_bootcon to > ensure that boot consoles were kept around until the real console is > registered. > This can lead to problems if the boot console data and code are in the > init section, since it can be freed before the boot console is > deregistered. This was fixed by commit 81cc26f2bd11 ("printk: only > unregister boot consoles when necessary"). > The keep_bootcon flag prevents the unregistration of a boot console, > even if it's data and code reside in the init section and are about to > be freed. This can lead to crashes and weird panics when the bootconsole > is accessed after free, especially if page poisoning is in use and the > code / data have been overwritten with a poison value. > To prevent this, always free the boot console if it is within the init > section. > > Fixes 81cc26f2bd11 ("printk: only unregister boot consoles when necessary"). > Signed-off-by: Matt Redfearn well... hm. don't specify `keep_bootcon' if your bootcon access initdata? keeping `early_printk' sometimes can be helpful and people even want to use `early_printk' as a panic() console fallback (because of those nasty deadlocks that spoil printk->call_console_drivers()). if someone asked to `keep_bootcon' but we actually don't keep it, then what's the purpose of the flag and pr_info("debug: skip boot console de-registration.\n")? > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index a1db38abac5b..b5411f44eed4 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2660,7 +2660,7 @@ static int __init printk_late_init(void) > int ret; > > for_each_console(con) { > - if (!keep_bootcon && con->flags & CON_BOOT) { > + if (con->flags & CON_BOOT) { > /* >* Make sure to unregister boot consoles whose data >* resides in the init section before the init section what about bootconsoles that are still not getting de-registered due to keep_bootcon in register_console()? -ss
Re: [PATCH v2 2/6] cpufreq: schedutil: reset sg_cpus's flags at IDLE enter
On Tue, Jul 4, 2017 at 10:34 AM, Patrick Bellasi wrote: > Currently, sg_cpu's flags are set to the value defined by the last call of > the cpufreq_update_util()/cpufreq_update_this_cpu(); for RT/DL classes > this corresponds to the SCHED_CPUFREQ_{RT/DL} flags always being set. > > When multiple CPU shares the same frequency domain it might happen that a > CPU which executed a RT task, right before entering IDLE, has one of the > SCHED_CPUFREQ_RT_DL flags set, permanently, until it exits IDLE. > > Although such an idle CPU is _going to be_ ignored by the > sugov_next_freq_shared(): > 1. this kind of "useless RT requests" are ignored only if more then > TICK_NSEC have elapsed since the last update > 2. we can still potentially trigger an already too late switch to > MAX, which starts also a new throttling interval > 3. the internal state machine is not consistent with what the > scheduler knows, i.e. the CPU is now actually idle > > Thus, in sugov_next_freq_shared(), where utilisation and flags are > aggregated across all the CPUs of a frequency domain, it can turn out > that all the CPUs of that domain can run unnecessary at the maximum OPP > until another event happens in the idle CPU, which eventually clear the > SCHED_CPUFREQ_{RT/DL} flag, or the IDLE CPUs gets ignored after > TICK_NSEC since the CPU entering IDLE. > > Such a behaviour can harm the energy efficiency of systems where RT > workloads are not so frequent and other CPUs in the same frequency > domain are running small utilisation workloads, which is a quite common > scenario in mobile embedded systems. > > This patch proposes a solution which is aligned with the current principle > to update the flags each time a scheduling event happens. The scheduling > of the idle_task on a CPU is considered one of such meaningful events. > That's why when the idle_task is selected for execution we poke the > schedutil policy to reset the flags for that CPU. > > No frequency transitions are activated at that point, which is fair in > case the RT workload should come back in the future. However, this still > allows other CPUs in the same frequency domain to scale down the > frequency in case that should be possible. > > Signed-off-by: Patrick Bellasi > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: linux-kernel@vger.kernel.org > Cc: linux...@vger.kernel.org > > --- > Changes from v1: > - added "unlikely()" around the statement (SteveR) > --- > include/linux/sched/cpufreq.h| 1 + > kernel/sched/cpufreq_schedutil.c | 7 +++ > kernel/sched/idle_task.c | 4 > 3 files changed, 12 insertions(+) > > diff --git a/include/linux/sched/cpufreq.h b/include/linux/sched/cpufreq.h > index d2be2cc..36ac8d2 100644 > --- a/include/linux/sched/cpufreq.h > +++ b/include/linux/sched/cpufreq.h > @@ -10,6 +10,7 @@ > #define SCHED_CPUFREQ_RT (1U << 0) > #define SCHED_CPUFREQ_DL (1U << 1) > #define SCHED_CPUFREQ_IOWAIT (1U << 2) > +#define SCHED_CPUFREQ_IDLE (1U << 3) > > #define SCHED_CPUFREQ_RT_DL(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index eaba6d6..004ae18 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -304,6 +304,12 @@ static void sugov_update_shared(struct update_util_data > *hook, u64 time, > > sg_cpu->util = util; > sg_cpu->max = max; > + > + /* CPU is entering IDLE, reset flags without triggering an update */ > + if (unlikely(flags & SCHED_CPUFREQ_IDLE)) { > + sg_cpu->flags = 0; > + goto done; > + } Instead of defining a new flag for idle, wouldn't another way be to just clear the flag from the RT scheduling class with an extra call to cpufreq_update_util with flags = 0 during dequeue_rt_entity? That seems to me to be also the right place to clear the flag since the flag is set in the corresponding class to begin with. thanks, -Joel
Re: [PATCH v2] arm: eBPF JIT compiler
On Wed, Jul 5, 2017 at 8:49 PM, Shubham Bansal wrote: > Hi Kees, > > Problem is my ARM machine don't have clang and iproute2 which is > keeping me from testing the bpf tail calls. > > You should do the following to test it,. > > 1. tools/testing/selftests/bpf/ > 2. make > 3. sudo ./test_progs > > And, before testing, you have to do "make headers_install". > These tests are for tail calls with the attached patch. If its too > much work, Can you please upload your arm image so that I can test it? > I just need a good machine. I've got all this set up now, and it faults during the test: Unable to handle kernel NULL pointer dereference at virtual address 0008 ... CPU: 0 PID: 1922 Comm: test_progs Not tainted 4.12.0+ #60 ... PC is at __htab_map_lookup_elem+0x54/0x1f4 I'll see if I can send you this disk image... -Kees -- Kees Cook Pixel Security
Re: [PATCH] iio: multiplexer: add NULL check on devm_kzalloc() return value
Hi Peter, Quoting Peter Rosin : On 2017-07-07 00:08, Gustavo A. R. Silva wrote: Check return value from call to devm_kzalloc() in order to prevent a NULL pointer dereference. Right, thanks for finding that one! There's another one inside the for loop that is just starting in the context of this patch. Care to fix checking the return value of that devm_kmemdup as well? Sure, I'll send a new patch shortly. And someone should perhaps teach Coccinelle about devm_kmemdup... Good catch, I just implemented that script. This issue was detected using Coccinelle and the following semantic patch: @@ expression x; identifier fld; @@ * x = devm_kzalloc(...); ... when != x == NULL x->fld One of these blank lines should perhaps be a "Fixes:" tag? mmm, I don't get this... Cheers, peda Signed-off-by: Gustavo A. R. Silva --- drivers/iio/multiplexer/iio-mux.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iio/multiplexer/iio-mux.c b/drivers/iio/multiplexer/iio-mux.c index 37ba007..a8d672b 100644 --- a/drivers/iio/multiplexer/iio-mux.c +++ b/drivers/iio/multiplexer/iio-mux.c @@ -285,6 +285,9 @@ static int mux_configure_channel(struct device *dev, struct mux *mux, child->ext_info_cache = devm_kzalloc(dev, sizeof(*child->ext_info_cache) * num_ext_info, GFP_KERNEL); + if (!child->ext_info_cache) + return -ENOMEM; + for (i = 0; i < num_ext_info; ++i) { child->ext_info_cache[i].size = -1; Thanks! -- Gustavo A. R. Silva
Re: [PATCH] target: make device_mutex and device_list static
On Wed, 2017-07-05 at 13:15 -0500, Mike Christie wrote: > On 07/04/2017 03:44 AM, Colin King wrote: > > From: Colin Ian King > > > > Variables device_mutex and device_list static are local to the source, > > so make them static. > > > > Cleans up sparse warnings: > > "symbol 'device_list' was not declared. Should it be static?" > > "symbol 'device_mutex' was not declared. Should it be static?" > > > > Signed-off-by: Colin Ian King > > --- > > drivers/target/target_core_device.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/target/target_core_device.c > > b/drivers/target/target_core_device.c > > index 3ae8fbf01bdf..bbcef3bc66c8 100644 > > --- a/drivers/target/target_core_device.c > > +++ b/drivers/target/target_core_device.c > > @@ -49,8 +49,8 @@ > > #include "target_core_pr.h" > > #include "target_core_ua.h" > > > > -DEFINE_MUTEX(device_mutex); > > -LIST_HEAD(device_list); > > +static DEFINE_MUTEX(device_mutex); > > +static LIST_HEAD(device_list); > > static DEFINE_IDR(devices_idr); > > > > static struct se_hba *lun0_hba; > > > > My fault. Thanks. > > Reviewed-by: Mike Christie Applied. Thanks Colin + MNC.
Re: [PATCH] extcon: int3496: constify acpi_device_id.
On 2017년 07월 07일 01:55, Arvind Yadav wrote: > acpi_device_id are not supposed to change at runtime. All functions > working with acpi_device_id provided by work with > const acpi_device_id. So mark the non-const structs as const. > > File size before: >text data bss dec hex filename >1733 352 02085 825 > drivers/extcon/extcon-intel-int3496.o > > File size After adding 'const': >text data bss dec hex filename >1797 272 02069 815 > drivers/extcon/extcon-intel-int3496.o > > Signed-off-by: Arvind Yadav > --- > drivers/extcon/extcon-intel-int3496.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/extcon/extcon-intel-int3496.c > b/drivers/extcon/extcon-intel-int3496.c > index 9d17984..49bf9c6 100644 > --- a/drivers/extcon/extcon-intel-int3496.c > +++ b/drivers/extcon/extcon-intel-int3496.c > @@ -174,7 +174,7 @@ static int int3496_remove(struct platform_device *pdev) > return 0; > } > > -static struct acpi_device_id int3496_acpi_match[] = { > +static const struct acpi_device_id int3496_acpi_match[] = { > { "INT3496" }, > { } > }; > Applied it. -- Best Regards, Chanwoo Choi Samsung Electronics
Re: [PATCH] iio: multiplexer: add NULL check on devm_kzalloc() return value
On 2017-07-07 00:08, Gustavo A. R. Silva wrote: > Check return value from call to devm_kzalloc() > in order to prevent a NULL pointer dereference. Right, thanks for finding that one! There's another one inside the for loop that is just starting in the context of this patch. Care to fix checking the return value of that devm_kmemdup as well? And someone should perhaps teach Coccinelle about devm_kmemdup... > This issue was detected using Coccinelle and the following semantic patch: > > @@ > expression x; > identifier fld; > @@ > > * x = devm_kzalloc(...); > ... when != x == NULL > x->fld > > One of these blank lines should perhaps be a "Fixes:" tag? Cheers, peda > Signed-off-by: Gustavo A. R. Silva > --- > drivers/iio/multiplexer/iio-mux.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/iio/multiplexer/iio-mux.c > b/drivers/iio/multiplexer/iio-mux.c > index 37ba007..a8d672b 100644 > --- a/drivers/iio/multiplexer/iio-mux.c > +++ b/drivers/iio/multiplexer/iio-mux.c > @@ -285,6 +285,9 @@ static int mux_configure_channel(struct device *dev, > struct mux *mux, > child->ext_info_cache = devm_kzalloc(dev, >sizeof(*child->ext_info_cache) * >num_ext_info, GFP_KERNEL); > + if (!child->ext_info_cache) > + return -ENOMEM; > + > for (i = 0; i < num_ext_info; ++i) { > child->ext_info_cache[i].size = -1; > >
[PATCH v3 2/3] dt-bindings: cpufreq: move MediaTek cpufreq dt-bindings document to proper place
From: Sean Wang The old place is Documentation/devicetree/bindings/clock/ that would let people hard to find how to use MediaTek cpufreq driver, so moving it to the appropriate place as other cpufreq drivers done would be better. Signed-off-by: Sean Wang Acked-by: Viresh Kumar --- .../bindings/{clock/mt8173-cpu-dvfs.txt => cpufreq/cpufreq-mediatek.txt} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename Documentation/devicetree/bindings/{clock/mt8173-cpu-dvfs.txt => cpufreq/cpufreq-mediatek.txt} (100%) diff --git a/Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt similarity index 100% rename from Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt rename to Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt -- 2.7.4
[PATCH v3 0/3] some fixups for MediaTek cpufreq driver
From: Sean Wang Changes since v2: - correct the typo in the binding document Changes since v1: - drop those patches already accepted - refine the commit messages and Kconfig dependency - Kconfig menu entry and file name itself are updated with more generic name to drop "MT8173" since this driver actually supports all MediaTek SoCs. - generate patchset again with git format-patch -C -M --thread=shallow - fix binding examples by replacing '@' with '-' as the OPP nodes will never have a "reg" property. Hi, The purpose of the series is - (patch 1 to 3) to fix up current Mediatek cpufreq driver can't work with the latest tree since one required CPU clock muxer missing would cause the driver getting the resource fails when driver probe gets called. - (patch 4) to enable cpufreq feature on MT2701/MT7623 platform. - (patch 5 to 6) to update the binding document to reflect latest driver logic and add more examples guiding people how to apply for Mediatek cpufreq driver. Sean Wang (3): cpufreq: mediatek: Add support of cpufreq to MT2701/MT7623 SoC dt-bindings: cpufreq: move MediaTek cpufreq dt-bindings document to proper place dt-bindings: cpufreq: enhance MediaTek cpufreq dt-binding document .../devicetree/bindings/clock/mt8173-cpu-dvfs.txt | 83 --- .../bindings/cpufreq/cpufreq-mediatek.txt | 247 + drivers/cpufreq/Kconfig.arm| 8 +- drivers/cpufreq/Makefile | 2 +- .../cpufreq/{mt8173-cpufreq.c => mtk-cpufreq.c}| 2 + 5 files changed, 254 insertions(+), 88 deletions(-) delete mode 100644 Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt create mode 100644 Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt rename drivers/cpufreq/{mt8173-cpufreq.c => mtk-cpufreq.c} (99%) -- 2.7.4
[PATCH v3 3/3] dt-bindings: cpufreq: enhance MediaTek cpufreq dt-binding document
From: Sean Wang Update binding document with adding operating-points-v2 as the required property and the cooling level as the optional properties and adding more examples guiding people how to use MediaTek cpufreq driver for MediaTek SoCs. Signed-off-by: Sean Wang Acked-by: Viresh Kumar --- .../bindings/cpufreq/cpufreq-mediatek.txt | 170 - 1 file changed, 167 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt index 52b457c..f640308 100644 --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt @@ -1,4 +1,5 @@ -Device Tree Clock bindins for CPU DVFS of Mediatek MT8173 SoC +Binding for MediaTek's CPUFreq driver += Required properties: - clocks: A list of phandle + clock-specifier pairs for the clocks listed in clock names. @@ -9,6 +10,8 @@ Required properties: transition and not stable yet. Please refer to Documentation/devicetree/bindings/clk/clock-bindings.txt for generic clock consumer properties. +- operating-points-v2: Please refer to Documentation/devicetree/bindings/opp/opp.txt + for detail. - proc-supply: Regulator for Vproc of CPU cluster. Optional properties: @@ -17,9 +20,166 @@ Optional properties: Vsram to fit SoC specific needs. When absent, the voltage scaling flow is handled by hardware, hence no software "voltage tracking" is needed. +- #cooling-cells: +- cooling-min-level: +- cooling-max-level: + Please refer to Documentation/devicetree/bindings/thermal/thermal.txt + for detail. + +Example 1 (MT7623 SoC): + + cpu_opp_table: opp_table { + compatible = "operating-points-v2"; + opp-shared; + + opp-59800 { + opp-hz = /bits/ 64 <59800>; + opp-microvolt = <105>; + }; + + opp-74750 { + opp-hz = /bits/ 64 <74750>; + opp-microvolt = <105>; + }; + + opp-104000 { + opp-hz = /bits/ 64 <104000>; + opp-microvolt = <115>; + }; + + opp-119600 { + opp-hz = /bits/ 64 <119600>; + opp-microvolt = <120>; + }; + + opp-13 { + opp-hz = /bits/ 64 <13>; + opp-microvolt = <130>; + }; + }; + + cpu0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x0>; + clocks = <&infracfg CLK_INFRA_CPUSEL>, +<&apmixedsys CLK_APMIXED_MAINPLL>; + clock-names = "cpu", "intermediate"; + operating-points-v2 = <&cpu_opp_table>; + #cooling-cells = <2>; + cooling-min-level = <0>; + cooling-max-level = <7>; + }; + cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x1>; + operating-points-v2 = <&cpu_opp_table>; + }; + cpu@2 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x2>; + operating-points-v2 = <&cpu_opp_table>; + }; + cpu@3 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x3>; + operating-points-v2 = <&cpu_opp_table>; + }; + +Example 2 (MT8173 SoC): + cpu_opp_table_a: opp_table_a { + compatible = "operating-points-v2"; + opp-shared; + + opp-50700 { + opp-hz = /bits/ 64 <50700>; + opp-microvolt = <859000>; + }; + + opp-70200 { + opp-hz = /bits/ 64 <70200>; + opp-microvolt = <908000>; + }; + + opp-100100 { + opp-hz = /bits/ 64 <100100>; + opp-microvolt = <983000>; + }; + + opp-110500 { + opp-hz = /bits/ 64 <110500>; + opp-microvolt = <1009000>; + }; + + opp-118300 { + opp-hz = /bits/ 64 <118300>; + opp-microvolt = <1028000>; + }; + + opp-140400 { + opp-hz = /bits/ 64 <140400>; + opp-microvolt = <1083000>; + }; + + opp-150800 { +
[PATCH v3 1/3] cpufreq: mediatek: Add support of cpufreq to MT2701/MT7623 SoC
From: Sean Wang MT2701/MT7623 is a 32-bit ARMv7 based quad-core (4 * Cortex-A7) with single cluster and this hardware is also compatible with the existing driver through enabling CPU frequency feature with operating-points-v2 bindings. Also, this driver actually supports all MediaTek SoCs, the Kconfig menu entry and file name itself should be updated with more generic name to drop "MT8173" Signed-off-by: Sean Wang Acked-by: Viresh Kumar --- drivers/cpufreq/Kconfig.arm | 8 drivers/cpufreq/Makefile| 2 +- drivers/cpufreq/{mt8173-cpufreq.c => mtk-cpufreq.c} | 2 ++ 3 files changed, 7 insertions(+), 5 deletions(-) rename drivers/cpufreq/{mt8173-cpufreq.c => mtk-cpufreq.c} (99%) diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index 74ed7e9..79aece7 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -87,14 +87,14 @@ config ARM_KIRKWOOD_CPUFREQ This adds the CPUFreq driver for Marvell Kirkwood SoCs. -config ARM_MT8173_CPUFREQ - tristate "Mediatek MT8173 CPUFreq support" +config ARM_MEDIATEK_CPUFREQ + tristate "CPU Frequency scaling support for MediaTek SoCs" depends on ARCH_MEDIATEK && REGULATOR - depends on ARM64 || (ARM_CPU_TOPOLOGY && COMPILE_TEST) + depends on ARM || ARM64 || COMPILE_TEST depends on !CPU_THERMAL || THERMAL select PM_OPP help - This adds the CPUFreq driver support for Mediatek MT8173 SoC. + This adds the CPUFreq driver support for MediaTek SoCs. config ARM_OMAP2PLUS_CPUFREQ bool "TI OMAP2+" diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index b7e78f0..4956f5d 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -58,7 +58,7 @@ obj-$(CONFIG_ARM_EXYNOS5440_CPUFREQ) += exynos5440-cpufreq.o obj-$(CONFIG_ARM_HIGHBANK_CPUFREQ) += highbank-cpufreq.o obj-$(CONFIG_ARM_IMX6Q_CPUFREQ)+= imx6q-cpufreq.o obj-$(CONFIG_ARM_KIRKWOOD_CPUFREQ) += kirkwood-cpufreq.o -obj-$(CONFIG_ARM_MT8173_CPUFREQ) += mt8173-cpufreq.o +obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ) += mtk-cpufreq.o obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)+= omap-cpufreq.o obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mtk-cpufreq.c similarity index 99% rename from drivers/cpufreq/mt8173-cpufreq.c rename to drivers/cpufreq/mtk-cpufreq.c index fd1886f..481ec77 100644 --- a/drivers/cpufreq/mt8173-cpufreq.c +++ b/drivers/cpufreq/mtk-cpufreq.c @@ -575,6 +575,8 @@ static struct platform_driver mt8173_cpufreq_platdrv = { /* List of machines supported by this driver */ static const struct of_device_id mt8173_cpufreq_machines[] __initconst = { + { .compatible = "mediatek,mt2701", }, + { .compatible = "mediatek,mt7623", }, { .compatible = "mediatek,mt817x", }, { .compatible = "mediatek,mt8173", }, { .compatible = "mediatek,mt8176", }, -- 2.7.4
Re: [RFC PATCH v1 3/8] sched/cpufreq_schedutil: make worker kthread be SCHED_DEADLINE
Hi Juri, On Wed, Jul 5, 2017 at 1:59 AM, Juri Lelli wrote: > Worker kthread needs to be able to change frequency for all other > threads. > > Make it special, just under STOP class. > > Signed-off-by: Juri Lelli > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Luca Abeni > Cc: Claudio Scordino > --- > Changes from RFCv0: > > - use a high bit of sched_flags and don't expose the new flag to >userspace (also add derisory comments) (Peter) > --- > include/linux/sched.h| 1 + > kernel/sched/core.c | 15 +-- > kernel/sched/cpufreq_schedutil.c | 15 --- > kernel/sched/deadline.c | 13 + > kernel/sched/sched.h | 20 +++- > 5 files changed, 58 insertions(+), 6 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 1f0f427e0292..0ba767fdedae 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1359,6 +1359,7 @@ extern int idle_cpu(int cpu); > extern int sched_setscheduler(struct task_struct *, int, const struct > sched_param *); > extern int sched_setscheduler_nocheck(struct task_struct *, int, const > struct sched_param *); > extern int sched_setattr(struct task_struct *, const struct sched_attr *); > +extern int sched_setattr_nocheck(struct task_struct *, const struct > sched_attr *); > extern struct task_struct *idle_task(int cpu); > > /** > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 5186797908dc..0e40c7ff2bf4 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3998,7 +3998,9 @@ static int __sched_setscheduler(struct task_struct *p, > } > > if (attr->sched_flags & > - ~(SCHED_FLAG_RESET_ON_FORK | SCHED_FLAG_RECLAIM)) > + ~(SCHED_FLAG_RESET_ON_FORK | > + SCHED_FLAG_RECLAIM | > + SCHED_FLAG_SPECIAL)) > return -EINVAL; I was thinking if its better to name SCHED_FLAG_SPECIAL something more descriptive than "special", since it will not be clear looking at other parts of the code that this is used for the frequency scaling thread or that its a DL task. I was thinking since this flag is specifically setup for the sugov thread frequency scaling purpose, a better flag name than SPECIAL might be SCHED_FLAG_DL_FREQ_SCALE or SCHED_FLAG_FREQ_SCALE. But I am Ok with whatever makes sense to do here. Thanks, Regards, -Joel
Re: [PATCH 1/2] genirq: Get the fwnode back for irqchips being probed via ACPI namespace
On 2017/7/6 21:05, Marc Zyngier wrote: > On 06/07/17 10:01, Hanjun Guo wrote: >> Hi Marc, >> >> On 2017/7/6 15:43, Marc Zyngier wrote: >>> On 06/07/17 05:35, Hanjun Guo wrote: From: Hanjun Guo commit d59f6617eef0 (genirq: Allow fwnode to carry name information only) forgot to do "domain->fwnode = fwnode;" for irqchips being probed via ACPI namesapce (DSDT/SSDT), that will break platforms with irqchip such as mbigen or qcom irq combiner, set the fwnode back to fix the issue. Reported-by: John Garry Signed-off-by: Hanjun Guo --- kernel/irq/irqdomain.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index 14fe862..1bc38fa 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -151,7 +151,6 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; break; default: - domain->fwnode = fwnode; domain->name = fwid->name; break; } @@ -172,7 +171,6 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, strreplace(name, '/', ':'); domain->name = name; - domain->fwnode = fwnode; domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; } @@ -196,6 +194,7 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL); domain->ops = ops; domain->host_data = host_data; + domain->fwnode = fwnode; domain->hwirq_max = hwirq_max; domain->revmap_size = size; domain->revmap_direct_max_irq = direct_max; >>> This doesn't seem right. >>> >>> Why is is_fwnode_irqchip() returning false when presented with an >>> irqchip probed via the ACPI namespace? That's what you should consider >>> fixing instead of moving that code around. >> irqchips probed via the ACPI namespace will have a fwnode handler >> with the fwnode type FWNODE_ACPI, similar to irqchips probed >> via DT having a fwnode handler with type FWNODE_OF, in this >> function with DT case, fwnode is set for irqdomain's fwnode, >> ACPI just reuse that code because they are similar. >> >> is_fwnode_irqchip() just checks if it's FWNODE_IRQCHIP, which is only >> available for irqchips probing via ACPI static tables such as ITS, GIC >> on ARM, or via DMAR on x86, those cases don't have a fwnode handler then >> need to create one via __irq_domain_alloc_fwnode(), which is different >> from DT/ACPI namesapce scan mechanism. So the patch above it's the best >> solution I got so far which just resume the code as before, correct me >> if you have something new :) > This violates the irqdomain API that takes two kind of fwnodes: > FWNODE_OF or FWNODE_IRQCHIP, and no others. You got away with it so > far, but it is over. > > You have two choices here: either you allocate a FWNODE_IRQCHIP in your > irqchip driver, and use this as a handle for your IRQ domain, or you > make the irqdomain code able to deal with any FWNODE_ACPI fwnode. Later one is better to me as allocate a FWNODE_IRQCHIP in the irqchip driver will override the FWNODE_ACPI handler. > > Does the following hack work for you? Yes, it works. shall we go this way with a proper fix? > > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c > index 14fe862aa2e3..37f4aa3985b3 100644 > --- a/kernel/irq/irqdomain.c > +++ b/kernel/irq/irqdomain.c > @@ -155,6 +155,8 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle > *fwnode, int size, > domain->name = fwid->name; > break; > } > + } else if (is_acpi_device_node(fwnode)) { > + domain->fwnode = fwnode; > } else if (of_node) { > char *name; > > If it does, we can then find weird and wonderful ways to give the > domain a shiny name in debugfs... How about the weird way below (using dev_name() which shows ACPI HID + number) ? +#include #include +#include #include #include #include @@ -172,6 +174,15 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, domain->name = name; domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + } else if (is_acpi_device_node(fwnode)) { + struct acpi_device *adev = to_acpi_device_node(fwnode); + + domain->name = kstrdup(dev_name(&adev->dev), GFP_KERNEL); + if (!domain->name) + return NULL; + + domain->fwnode = fwnode; + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; } But my hack code retrieving the ACPI data in irq domain core is really weird :) Thanks Hanjun
Re: Make HWSPINLOCK a menuconfig to ease disabling
On Thu, Jul 06, 2017 at 11:23:50PM +, Linux Kernel wrote: > Make HWSPINLOCK a menuconfig to ease disabling > > So that there's no need to get into the submenu to disable all related > config > entries. Here's how that looks on x86... * * Hardware Spinlock drivers * Hardware Spinlock drivers (HWSPINLOCK) [N/m/y] (NEW) ? There is no help available for this option. Symbol: HWSPINLOCK [=n] Type : tristate Prompt: Hardware Spinlock drivers Location: -> Device Drivers Defined at drivers/hwspinlock/Kconfig:5 No help ? Ok whatever, let's see what happens.. Hardware Spinlock drivers (HWSPINLOCK) [N/m/y] (NEW) m And then nothing, because none of the drivers that depend on this are available for x86 because of stuff like this.. > config HWSPINLOCK_OMAP > tristate "OMAP Hardware Spinlock device" > depends on ARCH_OMAP4 || SOC_OMAP5 || SOC_DRA7XX || SOC_AM33XX || > SOC_AM43XX > config HWSPINLOCK_QCOM > tristate "Qualcomm Hardware Spinlock device" > depends on ARCH_QCOM > config HWSPINLOCK_SIRF > tristate "SIRF Hardware Spinlock device" > depends on ARCH_SIRF > config HSEM_U8500 > tristate "STE Hardware Semaphore functionality" > depends on ARCH_U8500 HWSPINLOCK needs all those depends too. Dave
Re: [RFC PATCH v1 4/8] sched/cpufreq_schedutil: split utilization signals
Hi Juri, On Wed, Jul 5, 2017 at 1:59 AM, Juri Lelli wrote: > To be able to treat utilization signals of different scheduling classes > in different ways (e.g., CFS signal might be stale while DEADLINE signal > is never stale by design) we need to split sugov_cpu::util signal in two: > util_cfs and util_dl. > > This patch does that by also changing sugov_get_util() parameter list. > After this change, aggregation of the different signals has to be performed > by sugov_get_util() users (so that they can decide what to do with the > different signals). > > Suggested-by: Rafael J. Wysocki > Signed-off-by: Juri Lelli > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Luca Abeni > Cc: Claudio Scordino > --- > Changes from RFCv0: > > - refactor aggregation of utilization in sugov_aggregate_util() > --- > kernel/sched/cpufreq_schedutil.c | 28 > 1 file changed, 16 insertions(+), 12 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index ba6227625f24..e835fa886225 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -58,7 +58,8 @@ struct sugov_cpu { > u64 last_update; > > /* The fields below are only needed when sharing a policy. */ > - unsigned long util; > + unsigned long util_cfs; > + unsigned long util_dl; > unsigned long max; > unsigned int flags; > > @@ -154,20 +155,24 @@ static unsigned int get_next_freq(struct sugov_policy > *sg_policy, > return cpufreq_driver_resolve_freq(policy, freq); > } > > -static void sugov_get_util(unsigned long *util, unsigned long *max) > +static void sugov_get_util(struct sugov_cpu *sg_cpu) > { > struct rq *rq = this_rq(); > - unsigned long dl_util = (rq->dl.running_bw * SCHED_CAPACITY_SCALE) > - >> BW_SHIFT; > > - *max = arch_scale_cpu_capacity(NULL, smp_processor_id()); > + sg_cpu->max = arch_scale_cpu_capacity(NULL, smp_processor_id()); > + sg_cpu->util_cfs = rq->cfs.avg.util_avg; > + sg_cpu->util_dl = (rq->dl.running_bw * SCHED_CAPACITY_SCALE) > + >> BW_SHIFT; > +} > > +static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) > +{ > /* > * Ideally we would like to set util_dl as min/guaranteed freq and > * util_cfs + util_dl as requested freq. However, cpufreq is not yet > * ready for such an interface. So, we only do the latter for now. > */ > - *util = min(rq->cfs.avg.util_avg + dl_util, *max); > + return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max); > } I am wondering why the need for a separate aggregation API. To me, it looks like using sugov_get_util to set the sg_cpu util elements and then do the aggregation at the same time would have the same effect (without changing the existing parameter list). Is this to handle a future usecase where aggregation may need to be done differently? For all the user's of sugov_get_util, aggregation is done in the same way. Anyway if I missed something, sorry for the noise. thanks, -Joel
[PATCH] sched/cputime: Fix using smp_processor_id() in preemptible
From: Wanpeng Li BUG: using smp_processor_id() in preemptible [] code: 99-trinity/181 caller is debug_smp_processor_id+0x17/0x19 CPU: 0 PID: 181 Comm: 99-trinity Not tainted 4.12.0-01059-g2a42eb9 #1 Call Trace: dump_stack+0x82/0xb8 check_preemption_disabled+0xd1/0xe3 debug_smp_processor_id+0x17/0x19 vtime_delta+0xd/0x2c task_cputime+0x89/0xdb thread_group_cputime+0x11b/0x1ed thread_group_cputime_adjusted+0x1f/0x47 wait_consider_task+0x2a9/0xaf9 ? lock_acquire+0x97/0xa4 do_wait+0xdf/0x1f4 SYSC_wait4+0x8e/0xb5 ? list_add+0x34/0x34 SyS_wait4+0x9/0xb do_syscall_64+0x70/0x82 entry_SYSCALL64_slow_path+0x25/0x25 As Frederic pointed out: | Although those sched_clock_cpu() things seem to only matter when the | sched_clock() is unstable. And that stability is a condition for nohz_full | to work anyway. So probably sched_clock() alone would be enough. This patch fixes it by replacing sched_clock_cpu() in vtime_delta() by sched_clock() to avoid to call smp_processor_id() in preemptible context. Reported-by: Xiaolong Ye Cc: Thomas Gleixner Cc: Luiz Capitulino Cc: Frederic Weisbecker Cc: Rik van Riel Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Wanpeng Li --- kernel/sched/cputime.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 6e3ea4a..d86b94e 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -683,7 +683,7 @@ static u64 vtime_delta(struct vtime *vtime) { unsigned long long clock; - clock = sched_clock_cpu(smp_processor_id()); + clock = sched_clock(); if (clock < vtime->starttime) return 0; -- 2.7.4
[PATCH] module: fix ddebug_remove_module()
ddebug_remove_module() use mod->name to find the ddebug_table of the module and remove it. But dynamic_debug_setup() use the first _ddebug->modname to create ddebug_table for the module. It's ok when the _ddebug->modname is the same with the mod->name. But livepatch module is special, it may contain _ddebugs of other modules, the modname of which is different from the name of livepatch module. So ddebug_remove_module() can't use mod->name to find the right ddebug_table and remove it. It can cause kernel crash when we cat the file /dynamic_debug/control. Signed-off-by: Zhou Chengming --- kernel/module.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4a3665f..dac9805 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2703,21 +2703,21 @@ static void add_kallsyms(struct module *mod, const struct load_info *info) } #endif /* CONFIG_KALLSYMS */ -static void dynamic_debug_setup(struct _ddebug *debug, unsigned int num) +static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug, unsigned int num) { if (!debug) return; #ifdef CONFIG_DYNAMIC_DEBUG - if (ddebug_add_module(debug, num, debug->modname)) + if (ddebug_add_module(debug, num, mod->name)) pr_err("dynamic debug error adding module: %s\n", debug->modname); #endif } -static void dynamic_debug_remove(struct _ddebug *debug) +static void dynamic_debug_remove(struct module *mod, struct _ddebug *debug) { if (debug) - ddebug_remove_module(debug->modname); + ddebug_remove_module(mod->name); } void * __weak module_alloc(unsigned long size) @@ -3697,7 +3697,7 @@ static int load_module(struct load_info *info, const char __user *uargs, goto free_arch_cleanup; } - dynamic_debug_setup(info->debug, info->num_debug); + dynamic_debug_setup(mod, info->debug, info->num_debug); /* Ftrace init must be called in the MODULE_STATE_UNFORMED state */ ftrace_module_init(mod); @@ -3761,7 +3761,7 @@ static int load_module(struct load_info *info, const char __user *uargs, module_disable_nx(mod); ddebug_cleanup: - dynamic_debug_remove(info->debug); + dynamic_debug_remove(mod, info->debug); synchronize_sched(); kfree(mod->args); free_arch_cleanup: -- 1.8.3.1
Re: [PATCH] x86/boot/KASLR: exclude EFI_BOOT_SERVICES_{CODE|DATA} from KASLR's choice
On 07/06/17 at 03:57pm, Matt Fleming wrote: > On Thu, 06 Jul, at 08:31:07AM, Naoya Horiguchi wrote: > > + for (i = 0; i < nr_desc; i++) { > > + md = (efi_memory_desc_t *)(pmap + (i * e->efi_memdesc_size)); > > + > > + /* > > +* EFI_BOOT_SERVICES_{CODE|DATA} are avoided because boot > > +* services regions could be accessed after ExitBootServices() > > +* due to the workaround for buggy firmware. > > +*/ > > + if (!(md->type == EFI_LOADER_CODE || > > + md->type == EFI_LOADER_DATA || > > + md->type == EFI_CONVENTIONAL_MEMORY)) > > + continue; > > Wouldn't it make more sense to *only* use EFI_CONVENTIONAL_MEMORY? > > You can't re-use EFI_LOADER_* regions because the kaslr code is run so > early in boot that you've no idea if data the kernel will need is in > those EFI_LOADER_* regions. > > For example, we pass struct setup_data objects inside of > EFI_LOADER_DATA regions. It doesn't matter because we have tried to avoid those memory setup_data resides in in mem_avoid_overlap(). Here discarding EFI_LOADER_* could discard the whole regions while setup_data could occupy small part of them.
Re: [x86/time] 03fa63cc96: ACPI_Error:Table[DMAR]is_not_invalidated_during_early_boot_stage(#/tbxface -#)
On 07/07, Dou Liyang wrote: >Hi xiaolong, > >Really thanks for your testing. > >At 07/07/2017 09:54 AM, Ye Xiaolong wrote: >>On 07/06, Thomas Gleixner wrote: >>>On Thu, 6 Jul 2017, kernel test robot wrote: >>> commit: 03fa63cc96ab35592e0a7d522b8edbc1e6b02d22 ("x86/time: Initialize interrupt mode behind timer init") >>> ++++ || 43436935b7 | 03fa63cc96 | ++++ | boot_successes | 0 | 4 | ++++ >>> >>>So 03fa63cc96 makes the box boot again. I'm confused as usual by the >>>output of this tool., >>> kern :info : [0.005000] tsc: Fast TSC calibration using PIT kern :info : [0.006000] tsc: Detected 2195.020 MHz processor kern :info : [0.007000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4390.04 BogoMIPS (lpj=2195020) kern :info : [0.008001] pid_max: default: 90112 minimum: 704 kern :info : [0.009037] ACPI: Core revision 20170303 kern :err : [0.010002] ACPI Error: Table [DMAR] is not invalidated during early boot stage (20170303/tbxface-193) >>> >>>Sure we have a error message here, but compared to what? Compared to >>>something which does not boot at all? >> >>Sorry for the confusion, here commit 43436935b7 boot failed due to OOM which >>happened at the late stage of kernel boot while the ACPI error showed at the >>early boot stage for commit 03fa63cc96 and it didn't appear in 43436935b7's >>dmesg. >> > >let's make the problem clearly firstly: > >1) Commit 43436935b7 ("x86/xen: Bypass intr mode setup in enlighten_pv >system") made kernel boot failed, which caused by OOM. > >2) Commit 03fa63cc96 ("x86/time: Initialize interrupt mode behind >timer init") can make the kernel boot success again, but with an ACPI >error happened. > >And both *1* and *2* used the same configuration showed in the >attachment. > >Does anything I missed? Yes, this is exactly what I meant. Thanks, Xiaolong > >Thanks, > > dou. > >>Thanks, >>Xiaolong >> >>> >>>Thanks, >>> >>> tglx >>> >> >> >> > >
Re: [lkp-robot] [sched/cputime] 2a42eb9594: BUG:using_smp_processor_id()in_preemptible
2017-07-07 10:09 GMT+08:00 kernel test robot : > > FYI, we noticed the following commit: > > commit: 2a42eb9594a1480b4ead9e036e06ee1290e5fa6d ("sched/cputime: Accumulate > vtime on top of nsec clocksource") > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/urgent > > in testcase: boot > > on test machine: qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -smp > 2 -m 512M > > caused below changes (please refer to attached dmesg/kmsg for entire > log/backtrace): > > > ++++ > || bac5b6b6b1 | 2a42eb9594 | > ++++ > | boot_successes | 8 | 2 | > | boot_failures | 0 | 6 | > | BUG:kernel_hang_in_test_stage | 0 | 2 | > | BUG:using_smp_processor_id()in_preemptible | 0 | 4 | > ++++ > > > > [ 55.289944] BUG: using smp_processor_id() in preemptible [] code: > 99-trinity/181 > [ 55.302094] caller is debug_smp_processor_id+0x17/0x19 > [ 55.313831] CPU: 0 PID: 181 Comm: 99-trinity Not tainted > 4.12.0-01059-g2a42eb9 #1 > [ 55.324812] Call Trace: > [ 55.328601] dump_stack+0x82/0xb8 > [ 55.333467] check_preemption_disabled+0xd1/0xe3 > [ 55.340370] debug_smp_processor_id+0x17/0x19 > [ 55.346815] vtime_delta+0xd/0x2c > [ 55.352843] task_cputime+0x89/0xdb > [ 55.358026] thread_group_cputime+0x11b/0x1ed > [ 55.364517] thread_group_cputime_adjusted+0x1f/0x47 > [ 55.371895] wait_consider_task+0x2a9/0xaf9 > [ 55.378025] ? lock_acquire+0x97/0xa4 > [ 55.383463] do_wait+0xdf/0x1f4 > [ 55.388199] SYSC_wait4+0x8e/0xb5 > [ 55.393219] ? list_add+0x34/0x34 > [ 55.398317] SyS_wait4+0x9/0xb > [ 55.403401] do_syscall_64+0x70/0x82 > [ 55.408977] entry_SYSCALL64_slow_path+0x25/0x25 > [ 55.409006] RIP: 0033:0x7ff2f15c3c3e > [ 55.409032] RSP: 002b:7ffd3e91eab0 EFLAGS: 0246 ORIG_RAX: > 003d > [ 55.409064] RAX: ffda RBX: 7ff2f1f0b6a0 RCX: > 7ff2f15c3c3e > [ 55.409070] RDX: 0001 RSI: 7ffd3e91eb18 RDI: > > [ 55.409075] RBP: 0001 R08: R09: > > [ 55.409101] R10: R11: 0246 R12: > > [ 55.409106] R13: R14: 0001 R15: > > [ 55.550819] random: trinity: uninitialized urandom read (4 bytes read) > mountall: Event failed > [ 55.947396] random: mountall: uninitialized urandom read (12 bytes read) > [ 56.044974] BUG: using smp_processor_id() in preemptible [] code: > mountall/194 > [ 56.057287] caller is debug_smp_processor_id+0x17/0x19 > [ 56.066939] CPU: 1 PID: 194 Comm: mountall Not tainted > 4.12.0-01059-g2a42eb9 #1 > [ 56.078248] Call Trace: > [ 56.082297] dump_stack+0x82/0xb8 > [ 56.087256] check_preemption_disabled+0xd1/0xe3 > [ 56.093512] debug_smp_processor_id+0x17/0x19 > [ 56.099800] vtime_delta+0xd/0x2c > [ 56.105862] task_cputime+0x89/0xdb > [ 56.111237] thread_group_cputime+0x11b/0x1ed > [ 56.117693] thread_group_cputime_adjusted+0x1f/0x47 > [ 56.125298] wait_consider_task+0x2a9/0xaf9 > [ 56.131983] ? do_raw_read_lock+0x39/0x3c > [ 56.138293] do_wait+0xdf/0x1f4 > [ 56.143000] SyS_waitid+0xac/0x1e3 > [ 56.148566] ? list_add+0x34/0x34 > [ 56.153694] do_syscall_64+0x70/0x82 > [ 56.159267] entry_SYSCALL64_slow_path+0x25/0x25 > [ 56.166324] RIP: 0033:0x7fdc9c642d2f > [ 56.171770] RSP: 002b:7ffe6f4dda30 EFLAGS: 0246 ORIG_RAX: > 00f7 > [ 56.181888] RAX: ffda RBX: 560f25433d70 RCX: > 7fdc9c642d2f > [ 56.192353] RDX: 7ffe6f4dda78 RSI: 00c7 RDI: > 0001 > [ 56.203076] RBP: 560f2543bf20 R08: R09: > 0020 > [ 56.212903] R10: 0004 R11: 0246 R12: > 560f25439be0 > [ 56.223108] R13: 00c7 R14: 00c7 R15: > 560f25222c1f > [ 56.522906] init: Failed to create pty - disabling logging for job > [ 56.535738] init: Temporary process spawn error: No such file or directory > [ 56.769705] init: Failed to create pty - disabling logging for job > [ 56.781761] init: Temporary process spawn error: No such file or directory > [ 57.258638] init: Failed to create pty - disabling logging for job > [ 57.271867] init: Temporary process spawn error: No such file or directory > [ 57.318928] init: Failed to create pty - disabling logging for job > [ 57.340639] init: Temporary process spawn error: No such file or directory > [ 57.499535] BUG: using smp_processor_id() in preemptible [] code: > sh/217 > [ 57.529507] caller is debug_smp_
[PATCH v2] clk: fractional-divider: fix up the fractional clk's jitter
add clk_fractional_divider_special_ops for rockchip specific requirements, fractional divider must set that denominator is 20 times larger than numerator to generate precise clock frequency. Otherwise the CLK jitter is very big, poor quality of the clock signal. RK document description: 3.1.9 Fractional divider usage To get specific frequency, clocks of I2S, SPDIF, UARTcan be generated by fractional divider. Generally you must set that denominator is 20 times larger than numerator to generate precise clock frequency. So the fractional divider applies only to generate low frequency clock like I2S, UART.igned-off-by: Elaine Zhang Signed-off-by: Elaine Zhang --- drivers/clk/clk-fractional-divider.c | 32 drivers/clk/rockchip/clk.c | 2 +- include/linux/clk-provider.h | 1 + 3 files changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c index aab904618eb6..3107b33327f9 100644 --- a/drivers/clk/clk-fractional-divider.c +++ b/drivers/clk/clk-fractional-divider.c @@ -158,6 +158,38 @@ struct clk_hw *clk_hw_register_fractional_divider(struct device *dev, } EXPORT_SYMBOL_GPL(clk_hw_register_fractional_divider); +static long clk_fd_round_rate_special(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate) +{ + struct clk_hw *p_parent; + unsigned long p_rate, p_parent_rate; + + if (!rate || rate >= *parent_rate) + return *parent_rate; + + /* +* Get rate closer to *parent_rate to guarantee there is no overflow +* for m and n. In the result it will be the nearest rate left shifted +* by (scale - fd->nwidth) bits. +* fractional divider must set that denominator is 20 times larger than +* numerator to generate precise clock frequency. +*/ + p_rate = clk_hw_get_rate(clk_hw_get_parent(hw)); + if ((rate * 20 > p_rate) && (p_rate % rate != 0)) { + p_parent = clk_hw_get_parent(clk_hw_get_parent(hw)); + p_parent_rate = clk_hw_get_rate(p_parent); + *parent_rate = p_parent_rate; + } + return clk_fd_round_rate(hw, rate, parent_rate); +} + +const struct clk_ops clk_fractional_divider_special_ops = { + .recalc_rate = clk_fd_recalc_rate, + .round_rate = clk_fd_round_rate_special, + .set_rate = clk_fd_set_rate, +}; +EXPORT_SYMBOL_GPL(clk_fractional_divider_special_ops); + struct clk *clk_register_fractional_divider(struct device *dev, const char *name, const char *parent_name, unsigned long flags, void __iomem *reg, u8 mshift, u8 mwidth, u8 nshift, u8 nwidth, diff --git a/drivers/clk/rockchip/clk.c b/drivers/clk/rockchip/clk.c index fe1d393cf678..9eac9a579d18 100644 --- a/drivers/clk/rockchip/clk.c +++ b/drivers/clk/rockchip/clk.c @@ -210,7 +210,7 @@ static struct clk *rockchip_clk_register_frac_branch( div->nwidth = 16; div->nmask = GENMASK(div->nwidth - 1, 0) << div->nshift; div->lock = lock; - div_ops = &clk_fractional_divider_ops; + div_ops = &clk_fractional_divider_special_ops; clk = clk_register_composite(NULL, name, parent_names, num_parents, NULL, NULL, diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h index a428aec36ace..3f5f36973df6 100644 --- a/include/linux/clk-provider.h +++ b/include/linux/clk-provider.h @@ -570,6 +570,7 @@ struct clk_fractional_divider { #define to_clk_fd(_hw) container_of(_hw, struct clk_fractional_divider, hw) extern const struct clk_ops clk_fractional_divider_ops; +extern const struct clk_ops clk_fractional_divider_special_ops; struct clk *clk_register_fractional_divider(struct device *dev, const char *name, const char *parent_name, unsigned long flags, void __iomem *reg, u8 mshift, u8 mwidth, u8 nshift, u8 nwidth, -- 1.9.1
[PATCH 3/3] rtmutex: remove unnecessary adjust prio
We don't need to adjust prio before new pi_waiter adding. The prio only need update after pi_waiter change or task priority change. Signed-off-by: Alex Shi Cc: Steven Rostedt Cc: Sebastian Siewior Cc: Mathieu Poirier Cc: Juri Lelli Cc: Thomas Gleixner To: linux-kernel@vger.kernel.org To: Ingo Molnar To: Peter Zijlstra --- kernel/locking/rtmutex.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 28cd09e..d1fe41f 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -963,7 +963,6 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock, return -EDEADLK; raw_spin_lock(&task->pi_lock); - rt_mutex_adjust_prio(task); waiter->task = task; waiter->lock = lock; waiter->prio = task->prio; -- 2.7.4
[PATCH v2 2/3] rtmutex: update rt-mutex
The rtmutex remove a pending owner bit in in rt_mutex::owner, in commit 8161239a8bcc ("rtmutex: Simplify PI algorithm and make highest prio task get lock") But the document was changed accordingly. Updating it to a meaningful state. BTW, as 'Steven Rostedt' mentioned: There is still technically a "Pending Owner", it's just not called that anymore. The pending owner happens to be the top_waiter of a lock that has no owner and has been woken up to grab the lock. Signed-off-by: Alex Shi Cc: Steven Rostedt Cc: Sebastian Siewior Cc: Mathieu Poirier Cc: Juri Lelli Cc: Thomas Gleixner To: linux-...@vger.kernel.org To: linux-kernel@vger.kernel.org To: Jonathan Corbet To: Ingo Molnar To: Peter Zijlstra --- Documentation/locking/rt-mutex.txt | 58 +- 1 file changed, 26 insertions(+), 32 deletions(-) diff --git a/Documentation/locking/rt-mutex.txt b/Documentation/locking/rt-mutex.txt index 243393d..35793e0 100644 --- a/Documentation/locking/rt-mutex.txt +++ b/Documentation/locking/rt-mutex.txt @@ -28,14 +28,13 @@ magic bullet for poorly designed applications, but it allows well-designed applications to use userspace locks in critical parts of an high priority thread, without losing determinism. -The enqueueing of the waiters into the rtmutex waiter list is done in +The enqueueing of the waiters into the rtmutex waiter tree is done in priority order. For same priorities FIFO order is chosen. For each rtmutex, only the top priority waiter is enqueued into the owner's -priority waiters list. This list too queues in priority order. Whenever +priority waiters tree. This tree too queues in priority order. Whenever the top priority waiter of a task changes (for example it timed out or -got a signal), the priority of the owner task is readjusted. [The -priority enqueueing is handled by "plists", see include/linux/plist.h -for more details.] +got a signal), the priority of the owner task is readjusted. The +priority enqueueing is handled by "pi_waiters". RT-mutexes are optimized for fastpath operations and have no internal locking overhead when locking an uncontended mutex or unlocking a mutex @@ -46,34 +45,29 @@ is used] The state of the rt-mutex is tracked via the owner field of the rt-mutex structure: -rt_mutex->owner holds the task_struct pointer of the owner. Bit 0 and 1 -are used to keep track of the "owner is pending" and "rtmutex has -waiters" state. +lock->owner holds the task_struct pointer of the owner. Bit 0 is used to +keep track of the "lock has waiters" state. - owner bit1bit0 - NULL 0 0 mutex is free (fast acquire possible) - NULL 0 1 invalid state - NULL 1 0 Transitional state* - NULL 1 1 invalid state - taskpointer 0 0 mutex is held (fast release possible) - taskpointer 0 1 task is pending owner - taskpointer 1 0 mutex is held and has waiters - taskpointer 1 1 task is pending owner and mutex has waiters + ownerbit0 + NULL 0 lock is free (fast acquire possible) + NULL 1 lock is free and has waiters and the top waiter + is going to take the lock* + taskpointer 0 lock is held (fast release possible) + taskpointer 1 lock is held and has waiters** -Pending-ownership handling is a performance optimization: -pending-ownership is assigned to the first (highest priority) waiter of -the mutex, when the mutex is released. The thread is woken up and once -it starts executing it can acquire the mutex. Until the mutex is taken -by it (bit 0 is cleared) a competing higher priority thread can "steal" -the mutex which puts the woken up thread back on the waiters list. +The fast atomic compare exchange based acquire and release is only +possible when bit 0 of lock->owner is 0. -The pending-ownership optimization is especially important for the -uninterrupted workflow of high-prio tasks which repeatedly -takes/releases locks that have lower-prio waiters. Without this -optimization the higher-prio thread would ping-pong to the lower-prio -task [because at unlock time we always assign a new owner]. +(*) It also can be a transitional state when grabbing the lock +with ->wait_lock is held. To prevent any fast path cmpxchg to the lock, +we need to set the bit0 before looking at the lock, and the owner may be +NULL in this small time, hence this can be a transitional state. -(*) The "mutex has waiters" bit gets set to take the lock. If the lock -doesn't already have an owner, this bit is quickly cleared if there are -no waiters. So this is a transitional state to synchronize with looking -at the owner field of the mutex and the mutex owner releasing the lock. +(**) There is a small time when bit 0 is set but there are no +waiters. This can happen when grabbing the lock in the slow path. +To prevent a cmpxchg of the owner releasing the lock
[PATCH v4 1/3] rtmutex: update rt-mutex-design
The rt-mutex-design documents didn't gotten meaningful update from its first version. Even after owner's pending bit was removed in commit 8161239a8bcc ("rtmutex: Simplify PI algorithm and make highest prio task get lock") and priority list 'plist' changed to rbtree. And Peter Zijlstra did some clean up and fix for deadline task changes on tip tree. So update it to latest code and make it meaningful. Steven Rostedt and Sebastian Siewior gave much of comments and input in this doc. Thanks! Signed-off-by: Alex Shi Cc: Steven Rostedt Cc: Sebastian Siewior Cc: Mathieu Poirier Cc: Juri Lelli Cc: Thomas Gleixner To: linux-...@vger.kernel.org To: linux-kernel@vger.kernel.org To: Jonathan Corbet To: Ingo Molnar To: Peter Zijlstra --- Documentation/locking/rt-mutex-design.txt | 432 -- 1 file changed, 105 insertions(+), 327 deletions(-) diff --git a/Documentation/locking/rt-mutex-design.txt b/Documentation/locking/rt-mutex-design.txt index 8666070..6c6e8c2 100644 --- a/Documentation/locking/rt-mutex-design.txt +++ b/Documentation/locking/rt-mutex-design.txt @@ -97,9 +97,9 @@ waiter - A waiter is a struct that is stored on the stack of a blocked a process being blocked on the mutex, it is fine to allocate the waiter on the process's stack (local variable). This structure holds a pointer to the task, as well as the mutex that - the task is blocked on. It also has the plist node structures to - place the task in the waiter_list of a mutex as well as the - pi_list of a mutex owner task (described below). + the task is blocked on. It also has rbtree node structures to + place the task in the waiters rbtree of a mutex as well as the + pi_waiters rbtree of a mutex owner task (described below). waiter is sometimes used in reference to the task that is waiting on a mutex. This is the same as waiter->task. @@ -179,53 +179,34 @@ again. | F->L5-+ +If process G has the highest priority in the chain, then all the tasks up +the chain (A and B in this example), must have their priorities increased +to that of G. -Plist -- - -Before I go further and talk about how the PI chain is stored through lists -on both mutexes and processes, I'll explain the plist. This is similar to -the struct list_head functionality that is already in the kernel. -The implementation of plist is out of scope for this document, but it is -very important to understand what it does. - -There are a few differences between plist and list, the most important one -being that plist is a priority sorted linked list. This means that the -priorities of the plist are sorted, such that it takes O(1) to retrieve the -highest priority item in the list. Obviously this is useful to store processes -based on their priorities. - -Another difference, which is important for implementation, is that, unlike -list, the head of the list is a different element than the nodes of a list. -So the head of the list is declared as struct plist_head and nodes that will -be added to the list are declared as struct plist_node. - - -Mutex Waiter List +Mutex Waiters Tree - -Every mutex keeps track of all the waiters that are blocked on itself. The mutex -has a plist to store these waiters by priority. This list is protected by -a spin lock that is located in the struct of the mutex. This lock is called -wait_lock. Since the modification of the waiter list is never done in -interrupt context, the wait_lock can be taken without disabling interrupts. +Every mutex keeps track of all the waiters that are blocked on itself. The +mutex has a rbtree to store these waiters by priority. This tree is protected +by a spin lock that is located in the struct of the mutex. This lock is called +wait_lock. -Task PI List +Task PI Tree -To keep track of the PI chains, each process has its own PI list. This is -a list of all top waiters of the mutexes that are owned by the process. -Note that this list only holds the top waiters and not all waiters that are +To keep track of the PI chains, each process has its own PI rbtree. This is +a tree of all top waiters of the mutexes that are owned by the process. +Note that this tree only holds the top waiters and not all waiters that are blocked on mutexes owned by the process. -The top of the task's PI list is always the highest priority task that +The top of the task's PI tree is always the highest priority task that is waiting on a mutex that is owned by the task. So if the task has inherited a priority, it will always be the priority of the task that is -at the top of this list. +at the top of this tree. -This list is stored in the task structure of a process as a plist called -pi_list. This list is protected by a spin lock also in the task structure, +This tree is stored in the task struct
Re: [PATCH v3 1/3] rtmutex: update rt-mutex-design
On 07/06/2017 09:25 PM, Steven Rostedt wrote: > > This looks fine. > Thanks a lot! >> >> >> >> -Reviewers: Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and Randy Dunlap >> +Reviewers: Ingo Molnar, Thomas Gleixner, Thomas Duetsch, Randy Dunlap >> + and Sebastian Siewior > > Since the updated was not reviewed by all of the above, you should > change this to: > > Original Reviewers: Ingo Molnar, Thomas Gleixner, Thomas Deutsch, and > Randy Dunlap > > Update (7/6/2017) Reviewers: Steven Rostedt and Sebastian Siewior > This looks better. Thanks! :) I will send out the new patch ASAP. > Did Randy make comments? No. Regards Alex
Re: [x86/time] 03fa63cc96: ACPI_Error:Table[DMAR]is_not_invalidated_during_early_boot_stage(#/tbxface -#)
Hi xiaolong, Really thanks for your testing. At 07/07/2017 09:54 AM, Ye Xiaolong wrote: On 07/06, Thomas Gleixner wrote: On Thu, 6 Jul 2017, kernel test robot wrote: commit: 03fa63cc96ab35592e0a7d522b8edbc1e6b02d22 ("x86/time: Initialize interrupt mode behind timer init") ++++ || 43436935b7 | 03fa63cc96 | ++++ | boot_successes | 0 | 4 | ++++ So 03fa63cc96 makes the box boot again. I'm confused as usual by the output of this tool., kern :info : [0.005000] tsc: Fast TSC calibration using PIT kern :info : [0.006000] tsc: Detected 2195.020 MHz processor kern :info : [0.007000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4390.04 BogoMIPS (lpj=2195020) kern :info : [0.008001] pid_max: default: 90112 minimum: 704 kern :info : [0.009037] ACPI: Core revision 20170303 kern :err : [0.010002] ACPI Error: Table [DMAR] is not invalidated during early boot stage (20170303/tbxface-193) Sure we have a error message here, but compared to what? Compared to something which does not boot at all? Sorry for the confusion, here commit 43436935b7 boot failed due to OOM which happened at the late stage of kernel boot while the ACPI error showed at the early boot stage for commit 03fa63cc96 and it didn't appear in 43436935b7's dmesg. let's make the problem clearly firstly: 1) Commit 43436935b7 ("x86/xen: Bypass intr mode setup in enlighten_pv system") made kernel boot failed, which caused by OOM. 2) Commit 03fa63cc96 ("x86/time: Initialize interrupt mode behind timer init") can make the kernel boot success again, but with an ACPI error happened. And both *1* and *2* used the same configuration showed in the attachment. Does anything I missed? Thanks, dou. Thanks, Xiaolong Thanks, tglx
Re: [PATCH 0/5 v4] Fix sp5100_tco watchdog driver regression
(Reposting without html subpart, second attempt) On Thu, Jul 6, 2017 at 11:09 PM, Marcelo "Marc" Ranolfi wrote: Hi Zoltán and all, I _was_ testing the patch series (the latest version), system worked OK for two days, but then I got a system crash (probably hardware related) and lost my files in the main partition. Posted about it in the F2FS mailing list (https://sourceforge.net/p/linux-f2fs/mailman/message/35928135/). While I don't believe that the freeze/crash was caused by the watchdog driver (many other things to blame: amdgpu driver on Southern Islands, CPU voltage lower than recommended...), it's worth mentioning that the computer was completely frozen for a little over 4 minutes, and did not reset as I'd expect the watchdog to. Just mentioning because I'm not sure of the implications. Once I get my system back online this weekend I'll have a good look at the code. Anyway, thanks again for the patches and I'm also expecting someone more qualified (than me, at least) to review them. Regards Marcelo "Marc" Ranolfi
Re: 答复: [f2fs-dev] [PATCH] f2fs: relax permission for atomic/volatile ioctls
-邮件原件- 发件人: Jaegeuk Kim [mailto:jaeg...@kernel.org] 发送时间: 2017年7月7日 10:12 收件人: gaoxiang (P) 抄送: Chao Yu; linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net 主题: Re: 答复: [f2fs-dev] [PATCH] f2fs: relax permission for atomic/volatile ioctls On 07/07, gaoxiang (P) wrote: > Hi, > I think Sdcardfs should override task_struct cred fsuid/fsgid before calling > the underlay fs operation. > And it seems sdcardfs implementations misses override cred fsuid/fsgid before > the ioctl operation. Oh, good catch! Thank you. ;) ;) We are fixing it internally. And the AOSP sdcardfs patch would be https://android-review.googlesource.com/#/c/431259/ It is none of f2fs business. :) :) Thanks. > > Thanks. > > -邮件原件- > 发件人: Chao Yu [mailto:c...@kernel.org] > 发送时间: 2017年7月7日 8:58 > 收件人: Jaegeuk Kim > 抄送: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > 主题: Re: [f2fs-dev] [PATCH] f2fs: relax permission for atomic/volatile > ioctls > > On 2017/7/7 8:16, Jaegeuk Kim wrote: > > On 07/07, Chao Yu wrote: > >> On 2017/7/6 10:23, Jaegeuk Kim wrote: > >>> This patch allows atomic/volatile ioctls for sqlite under sdcardfs. > >> > >> Out of curiosity, we will lose some capable when passing through sdcardfs? > > > > I don't think so. But, it seems a test applicaion tries to access > > database from difference uid. > > Oh, is that really allowed? if the sqlite database is public in sdcard > directory, application needs to apply WRITE_EXTERNAL_STORAGE in order > to add itself to sdcard_rw group, then it can access the database. Right? > > Thanks, > > > > > Thanks, > > > >> > >> Thanks, > >> > >>> > >>> Signed-off-by: Jaegeuk Kim > --- > >>> fs/f2fs/file.c | 15 --- > >>> 1 file changed, 15 deletions(-) > >>> > >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index > >>> f5d6357e8360..dd8f5d2caa48 100644 > >>> --- a/fs/f2fs/file.c > >>> +++ b/fs/f2fs/file.c > >>> @@ -1587,9 +1587,6 @@ static int f2fs_ioc_start_atomic_write(struct file > >>> *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> if (!S_ISREG(inode->i_mode)) > >>> return -EINVAL; > >>> > >>> @@ -1636,9 +1633,6 @@ static int f2fs_ioc_commit_atomic_write(struct file > >>> *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> @@ -1672,9 +1666,6 @@ static int f2fs_ioc_start_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> if (!S_ISREG(inode->i_mode)) > >>> return -EINVAL; > >>> > >>> @@ -1707,9 +1698,6 @@ static int f2fs_ioc_release_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> @@ -1736,9 +1724,6 @@ static int f2fs_ioc_abort_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> > > -- > Check out the vibrant tech community on one of the world's > most engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Linux-f2fs-devel mailing list > linux-f2fs-de...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: 答复: [f2fs-dev] [PATCH] f2fs: relax permission for atomic/volatile ioctls
On 07/07, gaoxiang (P) wrote: > Hi, > I think Sdcardfs should override task_struct cred fsuid/fsgid before calling > the underlay fs operation. > And it seems sdcardfs implementations misses override cred fsuid/fsgid before > the ioctl operation. Oh, good catch! Thank you. ;) > > Thanks. > > -邮件原件- > 发件人: Chao Yu [mailto:c...@kernel.org] > 发送时间: 2017年7月7日 8:58 > 收件人: Jaegeuk Kim > 抄送: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > 主题: Re: [f2fs-dev] [PATCH] f2fs: relax permission for atomic/volatile ioctls > > On 2017/7/7 8:16, Jaegeuk Kim wrote: > > On 07/07, Chao Yu wrote: > >> On 2017/7/6 10:23, Jaegeuk Kim wrote: > >>> This patch allows atomic/volatile ioctls for sqlite under sdcardfs. > >> > >> Out of curiosity, we will lose some capable when passing through sdcardfs? > > > > I don't think so. But, it seems a test applicaion tries to access database > > from > > difference uid. > > Oh, is that really allowed? if the sqlite database is public in sdcard > directory, > application needs to apply WRITE_EXTERNAL_STORAGE in order to add itself to > sdcard_rw group, then it can access the database. Right? > > Thanks, > > > > > Thanks, > > > >> > >> Thanks, > >> > >>> > >>> Signed-off-by: Jaegeuk Kim > --- > >>> fs/f2fs/file.c | 15 --- > >>> 1 file changed, 15 deletions(-) > >>> > >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c > >>> index f5d6357e8360..dd8f5d2caa48 100644 > >>> --- a/fs/f2fs/file.c > >>> +++ b/fs/f2fs/file.c > >>> @@ -1587,9 +1587,6 @@ static int f2fs_ioc_start_atomic_write(struct file > >>> *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> if (!S_ISREG(inode->i_mode)) > >>> return -EINVAL; > >>> > >>> @@ -1636,9 +1633,6 @@ static int f2fs_ioc_commit_atomic_write(struct file > >>> *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> @@ -1672,9 +1666,6 @@ static int f2fs_ioc_start_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> if (!S_ISREG(inode->i_mode)) > >>> return -EINVAL; > >>> > >>> @@ -1707,9 +1698,6 @@ static int f2fs_ioc_release_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> @@ -1736,9 +1724,6 @@ static int f2fs_ioc_abort_volatile_write(struct > >>> file *filp) > >>> struct inode *inode = file_inode(filp); > >>> int ret; > >>> > >>> - if (!inode_owner_or_capable(inode)) > >>> - return -EACCES; > >>> - > >>> ret = mnt_want_write_file(filp); > >>> if (ret) > >>> return ret; > >>> > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Linux-f2fs-devel mailing list > linux-f2fs-de...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [PATCH 2/9] RISC-V: Atomic and Locking Code
On Thu, Jul 06, 2017 at 06:04:13PM -0700, Palmer Dabbelt wrote: [...] > >> +#define __smp_load_acquire(p) > >> \ > >> +do { > >> \ > >> + union { typeof(*p) __val; char __c[1]; } __u = \ > >> + { .__val = (__force typeof(*p)) (v) }; \ > >> + compiletime_assert_atomic_type(*p); \ > >> + switch (sizeof(*p)) { \ > >> + case 1: \ > >> + case 2: \ > >> + __u.__val = READ_ONCE(*p); \ > >> + smb_mb(); \ > >> + break; \ > >> + case 4: \ > >> + __asm__ __volatile__ ( \ > >> + "amoor.w.aq %1, zero, %0" \ > >> + : "+A" (*p) \ > >> + : "=r" (__u.__val) \ > >> + : "memory");\ > >> + break; \ > >> + case 8: \ > >> + __asm__ __volatile__ ( \ > >> + "amoor.d.aq %1, zero, %0" \ > >> + : "+A" (*p) \ > >> + : "=r" (__u.__val) \ > >> + : "memory");\ > >> + break; \ > >> + } \ > >> + __u.__val; \ > >> +} while (0) > > > > 'creative' use of amoswap and amoor :-) > > > > You should really look at a normal load with ordering instruction > > though, that amoor.aq is a rmw and will promote the cacheline to > > exclusive (and dirty it). > > The thought here was that implementations could elide the MW by pattern > matching the "zero" (x0, the architectural zero register) forms of AMOs where > it's interesting. I talked to one of our microarchitecture guys, and while he > agrees that's easy he points out that eliding half the AMO may wreak havoc on > the consistency model. Since we're not sure what the memory model is actually > going to look like, we thought it'd be best to just write the simplest code > here > > /* >* TODO_RISCV_MEMORY_MODEL: While we could emit AMOs for the W and D sized >* accesses here, it's questionable if that actually helps or not: the lack > of >* offsets in the AMOs means they're usually preceded by an addi, so they >* probably won't save code space. For now we'll just emit the fence. >*/ > #define __smp_store_release(p, v) \ > ({ \ > compiletime_assert_atomic_type(*p); \ > smp_mb(); \ > WRITE_ONCE(*p, v); \ > }) > > #define __smp_load_acquire(p) \ > ({ \ > union{typeof(*p) __p; long __l;} __u; \ AFAICT, there seems to be an endian issue if you do this. No? Let us assume typeof(*p) is char and *p == 1, and on a big endian 32bit platform: > compiletime_assert_atomic_type(*p); \ > __u.__l = READ_ONCE(*p);\ READ_ONCE(*p) is 1 so __u.__l is 0x00 00 00 01 now > smp_mb(); \ > __u.__p;\ __u.__p is then 0x00. Am I missing something here? Even so why not use the simple definition as in include/asm-generic/barrier.h? Regards, Boqun > }) > [...] signature.asc Description: PGP signature
[PATCH v1] i2c: mediatek: send i2c master code at 400k
From: Jun Gao The speed of sending i2c master code in high-speed mode depends on source clock, clock-div and TIMING register. The source clock and clock-div of different SoC are not all the same. In order to send i2c master code at 400k in high-speed mode, a appropriate value should be set to TIMING register for a certain source clock and clock-div. Signed-off-by: Jun Gao --- drivers/i2c/busses/i2c-mt65xx.c | 65 + 1 file changed, 46 insertions(+), 19 deletions(-) diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c index 45d6171..9bedf0b 100644 --- a/drivers/i2c/busses/i2c-mt65xx.c +++ b/drivers/i2c/busses/i2c-mt65xx.c @@ -50,7 +50,6 @@ #define I2C_FS_START_CON 0x1800 #define I2C_TIME_CLR_VALUE 0x #define I2C_TIME_DEFAULT_VALUE 0x0003 -#define I2C_FS_TIME_INIT_VALUE 0x1303 #define I2C_WRRD_TRANAC_VALUE 0x0002 #define I2C_RD_TRANAC_VALUE0x0001 @@ -154,6 +153,7 @@ struct mtk_i2c { bool use_push_pull; /* IO config push-pull mode */ u16 irq_stat; /* interrupt status */ + unsigned int clk_src_div; unsigned int speed_hz; /* The speed in transfer */ enum mtk_trans_op op; u16 timing_reg; @@ -285,23 +285,20 @@ static void mtk_i2c_init_hw(struct mtk_i2c *i2c) * less than or equal to i2c->speed_hz. The calculation try to get * sample_cnt and step_cn */ -static int mtk_i2c_set_speed(struct mtk_i2c *i2c, unsigned int parent_clk, -unsigned int clock_div) +static int mtk_i2c_calculate_speed(struct mtk_i2c *i2c, unsigned int clk_src, + unsigned int target_speed, + unsigned int *timing_step_cnt, + unsigned int *timing_sample_cnt) { - unsigned int clk_src; unsigned int step_cnt; unsigned int sample_cnt; unsigned int max_step_cnt; - unsigned int target_speed; unsigned int base_sample_cnt = MAX_SAMPLE_CNT_DIV; unsigned int base_step_cnt; unsigned int opt_div; unsigned int best_mul; unsigned int cnt_mul; - clk_src = parent_clk / clock_div; - target_speed = i2c->speed_hz; - if (target_speed > MAX_HS_MODE_SPEED) target_speed = MAX_HS_MODE_SPEED; @@ -347,16 +344,48 @@ static int mtk_i2c_set_speed(struct mtk_i2c *i2c, unsigned int parent_clk, return -EINVAL; } - step_cnt--; - sample_cnt--; + *timing_step_cnt = step_cnt - 1; + *timing_sample_cnt = sample_cnt - 1; + + return 0; +} + +static int mtk_i2c_set_speed(struct mtk_i2c *i2c, unsigned int parent_clk) +{ + unsigned int clk_src; + unsigned int step_cnt; + unsigned int sample_cnt; + unsigned int target_speed; + int ret; + + clk_src = parent_clk / i2c->clk_src_div; + target_speed = i2c->speed_hz; if (target_speed > MAX_FS_MODE_SPEED) { + /* Set master code speed register */ + ret = mtk_i2c_calculate_speed(i2c, clk_src, MAX_FS_MODE_SPEED, + &step_cnt, &sample_cnt); + if (ret < 0) + return ret; + + i2c->timing_reg = (sample_cnt << 8) | step_cnt; + /* Set the high speed mode register */ - i2c->timing_reg = I2C_FS_TIME_INIT_VALUE; + ret = mtk_i2c_calculate_speed(i2c, clk_src, target_speed, + &step_cnt, &sample_cnt); + if (ret < 0) + return ret; + i2c->high_speed_reg = I2C_TIME_DEFAULT_VALUE | (sample_cnt << 12) | (step_cnt << 8); } else { - i2c->timing_reg = (sample_cnt << 8) | (step_cnt << 0); + ret = mtk_i2c_calculate_speed(i2c, clk_src, target_speed, + &step_cnt, &sample_cnt); + if (ret < 0) + return ret; + + i2c->timing_reg = (sample_cnt << 8) | step_cnt; + /* Disable the high speed transaction */ i2c->high_speed_reg = I2C_TIME_CLR_VALUE; } @@ -647,8 +676,7 @@ static u32 mtk_i2c_functionality(struct i2c_adapter *adap) .functionality = mtk_i2c_functionality, }; -static int mtk_i2c_parse_dt(struct device_node *np, struct mtk_i2c *i2c, - unsigned int *clk_src_div) +static int mtk_i2c_parse_dt(struct device_node *np, struct mtk_i2c *i2c) { int ret; @@ -656,11 +684,11 @@ static int mtk_i2c_parse_dt(struct device_node *np, struct mtk_i2c *i2c, if (ret < 0) i2c->speed_hz = I2C_DEFAULT_SPEED; - ret = of_property_read_u32(np, "clock-div", clk_src_div); + ret = of_property_re
[lkp-robot] [sched/cputime] 2a42eb9594: BUG:using_smp_processor_id()in_preemptible
FYI, we noticed the following commit: commit: 2a42eb9594a1480b4ead9e036e06ee1290e5fa6d ("sched/cputime: Accumulate vtime on top of nsec clocksource") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/urgent in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -smp 2 -m 512M caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): ++++ || bac5b6b6b1 | 2a42eb9594 | ++++ | boot_successes | 8 | 2 | | boot_failures | 0 | 6 | | BUG:kernel_hang_in_test_stage | 0 | 2 | | BUG:using_smp_processor_id()in_preemptible | 0 | 4 | ++++ [ 55.289944] BUG: using smp_processor_id() in preemptible [] code: 99-trinity/181 [ 55.302094] caller is debug_smp_processor_id+0x17/0x19 [ 55.313831] CPU: 0 PID: 181 Comm: 99-trinity Not tainted 4.12.0-01059-g2a42eb9 #1 [ 55.324812] Call Trace: [ 55.328601] dump_stack+0x82/0xb8 [ 55.333467] check_preemption_disabled+0xd1/0xe3 [ 55.340370] debug_smp_processor_id+0x17/0x19 [ 55.346815] vtime_delta+0xd/0x2c [ 55.352843] task_cputime+0x89/0xdb [ 55.358026] thread_group_cputime+0x11b/0x1ed [ 55.364517] thread_group_cputime_adjusted+0x1f/0x47 [ 55.371895] wait_consider_task+0x2a9/0xaf9 [ 55.378025] ? lock_acquire+0x97/0xa4 [ 55.383463] do_wait+0xdf/0x1f4 [ 55.388199] SYSC_wait4+0x8e/0xb5 [ 55.393219] ? list_add+0x34/0x34 [ 55.398317] SyS_wait4+0x9/0xb [ 55.403401] do_syscall_64+0x70/0x82 [ 55.408977] entry_SYSCALL64_slow_path+0x25/0x25 [ 55.409006] RIP: 0033:0x7ff2f15c3c3e [ 55.409032] RSP: 002b:7ffd3e91eab0 EFLAGS: 0246 ORIG_RAX: 003d [ 55.409064] RAX: ffda RBX: 7ff2f1f0b6a0 RCX: 7ff2f15c3c3e [ 55.409070] RDX: 0001 RSI: 7ffd3e91eb18 RDI: [ 55.409075] RBP: 0001 R08: R09: [ 55.409101] R10: R11: 0246 R12: [ 55.409106] R13: R14: 0001 R15: [ 55.550819] random: trinity: uninitialized urandom read (4 bytes read) mountall: Event failed [ 55.947396] random: mountall: uninitialized urandom read (12 bytes read) [ 56.044974] BUG: using smp_processor_id() in preemptible [] code: mountall/194 [ 56.057287] caller is debug_smp_processor_id+0x17/0x19 [ 56.066939] CPU: 1 PID: 194 Comm: mountall Not tainted 4.12.0-01059-g2a42eb9 #1 [ 56.078248] Call Trace: [ 56.082297] dump_stack+0x82/0xb8 [ 56.087256] check_preemption_disabled+0xd1/0xe3 [ 56.093512] debug_smp_processor_id+0x17/0x19 [ 56.099800] vtime_delta+0xd/0x2c [ 56.105862] task_cputime+0x89/0xdb [ 56.111237] thread_group_cputime+0x11b/0x1ed [ 56.117693] thread_group_cputime_adjusted+0x1f/0x47 [ 56.125298] wait_consider_task+0x2a9/0xaf9 [ 56.131983] ? do_raw_read_lock+0x39/0x3c [ 56.138293] do_wait+0xdf/0x1f4 [ 56.143000] SyS_waitid+0xac/0x1e3 [ 56.148566] ? list_add+0x34/0x34 [ 56.153694] do_syscall_64+0x70/0x82 [ 56.159267] entry_SYSCALL64_slow_path+0x25/0x25 [ 56.166324] RIP: 0033:0x7fdc9c642d2f [ 56.171770] RSP: 002b:7ffe6f4dda30 EFLAGS: 0246 ORIG_RAX: 00f7 [ 56.181888] RAX: ffda RBX: 560f25433d70 RCX: 7fdc9c642d2f [ 56.192353] RDX: 7ffe6f4dda78 RSI: 00c7 RDI: 0001 [ 56.203076] RBP: 560f2543bf20 R08: R09: 0020 [ 56.212903] R10: 0004 R11: 0246 R12: 560f25439be0 [ 56.223108] R13: 00c7 R14: 00c7 R15: 560f25222c1f [ 56.522906] init: Failed to create pty - disabling logging for job [ 56.535738] init: Temporary process spawn error: No such file or directory [ 56.769705] init: Failed to create pty - disabling logging for job [ 56.781761] init: Temporary process spawn error: No such file or directory [ 57.258638] init: Failed to create pty - disabling logging for job [ 57.271867] init: Temporary process spawn error: No such file or directory [ 57.318928] init: Failed to create pty - disabling logging for job [ 57.340639] init: Temporary process spawn error: No such file or directory [ 57.499535] BUG: using smp_processor_id() in preemptible [] code: sh/217 [ 57.529507] caller is debug_smp_processor_id+0x17/0x19 [ 57.537158] CPU: 1 PID: 217 Comm: sh Not tainted 4.12.0-01059-g2a42eb9 #1 [ 57.547065] Call Trace: [ 57.550929] dump_stack+0x82/0xb8 [ 57.555948] check_preemption_disabled+0xd1/0xe3 [ 57.562849] debug_smp_processor_id+0x17/0x1
Re: [PATCH] f2fs: relax permission for atomic/volatile ioctls
On Wed, Jul 05, 2017 at 07:23:26PM -0700, Jaegeuk Kim wrote: > This patch allows atomic/volatile ioctls for sqlite under sdcardfs. > > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/file.c | 15 --- > 1 file changed, 15 deletions(-) > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c > index f5d6357e8360..dd8f5d2caa48 100644 > --- a/fs/f2fs/file.c > +++ b/fs/f2fs/file.c > @@ -1587,9 +1587,6 @@ static int f2fs_ioc_start_atomic_write(struct file > *filp) > struct inode *inode = file_inode(filp); > int ret; > > - if (!inode_owner_or_capable(inode)) > - return -EACCES; > - Why not require that the file be open for writing (FMODE_WRITE)?
Re: [PATCH 1/3] drm: Widen vblank count to 64 bits. Change vblank time precision to ns
On 07/07/17 10:34 AM, Michel Dänzer wrote: > On 07/07/17 12:04 AM, Keith Packard wrote: >> Michel Dänzer writes: >> @@ -317,6 +317,9 @@ int via_driver_irq_postinstall(struct drm_device *dev) if (!dev_priv) return -EINVAL; + if (dev->driver->get_vblank_counter) + dev->max_vblank_count = 0x; >>> >>> What's the purpose of this? All drivers providing get_vblank_counter >>> should already initialize max_vblank_count correctly. >> >> Yeah, I couldn't prove that this driver did that, > > Which driver? > >> and as Daniel says, we haven't ever audited the drivers to make sure >> they do. > > I don't think that's what he meant, rather that with the change above, > all drivers have to be audited to make sure the added assignment doesn't > clobber an earlier assignment by the driver. ... and if there are any drivers that set dev->driver->get_vblank_counter but don't set dev->max_vblank_count to a non-0 value, that the hardware counter actually has 32 bits. I'd say don't bother, just drop this hunk. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer signature.asc Description: OpenPGP digital signature
Re: [PATCH] f2fs: no need to create issue_discard_thread if it exists
It's fine unless create_discard_cmd_control is used in remount flow in the future. On 2017/7/6 21:16, Chao Yu wrote: > Hi Yunlong, > > It looks there is no way to create discard thread redundantly, > so here we don't need to check this? > > Thanks, > > On 2017/7/6 19:05, Yunlong Song wrote: >> Signed-off-by: Yunlong Song >> --- >> fs/f2fs/segment.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >> index 4c246e3..b48d004 100644 >> --- a/fs/f2fs/segment.c >> +++ b/fs/f2fs/segment.c >> @@ -1417,6 +1417,8 @@ static int create_discard_cmd_control(struct >> f2fs_sb_info *sbi) >> >> if (SM_I(sbi)->dcc_info) { >> dcc = SM_I(sbi)->dcc_info; >> +if (dcc->f2fs_issue_discard) >> +return err; >> goto init_thread; >> } >> >> > . > -- Thanks, Yunlong Song