date:20190423

Re: [PATCH v2 4/5] OPP: Update the bandwidth on OPP frequency changes

2019-04-23 Thread Viresh Kumar

On 23-04-19, 16:28, Georgi Djakov wrote:
> If the OPP bandwidth values are populated, we want to switch also the
> interconnect bandwidth in addition to frequency and voltage.
> 
> Signed-off-by: Georgi Djakov 
> ---
>  drivers/opp/core.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 97ee39ecdebd..91d1c2abfb3e 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -707,7 +707,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
> target_freq)
>   unsigned long freq, old_freq;
>   struct dev_pm_opp *old_opp, *opp;
>   struct clk *clk;
> - int ret;
> + int ret, i;
>  
>   if (unlikely(!target_freq)) {
>   dev_err(dev, "%s: Invalid target frequency %lu\n", __func__,
> @@ -780,6 +780,13 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned 
> long target_freq)
>   ret = _generic_set_opp_clk_only(dev, clk, freq);
>   }
>  
> + if (!ret && !IS_ERR_OR_NULL(opp_table->paths)) {

Can paths ever have a error value ? I believe only checking for NULL
is sufficient ?

> + for (i = 0; i < opp_table->path_count; i++) {
> + icc_set_bw(opp_table->paths[i], opp->bandwidth[i].avg,
> +opp->bandwidth[i].peak);
> + }
> + }
> +

I will set the path after required_opps are set.

>   /* Scaling down? Configure required OPPs after frequency */
>   if (!ret && freq < old_freq) {
>   ret = _set_required_opps(dev, opp_table, opp);

-- 
viresh

Re: [PATCH 1/7] thermal/drivers/core: Remove the module Kconfig's option

2019-04-23 Thread Amit Kucheria

On Tue, Apr 23, 2019 at 9:22 PM Eduardo Valentin  wrote:
>
> Hello,
>
> On Tue, Apr 02, 2019 at 06:12:44PM +0200, Daniel Lezcano wrote:
> > The module support for the thermal subsystem makes little sense:
> >  - some subsystems relying on it are not modules, thus forcing the
> >framework to be compiled in
> >  - it is compiled in for almost every configs, the remaining ones
> >are a few platforms where I don't see why we can not switch the thermal
> >to 'y'. The drivers can stay in tristate.
> >  - platforms need the thermal to be ready as soon as possible at boot time
> >in order to mitigate
> >
> > Usually the subsystems framework are compiled-in and the plugs are as 
> > module.
> >
> > Remove the module option. The removal of the module related dead code will
> > come after this patch gets in or is acked.
>
>
> I remember some buzilla entry around this some time back.
>
> Rui, do you remember why you made this to be module?
>
> I dont have strong opinion here, but I would like to see
> a better description why we are going this direction rather
> than "most people dont use it as module". Was there any particular
> specific technical motivation?

Speaking for Qualcomm platforms, we want the thermal subsystem
available as soon as possible for boot time thermal mitigation since
faster boot times equals hotter cpus. Also the dependency on cpufreq
subsystem due to the cpufreq cooling device would be simplified with
this.

In fact, I now have a follow on patch to move thermal init earlier
than fs_initcall since we'd now not wait on modules to be available.

/Amit

Re: [RFC][PATCH 0/2] Access console drivers list under console_sem

2019-04-23 Thread Sergey Senozhatsky

On (04/23/19 09:48), Steven Rostedt wrote:
> > RFC
> > 
> > Normally, we grab console_sem lock before we iterate consoles
> > list, which is necessary if we want to be race free. The only exception
> > to this rule is console_flush_on_panic(). However, it seems that we are
> > not fully race free - register_console() iterates console drivers list
> > in unsafe manner in several places. E.g. the following scenarion:
> > 
> > CPU0CPU1
> > register_console()  unregister_console()
> >  console_lock()
> >   for_each_console()  // modify console_drivers
> > con->fookfree(con)
> > 
> > So I have two quick-n-dirty patches, which remove unsafe console list
> > access.
> > 
> > What do you think?
> 
> I just skimmed the patches and haven't done a thorough review, but the
> concept seems sane to me.

Thank you Steven. Let me know if anything doesn't work for you.

I have vague memories of a kernel Oops at con->foo dereferencing
(saw a message on linux-rt-users list a while ago, if I'm not
mistaken). And it seems that we have some races in printk code.

-ss

Re: [PATCH 2/2] x86/pci: Clean up usage of X86_DEV_DMA_OPS

2019-04-23 Thread Christoph Hellwig

Is anyone going to pick this patch up?

[PATCH V5 15/16] PCI: tegra: Add Tegra194 PCIe support

2019-04-23 Thread Vidya Sagar

Add support for Synopsys DesignWare core IP based PCIe host controller
present in Tegra194 SoC.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Changed 'nvidia,init-speed' to 'nvidia,init-link-speed'
* Changed 'nvidia,pex-wake' to 'nvidia,wake-gpios'
* Removed .runtime_suspend() & .runtime_resume() implementations

Changes since [v1]:
* Made CONFIG_PCIE_TEGRA194 as 'm' by default from its previous 'y' state
* Modified code as per changes made to DT documentation
* Refactored code to address Bjorn & Thierry's review comments
* Added goto to avoid recursion in tegra_pcie_dw_host_init() API
* Merged .scan_bus() of dw_pcie_host_ops implementation to 
tegra_pcie_dw_host_init() API

 drivers/pci/controller/dwc/Kconfig |   11 +
 drivers/pci/controller/dwc/Makefile|1 +
 drivers/pci/controller/dwc/pcie-tegra194.c | 1760 
 3 files changed, 1772 insertions(+)
 create mode 100644 drivers/pci/controller/dwc/pcie-tegra194.c

diff --git a/drivers/pci/controller/dwc/Kconfig 
b/drivers/pci/controller/dwc/Kconfig
index b450ad2823a5..f9992b6c5bf7 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -232,4 +232,15 @@ config PCIE_UNIPHIER
  Say Y here if you want PCIe controller support on UniPhier SoCs.
  This driver supports LD20 and PXs3 SoCs.
 
+config PCIE_TEGRA194
+   tristate "NVIDIA Tegra (T194) PCIe controller"
+   depends on TEGRA_BPMP && (ARCH_TEGRA || COMPILE_TEST)
+   depends on PCI_MSI_IRQ_DOMAIN
+   select PCIE_DW_HOST
+   select PHY_TEGRA194_PCIE_P2U
+   default m
+   help
+ Say Y here if you want support for DesignWare core based PCIe host
+ controller found in NVIDIA Tegra T194 SoC.
+
 endmenu
diff --git a/drivers/pci/controller/dwc/Makefile 
b/drivers/pci/controller/dwc/Makefile
index b5f3b83cc2b3..4362f0ea89ac 100644
--- a/drivers/pci/controller/dwc/Makefile
+++ b/drivers/pci/controller/dwc/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o
 obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o
 obj-$(CONFIG_PCI_MESON) += pci-meson.o
 obj-$(CONFIG_PCIE_UNIPHIER) += pcie-uniphier.o
+obj-$(CONFIG_PCIE_TEGRA194) += pcie-tegra194.o
 
 # The following drivers are for devices that use the generic ACPI
 # pci_root.c driver but don't support standard ECAM config access.
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
b/drivers/pci/controller/dwc/pcie-tegra194.c
new file mode 100644
index ..937038faebe5
--- /dev/null
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -0,0 +1,1760 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * PCIe host controller driver for Tegra T194 SoC
+ *
+ * Copyright (C) 2019 NVIDIA Corporation.
+ *
+ * Author: Vidya Sagar 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "pcie-designware.h"
+#include 
+#include 
+#include "../../pci.h"
+#include "../../pcie/portdrv.h"
+
+#define dw_pcie_to_tegra_pcie(x) container_of(x, struct tegra_pcie_dw, pci)
+
+#define CTRL_5 5
+
+#define APPL_PINMUX0x0
+#define APPL_PINMUX_PEX_RSTBIT(0)
+#define APPL_PINMUX_CLKREQ_OVERRIDE_EN BIT(2)
+#define APPL_PINMUX_CLKREQ_OVERRIDEBIT(3)
+#define APPL_PINMUX_CLK_OUTPUT_IN_OVERRIDE_EN  BIT(4)
+#define APPL_PINMUX_CLK_OUTPUT_IN_OVERRIDE BIT(5)
+#define APPL_PINMUX_CLKREQ_OUT_OVRD_EN BIT(9)
+#define APPL_PINMUX_CLKREQ_OUT_OVRDBIT(10)
+
+#define APPL_CTRL  0x4
+#define APPL_CTRL_SYS_PRE_DET_STATEBIT(6)
+#define APPL_CTRL_LTSSM_EN BIT(7)
+#define APPL_CTRL_HW_HOT_RST_ENBIT(20)
+#define APPL_CTRL_HW_HOT_RST_MODE_MASK GENMASK(1, 0)
+#define APPL_CTRL_HW_HOT_RST_MODE_SHIFT22
+#define APPL_CTRL_HW_HOT_RST_MODE_IMDT_RST 0x1
+
+#define APPL_INTR_EN_L0_0  0x8
+#define APPL_INTR_EN_L0_0_LINK_STATE_INT_ENBIT(0)
+#define APPL_INTR_EN_L0_0_MSI_RCV_INT_EN   BIT(4)
+#define APPL_INTR_EN_L0_0_INT_INT_EN   BIT(8)
+#define APPL_INTR_EN_L0_0_CDM_REG_CHK_INT_EN   BIT(19)
+#define APPL_INTR_EN_L0_0_SYS_INTR_EN  BIT(30)
+#define APPL_INTR_EN_L0_0_SYS_MSI_INTR_EN  BIT(31)
+
+#define APPL_INTR_STATUS_L00xC
+#define APPL_INTR_STATUS_L0_LINK_STATE_INT BIT(0)
+#define APPL_INTR_STATUS_L0_INT_INTBIT(8)
+#define APPL_INTR_STATUS_L0_CDM_REG_CHK_INTBIT(18)
+
+#define APPL_INTR_EN_L1_0_00x1C
+#define APPL_INTR_EN_L1_0_0_LINK_REQ_RST_NOT_INT_ENBIT(1)
+
+#define APPL_INTR_STATUS_L1_0_00x20
+#define APPL_INTR_STATUS_L1_0_0_LINK_REQ_RST_NOT_CHGED

Re: [PATCHv5 1/6] PCI: mobiveil: Refactor Mobiveil PCIe Host Bridge IP driver

2019-04-23 Thread Subrahmanya Lingappa

ZQ,

On Fri, Apr 12, 2019 at 3:22 PM Z.q. Hou  wrote:
>
> From: Hou Zhiqiang 
>
> Refactor the Mobiveil PCIe Host Bridge IP driver to make
> it easier to add support for both RC and EP mode driver.
> This patch moved the Mobiveil driver to an new directory
> 'drivers/pci/controller/mobiveil' and refactor it according
> to the RC and EP abstraction.
>
> Signed-off-by: Hou Zhiqiang 
> Reviewed-by: Minghuan Lian 
> Reviewed-by: Subrahmanya Lingappa 
> ---
> V5:
>  - Regenerated this patch on the new base.
>  - Retouched the changelog.
>  - Updated the Copyright.
>
>  MAINTAINERS   |   2 +-
>  drivers/pci/controller/Kconfig|  11 +-
>  drivers/pci/controller/Makefile   |   2 +-
>  drivers/pci/controller/mobiveil/Kconfig   |  24 +
>  drivers/pci/controller/mobiveil/Makefile  |   4 +
>  .../pcie-mobiveil-host.c} | 570 +++---
>  .../controller/mobiveil/pcie-mobiveil-plat.c  |  56 ++
>  .../pci/controller/mobiveil/pcie-mobiveil.c   | 248 
>  .../pci/controller/mobiveil/pcie-mobiveil.h   | 211 +++
>  9 files changed, 636 insertions(+), 492 deletions(-)
>  create mode 100644 drivers/pci/controller/mobiveil/Kconfig
>  create mode 100644 drivers/pci/controller/mobiveil/Makefile
>  rename drivers/pci/controller/{pcie-mobiveil.c => 
> mobiveil/pcie-mobiveil-host.c} (53%)
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil-plat.c
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil.c
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1e64279f338a..1013e74b14f2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11877,7 +11877,7 @@ M:  Subrahmanya Lingappa 
> 
>  L: linux-...@vger.kernel.org
>  S: Supported
>  F: Documentation/devicetree/bindings/pci/mobiveil-pcie.txt
> -F: drivers/pci/controller/pcie-mobiveil.c
> +F: drivers/pci/controller/mobiveil/pcie-mobiveil*
>

Please add yourself as co-maintainer of the mobiveil driver.

>
>  PCI DRIVER FOR MVEBU (Marvell Armada 370 and Armada XP SOC support)
>  M: Thomas Petazzoni 
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 6671946dbf66..0e981ed00a75 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -241,16 +241,6 @@ config PCIE_MEDIATEK
>   Say Y here if you want to enable PCIe controller support on
>   MediaTek SoCs.
>
> -config PCIE_MOBIVEIL
> -   bool "Mobiveil AXI PCIe controller"
> -   depends on ARCH_ZYNQMP || COMPILE_TEST
> -   depends on OF
> -   depends on PCI_MSI_IRQ_DOMAIN
> -   help
> - Say Y here if you want to enable support for the Mobiveil AXI PCIe
> - Soft IP. It has up to 8 outbound and inbound windows
> - for address translation and it is a PCIe Gen4 IP.
> -
>  config PCIE_TANGO_SMP8759
> bool "Tango SMP8759 PCIe controller (DANGEROUS)"
> depends on ARCH_TANGO && PCI_MSI && OF
> @@ -281,4 +271,5 @@ config VMD
>   module will be called vmd.
>
>  source "drivers/pci/controller/dwc/Kconfig"
> +source "drivers/pci/controller/mobiveil/Kconfig"
>  endmenu
> diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
> index d56a507495c5..b79a615041a0 100644
> --- a/drivers/pci/controller/Makefile
> +++ b/drivers/pci/controller/Makefile
> @@ -26,11 +26,11 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o
>  obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
> -obj-$(CONFIG_PCIE_MOBIVEIL) += pcie-mobiveil.o
>  obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o
>  obj-$(CONFIG_VMD) += vmd.o
>  # pcie-hisi.o quirks are needed even without CONFIG_PCIE_DW
>  obj-y  += dwc/
> +obj-y  += mobiveil/
>
>
>  # The following drivers are for devices that use the generic ACPI
> diff --git a/drivers/pci/controller/mobiveil/Kconfig 
> b/drivers/pci/controller/mobiveil/Kconfig
> new file mode 100644
> index ..64343c07bfed
> --- /dev/null
> +++ b/drivers/pci/controller/mobiveil/Kconfig
> @@ -0,0 +1,24 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +menu "Mobiveil PCIe Core Support"
> +   depends on PCI
> +
> +config PCIE_MOBIVEIL
> +   bool
> +
> +config PCIE_MOBIVEIL_HOST
> +bool
> +   depends on PCI_MSI_IRQ_DOMAIN
> +select PCIE_MOBIVEIL
> +
> +config PCIE_MOBIVEIL_PLAT
> +   bool "Mobiveil AXI PCIe controller"
> +   depends on ARCH_ZYNQMP || COMPILE_TEST
> +   depends on OF
> +   select PCIE_MOBIVEIL_HOST
> +   help
> + Say Y here if you want to enable support for the Mobiveil AXI PCIe
> + Soft IP. It has up to 8 outbound and inbound windows
> + for address translation and it is a PCIe Gen4 IP.
> +
> +endmenu
> diff --git

Re: [PATCH] PM / devfreq: Return -ENODEV from try_then_request_governor

2019-04-23 Thread Tomeu Vizoso

On Tue, 23 Apr 2019 at 11:56, Enric Balletbo i Serra
 wrote:
>
> Hi Tomeu,
>
> On 23/4/19 10:11, Tomeu Vizoso wrote:
> > Callers don't expect it to return NULL, but an error code.
> >
> > Fixes Oops such as the one below, when one tries to set a governor that
> > isn't available:
> >
> > Unable to handle kernel NULL pointer dereference at virtual address 0018
> >
> > [] (governor_store) from [] 
> > (kernfs_fop_write+0x100/0x1e0)
> > [] (kernfs_fop_write) from [] (__vfs_write+0x2c/0x17c)
> > [] (__vfs_write) from [] (vfs_write+0xa4/0x184)
> > [] (vfs_write) from [] (ksys_write+0x4c/0xac)
> > [] (ksys_write) from [] (ret_fast_syscall+0x0/0x28)
> >
> > Signed-off-by: Tomeu Vizoso 
> > Fixes: 23c7b54ca1cd ("PM / devfreq: Fix devfreq_add_device() when drivers 
> > are built as modules.")
> > Reported-by: Alyssa Rosenzweig 
> > Cc: Enric Balletbo i Serra 
> > ---
>
> There is already a fix for that. The fix was initially sent in October [2] but
> unfortunately it got lost. I resend and now is queued [1]. Hopefully the Fixes
> tag will help to pick the fix to the proper kernel releases.

Actually, Steve Price sent a third patch for this same issue.

Glad to read that it's being merged.

Thanks,

Tomeu



> Thanks,
>  Enric
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git/commit/?h=for-next=b53b0128052ffd687797d5f4deeb76327e7b5711
>
> [2] https://lkml.org/lkml/2018/10/16/744
>
> >  drivers/devfreq/devfreq.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index 0ae3de76833b..5539e9be718d 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -254,7 +254,7 @@ static struct devfreq_governor 
> > *try_then_request_governor(const char *name)
> >   /* Restore previous state before return */
> >   mutex_lock(_list_lock);
> >   if (err)
> > - return NULL;
> > + return ERR_PTR(-ENODEV);
> >
> >   governor = find_devfreq_governor(name);
> >   }
> >

Re: Alleged fix for writer stall on -rcu branch dev

2019-04-23 Thread Paul E. McKenney

On Tue, Apr 23, 2019 at 05:25:00PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-04-15 13:04:03 [+0200], To Paul E. McKenney wrote:
> > 
> > good so nothing important so far. I hope the box gets to TREE08 soon.
> 
> test completed. Nothing new.

That took some time!  Thank you for running it!

Thanx, Paul

Re: [PATCH v2 1/5] dt-bindings: opp: Introduce bandwidth-MBps bindings

2019-04-23 Thread Viresh Kumar

On 23-04-19, 16:28, Georgi Djakov wrote:
> In addition to frequency and voltage, some devices may have bandwidth
> requirements for their interconnect throughput - for example a CPU
> or GPU may also need to increase or decrease their bandwidth to DDR
> memory based on the current operating performance point.
> 
> Extend the OPP tables with additional property to describe the bandwidth
> needs of a device. The average and peak bandwidth values depend on the
> hardware and its properties.
> 
> Signed-off-by: Georgi Djakov 
> ---
>  Documentation/devicetree/bindings/opp/opp.txt | 38 +++
>  .../devicetree/bindings/property-units.txt|  4 ++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
> b/Documentation/devicetree/bindings/opp/opp.txt
> index 76b6c79604a5..830f0206aea7 100644
> --- a/Documentation/devicetree/bindings/opp/opp.txt
> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> @@ -132,6 +132,9 @@ Optional properties:
>  - opp-level: A value representing the performance level of the device,
>expressed as a 32-bit integer.
>  
> +- bandwidth-MBps: The interconnect bandwidth is specified with an array 
> containing
> +  the two integer values for average and peak bandwidth in megabytes per 
> second.
> +
>  - clock-latency-ns: Specifies the maximum possible transition latency (in
>nanoseconds) for switching to this OPP from any other OPP.
>  
> @@ -546,3 +549,38 @@ Example 6: opp-microvolt-, opp-microamp-:
>   };
>   };
>  };
> +
> +Example 7: bandwidth-MBps:
> +Average and peak bandwidth values for the interconnects between CPU and DDR
> +memory and also between CPU and L3 are defined per each OPP. Bandwidth of 
> both
> +interconnects is scaled together with CPU frequency.
> +
> +/ {
> + cpus {
> + CPU0: cpu@0 {
> + compatible = "arm,cortex-a53", "arm,armv8";
> + ...
> + operating-points-v2 = <_opp_table>;
> + /* path between CPU and DDR memory and CPU and L3 */
> + interconnects = < MASTER_CPU  SLAVE_DDR>,
> + < MASTER_CPU  SLAVE_L3>;
> + };
> + };
> +
> + cpu_opp_table: cpu_opp_table {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp-2 {
> + opp-hz = /bits/ 64 <2>;
> + /* CPU<->DDR bandwidth: 457 MB/s average, 1525 MB/s 
> peak */
> +  * CPU<->L3 bandwidth: 914 MB/s average, 3050 MB/s peak 
> */
> + bandwidth-MBps = <457 1525>, <914 3050>;
> + };
> + opp-4 {
> + opp-hz = /bits/ 64 <4>;
> + /* CPU<->DDR bandwidth: 915 MB/s average, 3051 MB/s 
> peak */
> +  * CPU<->L3 bandwidth: 1828 MB/s average, 6102 MB/s 
> peak */
> + bandwidth-MBps = <915 3051>, <1828 6102>;
> + };
> + };
> diff --git a/Documentation/devicetree/bindings/property-units.txt 
> b/Documentation/devicetree/bindings/property-units.txt
> index bfd33734faca..9c3dbefcdae8 100644
> --- a/Documentation/devicetree/bindings/property-units.txt
> +++ b/Documentation/devicetree/bindings/property-units.txt
> @@ -41,3 +41,7 @@ Temperature
>  Pressure
>  
>  -kpascal : kiloPascal
> +
> +Throughput
> +
> +-MBps: megabytes per second

LGTM

-- 
viresh

[PATCH V5 16/16] arm64: Add Tegra194 PCIe driver to defconfig

2019-04-23 Thread Vidya Sagar

Add PCIe host controller driver for DesignWare core based
PCIe controller IP present in Tegra194.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* None

Changes since [v1]:
* Changed CONFIG_PCIE_TEGRA194 from 'y' to 'm'

 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 17daa971225e..72cf77c58e7c 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -87,6 +87,7 @@ CONFIG_PCIE_QCOM=y
 CONFIG_PCIE_ARMADA_8K=y
 CONFIG_PCIE_KIRIN=y
 CONFIG_PCIE_HISI_STB=y
+CONFIG_PCIE_TEGRA194=m
 CONFIG_ARM64_VA_BITS_48=y
 CONFIG_SCHED_MC=y
 CONFIG_NUMA=y
-- 
2.17.1

[PATCH V5 12/16] arm64: tegra: Add P2U and PCIe controller nodes to Tegra194 DT

2019-04-23 Thread Vidya Sagar

Add P2U (PIPE to UPHY) and PCIe controller nodes to device tree.
The Tegra194 SoC contains six PCIe controllers and twenty P2U instances
grouped into two different PHY bricks namely High-Speed IO (HSIO-12 P2Us)
and NVIDIA High Speed (NVHS-8 P2Us) respectively.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Included 'hsio' or 'nvhs' in P2U node's label names to reflect which brick
  they belong to
* Removed leading zeros in unit address

Changes since [v1]:
* Flattened all P2U nodes by removing 'hsio-p2u' and 'nvhs-p2u' super nodes
* Changed P2U nodes compatible string from 'nvidia,tegra194-phy-p2u' to 
'nvidia,tegra194-p2u'
* Changed reg-name from 'base' to 'ctl'
* Updated all PCIe nodes according to the changes made to DT documentation file

 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 449 +++
 1 file changed, 449 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index c77ca211fa8f..dc433b446ff5 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -884,6 +884,166 @@
nvidia,interface = <3>;
};
};
+
+   p2u_hsio_0: p2u@3e1 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e1 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_1: p2u@3e2 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e2 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_2: p2u@3e3 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e3 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_3: p2u@3e4 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e4 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_4: p2u@3e5 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e5 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_5: p2u@3e6 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e6 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_6: p2u@3e7 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e7 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_7: p2u@3e8 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e8 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_8: p2u@3e9 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e9 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_hsio_9: p2u@3ea {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03ea 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_nvhs_0: p2u@3eb {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03eb 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_nvhs_1: p2u@3ec {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03ec 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_nvhs_2: p2u@3ed {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03ed 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_nvhs_3: p2u@3ee {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03ee 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+   };
+
+   p2u_nvhs_4: p2u@3ef {
+

[PATCH V5 14/16] phy: tegra: Add PCIe PIPE2UPHY support

2019-04-23 Thread Vidya Sagar

Synopsys DesignWare core based PCIe controllers in Tegra 194 SoC interface
with Universal PHY (UPHY) module through a PIPE2UPHY (P2U) module.
For each PCIe lane of a controller, there is a P2U unit instantiated at
hardware level. This driver provides support for the programming required
for each P2U that is going to be used for a PCIe controller.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* Rebased on top of linux-next top of the tree

Changes since [v2]:
* Replaced spaces with tabs in Kconfig file
* Sorted header file inclusion alphabetically

Changes since [v1]:
* Added COMPILE_TEST in Kconfig
* Removed empty phy_ops implementations
* Modified code according to DT documentation file modifications

 drivers/phy/tegra/Kconfig |   7 ++
 drivers/phy/tegra/Makefile|   1 +
 drivers/phy/tegra/pcie-p2u-tegra194.c | 120 ++
 3 files changed, 128 insertions(+)
 create mode 100644 drivers/phy/tegra/pcie-p2u-tegra194.c

diff --git a/drivers/phy/tegra/Kconfig b/drivers/phy/tegra/Kconfig
index a3b1de953fb7..06d423fa85b4 100644
--- a/drivers/phy/tegra/Kconfig
+++ b/drivers/phy/tegra/Kconfig
@@ -6,3 +6,10 @@ config PHY_TEGRA_XUSB
 
  To compile this driver as a module, choose M here: the module will
  be called phy-tegra-xusb.
+
+config PHY_TEGRA194_PCIE_P2U
+   tristate "NVIDIA Tegra P2U PHY Driver"
+   depends on ARCH_TEGRA || COMPILE_TEST
+   select GENERIC_PHY
+   help
+ Enable this to support the P2U (PIPE to UPHY) that is part of Tegra 
19x SOCs.
diff --git a/drivers/phy/tegra/Makefile b/drivers/phy/tegra/Makefile
index a93cd9a499b2..1aaca794f40c 100644
--- a/drivers/phy/tegra/Makefile
+++ b/drivers/phy/tegra/Makefile
@@ -5,3 +5,4 @@ phy-tegra-xusb-$(CONFIG_ARCH_TEGRA_124_SOC) += xusb-tegra124.o
 phy-tegra-xusb-$(CONFIG_ARCH_TEGRA_132_SOC) += xusb-tegra124.o
 phy-tegra-xusb-$(CONFIG_ARCH_TEGRA_210_SOC) += xusb-tegra210.o
 phy-tegra-xusb-$(CONFIG_ARCH_TEGRA_186_SOC) += xusb-tegra186.o
+obj-$(CONFIG_PHY_TEGRA194_PCIE_P2U) += pcie-p2u-tegra194.o
diff --git a/drivers/phy/tegra/pcie-p2u-tegra194.c 
b/drivers/phy/tegra/pcie-p2u-tegra194.c
new file mode 100644
index ..a5d85e411088
--- /dev/null
+++ b/drivers/phy/tegra/pcie-p2u-tegra194.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * P2U (PIPE to UPHY) driver for Tegra T194 SoC
+ *
+ * Copyright (C) 2019 NVIDIA Corporation.
+ *
+ * Author: Vidya Sagar 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define P2U_PERIODIC_EQ_CTRL_GEN3  0xc0
+#define P2U_PERIODIC_EQ_CTRL_GEN3_PERIODIC_EQ_EN   BIT(0)
+#define P2U_PERIODIC_EQ_CTRL_GEN3_INIT_PRESET_EQ_TRAIN_EN  BIT(1)
+#define P2U_PERIODIC_EQ_CTRL_GEN4  0xc4
+#define P2U_PERIODIC_EQ_CTRL_GEN4_INIT_PRESET_EQ_TRAIN_EN  BIT(1)
+
+#define P2U_RX_DEBOUNCE_TIME   0xa4
+#define P2U_RX_DEBOUNCE_TIME_DEBOUNCE_TIMER_MASK   0x
+#define P2U_RX_DEBOUNCE_TIME_DEBOUNCE_TIMER_VAL160
+
+struct tegra_p2u {
+   void __iomem *base;
+};
+
+static int tegra_p2u_power_on(struct phy *x)
+{
+   struct tegra_p2u *phy = phy_get_drvdata(x);
+   u32 val;
+
+   val = readl(phy->base + P2U_PERIODIC_EQ_CTRL_GEN3);
+   val &= ~P2U_PERIODIC_EQ_CTRL_GEN3_PERIODIC_EQ_EN;
+   val |= P2U_PERIODIC_EQ_CTRL_GEN3_INIT_PRESET_EQ_TRAIN_EN;
+   writel(val, phy->base + P2U_PERIODIC_EQ_CTRL_GEN3);
+
+   val = readl(phy->base + P2U_PERIODIC_EQ_CTRL_GEN4);
+   val |= P2U_PERIODIC_EQ_CTRL_GEN4_INIT_PRESET_EQ_TRAIN_EN;
+   writel(val, phy->base + P2U_PERIODIC_EQ_CTRL_GEN4);
+
+   val = readl(phy->base + P2U_RX_DEBOUNCE_TIME);
+   val &= ~P2U_RX_DEBOUNCE_TIME_DEBOUNCE_TIMER_MASK;
+   val |= P2U_RX_DEBOUNCE_TIME_DEBOUNCE_TIMER_VAL;
+   writel(val, phy->base + P2U_RX_DEBOUNCE_TIME);
+
+   return 0;
+}
+
+static const struct phy_ops ops = {
+   .power_on   = tegra_p2u_power_on,
+   .owner  = THIS_MODULE,
+};
+
+static int tegra_p2u_probe(struct platform_device *pdev)
+{
+   struct phy_provider *phy_provider;
+   struct device *dev = >dev;
+   struct phy *generic_phy;
+   struct tegra_p2u *phy;
+   struct resource *res;
+
+   phy = devm_kzalloc(dev, sizeof(*phy), GFP_KERNEL);
+   if (!phy)
+   return -ENOMEM;
+
+   res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ctl");
+   phy->base = devm_ioremap_resource(dev, res);
+   if (IS_ERR(phy->base))
+   return PTR_ERR_OR_ZERO(phy->base);
+
+   platform_set_drvdata(pdev, phy);
+
+   generic_phy = devm_phy_create(dev, NULL, );
+   if (IS_ERR(generic_phy))
+   return PTR_ERR_OR_ZERO(generic_phy);
+
+   phy_set_drvdata(generic_phy, phy);
+
+   phy_provider = devm_of_phy_provider_register(dev, of_phy_simple_xlate);
+   if (IS_ERR(phy_provider))
+

[PATCH V5 13/16] arm64: tegra: Enable PCIe slots in P2972-0000 board

2019-04-23 Thread Vidya Sagar

Enable PCIe controller nodes to enable respective PCIe slots on
P2972- board. Following is the ownership of slots by different
PCIe controllers.
Controller-0 : M.2 Key-M slot
Controller-1 : On-board Marvell eSATA controller
Controller-3 : M.2 Key-E slot

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Changed P2U label names to reflect new format that includes 'hsio'/'nvhs'
  strings to reflect UPHY brick they belong to

Changes since [v1]:
* Dropped 'pcie-' from phy-names property strings

 .../arm64/boot/dts/nvidia/tegra194-p2888.dtsi |  2 +-
 .../boot/dts/nvidia/tegra194-p2972-.dts   | 41 +++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
index 0fd5bd29fbf9..30a83d4c5b69 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
@@ -191,7 +191,7 @@
regulator-boot-on;
};
 
-   sd3 {
+   vdd_1v8ao: sd3 {
regulator-name = "VDD_1V8AO";
regulator-min-microvolt = 
<180>;
regulator-max-microvolt = 
<180>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2972-.dts 
b/arch/arm64/boot/dts/nvidia/tegra194-p2972-.dts
index b62e96945846..7411c64e24a6 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194-p2972-.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra194-p2972-.dts
@@ -169,4 +169,45 @@
};
};
};
+
+   pcie@1418 {
+   status = "okay";
+
+   vddio-pex-ctl-supply = <_1v8ao>;
+
+   phys = <_hsio_2>, <_hsio_3>, <_hsio_4>,
+  <_hsio_5>;
+   phy-names = "p2u-0", "p2u-1", "p2u-2", "p2u-3";
+   };
+
+   pcie@1410 {
+   status = "okay";
+
+   vddio-pex-ctl-supply = <_1v8ao>;
+
+   phys = <_hsio_0>;
+   phy-names = "p2u-0";
+   };
+
+   pcie@1414 {
+   status = "okay";
+
+   vddio-pex-ctl-supply = <_1v8ao>;
+
+   phys = <_hsio_7>;
+   phy-names = "p2u-0";
+   };
+
+   pcie@141a {
+   status = "disabled";
+
+   vddio-pex-ctl-supply = <_1v8ao>;
+
+   phys = <_nvhs_0>, <_nvhs_1>, <_nvhs_2>,
+  <_nvhs_3>, <_nvhs_4>, <_nvhs_5>,
+  <_nvhs_6>, <_nvhs_7>;
+
+   phy-names = "p2u-0", "p2u-1", "p2u-2", "p2u-3", "p2u-4",
+   "p2u-5", "p2u-6", "p2u-7";
+   };
 };
-- 
2.17.1

[PATCH V5 11/16] dt-bindings: PHY: P2U: Add Tegra 194 P2U block

2019-04-23 Thread Vidya Sagar

Add support for Tegra194 P2U (PIPE to UPHY) module block which is a glue
module instantiated one for each PCIe lane between Synopsys Designware core
based PCIe IP and Universal PHY block.
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Changed node label to reflect new format that includes either 'hsio' or
  'nvhs' in its name to reflect which UPHY brick they belong to

Changes since [v1]:
* This is a new patch in v2 series

 .../bindings/phy/phy-tegra194-p2u.txt | 28 +++
 1 file changed, 28 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-tegra194-p2u.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-tegra194-p2u.txt 
b/Documentation/devicetree/bindings/phy/phy-tegra194-p2u.txt
new file mode 100644
index ..8b543cba483b
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-tegra194-p2u.txt
@@ -0,0 +1,28 @@
+NVIDIA Tegra194 P2U binding
+
+Tegra194 has two PHY bricks namely HSIO (High Speed IO) and NVHS (NVIDIA High
+Speed) each interfacing with 12 and 8 P2U instances respectively.
+A P2U instance is a glue logic between Synopsys DesignWare Core PCIe IP's PIPE
+interface and PHY of HSIO/NVHS bricks. Each P2U instance represents one PCIe
+lane.
+
+Required properties:
+- compatible: For Tegra19x, must contain "nvidia,tegra194-p2u".
+- reg: Should be the physical address space and length of respective each P2U
+   instance.
+- reg-names: Must include the entry "ctl".
+
+Required properties for PHY port node:
+- #phy-cells: Defined by generic PHY bindings.  Must be 0.
+
+Refer to phy/phy-bindings.txt for the generic PHY binding properties.
+
+Example:
+
+p2u_hsio_0: p2u@3e1 {
+   compatible = "nvidia,tegra194-p2u";
+   reg = <0x03e1 0x1>;
+   reg-names = "ctl";
+
+   #phy-cells = <0>;
+};
-- 
2.17.1

[PATCH V5 10/16] dt-bindings: PCI: tegra: Add device tree support for T194

2019-04-23 Thread Vidya Sagar

Add support for Tegra194 PCIe controllers. These controllers are based
on Synopsys DesignWare core IP.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Using only 'Cx' (x-being controller number) format to represent a controller
* Changed to 'value: description' format where applicable
* Changed 'nvidia,init-speed' to 'nvidia,init-link-speed'
* Provided more documentation for 'nvidia,init-link-speed' property
* Changed 'nvidia,pex-wake' to 'nvidia,wake-gpios'

Changes since [v1]:
* Added documentation for 'power-domains' property
* Removed 'window1' and 'window2' properties
* Removed '_clk' and '_rst' from clock and reset names
* Dropped 'pcie' from phy-names
* Added entry for BPMP-FW handle
* Removed offsets for some of the registers and added them in code and would be 
pickedup based on
  controller ID
* Changed 'nvidia,max-speed' to 'max-link-speed' and is made as an optional
* Changed 'nvidia,disable-clock-request' to 'supports-clkreq' with inverted 
operation
* Added more documentation for 'nvidia,update-fc-fixup' property
* Removed 'nvidia,enable-power-down' and 'nvidia,plat-gpios' properties
* Added '-us' to all properties that represent time in microseconds
* Moved P2U documentation to a separate file

 .../bindings/pci/nvidia,tegra194-pcie.txt | 187 ++
 1 file changed, 187 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/pci/nvidia,tegra194-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/nvidia,tegra194-pcie.txt 
b/Documentation/devicetree/bindings/pci/nvidia,tegra194-pcie.txt
new file mode 100644
index ..208dff126108
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/nvidia,tegra194-pcie.txt
@@ -0,0 +1,187 @@
+NVIDIA Tegra PCIe controller (Synopsys DesignWare Core based)
+
+This PCIe host controller is based on the Synopsis Designware PCIe IP
+and thus inherits all the common properties defined in designware-pcie.txt.
+
+Required properties:
+- compatible: For Tegra19x, must contain "nvidia,tegra194-pcie".
+- device_type: Must be "pci"
+- power-domains: A phandle to the node that controls power to the respective
+  PCIe controller and a specifier name for the PCIe controller. Following are
+  the specifiers for the different PCIe controllers
+TEGRA194_POWER_DOMAIN_PCIEX8B: C0
+TEGRA194_POWER_DOMAIN_PCIEX1A: C1
+TEGRA194_POWER_DOMAIN_PCIEX1A: C2
+TEGRA194_POWER_DOMAIN_PCIEX1A: C3
+TEGRA194_POWER_DOMAIN_PCIEX4A: C4
+TEGRA194_POWER_DOMAIN_PCIEX8A: C5
+  these specifiers are defined in
+  "include/dt-bindings/power/tegra194-powergate.h" file.
+- reg: A list of physical base address and length for each set of controller
+  registers. Must contain an entry for each entry in the reg-names property.
+- reg-names: Must include the following entries:
+  "appl": Controller's application logic registers
+  "config": As per the definition in designware-pcie.txt
+  "atu_dma": iATU and DMA registers. This is where the iATU (internal Address
+ Translation Unit) registers of the PCIe core are made available
+ fow SW access.
+  "dbi": The aperture where root port's own configuration registers are
+ available
+- interrupts: A list of interrupt outputs of the controller. Must contain an
+  entry for each entry in the interrupt-names property.
+- interrupt-names: Must include the following entries:
+  "intr": The Tegra interrupt that is asserted for controller interrupts
+  "msi": The Tegra interrupt that is asserted when an MSI is received
+- bus-range: Range of bus numbers associated with this controller
+- #address-cells: Address representation for root ports (must be 3)
+  - cell 0 specifies the bus and device numbers of the root port:
+[23:16]: bus number
+[15:11]: device number
+  - cell 1 denotes the upper 32 address bits and should be 0
+  - cell 2 contains the lower 32 address bits and is used to translate to the
+CPU address space
+- #size-cells: Size representation for root ports (must be 2)
+- ranges: Describes the translation of addresses for root ports and standard
+  PCI regions. The entries must be 7 cells each, where the first three cells
+  correspond to the address as described for the #address-cells property
+  above, the fourth and fifth cells are for the physical CPU address to
+  translate to and the sixth and seventh cells are as described for the
+  #size-cells property above.
+  - Entries setup the mapping for the standard I/O, memory and
+prefetchable PCI regions. The first cell determines the type of region
+that is setup:
+- 0x8100: I/O memory region
+- 0x8200: non-prefetchable memory region
+- 0xc200: prefetchable memory region
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- #interrupt-cells: Size representation for interrupts (must be 1)
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping

[PATCH V5 08/16] PCI: dwc: Add support to enable CDM register check

2019-04-23 Thread Vidya Sagar

Add support to enable CDM (Configuration Dependent Module) register check
for any data corruption based on the device-tree flag 'enable-cdm-check'.

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Changed code and commit description to reflect change in flag from
  'cdm-check' to 'enable-cdm-check'

Changes since [v1]:
* This is a new patch in v2 series

 drivers/pci/controller/dwc/pcie-designware.c | 7 +++
 drivers/pci/controller/dwc/pcie-designware.h | 9 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 417ee51ae502..535b72292f53 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -561,4 +561,11 @@ void dw_pcie_setup(struct dw_pcie *pci)
break;
}
dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
+
+   if (of_property_read_bool(np, "enable-cdm-check")) {
+   val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
+   val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
+  PCIE_PL_CHK_REG_CHK_REG_START;
+   dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
+   }
 }
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 67307842e003..0b3323932fed 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -83,6 +83,15 @@
 #define PCIE_MISC_CONTROL_1_OFF0x8BC
 #define PCIE_DBI_RO_WR_EN  BIT(0)
 
+#define PCIE_PL_CHK_REG_CONTROL_STATUS 0xB20
+#define PCIE_PL_CHK_REG_CHK_REG_START  BIT(0)
+#define PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS BIT(1)
+#define PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR   BIT(16)
+#define PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERRORBIT(17)
+#define PCIE_PL_CHK_REG_CHK_REG_COMPLETE   BIT(18)
+
+#define PCIE_PL_CHK_REG_ERR_ADDR   0xB28
+
 /*
  * iATU Unroll-specific register definitions
  * From 4.80 core version the address translation will be made by unroll
-- 
2.17.1

[PATCH V5 07/16] dt-bindings: PCI: designware: Add binding for CDM register check

2019-04-23 Thread Vidya Sagar

Add support to enable CDM (Configuration Dependent Module) registers check
for any data corruption. CDM registers include standard PCIe configuration
space registers, Port Logic registers and iATU and DMA registers.
Refer Section S.4 of Synopsys DesignWare Cores PCI Express Controller Databook
Version 4.90a

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* Changed flag name from 'cdm-check' to 'enable-cdm-check'
* Added info about Port Logic and DMA registers being part of CDM

Changes since [v1]:
* This is a new patch in v2 series

 Documentation/devicetree/bindings/pci/designware-pcie.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/designware-pcie.txt 
b/Documentation/devicetree/bindings/pci/designware-pcie.txt
index 5561a1c060d0..85b872c42a9f 100644
--- a/Documentation/devicetree/bindings/pci/designware-pcie.txt
+++ b/Documentation/devicetree/bindings/pci/designware-pcie.txt
@@ -34,6 +34,11 @@ Optional properties:
 - clock-names: Must include the following entries:
- "pcie"
- "pcie_bus"
+- enable-cdm-check: This is a boolean property and if present enables
+   automatic checking of CDM (Configuration Dependent Module) registers
+   for data corruption. CDM registers include standard PCIe configuration
+   space registers, Port Logic registers, DMA and iATU (internal Address
+   Translation Unit) registers.
 RC mode:
 - num-viewport: number of view ports configured in hardware. If a platform
   does not specify it, the driver assumes 2.
-- 
2.17.1

[PATCH V5 09/16] Documentation/devicetree: Add PCIe supports-clkreq property

2019-04-23 Thread Vidya Sagar

Some host controllers need to know the existence of clkreq signal routing to
downstream devices to be able to advertise low power features like ASPM L1
substates. Without clkreq signal routing being present, enabling ASPM L1 sub
states might lead to downstream devices falling off the bus. Hence a new device
tree property 'supports-clkreq' is added to make such host controllers
aware of clkreq signal routing to downstream devices.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* Rebased on top of linux-next top of the tree

Changes since [v2]:
* None

Changes since [v1]:
* This is a new patch in v2 series

 Documentation/devicetree/bindings/pci/pci.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/pci.txt 
b/Documentation/devicetree/bindings/pci/pci.txt
index 92c01db610df..d132f9efeb3e 100644
--- a/Documentation/devicetree/bindings/pci/pci.txt
+++ b/Documentation/devicetree/bindings/pci/pci.txt
@@ -24,6 +24,11 @@ driver implementation may support the following properties:
unsupported link speed, for instance, trying to do training for
unsupported link speed, etc.  Must be '4' for gen4, '3' for gen3, '2'
for gen2, and '1' for gen1. Any other values are invalid.
+- supports-clkreq:
+   If present this property specifies that CLKREQ signal routing exists from
+   root port to downstream device and host bridge drivers can do programming
+   which depends on CLKREQ signal existence. For example, programming root port
+   not to advertise ASPM L1 Sub-States support if there is no CLKREQ signal.
 
 PCI-PCI Bridge properties
 -
-- 
2.17.1

[PATCH V5 03/16] PCI: Export pcie_bus_config symbol

2019-04-23 Thread Vidya Sagar

Export pcie_bus_config to enable host controller drivers setting it to a
specific configuration be able to build as loadable modules

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* None

Changes since [v1]:
* This is a new patch in v2 series

 drivers/pci/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f5ff01dc4b13..731f78508601 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -94,6 +94,7 @@ unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
 unsigned long pci_hotplug_bus_size = DEFAULT_HOTPLUG_BUS_SIZE;
 
 enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT;
+EXPORT_SYMBOL_GPL(pcie_bus_config);
 
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
-- 
2.17.1

[PATCH V5 04/16] PCI: dwc: Perform dbi regs write lock towards the end

2019-04-23 Thread Vidya Sagar

Remove multiple write enable and disable sequences of dbi registers as
Tegra194 implements writes to BAR-0 register (offset: 0x10) controlled by
DBI write-lock enable bit thereby not allowing any further writes to BAR-0
register in config space to take place. Hence disabling write permission
only towards the end.

Signed-off-by: Vidya Sagar 
---
Changes since [v4]:
* None

Changes since [v3]:
* None

Changes since [v2]:
* None

Changes since [v1]:
* None

 drivers/pci/controller/dwc/pcie-designware-host.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 36fd3f5b48f6..e5e3571dd2fe 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -654,7 +654,6 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
val &= 0x00ff;
val |= 0x0100;
dw_pcie_writel_dbi(pci, PCI_INTERRUPT_LINE, val);
-   dw_pcie_dbi_ro_wr_dis(pci);
 
/* Setup bus numbers */
val = dw_pcie_readl_dbi(pci, PCI_PRIMARY_BUS);
@@ -686,8 +685,6 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 
dw_pcie_wr_own_conf(pp, PCI_BASE_ADDRESS_0, 4, 0);
 
-   /* Enable write permission for the DBI read-only register */
-   dw_pcie_dbi_ro_wr_en(pci);
/* Program correct class for RC */
dw_pcie_wr_own_conf(pp, PCI_CLASS_DEVICE, 2, PCI_CLASS_BRIDGE_PCI);
/* Better disable write permission right after the update */
-- 
2.17.1

[PATCH V5 05/16] PCI: dwc: Move config space capability search API

2019-04-23 Thread Vidya Sagar

Move PCIe config space capability search API to common DesignWare file
as this can be used by both host and ep mode codes.

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from [v4]:
* Removed redundant APIs in pcie-designware-ep.c file after moving them
  to pcie-designware.c file based on Bjorn's comments.

Changes from [v3]:
* Rebased to linux-next top of the tree

Changes from [v2]:
* None

Changes from [v1]:
* Removed dw_pcie_find_next_ext_capability() API from here and made a
  separate patch for that

 .../pci/controller/dwc/pcie-designware-ep.c   | 37 +-
 drivers/pci/controller/dwc/pcie-designware.c  | 39 +++
 drivers/pci/controller/dwc/pcie-designware.h  |  2 +
 3 files changed, 43 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 2bf5a35c0570..65f479250087 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -40,39 +40,6 @@ void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum 
pci_barno bar)
__dw_pcie_ep_reset_bar(pci, bar, 0);
 }
 
-static u8 __dw_pcie_ep_find_next_cap(struct dw_pcie *pci, u8 cap_ptr,
- u8 cap)
-{
-   u8 cap_id, next_cap_ptr;
-   u16 reg;
-
-   if (!cap_ptr)
-   return 0;
-
-   reg = dw_pcie_readw_dbi(pci, cap_ptr);
-   cap_id = (reg & 0x00ff);
-
-   if (cap_id > PCI_CAP_ID_MAX)
-   return 0;
-
-   if (cap_id == cap)
-   return cap_ptr;
-
-   next_cap_ptr = (reg & 0xff00) >> 8;
-   return __dw_pcie_ep_find_next_cap(pci, next_cap_ptr, cap);
-}
-
-static u8 dw_pcie_ep_find_capability(struct dw_pcie *pci, u8 cap)
-{
-   u8 next_cap_ptr;
-   u16 reg;
-
-   reg = dw_pcie_readw_dbi(pci, PCI_CAPABILITY_LIST);
-   next_cap_ptr = (reg & 0x00ff);
-
-   return __dw_pcie_ep_find_next_cap(pci, next_cap_ptr, cap);
-}
-
 static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
   struct pci_epf_header *hdr)
 {
@@ -612,9 +579,9 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
dev_err(dev, "Failed to reserve memory for MSI/MSI-X\n");
return -ENOMEM;
}
-   ep->msi_cap = dw_pcie_ep_find_capability(pci, PCI_CAP_ID_MSI);
+   ep->msi_cap = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
 
-   ep->msix_cap = dw_pcie_ep_find_capability(pci, PCI_CAP_ID_MSIX);
+   ep->msix_cap = dw_pcie_find_capability(pci, PCI_CAP_ID_MSIX);
 
offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
if (offset) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 8e0081ccf83b..ed21e861df82 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -20,6 +20,45 @@
 #define PCIE_PHY_DEBUG_R1_LINK_UP  (0x1 << 4)
 #define PCIE_PHY_DEBUG_R1_LINK_IN_TRAINING (0x1 << 29)
 
+/*
+ * These APIs are different from standard pci_find_*capability() APIs in the
+ * sense that former can only be used post device enumeration as they require
+ * 'struct pci_dev *' pointer whereas these APIs require 'struct dw_pcie *'
+ * pointer and can be used before link up also.
+ */
+static u8 __dw_pcie_find_next_cap(struct dw_pcie *pci, u8 cap_ptr,
+ u8 cap)
+{
+   u8 cap_id, next_cap_ptr;
+   u16 reg;
+
+   if (!cap_ptr)
+   return 0;
+
+   reg = dw_pcie_readw_dbi(pci, cap_ptr);
+   cap_id = (reg & 0x00ff);
+
+   if (cap_id > PCI_CAP_ID_MAX)
+   return 0;
+
+   if (cap_id == cap)
+   return cap_ptr;
+
+   next_cap_ptr = (reg & 0xff00) >> 8;
+   return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
+}
+
+u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
+{
+   u8 next_cap_ptr;
+   u16 reg;
+
+   reg = dw_pcie_readw_dbi(pci, PCI_CAPABILITY_LIST);
+   next_cap_ptr = (reg & 0x00ff);
+
+   return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
+}
+
 int dw_pcie_read(void __iomem *addr, int size, u32 *val)
 {
if (!IS_ALIGNED((uintptr_t)addr, size)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 9ee98ced1ef6..35160b4ce929 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -248,6 +248,8 @@ struct dw_pcie {
 #define to_dw_pcie_from_ep(endpoint)   \
container_of((endpoint), struct dw_pcie, ep)
 
+u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
+
 int dw_pcie_read(void __iomem *addr, int size, u32 *val);
 int dw_pcie_write(void __iomem *addr, int size, u32 val);
 
-- 
2.17.1

[PATCH V5 06/16] PCI: dwc: Add ext config space capability search API

2019-04-23 Thread Vidya Sagar

Add extended configuration space capability search API using struct dw_pcie *
pointer

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from [v4]:
* None

Changes from [v3]:
* None

Changes from [v2]:
* None

Changes from [v1]:
* This is a new patch in v2 series

 drivers/pci/controller/dwc/pcie-designware.c | 41 
 drivers/pci/controller/dwc/pcie-designware.h |  1 +
 2 files changed, 42 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index ed21e861df82..417ee51ae502 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -59,6 +59,47 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
 }
 
+static int dw_pcie_find_next_ext_capability(struct dw_pcie *pci, int start,
+   int cap)
+{
+   u32 header;
+   int ttl;
+   int pos = PCI_CFG_SPACE_SIZE;
+
+   /* minimum 8 bytes per capability */
+   ttl = (PCI_CFG_SPACE_EXP_SIZE - PCI_CFG_SPACE_SIZE) / 8;
+
+   if (start)
+   pos = start;
+
+   header = dw_pcie_readl_dbi(pci, pos);
+   /*
+* If we have no capabilities, this is indicated by cap ID,
+* cap version and next pointer all being 0.
+*/
+   if (header == 0)
+   return 0;
+
+   while (ttl-- > 0) {
+   if (PCI_EXT_CAP_ID(header) == cap && pos != start)
+   return pos;
+
+   pos = PCI_EXT_CAP_NEXT(header);
+   if (pos < PCI_CFG_SPACE_SIZE)
+   break;
+
+   header = dw_pcie_readl_dbi(pci, pos);
+   }
+
+   return 0;
+}
+
+int dw_pcie_find_ext_capability(struct dw_pcie *pci, int cap)
+{
+   return dw_pcie_find_next_ext_capability(pci, 0, cap);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_find_ext_capability);
+
 int dw_pcie_read(void __iomem *addr, int size, u32 *val)
 {
if (!IS_ALIGNED((uintptr_t)addr, size)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 35160b4ce929..67307842e003 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -249,6 +249,7 @@ struct dw_pcie {
container_of((endpoint), struct dw_pcie, ep)
 
 u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
+int dw_pcie_find_ext_capability(struct dw_pcie *pci, int cap);
 
 int dw_pcie_read(void __iomem *addr, int size, u32 *val);
 int dw_pcie_write(void __iomem *addr, int size, u32 val);
-- 
2.17.1

[PATCH V5 02/16] PCI/PME: Export pcie_pme_disable_msi() & pcie_pme_no_msi() APIs

2019-04-23 Thread Vidya Sagar

Export pcie_pme_disable_msi() & pcie_pme_no_msi() APIs to enable drivers
using this API be able to build as loadable modules.

Signed-off-by: Vidya Sagar 
---
Changes from [v4]:
* None

Changes from [v3]:
* None

Changes from [v2]:
* Exported pcie_pme_no_msi() API after making pcie_pme_msi_disabled a static

Changes from [v1]:
* This is a new patch in v2 series

 drivers/pci/pcie/pme.c | 14 +-
 drivers/pci/pcie/portdrv.h | 16 +++-
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c
index 54d593d10396..d5e0ea4a62fc 100644
--- a/drivers/pci/pcie/pme.c
+++ b/drivers/pci/pcie/pme.c
@@ -25,7 +25,19 @@
  * that using MSI for PCIe PME signaling doesn't play well with PCIe PME-based
  * wake-up from system sleep states.
  */
-bool pcie_pme_msi_disabled;
+static bool pcie_pme_msi_disabled;
+
+void pcie_pme_disable_msi(void)
+{
+   pcie_pme_msi_disabled = true;
+}
+EXPORT_SYMBOL_GPL(pcie_pme_disable_msi);
+
+bool pcie_pme_no_msi(void)
+{
+   return pcie_pme_msi_disabled;
+}
+EXPORT_SYMBOL_GPL(pcie_pme_no_msi);
 
 static int __init pcie_pme_setup(char *str)
 {
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 1d50dc58ac40..7c8c3da4bd58 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -125,22 +125,12 @@ void pcie_port_bus_unregister(void);
 struct pci_dev;
 
 #ifdef CONFIG_PCIE_PME
-extern bool pcie_pme_msi_disabled;
-
-static inline void pcie_pme_disable_msi(void)
-{
-   pcie_pme_msi_disabled = true;
-}
-
-static inline bool pcie_pme_no_msi(void)
-{
-   return pcie_pme_msi_disabled;
-}
-
+void pcie_pme_disable_msi(void);
+bool pcie_pme_no_msi(void);
 void pcie_pme_interrupt_enable(struct pci_dev *dev, bool enable);
 #else /* !CONFIG_PCIE_PME */
 static inline void pcie_pme_disable_msi(void) {}
-static inline bool pcie_pme_no_msi(void) { return false; }
+static inline bool pcie_pme_no_msi(void) {}
 static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {}
 #endif /* !CONFIG_PCIE_PME */
 
-- 
2.17.1

Re: [PATCH v2] nvmem: core: add NVMEM_SYSFS Kconfig

2019-04-23 Thread Gaurav Kohli


Hi ,

Sorry for spam,

when is the plan to merge this patch.

Regards
Gaurav

On 4/16/2019 4:45 PM, Gaurav Kohli wrote:

Hi ,

I have reviewed and tested for both enabled and disabled and working as 
expected.


Please feel free to add:

Reviewed-by: Gaurav Kohli 
Tested-by: Gaurav Kohli 

Regards
Gaurav
On 4/16/2019 4:31 PM, Mika Westerberg wrote:

On Tue, Apr 16, 2019 at 10:59:24AM +0100, Srinivas Kandagatla wrote:

Many nvmem providers are not very keen on having default sysfs
nvmem entry, as most of the usecases for them are inside kernel
itself. And in some cases read/writes to some areas in nvmem are
restricted and trapped at secure monitor level, so accessing them
from userspace would result in board reboots.

This patch adds new NVMEM_SYSFS Kconfig to make binary sysfs entry
an optional one. This provision will give more flexibility to users.
This patch also moves existing sysfs code to a new file so that its
not compiled in when its not really required.

Signed-off-by: Srinivas Kandagatla 


Reviewed-by: Mika Westerberg 





--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

[PATCH V5 01/16] PCI: Add #defines for some of PCIe spec r4.0 features

2019-04-23 Thread Vidya Sagar

Add #defines only for the Data Link Feature and Physical Layer 16.0 GT/s
features.

Signed-off-by: Vidya Sagar 
Reviewed-by: Thierry Reding 
---
Changes from [v4]:
* None

Changes from [v3]:
* None

Changes from [v2]:
* Updated commit message and description to explicitly mention that defines are
  added only for some of the features and not all.

Changes from [v1]:
* None

 include/uapi/linux/pci_regs.h | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f7d3e7831fa8..4da04b1faab3 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -703,7 +703,9 @@
 #define PCI_EXT_CAP_ID_DPC 0x1D/* Downstream Port Containment */
 #define PCI_EXT_CAP_ID_L1SS0x1E/* L1 PM Substates */
 #define PCI_EXT_CAP_ID_PTM 0x1F/* Precision Time Measurement */
-#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PTM
+#define PCI_EXT_CAP_ID_DLF 0x25/* Data Link Feature */
+#define PCI_EXT_CAP_ID_PL  0x26/* Physical Layer 16.0 GT/s */
+#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL
 
 #define PCI_EXT_CAP_DSN_SIZEOF 12
 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
@@ -1043,4 +1045,22 @@
 #define  PCI_L1SS_CTL1_LTR_L12_TH_SCALE0xe000  /* 
LTR_L1.2_THRESHOLD_Scale */
 #define PCI_L1SS_CTL2  0x0c/* Control 2 Register */
 
+/* Data Link Feature */
+#define PCI_DLF_CAP0x04/* Capabilities Register */
+#define  PCI_DLF_LOCAL_DLF_SUP_MASK0x007f  /* Local Data Link Feature 
Supported */
+#define  PCI_DLF_EXCHANGE_ENABLE   0x8000  /* Data Link Feature 
Exchange Enable */
+#define PCI_DLF_STS0x08/* Status Register */
+#define  PCI_DLF_REMOTE_DLF_SUP_MASK   0x007f  /* Remote Data Link Feature 
Supported */
+#define  PCI_DLF_REMOTE_DLF_SUP_VALID  0x8000  /* Remote Data Link Feature 
Support Valid */
+
+/* Physical Layer 16.0 GT/s */
+#define PCI_PL_16GT_CAP0x04/* Capabilities Register */
+#define PCI_PL_16GT_CTRL   0x08/* Control Register */
+#define PCI_PL_16GT_STS0x0c/* Status Register */
+#define PCI_PL_16GT_LDPM_STS   0x10/* Local Data Parity Mismatch Status 
Register */
+#define PCI_PL_16GT_FRDPM_STS  0x14/* First Retimer Data Parity Mismatch 
Status Register */
+#define PCI_PL_16GT_SRDPM_STS  0x18/* Second Retimer Data Parity Mismatch 
Status Register */
+#define PCI_PL_16GT_RSVD   0x1C/* Reserved */
+#define PCI_PL_16GT_LE_CTRL0x20/* Lane Equalization Control Register */
+
 #endif /* LINUX_PCI_REGS_H */
-- 
2.17.1

[PATCH V5 00/16] Add Tegra194 PCIe support

2019-04-23 Thread Vidya Sagar

Tegra194 has six PCIe controllers based on Synopsys DesignWare core.
There are two Universal PHY (UPHY) blocks with each supporting 12(HSIO:
Hisg Speed IO) and 8(NVHS: NVIDIA High Speed) lanes respectively.
Controllers:0~4 use UPHY lanes from HSIO brick whereas Controller:5 uses
UPHY lanes from NVHS brick. Lane mapping in HSIO UPHY brick to each PCIe
controller (0~4) is controlled in XBAR module by BPMP-FW. Since PCIe
core has PIPE interface, a glue module called PIPE-to-UPHY (P2U) is used
to connect each UPHY lane (applicable to both HSIO and NVHS UPHY bricks)
to PCIe controller
This patch series
- Adds support for P2U PHY driver
- Adds support for PCIe host controller
- Adds device tree nodes each PCIe controllers
- Enables nodes applicable to p2972- platform
- Adds helper APIs in Designware core driver to get capability regs offset
- Adds defines for new feature registers of PCIe spec revision 4
- Makes changes in DesignWare core driver to get Tegra194 PCIe working

Testing done on P2972- platform
- Able to get PCIe link up with on-board Marvel eSATA controller
- Able to get PCIe link up with NVMe cards connected to M.2 Key-M slot
- Able to do data transfers with both SATA drives and NVMe cards

Note
- Enabling x8 slot on P2972- platform requires pinmux driver for Tegra194.
  It is being worked on currently and hence Controller:5 (i.e. x8 slot) is
  disabled in this patch series. A future patch series would enable this.
- This series is based on top of the following series
  Jisheng's patches to add support to .remove() in Designware sub-system
  https://patchwork.kernel.org/project/linux-pci/list/?series=98559
  (Jisheng's patches are now accepted and applied for v5.2)
  My patches made on top of Jisheng's patches to export various symbols
  https://patchwork.kernel.org/project/linux-pci/list/?series=101259

Changes since [v4]:
* Removed redundant APIs in pcie-designware-ep.c file after moving them
  to pcie-designware.c file based on Bjorn's review comments

Changes since [v3]:
* Rebased on top of linux-next top of the tree
* Addressed Gustavo's comments and added his Ack for some of the changes.

Changes since [v2]:
* Addressed review comments from Thierry

Changes since [v1]:
* Addressed review comments from Bjorn, Thierry, Jonathan, Rob & Kishon
* Added more patches in v2 series

Vidya Sagar (16):
  PCI: Add #defines for some of PCIe spec r4.0 features
  PCI/PME: Export pcie_pme_disable_msi() & pcie_pme_no_msi() APIs
  PCI: Export pcie_bus_config symbol
  PCI: dwc: Perform dbi regs write lock towards the end
  PCI: dwc: Move config space capability search API
  PCI: dwc: Add ext config space capability search API
  dt-bindings: PCI: designware: Add binding for CDM register check
  PCI: dwc: Add support to enable CDM register check
  Documentation/devicetree: Add PCIe supports-clkreq property
  dt-bindings: PCI: tegra: Add device tree support for T194
  dt-bindings: PHY: P2U: Add Tegra 194 P2U block
  arm64: tegra: Add P2U and PCIe controller nodes to Tegra194 DT
  arm64: tegra: Enable PCIe slots in P2972- board
  phy: tegra: Add PCIe PIPE2UPHY support
  PCI: tegra: Add Tegra194 PCIe support
  arm64: Add Tegra194 PCIe driver to defconfig

 .../bindings/pci/designware-pcie.txt  |5 +
 .../bindings/pci/nvidia,tegra194-pcie.txt |  187 ++
 Documentation/devicetree/bindings/pci/pci.txt |5 +
 .../bindings/phy/phy-tegra194-p2u.txt |   28 +
 .../arm64/boot/dts/nvidia/tegra194-p2888.dtsi |2 +-
 .../boot/dts/nvidia/tegra194-p2972-.dts   |   41 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  449 +
 arch/arm64/configs/defconfig  |1 +
 drivers/pci/controller/dwc/Kconfig|   11 +
 drivers/pci/controller/dwc/Makefile   |1 +
 .../pci/controller/dwc/pcie-designware-ep.c   |   37 +-
 .../pci/controller/dwc/pcie-designware-host.c |3 -
 drivers/pci/controller/dwc/pcie-designware.c  |   87 +
 drivers/pci/controller/dwc/pcie-designware.h  |   12 +
 drivers/pci/controller/dwc/pcie-tegra194.c| 1760 +
 drivers/pci/pci.c |1 +
 drivers/pci/pcie/pme.c|   14 +-
 drivers/pci/pcie/portdrv.h|   16 +-
 drivers/phy/tegra/Kconfig |7 +
 drivers/phy/tegra/Makefile|1 +
 drivers/phy/tegra/pcie-p2u-tegra194.c |  120 ++
 include/uapi/linux/pci_regs.h |   22 +-
 22 files changed, 2756 insertions(+), 54 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/pci/nvidia,tegra194-pcie.txt
 create mode 100644 Documentation/devicetree/bindings/phy/phy-tegra194-p2u.txt
 create mode 100644 drivers/pci/controller/dwc/pcie-tegra194.c
 create mode 100644 drivers/phy/tegra/pcie-p2u-tegra194.c

-- 
2.17.1

[PATCH v4 2/2] thermal: rcar_gen3_thermal: disable interrupt in .remove

2019-04-23 Thread Jiada Wang

Currently IRQ remains enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
called before device is initialized.

this patch disables interrupt in .remove, to ensure irq function
only be called after device is fully initialized.

Signed-off-by: Jiada Wang 
---
 drivers/thermal/rcar_gen3_thermal.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/rcar_gen3_thermal.c 
b/drivers/thermal/rcar_gen3_thermal.c
index 065e16f53285..280230951dfe 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -307,6 +307,9 @@ MODULE_DEVICE_TABLE(of, rcar_gen3_thermal_dt_ids);
 static int rcar_gen3_thermal_remove(struct platform_device *pdev)
 {
struct device *dev = >dev;
+   struct rcar_gen3_thermal_priv *priv = dev_get_drvdata(dev);
+
+   rcar_thermal_irq_set(priv, false);
 
pm_runtime_put(dev);
pm_runtime_disable(dev);
-- 
2.19.2

[PATCH v4 1/2] thermal: rcar_gen3_thermal: fix interrupt type

2019-04-23 Thread Jiada Wang

Currently IRQF_SHARED type interrupt line is allocated, but it
is not appropriate, as the interrupt line isn't shared between
different devices, instead IRQF_ONESHOT is the proper type.

By changing interrupt type to IRQF_ONESHOT, now irq handler is
no longer needed, as clear of interrupt status can be done in
threaded interrupt context.

Because IRQF_ONESHOT type interrupt line is kept disabled until
the threaded handler has been run, so there is no need to protect
read/write of REG_GEN3_IRQSTR with lock.

Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points")
Signed-off-by: Jiada Wang 
---
 drivers/thermal/rcar_gen3_thermal.c | 38 +
 1 file changed, 6 insertions(+), 32 deletions(-)

diff --git a/drivers/thermal/rcar_gen3_thermal.c 
b/drivers/thermal/rcar_gen3_thermal.c
index 88fa41cf16e8..065e16f53285 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -82,7 +81,6 @@ struct rcar_gen3_thermal_tsc {
 struct rcar_gen3_thermal_priv {
struct rcar_gen3_thermal_tsc *tscs[TSC_MAX_NUM];
unsigned int num_tscs;
-   spinlock_t lock; /* Protect interrupts on and off */
void (*thermal_init)(struct rcar_gen3_thermal_tsc *tsc);
 };
 
@@ -232,38 +230,16 @@ static irqreturn_t rcar_gen3_thermal_irq(int irq, void 
*data)
 {
struct rcar_gen3_thermal_priv *priv = data;
u32 status;
-   int i, ret = IRQ_HANDLED;
+   int i;
 
-   spin_lock(>lock);
for (i = 0; i < priv->num_tscs; i++) {
status = rcar_gen3_thermal_read(priv->tscs[i], REG_GEN3_IRQSTR);
rcar_gen3_thermal_write(priv->tscs[i], REG_GEN3_IRQSTR, 0);
if (status)
-   ret = IRQ_WAKE_THREAD;
+   thermal_zone_device_update(priv->tscs[i]->zone,
+  THERMAL_EVENT_UNSPECIFIED);
}
 
-   if (ret == IRQ_WAKE_THREAD)
-   rcar_thermal_irq_set(priv, false);
-
-   spin_unlock(>lock);
-
-   return ret;
-}
-
-static irqreturn_t rcar_gen3_thermal_irq_thread(int irq, void *data)
-{
-   struct rcar_gen3_thermal_priv *priv = data;
-   unsigned long flags;
-   int i;
-
-   for (i = 0; i < priv->num_tscs; i++)
-   thermal_zone_device_update(priv->tscs[i]->zone,
-  THERMAL_EVENT_UNSPECIFIED);
-
-   spin_lock_irqsave(>lock, flags);
-   rcar_thermal_irq_set(priv, true);
-   spin_unlock_irqrestore(>lock, flags);
-
return IRQ_HANDLED;
 }
 
@@ -371,8 +347,6 @@ static int rcar_gen3_thermal_probe(struct platform_device 
*pdev)
if (soc_device_match(r8a7795es1))
priv->thermal_init = rcar_gen3_thermal_init_r8a7795es1;
 
-   spin_lock_init(>lock);
-
platform_set_drvdata(pdev, priv);
 
/*
@@ -390,9 +364,9 @@ static int rcar_gen3_thermal_probe(struct platform_device 
*pdev)
if (!irqname)
return -ENOMEM;
 
-   ret = devm_request_threaded_irq(dev, irq, rcar_gen3_thermal_irq,
-   rcar_gen3_thermal_irq_thread,
-   IRQF_SHARED, irqname, priv);
+   ret = devm_request_threaded_irq(dev, irq, NULL,
+   rcar_gen3_thermal_irq,
+   IRQF_ONESHOT, irqname, priv);
if (ret)
return ret;
}
-- 
2.19.2

[PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

2019-04-23 Thread Jiada Wang

There are issues with interrupt handling in rcar_gen3_thermal driver.

Currently IRQ is remain enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
triggered but not able to clear IRQ status, thus cause system to hang.

Since the irq line isn't shared between different devices,
so the proper interrupt type flag should be IRQF_ONESHOT.

This patch-set fix these interrupt handling retated issues.

---
v4: remove 'spinlock_t lock'
add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")

v3: fix to use correct code base
remove unused "flag" variable in rcar_gen3_thermal_irq

v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
disable interrupt in .remove

v1: initial version

Jiada Wang (2):
  thermal: rcar_gen3_thermal: fix interrupt type
  thermal: rcar_gen3_thermal: disable interrupt in .remove

 drivers/thermal/rcar_gen3_thermal.c | 41 +++--
 1 file changed, 9 insertions(+), 32 deletions(-)

-- 
2.19.2

linux-next: manual merge of the tip tree with the asm-generic tree

2019-04-23 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  arch/arm64/include/asm/Kbuild

between commit:

  c67fdc1f00cb ("arch: mostly remove ")

from the asm-generic tree and commit:

  46ad0840b158 ("locking/rwsem: Remove arch specific rwsem files")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/include/asm/Kbuild
index afd8a3740b18,60a933b07001..
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@@ -17,7 -16,7 +17,6 @@@ generic-y += mmiowb.
  generic-y += msi.h
  generic-y += qrwlock.h
  generic-y += qspinlock.h
- generic-y += rwsem.h
 -generic-y += segment.h
  generic-y += serial.h
  generic-y += set_memory.h
  generic-y += sizes.h


pgpu2tAk97IAD.pgp
Description: OpenPGP digital signature

[PATCH v2 6/6] mm: reparent slab memory on cgroup removal

2019-04-23 Thread Roman Gushchin

Let's reparent memcg slab memory on memcg offlining. This allows us
to release the memory cgroup without waiting for the last outstanding
kernel object (e.g. dentry used by another application).

So instead of reparenting all accounted slab pages, let's do reparent
a relatively small amount of kmem_caches. Reparenting is performed as
a part of the deactivation process.

Since the parent cgroup is already charged, everything we need to do
is to splice the list of kmem_caches to the parent's kmem_caches list,
swap the memcg pointer and drop the css refcounter for each kmem_cache
and adjust the parent's css refcounter. Quite simple.

Please, note that kmem_cache->memcg_params.memcg isn't a stable
pointer anymore. It's safe to read it under rcu_read_lock() or
with slab_mutex held.

We can race with the slab allocation and deallocation paths. It's not
a big problem: parent's charge and slab global stats are always
correct, and we don't care anymore about the child usage and global
stats. The child cgroup is already offline, so we don't use or show it
anywhere.

Local slab stats (NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE)
aren't used anywhere except count_shadow_nodes(). But even there it
won't break anything: after reparenting "nodes" will be 0 on child
level (because we're already reparenting shrinker lists), and on
parent level page stats always were 0, and this patch won't change
anything.

Signed-off-by: Roman Gushchin 
---
 include/linux/slab.h |  4 ++--
 mm/memcontrol.c  | 14 --
 mm/slab.h| 14 +-
 mm/slab_common.c | 23 ---
 4 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1b54e5f83342..109cab2ad9b4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -152,7 +152,7 @@ void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
 
 void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
-void memcg_deactivate_kmem_caches(struct mem_cgroup *);
+void memcg_deactivate_kmem_caches(struct mem_cgroup *, struct mem_cgroup *);
 
 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -638,7 +638,7 @@ struct memcg_cache_params {
bool dying;
};
struct {
-   struct mem_cgroup *memcg;
+   struct mem_cgroup __rcu *memcg;
struct list_head children_node;
struct list_head kmem_caches_node;
struct percpu_ref refcnt;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c9896105d8d5..27ae253922da 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3201,15 +3201,15 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg)
 */
memcg->kmem_state = KMEM_ALLOCATED;
 
-   memcg_deactivate_kmem_caches(memcg);
-
-   kmemcg_id = memcg->kmemcg_id;
-   BUG_ON(kmemcg_id < 0);
-
parent = parent_mem_cgroup(memcg);
if (!parent)
parent = root_mem_cgroup;
 
+   memcg_deactivate_kmem_caches(memcg, parent);
+
+   kmemcg_id = memcg->kmemcg_id;
+   BUG_ON(kmemcg_id < 0);
+
/*
 * Change kmemcg_id of this cgroup and all its descendants to the
 * parent's id, and then move all entries from this cgroup's list_lrus
@@ -3242,7 +3242,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
if (memcg->kmem_state == KMEM_ALLOCATED) {
WARN_ON(!list_empty(>kmem_caches));
static_branch_dec(_kmem_enabled_key);
-   WARN_ON(page_counter_read(>kmem));
}
 }
 #else
@@ -4654,6 +4653,9 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state 
*parent_css)
 
/* The following stuff does not apply to the root */
if (!parent) {
+#ifdef CONFIG_MEMCG_KMEM
+   INIT_LIST_HEAD(>kmem_caches);
+#endif
root_mem_cgroup = memcg;
return >css;
}
diff --git a/mm/slab.h b/mm/slab.h
index 61110b3035e7..68c5fc6e557e 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -289,10 +289,11 @@ static __always_inline int memcg_charge_slab(struct page 
*page,
struct lruvec *lruvec;
int ret;
 
-   memcg = s->memcg_params.memcg;
+   rcu_read_lock();
+   memcg = rcu_dereference(s->memcg_params.memcg);
ret = memcg_kmem_charge_memcg(page, gfp, order, memcg);
if (ret)
-   return ret;
+   goto out;
 
lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
mod_lruvec_state(lruvec, cache_vmstat_idx(s), 1 << order);
@@ -300,8 +301,9 @@ static __always_inline int memcg_charge_slab(struct page 
*page,
/* transer try_charge() page references to kmem_cache */
percpu_ref_get_many(>memcg_params.refcnt, 1 << order);
css_put_many(>css, 1 << order);
-
-   return 0;
+out:
+   rcu_read_unlock();
+   return ret;
 }

RE: [LINUX PATCH v14] mtd: rawnand: pl353: Add basic driver for arm pl353 smc nand interface

2019-04-23 Thread Naga Sureshkumar Relli

Hi Helmut,

> -Original Message-
> From: Helmut Grohne 
> Sent: Tuesday, April 23, 2019 6:15 PM
> To: Naga Sureshkumar Relli 
> Cc: bbrezil...@kernel.org; miquel.ray...@bootlin.com; rich...@nod.at;
> dw...@infradead.org; computersforpe...@gmail.com; marek.va...@gmail.com; 
> linux-
> m...@lists.infradead.org; linux-kernel@vger.kernel.org; Michal Simek 
> ;
> nagasureshkumarre...@gmail.com
> Subject: Re: [LINUX PATCH v14] mtd: rawnand: pl353: Add basic driver for arm 
> pl353 smc
> nand interface
> 
> WARNING: This driver might brick the hardware. See below.
> 
> Hi Naga,
> 
> On Mon, Apr 15, 2019 at 04:40:13PM +0530, Naga Sureshkumar Relli wrote:
> > Changes in v14:
> >  - Removed legacy hooks as per Miquel comments
> 
> Thank you for the update.
> 
> > +static inline int pl353_wait_for_dev_ready(struct nand_chip *chip) {
> > +   unsigned long timeout = jiffies + PL353_NAND_DEV_BUSY_TIMEOUT;
> > +
> > +   do {
> > +   if (pl353_smc_get_nand_int_status_raw()) {
> > +   pl353_smc_clr_nand_int();
> > +   break;
> 
> A closing brace is missing here. This causes a compilation failure.
While cleaning up the warnings reported by checkpatch,  this happened.
sorry for that. I will correct it. 
> 
> > +
> > +   cpu_relax();
> 
> You previously used cond_resched (via nand_wait_ready) here. Why did you 
> change it to
> cpu_relax()?
I just replicated the pl353_wait_for_ecc_done() API definition.
But did you see any issue with this?
Anyway I will replace it with cond_resched(), instead of cpu_releax()
> 
> > +   } while (!time_after_eq(jiffies, timeout));
> > +
> > +   if (time_after_eq(jiffies, timeout)) {
> > +   pr_err("%s timed out\n", __func__);
> > +   return -ETIMEDOUT;
> > +   }
> > +
> > +   return 0;
> > +}
> 
> 
> > +static int pl353_nand_read_page_hwecc(struct nand_chip *chip,
> > + u8 *buf, int oob_required, int page) {
> > +   int i, stat, eccsize = chip->ecc.size;
> > +   int eccbytes = chip->ecc.bytes;
> > +   int eccsteps = chip->ecc.steps;
> > +   u8 *p = buf;
> > +   u8 *ecc_calc = chip->ecc.calc_buf;
> > +   u8 *ecc = chip->ecc.code_buf;
> > +   unsigned int max_bitflips = 0;
> > +   u8 *oob_ptr;
> > +   u32 ret;
> > +   unsigned long data_phase_addr;
> > +   unsigned long nand_offset = (unsigned long __force)xnfc->regs;
> 
> The variable xnfc is undeclared here. Consider swapping the line with the 
> next one.
> 
> > +   struct pl353_nand_controller *xnfc = to_pl353_nand(chip);
> > +   struct mtd_info *mtd = nand_to_mtd(chip);
> 
> After loading the driver, the device does not work. The dmesg output is:
> 
> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
> nand: Micron MT29F2G08ABAEAWP
> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 Bad 
> block table not
> found for chip 0 Bad block table not found for chip 0 Scanning device for bad 
> blocks
> nand_bbt: error while writing BBT block -524
> nand_bbt: error while writing BBT block -524
> nand_bbt: error while writing BBT block -524
> nand_bbt: error while writing BBT block -524 No space left to write bad block 
> table
> nand_bbt: error while writing bad block table -28 pl353-nand e100.flash: 
> could not scan
> the nand chip
> pl353-nand: probe of e100.flash failed with error -28
Did you follow the same thing that you tried earlier?
i.e. updated "nand-bus-width" property and "nand-ecc-mode" ?
I haven't seen any issue in BBT scanning, with this patch.

> 
> After trying the driver, the flash chip was bricked. Neither the old driver 
> nor the uboot-xlnx
> driver nor the Xilinx fsbl are able to talk to the chip afterwards. This 
> behaviour persists even
> after a full power cycle. I'll try reinitializing the flash chip next. I've 
> only seen this behaviour
> once, so there is a slight chance that the cause is something else.
Sometimes I also faced the same problem during driver development.
What I did is, in standalone nandps driver example,  I forcibly created BBT in 
the init and once
 it is done. I just reloaded the actual example. Then after wards u-boot and 
Linux are able to scan
the BBT.

Thanks,
Naga Sureshkumar Relli
> 
> Helmut

[PATCH] PCI: altera-msi: Allow building as module

2019-04-23 Thread Ley Foon Tan

Altera MSI IP is a soft IP and is only available after
FPGA image is programmed.

Make driver modulable to support use case FPGA image is programmed
after kernel is booted. User proram FPGA image in kernel then only load
MSI driver module.

Signed-off-by: Ley Foon Tan 
---
 drivers/pci/controller/Kconfig   |  2 +-
 drivers/pci/controller/pcie-altera-msi.c | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 4b550f9cdd56..920546cb84e2 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -181,7 +181,7 @@ config PCIE_ALTERA
  FPGA.
 
 config PCIE_ALTERA_MSI
-   bool "Altera PCIe MSI feature"
+   tristate "Altera PCIe MSI feature"
depends on PCIE_ALTERA
depends on PCI_MSI_IRQ_DOMAIN
help
diff --git a/drivers/pci/controller/pcie-altera-msi.c 
b/drivers/pci/controller/pcie-altera-msi.c
index 025ef7d9a046..16d938920ca5 100644
--- a/drivers/pci/controller/pcie-altera-msi.c
+++ b/drivers/pci/controller/pcie-altera-msi.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -288,4 +289,13 @@ static int __init altera_msi_init(void)
 {
return platform_driver_register(_msi_driver);
 }
+
+static void __exit altera_msi_exit(void)
+{
+   platform_driver_unregister(_msi_driver);
+}
+
 subsys_initcall(altera_msi_init);
+MODULE_DEVICE_TABLE(of, altera_msi_of_match);
+module_exit(altera_msi_exit);
+MODULE_LICENSE("GPL v2");
-- 
2.19.0

[PATCH] PCI: altera: Allow building as module

2019-04-23 Thread Ley Foon Tan

Altera PCIe Rootport IP is a soft IP and is only available after
FPGA image is programmed.

Make driver modulable to support use case FPGA image is programmed
after kernel is booted. User proram FPGA image in kernel then only load
PCIe driver module.

Signed-off-by: Ley Foon Tan 
---
 drivers/pci/controller/Kconfig   |  2 +-
 drivers/pci/controller/pcie-altera.c | 28 ++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 6012f3059acd..4b550f9cdd56 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -174,7 +174,7 @@ config PCIE_IPROC_MSI
  PCIe controller
 
 config PCIE_ALTERA
-   bool "Altera PCIe controller"
+   tristate "Altera PCIe controller"
depends on ARM || NIOS2 || ARM64 || COMPILE_TEST
help
  Say Y here if you want to enable PCIe controller support on Altera
diff --git a/drivers/pci/controller/pcie-altera.c 
b/drivers/pci/controller/pcie-altera.c
index 27edcebd1726..6c86bc69ace8 100644
--- a/drivers/pci/controller/pcie-altera.c
+++ b/drivers/pci/controller/pcie-altera.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -705,6 +706,13 @@ static int altera_pcie_init_irq_domain(struct altera_pcie 
*pcie)
return 0;
 }
 
+static int altera_pcie_irq_teardown(struct altera_pcie *pcie)
+{
+   irq_set_chained_handler_and_data(pcie->irq, NULL, NULL);
+   irq_domain_remove(pcie->irq_domain);
+   irq_dispose_mapping(pcie->irq);
+}
+
 static int altera_pcie_parse_dt(struct altera_pcie *pcie)
 {
struct device *dev = >pdev->dev;
@@ -798,6 +806,7 @@ static int altera_pcie_probe(struct platform_device *pdev)
 
pcie = pci_host_bridge_priv(bridge);
pcie->pdev = pdev;
+   platform_set_drvdata(pdev, pcie);
 
match = of_match_device(altera_pcie_of_match, >dev);
if (!match)
@@ -855,13 +864,28 @@ static int altera_pcie_probe(struct platform_device *pdev)
return ret;
 }
 
+static int altera_pcie_remove(struct platform_device *pdev)
+{
+   struct altera_pcie *pcie = platform_get_drvdata(pdev);
+   struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
+
+   pci_stop_root_bus(bridge->bus);
+   pci_remove_root_bus(bridge->bus);
+   pci_free_resource_list(>resources);
+   altera_pcie_irq_teardown(pcie);
+
+   return 0;
+}
+
 static struct platform_driver altera_pcie_driver = {
.probe  = altera_pcie_probe,
+   .remove = altera_pcie_remove,
.driver = {
.name   = "altera-pcie",
.of_match_table = altera_pcie_of_match,
-   .suppress_bind_attrs = true,
},
 };
 
-builtin_platform_driver(altera_pcie_driver);
+MODULE_DEVICE_TABLE(of, altera_pcie_of_match);
+module_platform_driver(altera_pcie_driver);
+MODULE_LICENSE("GPL v2");
-- 
2.19.0

[PATCH v2 0/6] mm: reparent slab memory on cgroup removal

2019-04-23 Thread Roman Gushchin

# Why do we need this?

We've noticed that the number of dying cgroups is steadily growing on most
of our hosts in production. The following investigation revealed an issue
in userspace memory reclaim code [1], accounting of kernel stacks [2],
and also the mainreason: slab objects.

The underlying problem is quite simple: any page charged
to a cgroup holds a reference to it, so the cgroup can't be reclaimed unless
all charged pages are gone. If a slab object is actively used by other cgroups,
it won't be reclaimed, and will prevent the origin cgroup from being reclaimed.

Slab objects, and first of all vfs cache, is shared between cgroups, which are
using the same underlying fs, and what's even more important, it's shared
between multiple generations of the same workload. So if something is running
periodically every time in a new cgroup (like how systemd works), we do
accumulate multiple dying cgroups.

Strictly speaking pagecache isn't different here, but there is a key difference:
we disable protection and apply some extra pressure on LRUs of dying cgroups,
and these LRUs contain all charged pages.
My experiments show that with the disabled kernel memory accounting the number
of dying cgroups stabilizes at a relatively small number (~100, depends on
memory pressure and cgroup creation rate), and with kernel memory accounting
it grows pretty steadily up to several thousands.

Memory cgroups are quite complex and big objects (mostly due to percpu stats),
so it leads to noticeable memory losses. Memory occupied by dying cgroups
is measured in hundreds of megabytes. I've even seen a host with more than 100Gb
of memory wasted for dying cgroups. It leads to a degradation of performance
with the uptime, and generally limits the usage of cgroups.

My previous attempt [3] to fix the problem by applying extra pressure on slab
shrinker lists caused a regressions with xfs and ext4, and has been reverted 
[4].
The following attempts to find the right balance [5, 6] were not successful.

So instead of trying to find a maybe non-existing balance, let's do reparent
the accounted slabs to the parent cgroup on cgroup removal.


# Implementation approach

There is however a significant problem with reparenting of slab memory:
there is no list of charged pages. Some of them are in shrinker lists,
but not all. Introducing of a new list is really not an option.

But fortunately there is a way forward: every slab page has a stable pointer
to the corresponding kmem_cache. So the idea is to reparent kmem_caches
instead of slab pages.

It's actually simpler and cheaper, but requires some underlying changes:
1) Make kmem_caches to hold a single reference to the memory cgroup,
   instead of a separate reference per every slab page.
2) Stop setting page->mem_cgroup pointer for memcg slab pages and use
   page->kmem_cache->memcg indirection instead. It's used only on
   slab page release, so it shouldn't be a big issue.
3) Introduce a refcounter for non-root slab caches. It's required to
   be able to destroy kmem_caches when they become empty and release
   the associated memory cgroup.

There is a bonus: currently we do release empty kmem_caches on cgroup
removal, however all other are waiting for the releasing of the memory cgroup.
These refactorings allow kmem_caches to be released as soon as they
become inactive and free.

Some additional implementation details are provided in corresponding
commit messages.


# Results

Below is the average number of dying cgroups on two groups of our production
hosts. They do run some sort of web frontend workload, the memory pressure
is moderate. As we can see, with the kernel memory reparenting the number
stabilizes in 50s range; however with the original version it grows almost
linearly and doesn't show any signs of plateauing. The difference in slab
and percpu usage between patched and unpatched versions also grows linearly.
In 6 days it reached 200Mb.

day   0123456
original 39  338  580  827 1098 1349 1574
patched  23   44   45   47   50   46   55
mem diff(Mb) 53   73   99  137  148  182  209


# History

v2:
  1) switched to percpu kmem_cache refcounter
  2) a reference to kmem_cache is held during the allocation
  3) slabs stats are fixed for !MEMCG case (and the refactoring
 is separated into a standalone patch)
  4) kmem_cache reparenting is performed from deactivatation context

v1:
  https://lkml.org/lkml/2019/4/17/1095


# Links

[1]: commit 68600f623d69 ("mm: don't miss the last page because of
round-off error")
[2]: commit 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
[3]: commit 172b06c32b94 ("mm: slowly shrink slabs with a relatively
small number of objects")
[4]: commit a9a238e83fbb ("Revert "mm: slowly shrink slabs
with a relatively small number of objects")
[5]: https://lkml.org/lkml/2019/1/28/1865
[6]: https://marc.info/?l=linux-mm=155064763626437=2


Roman Gushchin (6):
  mm: postpone kmem_cache memcg pointer

[PATCH v2 1/6] mm: postpone kmem_cache memcg pointer initialization to memcg_link_cache()

2019-04-23 Thread Roman Gushchin

Initialize kmem_cache->memcg_params.memcg pointer in
memcg_link_cache() rather than in init_memcg_params().

Once kmem_cache will hold a reference to the memory cgroup,
it will simplify the refcounting.

For non-root kmem_caches memcg_link_cache() is always called
before the kmem_cache becomes visible to a user, so it's safe.

Signed-off-by: Roman Gushchin 
---
 mm/slab.c|  2 +-
 mm/slab.h|  5 +++--
 mm/slab_common.c | 14 +++---
 mm/slub.c|  2 +-
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index b1eefe751d2a..57a332f524cf 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1268,7 +1268,7 @@ void __init kmem_cache_init(void)
  nr_node_ids * sizeof(struct kmem_cache_node 
*),
  SLAB_HWCACHE_ALIGN, 0, 0);
list_add(_cache->list, _caches);
-   memcg_link_cache(kmem_cache);
+   memcg_link_cache(kmem_cache, NULL);
slab_state = PARTIAL;
 
/*
diff --git a/mm/slab.h b/mm/slab.h
index 43ac818b8592..6a562ca72bca 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -289,7 +289,7 @@ static __always_inline void memcg_uncharge_slab(struct page 
*page, int order,
 }
 
 extern void slab_init_memcg_params(struct kmem_cache *);
-extern void memcg_link_cache(struct kmem_cache *s);
+extern void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg);
 extern void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
void (*deact_fn)(struct kmem_cache *));
 
@@ -344,7 +344,8 @@ static inline void slab_init_memcg_params(struct kmem_cache 
*s)
 {
 }
 
-static inline void memcg_link_cache(struct kmem_cache *s)
+static inline void memcg_link_cache(struct kmem_cache *s,
+   struct mem_cgroup *memcg)
 {
 }
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 58251ba63e4a..6e00bdf8618d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -140,13 +140,12 @@ void slab_init_memcg_params(struct kmem_cache *s)
 }
 
 static int init_memcg_params(struct kmem_cache *s,
-   struct mem_cgroup *memcg, struct kmem_cache *root_cache)
+struct kmem_cache *root_cache)
 {
struct memcg_cache_array *arr;
 
if (root_cache) {
s->memcg_params.root_cache = root_cache;
-   s->memcg_params.memcg = memcg;
INIT_LIST_HEAD(>memcg_params.children_node);
INIT_LIST_HEAD(>memcg_params.kmem_caches_node);
return 0;
@@ -221,11 +220,12 @@ int memcg_update_all_caches(int num_memcgs)
return ret;
 }
 
-void memcg_link_cache(struct kmem_cache *s)
+void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg)
 {
if (is_root_cache(s)) {
list_add(>root_caches_node, _root_caches);
} else {
+   s->memcg_params.memcg = memcg;
list_add(>memcg_params.children_node,
 >memcg_params.root_cache->memcg_params.children);
list_add(>memcg_params.kmem_caches_node,
@@ -244,7 +244,7 @@ static void memcg_unlink_cache(struct kmem_cache *s)
 }
 #else
 static inline int init_memcg_params(struct kmem_cache *s,
-   struct mem_cgroup *memcg, struct kmem_cache *root_cache)
+   struct kmem_cache *root_cache)
 {
return 0;
 }
@@ -384,7 +384,7 @@ static struct kmem_cache *create_cache(const char *name,
s->useroffset = useroffset;
s->usersize = usersize;
 
-   err = init_memcg_params(s, memcg, root_cache);
+   err = init_memcg_params(s, root_cache);
if (err)
goto out_free_cache;
 
@@ -394,7 +394,7 @@ static struct kmem_cache *create_cache(const char *name,
 
s->refcount = 1;
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, memcg);
 out:
if (err)
return ERR_PTR(err);
@@ -997,7 +997,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char 
*name,
 
create_boot_cache(s, name, size, flags, useroffset, usersize);
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, NULL);
s->refcount = 1;
return s;
 }
diff --git a/mm/slub.c b/mm/slub.c
index a34fbe1f6ede..2b9244529d76 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4224,7 +4224,7 @@ static struct kmem_cache * __init bootstrap(struct 
kmem_cache *static_cache)
}
slab_init_memcg_params(s);
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, NULL);
return s;
 }
 
-- 
2.20.1

[PATCH v2 3/6] mm: introduce __memcg_kmem_uncharge_memcg()

2019-04-23 Thread Roman Gushchin

Let's separate the page counter modification code out of
__memcg_kmem_uncharge() in a way similar to what
__memcg_kmem_charge() and __memcg_kmem_charge_memcg() work.

This will allow to reuse this code later using a new
memcg_kmem_uncharge_memcg() wrapper, which calls
__memcg_kmem_unchare_memcg() if memcg_kmem_enabled()
check is passed.

Signed-off-by: Roman Gushchin 
---
 include/linux/memcontrol.h | 10 ++
 mm/memcontrol.c| 25 +
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 36bdfe8e5965..deb209510902 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1298,6 +1298,8 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, int 
order);
 void __memcg_kmem_uncharge(struct page *page, int order);
 int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
  struct mem_cgroup *memcg);
+void __memcg_kmem_uncharge_memcg(struct mem_cgroup *memcg,
+unsigned int nr_pages);
 
 extern struct static_key_false memcg_kmem_enabled_key;
 extern struct workqueue_struct *memcg_kmem_cache_wq;
@@ -1339,6 +1341,14 @@ static inline int memcg_kmem_charge_memcg(struct page 
*page, gfp_t gfp,
return __memcg_kmem_charge_memcg(page, gfp, order, memcg);
return 0;
 }
+
+static inline void memcg_kmem_uncharge_memcg(struct page *page, int order,
+struct mem_cgroup *memcg)
+{
+   if (memcg_kmem_enabled())
+   __memcg_kmem_uncharge_memcg(memcg, 1 << order);
+}
+
 /*
  * helper for accessing a memcg's index. It will be used as an index in the
  * child cache array in kmem_cache, and also to derive its name. This function
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 48a8f1c35176..b2c39f187cbb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2750,6 +2750,22 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, 
int order)
css_put(>css);
return ret;
 }
+
+/**
+ * __memcg_kmem_uncharge_memcg: uncharge a kmem page
+ * @memcg: memcg to uncharge
+ * @nr_pages: number of pages to uncharge
+ */
+void __memcg_kmem_uncharge_memcg(struct mem_cgroup *memcg,
+unsigned int nr_pages)
+{
+   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+   page_counter_uncharge(>kmem, nr_pages);
+
+   page_counter_uncharge(>memory, nr_pages);
+   if (do_memsw_account())
+   page_counter_uncharge(>memsw, nr_pages);
+}
 /**
  * __memcg_kmem_uncharge: uncharge a kmem page
  * @page: page to uncharge
@@ -2764,14 +2780,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
return;
 
VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
-
-   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
-   page_counter_uncharge(>kmem, nr_pages);
-
-   page_counter_uncharge(>memory, nr_pages);
-   if (do_memsw_account())
-   page_counter_uncharge(>memsw, nr_pages);
-
+   __memcg_kmem_uncharge_memcg(memcg, nr_pages);
page->mem_cgroup = NULL;
 
/* slab pages do not have PageKmemcg flag set */
-- 
2.20.1

[PATCH v2 4/6] mm: unify SLAB and SLUB page accounting

2019-04-23 Thread Roman Gushchin

Currently the page accounting code is duplicated in SLAB and SLUB
internals. Let's move it into new (un)charge_slab_page helpers
in the slab_common.c file. These helpers will be responsible
for statistics (global and memcg-aware) and memcg charging.
So they are replacing direct memcg_(un)charge_slab() calls.

Signed-off-by: Roman Gushchin 
---
 mm/slab.c | 19 +++
 mm/slab.h | 22 ++
 mm/slub.c | 14 ++
 3 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 14466a73d057..53e6b2687102 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1389,7 +1389,6 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
int nodeid)
 {
struct page *page;
-   int nr_pages;
 
flags |= cachep->allocflags;
 
@@ -1399,17 +1398,11 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
return NULL;
}
 
-   if (memcg_charge_slab(page, flags, cachep->gfporder, cachep)) {
+   if (charge_slab_page(page, flags, cachep->gfporder, cachep)) {
__free_pages(page, cachep->gfporder);
return NULL;
}
 
-   nr_pages = (1 << cachep->gfporder);
-   if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
-   mod_lruvec_page_state(page, NR_SLAB_RECLAIMABLE, nr_pages);
-   else
-   mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE, nr_pages);
-
__SetPageSlab(page);
/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
if (sk_memalloc_socks() && page_is_pfmemalloc(page))
@@ -1424,12 +1417,6 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
 static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
 {
int order = cachep->gfporder;
-   unsigned long nr_freed = (1 << order);
-
-   if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
-   mod_lruvec_page_state(page, NR_SLAB_RECLAIMABLE, -nr_freed);
-   else
-   mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE, -nr_freed);
 
BUG_ON(!PageSlab(page));
__ClearPageSlabPfmemalloc(page);
@@ -1438,8 +1425,8 @@ static void kmem_freepages(struct kmem_cache *cachep, 
struct page *page)
page->mapping = NULL;
 
if (current->reclaim_state)
-   current->reclaim_state->reclaimed_slab += nr_freed;
-   memcg_uncharge_slab(page, order, cachep);
+   current->reclaim_state->reclaimed_slab += 1 << order;
+   uncharge_slab_page(page, order, cachep);
__free_pages(page, order);
 }
 
diff --git a/mm/slab.h b/mm/slab.h
index 4a261c97c138..0f5c5444acf1 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -205,6 +205,12 @@ ssize_t slabinfo_write(struct file *file, const char 
__user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
+static inline int cache_vmstat_idx(struct kmem_cache *s)
+{
+   return (s->flags & SLAB_RECLAIM_ACCOUNT) ?
+   NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE;
+}
+
 #ifdef CONFIG_MEMCG_KMEM
 
 /* List of all root caches. */
@@ -352,6 +358,22 @@ static inline void memcg_link_cache(struct kmem_cache *s,
 
 #endif /* CONFIG_MEMCG_KMEM */
 
+static __always_inline int charge_slab_page(struct page *page,
+   gfp_t gfp, int order,
+   struct kmem_cache *s)
+{
+   memcg_charge_slab(page, gfp, order, s);
+   mod_lruvec_page_state(page, cache_vmstat_idx(s), 1 << order);
+   return 0;
+}
+
+static __always_inline void uncharge_slab_page(struct page *page, int order,
+  struct kmem_cache *s)
+{
+   mod_lruvec_page_state(page, cache_vmstat_idx(s), -(1 << order));
+   memcg_uncharge_slab(page, order, s);
+}
+
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
struct kmem_cache *cachep;
diff --git a/mm/slub.c b/mm/slub.c
index 195f61785c7d..90563c0b3b5f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1499,7 +1499,7 @@ static inline struct page *alloc_slab_page(struct 
kmem_cache *s,
else
page = __alloc_pages_node(node, flags, order);
 
-   if (page && memcg_charge_slab(page, flags, order, s)) {
+   if (page && charge_slab_page(page, flags, order, s)) {
__free_pages(page, order);
page = NULL;
}
@@ -1692,11 +1692,6 @@ static struct page *allocate_slab(struct kmem_cache *s, 
gfp_t flags, int node)
if (!page)
return NULL;
 
-   mod_lruvec_page_state(page,
-   (s->flags & SLAB_RECLAIM_ACCOUNT) ?
-   NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
-   1 << oo_order(oo));
-
inc_slabs_node(s, page_to_nid(page), page->objects);

Re: [PATCH V2 1/3] Calculate Thermal Pressure

2019-04-23 Thread Thara Gopinath

On 04/18/2019 06:14 AM, Quentin Perret wrote:
> On Tuesday 16 Apr 2019 at 15:38:39 (-0400), Thara Gopinath wrote:
>> +/**
>> + * Function to update thermal pressure from cooling device
>> + * or any framework responsible for capping cpu maximum
>> + * capacity.
>> + */
>> +void sched_update_thermal_pressure(struct cpumask *cpus,
>> +   unsigned long cap_max_freq,
>> +   unsigned long max_freq)
>> +{
>> +int cpu;
>> +unsigned long flags = 0;
>> +struct thermal_pressure *cpu_thermal;
>> +
>> +for_each_cpu(cpu, cpus) {
> 
> Is it actually required to do this for each CPU ? You could calculate
> the whole thing once for the first CPU, and apply the result to all CPUs
> in the policy no ? All CPUs in a policy are capped and uncapped
> synchronously.
Hmm. You are right that all cpus in a policy are capped and uncapped
synchronously from the thermal framework point of view. But the thermal
pressure decay can happen at different times for each cpu and hence the
update has to be done on a per cpu basis(especially to keep track of
other age and other variables in the averaging and accumulating
algorithm). It can be separated out but I think it will just make the
solution more complicated.
> 
>> +cpu_thermal = per_cpu(thermal_pressure_cpu, cpu);
>> +if (!cpu_thermal)
>> +return;
>> +spin_lock_irqsave(_thermal->lock, flags);
>> +thermal_pressure_update(cpu_thermal, cap_max_freq, max_freq, 1);
>> +}
>> +}


-- 
Regards
Thara

Re: [PATCH] fs/ufs: Force type conversion from __fs16 to u16

2019-04-23 Thread Al Viro

On Tue, Apr 23, 2019 at 02:31:18PM +0530, Bharath Vedartham wrote:
> This patch fixes the sparse warning:
> warning: restricted __fs16 degrades to integer
> 
> inode->ui_u1.oldids.ui_suid is of type __fs16, a restricted integer.
> 0X is a 16 bit unsigned integer. Use __force to fix the sparse
> warning.

NAK.  As always, ask whether the code is correct and if so, why is
it correct.  Here, of course, the answer is that 16bit value with
all bits set is the same regardless of endianness.  If it was
le16, we could simply write cpu_to_le16(0x) and let the compiler
fold the constant expression; for fs16 it's beyond what gcc can
(reliably) do - you get essentially
UFS_SB(sbp)->s_bytesex == BYTESEX_LE
? (unsigned short)0x
: (unsigned short)0x
which would be nice to optimize to 0x, but it's not guaranteed
to happen.

So let's do this:

static inline bool ufs_reserved_uid(__fs16 uid)
{
return (__fs16)~uid == 0;   /* all 16 bits set */
}

and use it in... wait.  Wait, it *is* broken.  Look:
static inline u32
ufs_get_inode_uid(struct super_block *sb, struct ufs_inode *inode)
{
switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) {
case UFS_UID_44BSD:
return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_uid);
case UFS_UID_EFT:
if (inode->ui_u1.oldids.ui_suid == 0x)
return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_uid);
/* Fall through */
default:
return fs16_to_cpu(sb, inode->ui_u1.oldids.ui_suid);
}
}

OK, so
* old flavours: 16bit uids, stored in ->ui_u1.oldids.ui_suid
(offset 4)
* new flavours (44bsd, ufs2, openstep): 32bit uids, stored in
->ui_u3.ui_44.ui_uid (offset 0x70)
* Solaris flavours (sun, sunx86): 32bit uids; if smaller than
0x, stored where old flavours used to, if greater or equal -
stored in ->ui_u3.ui_sun.ui_uid (offset 0x74), with 0x stored
in the old place.

Makes sense, and ufs_set_inode_uid() matches that (with extra
piece of information - for new flavours the lower 16 bits of uid
are duplicated into the old place, for solaris - min(uid, 0x)
goes there).  However, the other place with similar logics is
static inline u32
ufs_get_inode_gid(struct super_block *sb, struct ufs_inode *inode)
{
switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) {
case UFS_UID_44BSD:
return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid);
case UFS_UID_EFT:
if (inode->ui_u1.oldids.ui_suid == 0x)
return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid);
/* Fall through */
default:
return fs16_to_cpu(sb, inode->ui_u1.oldids.ui_sgid);
}
}

See the problem?  For Solaris flavours we check the old *UID* location
to decide whether we want the new GID one or the old GID one.  That
makes no sense - after all, setting UID=1000, GID=8 should be possible,
and that couldn't work; we would get 1000 stored in old UID location,
8 - in new GID one, so this ufs_get_inode_gid() would end up
returning the old GID field, which is 16bit and could not store
8, no matter what.

So we have a braino in ufs_get_inode_gid(), AFAICS since
252e211e90ce5 ("Add in SunOS 4.1.x compatible mode for UFS").
It should go
if (inode->ui_u1.oldids.ui_sgid == 0x)
return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid);
instead of checking ui_suid...

The breakage there isn't what sparse complained about, but it
still needs fixing.  IMO it should go in two steps: first

diff --git a/fs/ufs/util.h b/fs/ufs/util.h
index 1fd3011ea623..7fd480b8 100644
--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -229,7 +229,7 @@ ufs_get_inode_gid(struct super_block *sb, struct ufs_inode 
*inode)
case UFS_UID_44BSD:
return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid);
case UFS_UID_EFT:
-   if (inode->ui_u1.oldids.ui_suid == 0x)
+   if (inode->ui_u1.oldids.ui_sgid == 0x)
return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid);
/* Fall through */
default:
then introduction of

static inline bool solaris_xid_overflow(__fs16 xid)
{
/*
 * Solaris indicates the use of 32bit [UG]ID by storing
 * all-ones bit pattern in the corresponding old (16bit) field
 */
return (__fs16)~xid == 0;   /* all 16 bits set */
}

with these two lines turned into
if (solaris_xid_overflow(inode->ui_u1.oldids.ui_suid))
and
if (solaris_xid_overflow(inode->ui_u1.oldids.ui_sgid))
resp.

Re: [PATCH v3 0/3] Refactor memory initialization hardening

2019-04-23 Thread Masahiro Yamada

On Wed, Apr 24, 2019 at 4:49 AM Kees Cook  wrote:
>
> This refactors the stack memory initialization configs in order to
> keep things together when adding Clang stack initialization, and in
> preparation for future heap memory initialization configs.
>
> I intend to carry this in the gcc-plugins tree, but I'd really like
> to get Acks from Masahiro (Kconfig changes, Makefile change), and
> from James (adding the new Kconfig.hardening to security/Kconfig).

If needed,
Acked-by: Masahiro Yamada 


> Thanks!
>
> -Kees
>
> v3:
> - clean up menu/if with a merged "depends on" (masahiro)
> - add CONFIG_COMPILE_TEST defaults (masahiro)
>
> v2:
> - add plugin menu (masahiro)
> - adjust patch subject prefixes (masahiro)
> - drop redundent "depends" (masahiro)
> - fixed early use of CC_HAS_AUTO_VAR_INIT (masahiro)
> - dropped default-enabled for STACK_INIT_ALL (masahiro)
>
>
> Kees Cook (3):
>   security: Create "kernel hardening" config area
>   security: Move stackleak config to Kconfig.hardening
>   security: Implement Clang's stack initialization
>
>  Makefile|   5 ++
>  scripts/gcc-plugins/Kconfig | 126 ++-
>  security/Kconfig|   2 +
>  security/Kconfig.hardening  | 164 
>  4 files changed, 177 insertions(+), 120 deletions(-)
>  create mode 100644 security/Kconfig.hardening
>
> --
> 2.17.1
>


-- 
Best Regards
Masahiro Yamada

Re: Does vdso_install attempt to re-compile objects under root privilege?

2019-04-23 Thread Linus Torvalds

On Tue, Apr 23, 2019 at 4:57 PM Andy Lutomirski  wrote:
>
> To clarify, this is “fail if you can’t find the files to install, but don’t 
> even try to check whether those files are up to date”, right?

Ack. Exactly because the whole "check whether the files are
up-to-date" is generally part of the very complex dance of doing all
the version stuff etc.

   Linus

Re: [PATCH v2 1/3] security: Create "kernel hardening" config area

2019-04-23 Thread Masahiro Yamada

On Wed, Apr 24, 2019 at 4:36 AM Kees Cook  wrote:
>
> On Thu, Apr 11, 2019 at 6:39 PM Masahiro Yamada
>  wrote:
> >
> > On Fri, Apr 12, 2019 at 3:01 AM Kees Cook  wrote:
> > >
> > > Right now kernel hardening options are scattered around various Kconfig
> > > files. This can be a central place to collect these kinds of options
> > > going forward. This is initially populated with the memory initialization
> > > options from the gcc-plugins.
> > >
> > > Signed-off-by: Kees Cook 
> > > ---
> > >  scripts/gcc-plugins/Kconfig | 74 +++--
> > >  security/Kconfig|  2 +
> > >  security/Kconfig.hardening  | 93 +
> > >  3 files changed, 102 insertions(+), 67 deletions(-)
> > >  create mode 100644 security/Kconfig.hardening
> > >
> > > diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
> > > index 74271dba4f94..84d471dea2b7 100644
> > > --- a/scripts/gcc-plugins/Kconfig
> > > +++ b/scripts/gcc-plugins/Kconfig
> > > @@ -13,10 +13,11 @@ config HAVE_GCC_PLUGINS
> > >   An arch should select this symbol if it supports building with
> > >   GCC plugins.
> > >
> > > -menuconfig GCC_PLUGINS
> > > -   bool "GCC plugins"
> > > +config GCC_PLUGINS
> > > +   bool
> > > depends on HAVE_GCC_PLUGINS
> > > depends on PLUGIN_HOSTCC != ""
> > > +   default y
> > > help
> > >   GCC plugins are loadable modules that provide extra features to 
> > > the
> > >   compiler. They are useful for runtime instrumentation and 
> > > static analysis.
> > > @@ -25,6 +26,8 @@ menuconfig GCC_PLUGINS
> > >
> > >  if GCC_PLUGINS
> > >
> > > +menu "GCC plugins"
> > > +
> >
> >
> >
> > Just a tip to save "if" ... "endif" block.
> >
> >
> > If you like, you can write like follows:
> >
> >
> > menu "GCC plugins"
> >   depends on GCC_PLUGINS
> >
> >   
> >
> > endmenu
>
> Ah yes, thanks! Adjusted.
>
> > > +menu "Memory initialization"
> > > +
> > > +choice
> > > +   prompt "Initialize kernel stack variables at function entry"
> > > +   depends on GCC_PLUGINS
> >
> > On second thought,
> > this 'depends on' is unnecessary
> > because INIT_STACK_NONE should be always visible.
>
> Oh yes, excellent point. Adjusted.
>
> > Another behavior change is
> > GCC_PLUGIN_STRUCTLEAK was previously enabled by all{yes,mod}config,
> > and in the compile-test coverage.
>
> I could set the defaults based on CONFIG_COMPILE_TEST, though? I.e.:
>
> prompt "Initialize kernel stack variables at function entry"
> default GCC_PLUGIN_STRUCTLEAK_BYREF_ALL if COMPILE_TEST && GCC_PLUGINS
> default INIT_STACK_ALL if COMPILE_TEST && CC_HAS_AUTO_VAR_INIT
> default INIT_STACK_NONE

Looks a good idea to me.

Thanks.



-- 
Best Regards
Masahiro Yamada

Re: [PATCH next] sysctl: add proc_dointvec_jiffies_minmax to limit the min/max write value

2019-04-23 Thread Zhiqiang Liu



Friendly ping...

> From: Zhiqiang Liu 
> 
> In proc_dointvec_jiffies func, the write value is only checked
> whether it is larger than INT_MAX. If the write value is less
> than zero, it can also be successfully writen in the data.
> 
> However, in some scenarios, users would adopt the data to
> set timers or check whether time is expired. Generally, the data
> will be cast to an unsigned type variable, then the negative data
> becomes a very large unsigned value, which leads to long waits
> or other unpredictable problems.
> 
> Here, we add a new func, proc_dointvec_jiffies_minmax, to limit the
> min/max write value, which is similar to the proc_dointvec_minmax func.
> 
> Signed-off-by: Zhiqiang Liu 
> Reported-by: Qiang Ning 
> Reviewed-by: Jie Liu 
> ---
>  include/linux/sysctl.h |  2 ++
>  kernel/sysctl.c| 44 +++-
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index b769ecf..8bde8a0 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -53,6 +53,8 @@ extern int proc_douintvec_minmax(struct ctl_table *table, 
> int write,
>loff_t *ppos);
>  extern int proc_dointvec_jiffies(struct ctl_table *, int,
>void __user *, size_t *, loff_t *);
> +extern int proc_dointvec_jiffies_minmax(struct ctl_table *, int,
> +  void __user *, size_t *, loff_t *);
>  extern int proc_dointvec_userhz_jiffies(struct ctl_table *, int,
>   void __user *, size_t *, loff_t *);
>  extern int proc_dointvec_ms_jiffies(struct ctl_table *, int,
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index c9ec050..8e1eb59 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -2967,10 +2967,15 @@ static int do_proc_dointvec_jiffies_conv(bool *negp, 
> unsigned long *lvalp,
>int *valp,
>int write, void *data)
>  {
> + struct do_proc_dointvec_minmax_conv_param *param = data;
> +
>   if (write) {
>   if (*lvalp > INT_MAX / HZ)
>   return 1;
>   *valp = *negp ? -(*lvalp*HZ) : (*lvalp*HZ);
> + if ((param->min && (*param->min)*HZ > *valp) ||
> + (param->max && (*param->max)*HZ < *valp))
> + return -EINVAL;
>   } else {
>   int val = *valp;
>   unsigned long lval;
> @@ -3053,7 +3058,37 @@ int proc_dointvec_jiffies(struct ctl_table *table, int 
> write,
> void __user *buffer, size_t *lenp, loff_t *ppos)
>  {
>  return do_proc_dointvec(table,write,buffer,lenp,ppos,
> - do_proc_dointvec_jiffies_conv,NULL);
> + do_proc_dointvec_jiffies_conv, NULL);
> +}
> +
> +/**
> + * proc_dointvec_jiffies_minmax - read a vector of integers as seconds with 
> min/max values
> + * @table: the sysctl table
> + * @write: %TRUE if this is a write to the sysctl file
> + * @buffer: the user buffer
> + * @lenp: the size of the user buffer
> + * @ppos: file position
> + *
> + * Reads/writes up to table->maxlen/sizeof(unsigned int) integer
> + * values from/to the user buffer, treated as an ASCII string.
> + * The values read are assumed to be in seconds, and are converted into
> + * jiffies.
> + *
> + * This routine will ensure the values are within the range specified by
> + * table->extra1 (min) and table->extra2 (max).
> + *
> + * Returns 0 on success or -EINVAL on write when the range check fails.
> + */
> +int proc_dointvec_jiffies_minmax(struct ctl_table *table, int write,
> +   void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + struct do_proc_dointvec_minmax_conv_param param = {
> + .min = (int *) table->extra1,
> + .max = (int *) table->extra2,
> + };
> +
> + return do_proc_dointvec(table, write, buffer, lenp, ppos,
> + do_proc_dointvec_jiffies_conv, );
>  }
> 
>  /**
> @@ -3301,6 +3336,12 @@ int proc_dointvec_jiffies(struct ctl_table *table, int 
> write,
>   return -ENOSYS;
>  }
> 
> +int proc_dointvec_jiffies_minmax(struct ctl_table *table, int write,
> + void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + return -ENOSYS;
> +}
> +
>  int proc_dointvec_userhz_jiffies(struct ctl_table *table, int write,
>   void __user *buffer, size_t *lenp, loff_t *ppos)
>  {
> @@ -3359,6 +3400,7 @@ static int proc_dointvec_minmax_bpf_stats(struct 
> ctl_table *table, int write,
>  EXPORT_SYMBOL(proc_dointvec);
>  EXPORT_SYMBOL(proc_douintvec);
>  EXPORT_SYMBOL(proc_dointvec_jiffies);
> +EXPORT_SYMBOL(proc_dointvec_jiffies_minmax);
>  EXPORT_SYMBOL(proc_dointvec_minmax);
>  EXPORT_SYMBOL_GPL(proc_douintvec_minmax);
>  EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
>

Re: [PATCH] cpufreq: qoriq: Add ls1028a chip support

2019-04-23 Thread Viresh Kumar

On 24-04-19, 10:32, andy.t...@nxp.com wrote:
> From: Yuantian Tang 
> 
> Enable cpufreq feature on ls1028a chip by adding its compatible
> string.
> 
> Signed-off-by: Yuantian Tang 
> ---
>  drivers/cpufreq/qoriq-cpufreq.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
> index 4295e54..d308c4d 100644
> --- a/drivers/cpufreq/qoriq-cpufreq.c
> +++ b/drivers/cpufreq/qoriq-cpufreq.c
> @@ -280,6 +280,7 @@ static int qoriq_cpufreq_target(struct cpufreq_policy 
> *policy,
>  
>   { .compatible = "fsl,ls1012a-clockgen", },
>   { .compatible = "fsl,ls1021a-clockgen", },
> + { .compatible = "fsl,ls1028a-clockgen", },
>   { .compatible = "fsl,ls1043a-clockgen", },
>   { .compatible = "fsl,ls1046a-clockgen", },
>   { .compatible = "fsl,ls1088a-clockgen", },

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH V4 05/16] PCI: dwc: Move config space capability search API

2019-04-23 Thread Oliver

On Wed, Apr 24, 2019 at 1:12 PM Vidya Sagar  wrote:
>
> On 4/24/2019 2:02 AM, Bjorn Helgaas wrote:
> > On Tue, Apr 23, 2019 at 01:57:19PM +0530, Vidya Sagar wrote:
> >> Move PCIe config space capability search API to common DesignWare file
> >> as this can be used by both host and ep mode codes.
> >>
> >> Signed-off-by: Vidya Sagar 
> >> Acked-by: Gustavo Pimentel 
> >> ---
> >> Changes from [v3]:
> >> * Rebased to linux-next top of the tree
> >>
> >> Changes from [v2]:
> >> * None
> >>
> >> Changes from [v1]:
> >> * Removed dw_pcie_find_next_ext_capability() API from here and made a
> >>separate patch for that
> >>
> >>   drivers/pci/controller/dwc/pcie-designware.c | 33 
> >>   drivers/pci/controller/dwc/pcie-designware.h |  2 ++
> >
> > You claim this is a "move", but I only see adds.  Where did it move
> > *from*?
> These are supposed to be moved from pcie-designware-ep.c file. That was the 
> case
> with my old patches but when I rebased them onto ToT, I missed the change that
> removes them from pcie-designware-ep.c file. Thanks for catching this. I'll
> address it in the next patch.
>
> >
> > While you're at it, can you add a comment in the code about why we
> > can't use the regular pci_find_capability() interface?  It's really a
> > shame to have to reimplement that.
> Regular pci_find_capability() uses 'struct pci_dev *dev' pointer and can be 
> used
> only after device enumeration is done. Whereas, these APIs are used 
> particularly
> before link up and use 'struct dw_pcie *pci' pointer.

pci_bus_find_capability() can be used without enumerating the devices
if you have a pci_bus. It's probably not worth using here though since
you need this code anyway for endpoint mode.

>
> >
> >>   2 files changed, 35 insertions(+)
> >>
> >> diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
> >> b/drivers/pci/controller/dwc/pcie-designware.c
> >> index 8e0081ccf83b..6a98135244d6 100644
> >> --- a/drivers/pci/controller/dwc/pcie-designware.c
> >> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> >> @@ -20,6 +20,39 @@
> >>   #define PCIE_PHY_DEBUG_R1_LINK_UP  (0x1 << 4)
> >>   #define PCIE_PHY_DEBUG_R1_LINK_IN_TRAINING (0x1 << 29)
> >>
> >> +static u8 __dw_pcie_find_next_cap(struct dw_pcie *pci, u8 cap_ptr,
> >> +  u8 cap)
> >> +{
> >> +u8 cap_id, next_cap_ptr;
> >> +u16 reg;
> >> +
> >> +reg = dw_pcie_readw_dbi(pci, cap_ptr);
> >> +next_cap_ptr = (reg & 0xff00) >> 8;
> >> +cap_id = (reg & 0x00ff);
> >> +
> >> +if (!next_cap_ptr || cap_id > PCI_CAP_ID_MAX)
> >> +return 0;
> >> +
> >> +if (cap_id == cap)
> >> +return cap_ptr;
> >> +
> >> +return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
> >> +}
> >> +
> >> +u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> >> +{
> >> +u8 next_cap_ptr;
> >> +u16 reg;
> >> +
> >> +reg = dw_pcie_readw_dbi(pci, PCI_CAPABILITY_LIST);
> >> +next_cap_ptr = (reg & 0x00ff);
> >> +
> >> +if (!next_cap_ptr)
> >> +return 0;
> >> +
> >> +return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
> >> +}
> >> +
> >>   int dw_pcie_read(void __iomem *addr, int size, u32 *val)
> >>   {
> >>  if (!IS_ALIGNED((uintptr_t)addr, size)) {
> >> diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
> >> b/drivers/pci/controller/dwc/pcie-designware.h
> >> index 9ee98ced1ef6..35160b4ce929 100644
> >> --- a/drivers/pci/controller/dwc/pcie-designware.h
> >> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> >> @@ -248,6 +248,8 @@ struct dw_pcie {
> >>   #define to_dw_pcie_from_ep(endpoint)   \
> >>  container_of((endpoint), struct dw_pcie, ep)
> >>
> >> +u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> >> +
> >>   int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> >>   int dw_pcie_write(void __iomem *addr, int size, u32 val);
> >>
> >> --
> >> 2.17.1
> >>
>

Re: Does vdso_install attempt to re-compile objects under root privilege?

2019-04-23 Thread Masahiro Yamada

On Wed, Apr 24, 2019 at 8:40 AM Linus Torvalds
 wrote:
>
> On Tue, Apr 23, 2019 at 11:47 AM Andy Lutomirski  wrote:
> >
> > Hmm.  I suppose an alternative would be for vdso_install to fail if
> > the vdso isn't built?
>
> I absolutely abhor even the concept of building the kernel as root,
> and I think it should be actively disallowed. Our build system is
> good, but it's good as in "clever and complex" rather than necessarily
> good as in "very secure".
>
> So anybody who builds the kernel as root is doing something seriously
> wrong, in my opinion.
>
> That's partly exactly _because_ we have a lot of magical and very
> powerful build rules, and complicated implicit things going on.
>
> For example, our dependencies aren't even about just the files in the
> kernel repository itself, we have clever things like "if the compiler
> has been updated and features or version changes, we'll automatically
> rebuild, because it's part of our clever build system checks".
>
> But that is also part of the reason why I absolutely do *not* want any
> root-building to happen, because our build setup is simply way too
> clever.
>
> If root builds stuff, you'll end up with root-owned generated
> subdirectories or various config files etc, and even if you don't have
> security issues, it can complicate the build later as a regular user.
>
> I've had the build occasionally fail in odd ways, because some
> root-owned file was now no longer removable (usually it's the
> auto-generated header files in the directory, and the root-generated
> and owned directory is now not writable by the developer any more).
> And every time it happens, I shudder.
>
> So all of that simply boils down to "root should not be running those
> complex rules for our config and dependency magic".
>
> At the same time, "make install" obviously needs to be done as root.
>
> All of which is why I opine that "make install" should never build
> anything at all, it should purely be used as a "install previously
> built files".
>
> So yes, I'd much prefer just failing over trying to build as root (or
> even trying to figure out dependencies as root).
>
> > What's the ideal outcome here?
>
> I'd basically like the rule for "make install" to be that it never
> ever generates a single file in the build tree, so that there are
> never any root-owned (or root-overwritten) files there.
>
> So "make install" should even avoid all dependency checking, for the
> simple reason that if you happen to do a system update between "make"
> and "make install", our smart dependencies should never say "oh, the
> compiler version has changed, so now I'll rebuild everything as root
> just because 'make install'".
>
> So I think the ideal outcome is just "fail if you can't find the files
> to install".
>
>  Linus

I assume this is ACK to change vdso Makefiles.
I will send patches.

Thanks.

-- 
Best Regards
Masahiro Yamada

[PATCH] ARM: imx_v6_v7_defconfig: Enable CONFIG_THERMAL_STATISTICS

2019-04-23 Thread Anson Huang

Enable CONFIG_THERMAL_STATISTICS to extend the sysfs interface
for thermal cooling devices and expose some useful statistics.

Signed-off-by: Anson Huang 
---
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index 765003a..ea387cb 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -225,6 +225,7 @@ CONFIG_POWER_SUPPLY=y
 CONFIG_SENSORS_MC13783_ADC=y
 CONFIG_SENSORS_GPIO_FAN=y
 CONFIG_SENSORS_IIO_HWMON=y
+CONFIG_THERMAL_STATISTICS=y
 CONFIG_THERMAL_WRITABLE_TRIPS=y
 CONFIG_CPU_THERMAL=y
 CONFIG_IMX_THERMAL=y
-- 
2.7.4

RE: [PATCH] riscv: Support non-coherency memory model

2019-04-23 Thread Gary Guo




> -Original Message-
> From: Guo Ren 
> Sent: Wednesday, April 24, 2019 03:08
> To: Gary Guo 
> Cc: Christoph Hellwig ; linux-a...@vger.kernel.org; Palmer
> Dabbelt ; Andrew Waterman ; Arnd
> Bergmann ; Anup Patel ; Xiang
> Xiaoyan ; linux-kernel@vger.kernel.org; Mike
> Rapoport ; Vincent Chen ;
> Greentime Hu ; ren_...@c-sky.com; linux-
> ri...@lists.infradead.org; Marek Szyprowski ;
> Robin Murphy ; Scott Wood ;
> tech-privile...@lists.riscv.org
> Subject: Re: [PATCH] riscv: Support non-coherency memory model
> 
> Hi Gary,
> 
> On Tue, Apr 23, 2019 at 03:57:30PM +, Gary Guo wrote:
> > >>> Another point is we could get more attribute bits by modify the riscv
> > >>> spec:
> > >>>   - Remove Global bit, I think it's duplicate with the User bit in 
> > >>> linux.
> > >>
> > >> It is in Linux, but it is conceptually very different.
> > > Yes, but hardware could ignore one of them and in riscv linux
> > > _PAGE_GLOBAL is no use at all, see:
> > > grep _PAGE_GLOBAL arch/riscv -r
> > >
> > > In fact, the _PAGE_KERNEL for pte doesn't contain _PAGE_GLOBAL and it
> > > works on FU540 and qemu. As I've mentioned page attribute bits is very
> > > precious, define a useless bit make people confused.
> >  >
> >
> > The fact that it isn't used yet doesn't imply it is not useful. We don't
> > use ASIDs at the moment, and without using ASIDs the "global" bit is
> > indeed not useful. However with ASIDs the bit will be vital for saving
> > TLB spaces. Without the global bit, the kernel pages become synonyms to
> > themselves (i.e. they have different tags in TLB but refer to the same
> > physical page).
> >
> > The global bit also exists in many other ISAs as well. It's definitely
> > not a "useless" bits.
> >
> > Moreover, this bit is already implemented in both Rocket and Ariane. It
> > is also in the spec for quite a while. The fact that Linux doesn't use
> > it at the moment is not a reason for removing it.
> >
> 
> Look:
> linux-next git:(riscv_asid_allocator_v2)$ grep GLOBAL arch/riscv -r
> arch/riscv/include/asm/pgtable-bits.h:#define _PAGE_GLOBAL(1 << 5)/*
> Global */
> arch/riscv/include/asm/pgtable-bits.h:
> _PAGE_USER |
> _PAGE_GLOBAL))
> 
> Your patch tell us _PAGE_USER and _PAGE_GLOBAL are duplicate and why we
> couldn't make _PAGE_USER implies _PAGE_GLOBAL? Can you give an example
> of a real scene in PTE about:
>   _PAGE_USER:0 + _PAGE_GLOBAL:1
> or
>   _PAGE_USER:1 + _PAGE_GLOBAL:0
> 
> Of cause I know USER & GLOBAL are conceptually very different, but
> there are only 10 attribute-bits for riscv (In fact we've wasted two bits
> to support huge RV32-pfn :P). So I think it is time to merge these two bits
> before hardware supports GLOBAL. Reserve them for future!

Two cases I can think of:
* vdso like things. They're user pages that can really be shared across address 
spaces (i.e. global). Kernels like L4 implement most systems calls similar to 
VDSO, so USER + GLOBAL is useful.
* hypervisor without H-extension: This requires shadow page tables. Supervisor 
pages are mapped to supervisor shadow pages. However these shadow pages cannot 
be GLOBAL because they can't be shared between VMs. So  !USER + !GLOBAL is 
useful.

Remember Linux isn't the only supervisor software that RISC-V cares! 

> 
> Best Regards
>  Guo Ren

Re: [PATCH V4 06/16] PCI: dwc: Add ext config space capability search API

2019-04-23 Thread Vidya Sagar


On 4/24/2019 2:05 AM, Bjorn Helgaas wrote:

On Tue, Apr 23, 2019 at 01:57:20PM +0530, Vidya Sagar wrote:

Add extended configuration space capability search API using struct dw_pcie *
pointer

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from [v3]:
* None

Changes from [v2]:
* None

Changes from [v1]:
* This is a new patch in v2 series

  drivers/pci/controller/dwc/pcie-designware.c | 41 
  drivers/pci/controller/dwc/pcie-designware.h |  1 +
  2 files changed, 42 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 6a98135244d6..ecf5fe8842f6 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -53,6 +53,47 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
  }
  


Please make sure there's a comment here about why
pci_find_ext_capability() can't be used (a comment covering both this
and pci_find_capability() is fine, if the reason is the same).

Reason is same that standard pci_find_ext_capability() uses 'struct pci_dev 
*dev' pointer
and can only be used post enumeration whereas APIs being added here use 'struct 
dw_pcie *pci'
and can be used before link up also.
I'll add a comment in the other patch where I'm moving these APIs from 
pcie-designware-ep.c file
to pcie-designware.c file.




+static int dw_pcie_find_next_ext_capability(struct dw_pcie *pci, int start,
+   int cap)
+{
+   u32 header;
+   int ttl;
+   int pos = PCI_CFG_SPACE_SIZE;
+
+   /* minimum 8 bytes per capability */
+   ttl = (PCI_CFG_SPACE_EXP_SIZE - PCI_CFG_SPACE_SIZE) / 8;
+
+   if (start)
+   pos = start;
+
+   header = dw_pcie_readl_dbi(pci, pos);
+   /*
+* If we have no capabilities, this is indicated by cap ID,
+* cap version and next pointer all being 0.
+*/
+   if (header == 0)
+   return 0;
+
+   while (ttl-- > 0) {
+   if (PCI_EXT_CAP_ID(header) == cap && pos != start)
+   return pos;
+
+   pos = PCI_EXT_CAP_NEXT(header);
+   if (pos < PCI_CFG_SPACE_SIZE)
+   break;
+
+   header = dw_pcie_readl_dbi(pci, pos);
+   }
+
+   return 0;
+}
+
+int dw_pcie_find_ext_capability(struct dw_pcie *pci, int cap)
+{
+   return dw_pcie_find_next_ext_capability(pci, 0, cap);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_find_ext_capability);
+
  int dw_pcie_read(void __iomem *addr, int size, u32 *val)
  {
if (!IS_ALIGNED((uintptr_t)addr, size)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 35160b4ce929..67307842e003 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -249,6 +249,7 @@ struct dw_pcie {
container_of((endpoint), struct dw_pcie, ep)
  
  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);

+int dw_pcie_find_ext_capability(struct dw_pcie *pci, int cap);
  
  int dw_pcie_read(void __iomem *addr, int size, u32 *val);

  int dw_pcie_write(void __iomem *addr, int size, u32 val);
--
2.17.1

Re: [PATCH V4 05/16] PCI: dwc: Move config space capability search API

2019-04-23 Thread Vidya Sagar


On 4/24/2019 2:02 AM, Bjorn Helgaas wrote:

On Tue, Apr 23, 2019 at 01:57:19PM +0530, Vidya Sagar wrote:

Move PCIe config space capability search API to common DesignWare file
as this can be used by both host and ep mode codes.

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from [v3]:
* Rebased to linux-next top of the tree

Changes from [v2]:
* None

Changes from [v1]:
* Removed dw_pcie_find_next_ext_capability() API from here and made a
   separate patch for that

  drivers/pci/controller/dwc/pcie-designware.c | 33 
  drivers/pci/controller/dwc/pcie-designware.h |  2 ++


You claim this is a "move", but I only see adds.  Where did it move
*from*?

These are supposed to be moved from pcie-designware-ep.c file. That was the case
with my old patches but when I rebased them onto ToT, I missed the change that
removes them from pcie-designware-ep.c file. Thanks for catching this. I'll
address it in the next patch.



While you're at it, can you add a comment in the code about why we
can't use the regular pci_find_capability() interface?  It's really a
shame to have to reimplement that.

Regular pci_find_capability() uses 'struct pci_dev *dev' pointer and can be used
only after device enumeration is done. Whereas, these APIs are used particularly
before link up and use 'struct dw_pcie *pci' pointer.




  2 files changed, 35 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 8e0081ccf83b..6a98135244d6 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -20,6 +20,39 @@
  #define PCIE_PHY_DEBUG_R1_LINK_UP (0x1 << 4)
  #define PCIE_PHY_DEBUG_R1_LINK_IN_TRAINING(0x1 << 29)
  
+static u8 __dw_pcie_find_next_cap(struct dw_pcie *pci, u8 cap_ptr,

+ u8 cap)
+{
+   u8 cap_id, next_cap_ptr;
+   u16 reg;
+
+   reg = dw_pcie_readw_dbi(pci, cap_ptr);
+   next_cap_ptr = (reg & 0xff00) >> 8;
+   cap_id = (reg & 0x00ff);
+
+   if (!next_cap_ptr || cap_id > PCI_CAP_ID_MAX)
+   return 0;
+
+   if (cap_id == cap)
+   return cap_ptr;
+
+   return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
+}
+
+u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
+{
+   u8 next_cap_ptr;
+   u16 reg;
+
+   reg = dw_pcie_readw_dbi(pci, PCI_CAPABILITY_LIST);
+   next_cap_ptr = (reg & 0x00ff);
+
+   if (!next_cap_ptr)
+   return 0;
+
+   return __dw_pcie_find_next_cap(pci, next_cap_ptr, cap);
+}
+
  int dw_pcie_read(void __iomem *addr, int size, u32 *val)
  {
if (!IS_ALIGNED((uintptr_t)addr, size)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 9ee98ced1ef6..35160b4ce929 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -248,6 +248,8 @@ struct dw_pcie {
  #define to_dw_pcie_from_ep(endpoint)   \
container_of((endpoint), struct dw_pcie, ep)
  
+u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);

+
  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
  int dw_pcie_write(void __iomem *addr, int size, u32 val);
  
--

2.17.1

Re: [EXT] Re: [PATCH v1 01/15] Revert "ARM: dts: imx6q: Use correct SDMA script for SPI5 core"

2019-04-23 Thread Robin Gong

On 2019-04-23 at 11:02 -0300, Fabio Estevam wrote:
> Hi Robin,
> 
> On Tue, Apr 23, 2019 at 10:50 AM Robin Gong 
> wrote:
> > 
> > 
> > This reverts commit df07101e1c4a29e820df02f9989a066988b160e6.
> You need to provide a detailed explanation in the commit log as to
> why
> the revert is needed.
Okay, will address your comments into V2.

RE: [PATCH V11 0/5] Add i.MX7ULP EVK PWM backlight support

2019-04-23 Thread Anson Huang

Gentle ping...

> -Original Message-
> From: Anson Huang
> Sent: Wednesday, April 10, 2019 9:47 AM
> To: thierry.red...@gmail.com; robh...@kernel.org; mark.rutl...@arm.com;
> shawn...@kernel.org; s.ha...@pengutronix.de; ker...@pengutronix.de;
> feste...@gmail.com; li...@armlinux.org.uk; ste...@agner.ch;
> ota...@ossystems.com.br; Leonard Crestez ;
> Robin Gong ; u.kleine-koe...@pengutronix.de; linux-
> p...@vger.kernel.org; devicet...@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-kernel@vger.kernel.org
> Cc: dl-linux-imx 
> Subject: [PATCH V11 0/5] Add i.MX7ULP EVK PWM backlight support
> 
> i.MX7ULP EVK board has MIPI-DSI display, its backlight is supplied by TPM
> PWM module, this patch set enables i.MX7ULP TPM PWM driver support and
> also add backlight support for MIPI-DSI display.
> 
> Changes since V10:
>   - ONLY change the pwm driver patch.
> 
> Anson Huang (5):
>   dt-bindings: pwm: Add i.MX TPM PWM binding
>   pwm: Add i.MX TPM PWM driver support
>   ARM: imx_v6_v7_defconfig: Add TPM PWM support by default
>   ARM: dts: imx7ulp: Add tpm pwm support
>   ARM: dts: imx7ulp-evk: Add backlight support
> 
>  .../devicetree/bindings/pwm/imx-tpm-pwm.txt|  22 +
>  arch/arm/boot/dts/imx7ulp-evk.dts  |  21 +
>  arch/arm/boot/dts/imx7ulp.dtsi |  10 +
>  arch/arm/configs/imx_v6_v7_defconfig   |   1 +
>  drivers/pwm/Kconfig|  11 +
>  drivers/pwm/Makefile   |   1 +
>  drivers/pwm/pwm-imx-tpm.c  | 442 
> +
>  7 files changed, 508 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/pwm/imx-tpm-
> pwm.txt
>  create mode 100644 drivers/pwm/pwm-imx-tpm.c
> 
> --
> 2.7.4

Re: [PATCH 3/3] RAS/CEC: immediate soft-offline page when count_threshold == 1

2019-04-23 Thread WANG Chao

On 04/20/19 at 01:57P, Borislav Petkov wrote:
> On Thu, Apr 18, 2019 at 11:41:15AM +0800, WANG Chao wrote:
> > count_threshol == 1 isn't working as expected. CEC only does soft
> > offline the second time the same pfn is hit by a correctable error.
> 
> So this?
> 
> ---
> diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
> index b3c377ddf340..750a427e1a73 100644
> --- a/drivers/ras/cec.c
> +++ b/drivers/ras/cec.c
> @@ -333,6 +333,7 @@ int cec_add_elem(u64 pfn)
>  
>   mutex_lock(_mutex);
>  
> + /* Array full, free the LRU slot. */
>   if (ca->n == MAX_ELEMS)
>   WARN_ON(!del_lru_elem_unlocked(ca));
>  
> @@ -346,14 +347,9 @@ int cec_add_elem(u64 pfn)
>   (void *)>array[to],
>   (ca->n - to) * sizeof(u64));
>  
> - ca->array[to] = (pfn << PAGE_SHIFT) |
> - (DECAY_MASK << COUNT_BITS) | 1;
> + ca->array[to] = (pfn << PAGE_SHIFT) | 1;
>  
>   ca->n++;
> -
> - ret = 0;
> -
> - goto decay;
>   }
>  
>   count = COUNT(ca->array[to]);
> @@ -386,7 +382,6 @@ int cec_add_elem(u64 pfn)
>   goto unlock;
>   }
>  
> -decay:
>   ca->decay_count++;
>  
>   if (ca->decay_count >= CLEAN_ELEMS)

It looks good to me. Thanks for a better fix.

[PATCH -next] staging: kpc2000: fix platform_no_drv_owner.cocci warnings

2019-04-23 Thread YueHaibing

Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: YueHaibing 
---
 drivers/staging/kpc2000/kpc_spi/spi_driver.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/kpc2000/kpc_spi/spi_driver.c 
b/drivers/staging/kpc2000/kpc_spi/spi_driver.c
index b38149b752fb..63b4616bf538 100644
--- a/drivers/staging/kpc2000/kpc_spi/spi_driver.c
+++ b/drivers/staging/kpc2000/kpc_spi/spi_driver.c
@@ -496,7 +496,6 @@ kp_spi_remove(struct platform_device *pldev)
 static struct platform_driver kp_spi_driver = {
 .driver = {
 .name = KP_DRIVER_NAME_SPI,
-.owner =THIS_MODULE,
 },
 .probe =kp_spi_probe,
 .remove =   kp_spi_remove,

[PATCH -next] staging: kpc2000: remove duplicated include from kp2000_module.c

2019-04-23 Thread YueHaibing

Remove duplicated include.

Signed-off-by: YueHaibing 
---
 drivers/staging/kpc2000/kpc2000/kp2000_module.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/kpc2000/kpc2000/kp2000_module.c 
b/drivers/staging/kpc2000/kpc2000/kp2000_module.c
index 661b0b74ed66..fa3bd266ba54 100644
--- a/drivers/staging/kpc2000/kpc2000/kp2000_module.c
+++ b/drivers/staging/kpc2000/kpc2000/kp2000_module.c
@@ -5,7 +5,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include

[PATCH] cpufreq: qoriq: Add ls1028a chip support

2019-04-23 Thread andy . tang

From: Yuantian Tang 

Enable cpufreq feature on ls1028a chip by adding its compatible
string.

Signed-off-by: Yuantian Tang 
---
 drivers/cpufreq/qoriq-cpufreq.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
index 4295e54..d308c4d 100644
--- a/drivers/cpufreq/qoriq-cpufreq.c
+++ b/drivers/cpufreq/qoriq-cpufreq.c
@@ -280,6 +280,7 @@ static int qoriq_cpufreq_target(struct cpufreq_policy 
*policy,
 
{ .compatible = "fsl,ls1012a-clockgen", },
{ .compatible = "fsl,ls1021a-clockgen", },
+   { .compatible = "fsl,ls1028a-clockgen", },
{ .compatible = "fsl,ls1043a-clockgen", },
{ .compatible = "fsl,ls1046a-clockgen", },
{ .compatible = "fsl,ls1088a-clockgen", },
-- 
1.7.1

[PATCH] dt-bindings: qoriq-clock: Add ls1028a chip compatible string

2019-04-23 Thread andy . tang

From: Yuantian Tang 

Add ls1028a chip compatible string in binding document.

Signed-off-by: Yuantian Tang 
---
 .../devicetree/bindings/clock/qoriq-clock.txt  |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/qoriq-clock.txt 
b/Documentation/devicetree/bindings/clock/qoriq-clock.txt
index c655f28..9cf4a07 100644
--- a/Documentation/devicetree/bindings/clock/qoriq-clock.txt
+++ b/Documentation/devicetree/bindings/clock/qoriq-clock.txt
@@ -39,6 +39,7 @@ Required properties:
* "fsl,b4860-clockgen"
* "fsl,ls1012a-clockgen"
* "fsl,ls1021a-clockgen"
+   * "fsl,ls1028a-clockgen"
* "fsl,ls1043a-clockgen"
* "fsl,ls1046a-clockgen"
* "fsl,ls1088a-clockgen"
-- 
1.7.1

[PATCH v1 1/1] Return the verified kernel image signature in kexec_file_load

2019-04-23 Thread nramas




From: Lakshmi Ramasubramanian 

Signed-off-by: Lakshmi Ramasubramanian 
---
When CONFIG_KEXEC_VERIFY_SIG is selected the signature on 
the new kernel image is verified in kexec_file_load.

The signature is embedded in the kernel image file.

This change returns the pointer to the verified signature and 
the length of that signature. kexec_file_load can log this

signature for attestation (To attest the signer of the new kernel).
The change to log the kernel signature for attestation will be added
in a future change set.


 arch/x86/kernel/kexec-bzimage64.c  |  7 +--
 arch/x86/kernel/machine_kexec_64.c | 10 +++---
 crypto/asymmetric_keys/verify_pefile.c | 18 +-
 include/linux/kexec.h  |  8 ++--
 include/linux/verification.h   |  4 +++-
 kernel/kexec_file.c| 14 +++---
 6 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/kexec-bzimage64.c 
b/arch/x86/kernel/kexec-bzimage64.c
index 9d7fd5e6689a..030abd8adbce 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -530,11 +530,14 @@ static int bzImage64_cleanup(void *loader_data)
 }

 #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG
-static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
+static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len,
+   unsigned int *signature_len, void **signature)
 {
return verify_pefile_signature(kernel, kernel_len,
   NULL,
-  VERIFYING_KEXEC_PE_SIGNATURE);
+  VERIFYING_KEXEC_PE_SIGNATURE,
+  signature_len,
+  signature);
 }
 #endif

diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 6f5ca4ebe6e5..b556a9750dd4 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -405,15 +405,19 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image)
 }

 #ifdef CONFIG_KEXEC_VERIFY_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
-unsigned long kernel_len)
+int arch_kexec_kernel_verify_sig(struct kimage *image,
+   void *kernel,
+   unsigned long kernel_len,
+   unsigned int *signature_len,
+   void **signature)
 {
if (!image->fops || !image->fops->verify_sig) {
pr_debug("kernel loader does not support signature 
verification.");
return -EKEYREJECTED;
}

-   return image->fops->verify_sig(kernel, kernel_len);
+   return image->fops->verify_sig(kernel, kernel_len,
+   signature_len, signature);
 }
 #endif

diff --git a/crypto/asymmetric_keys/verify_pefile.c 
b/crypto/asymmetric_keys/verify_pefile.c
index 672a94c2c3ff..588a7966922f 100644
--- a/crypto/asymmetric_keys/verify_pefile.c
+++ b/crypto/asymmetric_keys/verify_pefile.c
@@ -394,6 +394,12 @@ static int pefile_digest_pe(const void *pebuf, unsigned 
int pelen,
  * @pelen: Length of the binary image
  * @trust_keys: Signing certificate(s) to use as starting points
  * @usage: The use to which the key is being put.
+ * @signature_len: If non-NULL the number of bytes in the signature
+ * will be returned in this out parameter
+ * @signature: If non-NULL a pointer to the buffer containing
+ * the file signature will be returned in this out parameter.
+ * The pointer being returned is actually within the buffer
+ * pointed to by pebuf. So the caller should not try to free it.
  *
  * Validate that the certificate chain inside the PKCS#7 message inside the PE
  * binary image intersects keys we already know and trust.
@@ -418,7 +424,9 @@ static int pefile_digest_pe(const void *pebuf, unsigned int 
pelen,
  */
 int verify_pefile_signature(const void *pebuf, unsigned pelen,
struct key *trusted_keys,
-   enum key_being_used_for usage)
+   enum key_being_used_for usage,
+   unsigned int *signature_len,
+   void **signature)
 {
struct pefile_context ctx;
int ret;
@@ -448,6 +456,14 @@ int verify_pefile_signature(const void *pebuf, unsigned 
pelen,
 * contents.
 */
ret = pefile_digest_pe(pebuf, pelen, );
+   if (ret < 0)
+   goto error;
+
+   /* Check if the caller needs the file signature */
+   if (signature_len != NULL && signature != NULL) {
+   *signature_len = ctx.sig_len;
+   *signature = pebuf + ctx.sig_offset;
+   }

 error:
kfree(ctx.digest);
diff --git

Re: [PATCH] riscv: Support non-coherency memory model

2019-04-23 Thread Guo Ren

Hi Gary,

On Tue, Apr 23, 2019 at 03:57:30PM +, Gary Guo wrote:
> >>> Another point is we could get more attribute bits by modify the riscv
> >>> spec:
> >>>   - Remove Global bit, I think it's duplicate with the User bit in linux.
> >>
> >> It is in Linux, but it is conceptually very different.
> > Yes, but hardware could ignore one of them and in riscv linux
> > _PAGE_GLOBAL is no use at all, see:
> > grep _PAGE_GLOBAL arch/riscv -r
> > 
> > In fact, the _PAGE_KERNEL for pte doesn't contain _PAGE_GLOBAL and it
> > works on FU540 and qemu. As I've mentioned page attribute bits is very
> > precious, define a useless bit make people confused.
>  >
> 
> The fact that it isn't used yet doesn't imply it is not useful. We don't 
> use ASIDs at the moment, and without using ASIDs the "global" bit is 
> indeed not useful. However with ASIDs the bit will be vital for saving 
> TLB spaces. Without the global bit, the kernel pages become synonyms to 
> themselves (i.e. they have different tags in TLB but refer to the same 
> physical page).
> 
> The global bit also exists in many other ISAs as well. It's definitely 
> not a "useless" bits.
> 
> Moreover, this bit is already implemented in both Rocket and Ariane. It 
> is also in the spec for quite a while. The fact that Linux doesn't use 
> it at the moment is not a reason for removing it.
> 

Look:
linux-next git:(riscv_asid_allocator_v2)$ grep GLOBAL arch/riscv -r
arch/riscv/include/asm/pgtable-bits.h:#define _PAGE_GLOBAL(1 << 5)/* 
Global */
arch/riscv/include/asm/pgtable-bits.h:
_PAGE_USER | _PAGE_GLOBAL))

Your patch tell us _PAGE_USER and _PAGE_GLOBAL are duplicate and why we
couldn't make _PAGE_USER implies _PAGE_GLOBAL? Can you give an example
of a real scene in PTE about:
  _PAGE_USER:0 + _PAGE_GLOBAL:1
or
  _PAGE_USER:1 + _PAGE_GLOBAL:0

Of cause I know USER & GLOBAL are conceptually very different, but
there are only 10 attribute-bits for riscv (In fact we've wasted two bits
to support huge RV32-pfn :P). So I think it is time to merge these two bits
before hardware supports GLOBAL. Reserve them for future!

Best Regards
 Guo Ren

Re: [PATCH] KVM: x86: Add Intel CPUID.1F cpuid emulation support

2019-04-23 Thread Like Xu


On 2019/4/24 1:44, Sean Christopherson wrote:

On Tue, Apr 23, 2019 at 11:23:59AM +0800, Like Xu wrote:

On 2019/4/23 2:35, Sean Christopherson wrote:

  #define F(x) bit(X86_FEATURE_##x)
  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -426,6 +436,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,
switch (function) {
case 0:
entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd));
+   entry->eax = kvm_supported_intel_mcp() ? 0x1f : entry->eax;


This all seems unnecessary.  And by 'all', I mean the existing Intel PT
and XSAVE leaf checks, as well as the new mcp check.  entry->eax comes
directly from hardware, and unless I missed something, PT and XSAVE are
only exposed to the guest when they're supported in hardware.  In other
words, KVM will never need to adjust entry->eax to expose PT or XSAVE.


We call this function for both case KVM_GET_SUPPORTED_CPUID and
KVM_GET_EMULATED_CPUID although kvm user could reconfig them via
KVM_SET_CPUID* path.


Not that it matters, but __do_cpuid_ent() is only used for the non-emulated
case, KVM_GET_EMULATED_CPUID goes to __do_cpuid_ent_emulated().


It's true and I have to mention we have two scenarios to get vCPUID:

1. For kvm_dev, we have KVM_GET_EMULATED_CPUID for 
kvm_dev_ioctl_get_cpuid; (we're talking about this)


2. For kvm_vcpu,we have KVM_GET_CPUID2 for kvm_vcpu_ioctl_get_cpuid2;

  

The original min() check was added by commit 0771671749b5 ("KVM: Enhance
guest cpuid management"), which doesn't provide any explicit information
on why KVM does min() in the first place.


Exposing cpuid.0.eax in a blind way (with host hardware support)
is not a good practice for guest migration and improves compatibility
requirements.


Right, but isn't the f_intel_pt check for example completely irrelevant?
f_intel_pt is true if and only if hardware supports PT, i.e. CPUID.0.EAX
and thus entry->eax will already be >=0x14.


The f_intel_pt check is not only about hardware supports check but also 
module_param (pt_mode) supports check.


So the case is the host does have PT support which means (host 
CPUID.0.EAX already be >=0x14 for Intel CPUs) but kvm doesn't want 
advertise it and thus the min() operation is needed.




I don't fully understand whether or not KVM needs to raise the minimum to
0xb regardless of h/w XSAVE support, but it's likely irrelevant in the end.

Anyways, back to 0x1f, kvm_supported_intel_mcp() returns true if and only
if hardware's CPUID.0.EAX >= 0x1f, 


According to latest SDM, the max hardware CPUID.0.EAX is 0x1f and BIOS 
would expose 0x1f only for multi-chip packaging CPUs (at least for now).



i.e. adjusting entry->eax is always a
nop.  So if KVM wants to advertise leaf 0x1f only when it's supported in
hardware then adjusting entry->eax is unnecessary, and if KVM wants to
unconditionally advertise 0x1f then adjusting entry->eax should also be
done unconditionally.


It we have no check on kvm_supported_intel_mcp() in legacy code,
CPUID.0.EAX would be min() and thus less than 0x1f which means the 
cpuid.1f info is not exposed.


I know your point is to avoid min() totally (I thought so at the time) 
and I have pointed out it's necessary for kvm features setting.


If KVM wants to unconditionally advertise 0x1f (in EMULATED way),
kvm needs cover other side effects and this patch only advertises 0x1f
when hardware has it.

It's very common that guest wants to set 0x1f regardless of h/w support
and this is another story.




Given that the original code
was "entry->eax = min(entry->eax, (u32)0xb);", my *guess* is that the
idea was to always report "Extended Topology Enumeration Leaf" as
supported so that userspace can enumerate the VM's topology to the guest
even when hardware itself doesn't do so.


If the host cpu mode is too antiquated to support 0xb, it wouldn't report
0xb for sure. The host cpuid.0.eax has been over 0xb for a long time and
reached 0x1f in the latest SDM.

AFAICT, the original code keeps minimum cpuid.0.eax out of features guest
just used or at least it claimed to use.



Assuming we want to allow userspace to use "V2 Extended Topology
Enumeration Leaf" regardless of hardware support, then this can simply be:

   entry->eax = min(entry->eax, (u32)0x1f);

Or am I completely missing something?

[PATCH] tools lib traceevent: Change tag string for error

2019-04-23 Thread Leo Yan

The traceevnt lib is used by perf tool, when execute 'perf test -v 6' it
outputs error log on ARM64 platform:

  running test 33 '*:*'trace-cmd: No such file or directory

  [...]

  trace-cmd: Invalid argument

The trace event parsing code originally came from trace-cmd so it keeps
the tag string "trace-cmd" for errors, this easily introduces the
impression that perf tool launches trace-cmd command for trace event
parsing, but in fact the related parsing is accomplished by traceevent
lib.

This patch changes the tag string to "libtraceevent" so can avoid
confusion and let users to be more easily to connect the error with
traceevent lib.

Signed-off-by: Leo Yan 
---
 tools/lib/traceevent/parse-utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-utils.c 
b/tools/lib/traceevent/parse-utils.c
index 77e4ec6402dd..e99867111387 100644
--- a/tools/lib/traceevent/parse-utils.c
+++ b/tools/lib/traceevent/parse-utils.c
@@ -14,7 +14,7 @@
 void __vwarning(const char *fmt, va_list ap)
 {
if (errno)
-   perror("trace-cmd");
+   perror("libtraceevent");
errno = 0;
 
fprintf(stderr, "  ");
-- 
2.17.1

Re: [PATCH v20 16/28] x86/sgx: Add provisioning

2019-04-23 Thread Jethro Beekman


On 2019-04-17 03:39, Jarkko Sakkinen wrote:

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 7bf627ac4958..3b80acde8671 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -16,6 +16,8 @@
_IOW(SGX_MAGIC, 0x01, struct sgx_enclave_add_page)
  #define SGX_IOC_ENCLAVE_INIT \
_IOW(SGX_MAGIC, 0x02, struct sgx_enclave_init)
+#define SGX_IOC_ENCLAVE_SET_ATTRIBUTE \
+   _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_set_attribute)


Need to update Documentation/ioctl/ioctl-number.txt as well

--
Jethro Beekman | Fortanix



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH 1/2] dt-bindings: iio: tsl2772: convert bindings to YAML format

2019-04-23 Thread Rob Herring

On Mon, Apr 22, 2019 at 7:52 AM Jonathan Cameron  wrote:
>
> On Tue, 16 Apr 2019 04:45:51 -0400
> Brian Masney  wrote:
>
> > Convert the tsl2772 device tree bindings to the new YAML format.
> >
> > Signed-off-by: Brian Masney 
> Hi Brian,
>
> Good to see this.  I'm afraid it's all a bit new to me so what
> I haven't yet understood is how prescriptive we should be.
> For example, are the phandle references below needed or not?
>
> So for a while yet I'm going to be relying on Rob and others
> to review these and put me on the right track.
>
> Jonathan
>
> > ---
> >  .../devicetree/bindings/iio/light/tsl2772.txt | 42 -
> >  .../bindings/iio/light/tsl2772.yaml   | 85 +++
> >  2 files changed, 85 insertions(+), 42 deletions(-)
> >  delete mode 100644 Documentation/devicetree/bindings/iio/light/tsl2772.txt
> >  create mode 100644 Documentation/devicetree/bindings/iio/light/tsl2772.yaml
> >

> > diff --git a/Documentation/devicetree/bindings/iio/light/tsl2772.yaml 
> > b/Documentation/devicetree/bindings/iio/light/tsl2772.yaml
> > new file mode 100644
> > index ..b3ac182288d2
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/iio/light/tsl2772.yaml
> > @@ -0,0 +1,85 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)

Do you have the rights on the original file to add BSD license?

> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/iio/light/tsl2772.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: AMS/TAOS Ambient Light Sensor (ALS) and Proximity Detector
> > +
> > +maintainers:
> > +  - Brian Masney 
> > +
> > +description: |
> > +  Ambient light sensing and proximity detection with an i2c interface.
> > +  https://ams.com/documents/20143/36005/TSL2772_DS000181_2-00.pdf
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - amstaos,tsl2571
> > +  - amstaos,tsl2671
> > +  - amstaos,tmd2671
> > +  - amstaos,tsl2771
> > +  - amstaos,tmd2771
> > +  - amstaos,tsl2572
> > +  - amstaos,tsl2672
> > +  - amstaos,tmd2672
> > +  - amstaos,tsl2772
> > +  - amstaos,tmd2772
> > +  - avago,apds9930
> > +
> > +  reg:
> > +description: The I2C address of the device

No need for description on common properties unless you have something
unique to add.

> > +maxItems: 1
> > +
> > +  amstaos,proximity-diodes:
> > +description: Proximity diodes to enable
> > +allOf:
> > +  - $ref: /schemas/types.yaml#/definitions/uint32-array
> > +  - minItems: 1
> > +maxItems: 2
> > +items:
> > +  minimum: 0
> > +  maximum: 1
>
> Do we need to represent that these can't be <1 0> ?
> (specified in old docs)
> We also have a tighter spec than the uint32-array format in types.yaml
> as don't allow <0>, <1> under the current binding where only <0, 1> is
> allowed.

The uint32-array definition is loose to avoid lots of warnings on
common properties where we can't define the exact size. Over time we
need to tighten up the dts syntax to be consistent (I have a dtc patch
which can spew out lots of warnings to fix these).

So this case is correct. '<0, 1>' means a single array with 2
elements. '<0>, <1>' means 2 arrays each with a single element. There
shouldn't be lots of occurrences.

> > +
> > +  interrupts:
> > +description: Interrupt generated by the device

No need for description.

> > +maxItems: 1
> > +
> > +  led-max-microamp:
> > +description: Current for the proximity LED

This device only has proximity LEDs, right?

> > +allOf:
> > +  - $ref: /schemas/types.yaml#/definitions/uint32
> > +  - enum: [13000, 25000, 5, 10]

We should assume a common schema for 'led-max-microamp', so we just
need the enum.

> > +
> > +  vdd-supply:
> > +$ref: /schemas/types.yaml#/definitions/phandle
> > +description: Regulator that provides power to the sensor

Same here for '*-supply'. So just the description is enough.

> > +
> > +  vddio-supply:
> > +$ref: /schemas/types.yaml#/definitions/phandle
> > +description: Regulator that provides power to the bus
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +
> > +examples:
> > +  - |
> > +#include 
> > +
> > +i2c {
> > +#address-cells = <1>;
> > +#size-cells = <0>;
> > +
> > +tsl2772@39 {

proximity-sensor@39

> > +compatible = "amstaos,tsl2772";
> > +reg = <0x39>;
> > +interrupts-extended = < 61 IRQ_TYPE_EDGE_FALLING>;
> > +vdd-supply = <_l17>;
> > +vddio-supply = <_lvs1>;
> > +amstaos,proximity-diodes = <0>;
> > +led-max-microamp = <10>;
> > +};
> > +};
> > +...
>

[PATCH] clk: qoriq: Add ls1028a clock configuration

2019-04-23 Thread andy . tang

From: Yuantian Tang 

Enable clock driver by adding clock configuration for ls1028a chip.

Signed-off-by: Yuantian Tang 
---
 drivers/clk/clk-qoriq.c |   68 +++
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/drivers/clk/clk-qoriq.c b/drivers/clk/clk-qoriq.c
index 1212a9b..8b0cb0b 100644
--- a/drivers/clk/clk-qoriq.c
+++ b/drivers/clk/clk-qoriq.c
@@ -245,6 +245,58 @@ static u32 cg_in(struct clockgen *cg, u32 __iomem *reg)
},
 };
 
+static const struct clockgen_muxinfo ls1028a_hwa1 = {
+   {
+   { CLKSEL_VALID, PLATFORM_PLL, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV3 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV4 },
+   {},
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV3 },
+   },
+};
+
+static const struct clockgen_muxinfo ls1028a_hwa2 = {
+   {
+   { CLKSEL_VALID, PLATFORM_PLL, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV3 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV4 },
+   {},
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV3 },
+   },
+};
+
+static const struct clockgen_muxinfo ls1028a_hwa3 = {
+   {
+   { CLKSEL_VALID, PLATFORM_PLL, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV3 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV4 },
+   {},
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV3 },
+   },
+};
+
+static const struct clockgen_muxinfo ls1028a_hwa4 = {
+   {
+   { CLKSEL_VALID, PLATFORM_PLL, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV1 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV3 },
+   { CLKSEL_VALID, CGA_PLL2, PLL_DIV4 },
+   {},
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV2 },
+   { CLKSEL_VALID, CGA_PLL1, PLL_DIV3 },
+   },
+};
+
 static const struct clockgen_muxinfo ls1043a_hwa1 = {
{
{},
@@ -508,6 +560,21 @@ static void __init t4240_init_periph(struct clockgen *cg)
.pll_mask = 0x03,
},
{
+   .compat = "fsl,ls1028a-clockgen",
+   .cmux_groups = {
+   _cmux_cga12
+   },
+   .hwaccel = {
+   _hwa1, _hwa2,
+   _hwa3, _hwa4
+   },
+   .cmux_to_group = {
+   0, 0, 0, 0, -1
+   },
+   .pll_mask = 0x07,
+   .flags = CG_VER3 | CG_LITTLE_ENDIAN,
+   },
+   {
.compat = "fsl,ls1043a-clockgen",
.init_periph = t2080_init_periph,
.cmux_groups = {
@@ -1423,6 +1490,7 @@ static void __init clockgen_init(struct device_node *np)
 CLK_OF_DECLARE(qoriq_clockgen_b4860, "fsl,b4860-clockgen", clockgen_init);
 CLK_OF_DECLARE(qoriq_clockgen_ls1012a, "fsl,ls1012a-clockgen", clockgen_init);
 CLK_OF_DECLARE(qoriq_clockgen_ls1021a, "fsl,ls1021a-clockgen", clockgen_init);
+CLK_OF_DECLARE(qoriq_clockgen_ls1028a, "fsl,ls1028a-clockgen", clockgen_init);
 CLK_OF_DECLARE(qoriq_clockgen_ls1043a, "fsl,ls1043a-clockgen", clockgen_init);
 CLK_OF_DECLARE(qoriq_clockgen_ls1046a, "fsl,ls1046a-clockgen", clockgen_init);
 CLK_OF_DECLARE(qoriq_clockgen_ls1088a, "fsl,ls1088a-clockgen", clockgen_init);
-- 
1.7.1

Re: [PATCH] ARM: dts: qcom-apq8064: Fix DSI PHY ref clk phandle

2019-04-23 Thread Stephen Rothwell

Hi Matthias,

On Tue, 23 Apr 2019 17:12:10 -0700 Matthias Kaehlcke  wrote:
>
> Commit 3560af5a56b5 ("ARM: dts: qcom-apq8064: Set 'xo_board' as
> ref clock of the DSI PHY") specifies the non-existing phandle
> 'xo_board' as DSI PHY ref clk. Fix this by using the correct
> phandle is 'cxo_board'.
> 
> Fixes: 3560af5a56b5 ("ARM: dts: qcom-apq8064: Set 'xo_board' as ref clock of 
> the DSI PHY")
> 
> Signed-off-by: Matthias Kaehlcke 

You should keep the tags in one group (no blank lines between them).
Also, you might like to add a Reported-by tag when you get reports from
others (unless you discovered this yourself, of course).
-- 
Cheers,
Stephen Rothwell


pgpnlS_0RINRM.pgp
Description: OpenPGP digital signature

[PATCH v2 2/2] arm64: dts: mt8183: Add auxadc device node

2019-04-23 Thread Zhiyong Tao

Add auxadc device node for MT8183

Signed-off-by: Zhiyong Tao 
---
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  4 
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 10 ++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
index 9b525597e5ec..49909acc6efa 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -26,6 +26,10 @@
};
 };
 
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 };
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 75c4881bbe5e..57580d973316 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -269,6 +269,16 @@
clock-names = "spi", "wrap";
};
 
+   auxadc: auxadc@11001000 {
+   compatible = "mediatek,mt8183-auxadc",
+"mediatek,mt8173-auxadc";
+   reg = <0 0x11001000 0 0x1000>;
+   clocks = < CLK_INFRA_AUXADC>;
+   clock-names = "main";
+   #io-channel-cells = <1>;
+   status = "disabled";
+   };
+
uart0: serial@11002000 {
compatible = "mediatek,mt8183-uart",
 "mediatek,mt6577-uart";
-- 
2.12.5

[PATCH v2 1/2] dt-bindings: adc: mt8183: add binding document

2019-04-23 Thread Zhiyong Tao

The commit adds mt8183 compatible node in binding document.

Signed-off-by: Zhiyong Tao 
---
 Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt 
b/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
index 0df9befdaecc..936a0b4666da 100644
--- a/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
+++ b/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
@@ -15,6 +15,7 @@ Required properties:
 - "mediatek,mt2712-auxadc": For MT2712 family of SoCs
 - "mediatek,mt7622-auxadc": For MT7622 family of SoCs
 - "mediatek,mt8173-auxadc": For MT8173 family of SoCs
+- "mediatek,mt8183-auxadc", "mediatek,mt8173-auxadc": For MT8183 family of 
SoCs
   - reg: Address range of the AUXADC unit.
   - clocks: Should contain a clock specifier for each entry in clock-names
   - clock-names: Should contain "main".
-- 
2.12.5

[PATCH v2 0/2] AUXADC: Mediatek auxadc driver on MT8183

2019-04-23 Thread Zhiyong Tao

This series includes two patches:
1.Add mt8183 auxadc compatible node in binding document.
1.Add mt8183 auxadc device node.

Changes in patch v2:
1)change auxadc compatible node in binding document for mt8183.

Zhiyong Tao (2):
  dt-bindings: adc: mt8183: add binding document
  arm64: dts: mt8183: Add auxadc device node

 Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt |  1 +
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  4 
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 10 ++
 3 files changed, 15 insertions(+)

--
2.12.5

Re: [PATCH v20 15/28] x86/sgx: Add the Linux SGX Enclave Driver

2019-04-23 Thread Jethro Beekman


On 2019-04-23 17:26, Sean Christopherson wrote:

On Tue, Apr 23, 2019 at 11:29:24PM +, Jethro Beekman wrote:

On 2019-04-22 14:58, Sean Christopherson wrote:

Now that the core SGX code is approaching stability, I'd like to start
sending RFCs for the EPC virtualization and KVM bits to hash out that side
of things.  The ACPI crud is the last chunk of code that would require
non-trivial changes to the core SGX code for the proposed virtualization
implementation.  I'd strongly prefer to get it out of the way before
sending the KVM RFCs.


What kind of changes? Wouldn't KVM just be another consumer of the same API
used by the driver?


Nope, userspace "only" needs to be able to mmap() arbitrary chunks of EPC.


I don't think this is sufficient. Don't you need enclave tracking in 
order to support paging?


--
Jethro Beekman | Fortanix



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH 2/2] seccomp: disallow NEW_LISTENER and TSYNC flags

2019-04-23 Thread Kees Cook

On Tue, Apr 23, 2019 at 4:34 PM Tycho Andersen  wrote:
>
> On Tue, Apr 23, 2019 at 04:31:45PM -0700, Kees Cook wrote:
> > On Tue, Apr 23, 2019 at 3:09 PM Kees Cook  wrote:
> > >
> > > On Wed, Mar 6, 2019 at 12:14 PM Tycho Andersen  wrote:
> > > >
> > > > As the comment notes, the return codes for TSYNC and NEW_LISTENER 
> > > > conflict,
> > > > because they both return positive values, one in the case of success and
> > > > one in the case of error. So, let's disallow both of these flags 
> > > > together.
> > > >
> > > > While this is technically a userspace break, all the users I know of are
> > > > still waiting on me to land this feature in libseccomp, so I think 
> > > > it'll be
> > > > safe. Also, at present my use case doesn't require TSYNC at all, so this
> > > > isn't a big deal to disallow. If someone wanted to support this, a path
> > > > forward would be to add a new flag like
> > > > TSYNC_AND_LISTENER_YES_I_UNDERSTAND_THAT_TSYNC_WILL_JUST_RETURN_EAGAIN, 
> > > > but
> > > > the use cases are so different I don't see it really happening.
> > > >
> > > > Finally, it's worth noting that this does actually fix a UAF issue: at 
> > > > the end
> > > > of seccomp_set_mode_filter(), we have:
> > > >
> > > > if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
> > > > if (ret < 0) {
> > > > listener_f->private_data = NULL;
> > > > fput(listener_f);
> > > > put_unused_fd(listener);
> > > > } else {
> > > > fd_install(listener, listener_f);
> > > > ret = listener;
> > > > }
> > > > }
> > > > out_free:
> > > > seccomp_filter_free(prepared);
> > > >
> > > > But if ret > 0 because TSYNC raced, we'll install the listener fd and 
> > > > then free
> > > > the filter out from underneath it, causing a UAF when the task closes 
> > > > it or
> > > > dies. This patch also switches the condition to be simply if (ret), so 
> > > > that
> > > > if someone does add the flag mentioned above, they won't have to 
> > > > remember
> > > > to fix this too.
> > > >
> > > > Signed-off-by: Tycho Andersen 
> > > > Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > > > CC: sta...@vger.kernel.org # v5.0+
> > >
> > > Thanks! Sorry I missed this. James, can you take this for Linus's
> > > fixes for v5.1? (Or should I send a pull request to you?)
> > >
> > > Acked-by: Kees Cook 
> > >
> > > Let's also add:
> > >
> > > Reported-by: syzbot+b562969adb2e04af3...@syzkaller.appspotmail.com
> > >
> > > > ---
> > > >  kernel/seccomp.c | 17 +++--
> > > >  1 file changed, 15 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > > > index d0d355ded2f4..79bada51091b 100644
> > > > --- a/kernel/seccomp.c
> > > > +++ b/kernel/seccomp.c
> > > > @@ -500,7 +500,10 @@ seccomp_prepare_user_filter(const char __user 
> > > > *user_filter)
> > > >   *
> > > >   * Caller must be holding current->sighand->siglock lock.
> > > >   *
> > > > - * Returns 0 on success, -ve on error.
> > > > + * Returns 0 on success, -ve on error, or
> > > > + *   - in TSYNC mode: the pid of a thread which was either not in the 
> > > > correct
> > > > + * seccomp mode or did not have an ancestral seccomp filter
> > > > + *   - in NEW_LISTENER mode: the fd of the new listener
> > > >   */
> > > >  static long seccomp_attach_filter(unsigned int flags,
> > > >   struct seccomp_filter *filter)
> > > > @@ -1256,6 +1259,16 @@ static long seccomp_set_mode_filter(unsigned int 
> > > > flags,
> > > > if (flags & ~SECCOMP_FILTER_FLAG_MASK)
> > > > return -EINVAL;
> > > >
> > > > +   /*
> > > > +* In the successful case, NEW_LISTENER returns the new 
> > > > listener fd.
> > > > +* But in the failure case, TSYNC returns the thread that died. 
> > > > If you
> > > > +* combine these two flags, there's no way to tell whether 
> > > > something
> > > > +* succeded or failed. So, let's disallow this combination.
> > >
> > > also a tiny typo: succeeded
> > >
> > > > +*/
> > > > +   if ((flags & SECCOMP_FILTER_FLAG_TSYNC) &&
> > > > +   (flags && SECCOMP_FILTER_FLAG_NEW_LISTENER))
> >
> > also a typo: && should be &
>
> Oh, yes. Do you want me to send another version?

Nah, I fixed it up. :)


-- 
Kees Cook

[PATCH] fs/quota: erase unused but set variable warning

2019-04-23 Thread Jiang Biao

Local variable *reserved* of remove_dquot_ref() is only used if
define CONFIG_QUOTA_DEBUG, but not ebraced in CONFIG_QUOTA_DEBUG
macro, which leads to unused-but-set-variable warning when compiling.

This patch ebrace it into CONFIG_QUOTA_DEBUG macro like what is done
in add_dquot_ref().

Signed-off-by: Jiang Biao 
---
 fs/quota/dquot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index fc20e06c56ba..14ee4c6deba1 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1049,7 +1049,9 @@ static void remove_dquot_ref(struct super_block *sb, int 
type,
struct list_head *tofree_head)
 {
struct inode *inode;
+#ifdef CONFIG_QUOTA_DEBUG
int reserved = 0;
+#endif
 
spin_lock(>s_inode_list_lock);
list_for_each_entry(inode, >s_inodes, i_sb_list) {
@@ -1061,8 +1063,10 @@ static void remove_dquot_ref(struct super_block *sb, int 
type,
 */
spin_lock(_data_lock);
if (!IS_NOQUOTA(inode)) {
+#ifdef CONFIG_QUOTA_DEBUG
if (unlikely(inode_get_rsv_space(inode) > 0))
reserved = 1;
+#endif
remove_inode_dquot_ref(inode, type, tofree_head);
}
spin_unlock(_data_lock);
-- 
2.17.2 (Apple Git-113)

[RFC PATCH v5 4/4] x86/acrn: Add hypercall for ACRN guest

2019-04-23 Thread Zhao Yakui

When ACRN hypervisor is detected, the hypercall is needed so that the
ACRN guest can query/config some settings. For example: it can be used
to query the resources in hypervisor and manage the CPU/memory/device/
interrupt for the guest operating system.

So add the hypercall so that ACRN guest can communicate with the
low-level ACRN hypervisor. It is implemented with the VMCALL instruction.

Co-developed-by: Jason Chen CJ 
Signed-off-by: Jason Chen CJ 
Signed-off-by: Zhao Yakui 
---
V1->V2: Refine the comments for the function of acrn_hypercall0/1/2
v2->v3: Use the "vmcall" mnemonic to replace hard-code byte definition
v4->v5: Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H to
align the header file of acrn_hypercall.h
Use the "VMCALL" mnemonic in comment/commit log.
Uppercase r8/rdi/rsi/rax for hypercall parameter registers in comment.
---
 arch/x86/include/asm/acrn_hypercall.h | 82 +++
 1 file changed, 82 insertions(+)
 create mode 100644 arch/x86/include/asm/acrn_hypercall.h

diff --git a/arch/x86/include/asm/acrn_hypercall.h 
b/arch/x86/include/asm/acrn_hypercall.h
new file mode 100644
index 000..3594436
--- /dev/null
+++ b/arch/x86/include/asm/acrn_hypercall.h
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_X86_ACRN_HYPERCALL_H
+#define _ASM_X86_ACRN_HYPERCALL_H
+
+#include 
+
+#ifdef CONFIG_ACRN_GUEST
+
+/*
+ * Hypercalls for ACRN guest
+ *
+ * Hypercall number is passed in R8 register.
+ * Up to 2 arguments are passed in RDI, RSI.
+ * Return value will be placed in RAX.
+ */
+
+static inline long acrn_hypercall0(unsigned long hcall_id)
+{
+   register unsigned long r8 asm("r8") = hcall_id;
+   register long result asm("rax");
+
+   /* the hypercall is implemented with the VMCALL instruction.
+* asm indicates that inline assembler instruction is used.
+* volatile qualifier is added to avoid that it is dropped
+* because of compiler optimization.
+*/
+   asm volatile("vmcall"
+   : "=r"(result)
+   : "r"(r8));
+
+   return result;
+}
+
+static inline long acrn_hypercall1(unsigned long hcall_id,
+  unsigned long param1)
+{
+   register unsigned long r8 asm("r8") = hcall_id;
+   register long result asm("rax");
+
+   asm volatile("vmcall"
+   : "=r"(result)
+   : "D"(param1), "r"(r8));
+
+   return result;
+}
+
+static inline long acrn_hypercall2(unsigned long hcall_id,
+  unsigned long param1,
+  unsigned long param2)
+{
+   register unsigned long r8 asm("r8") = hcall_id;
+   register long result asm("rax");
+
+   asm volatile("vmcall"
+   : "=r"(result)
+   : "D"(param1), "S"(param2), "r"(r8));
+
+   return result;
+}
+
+#else
+
+static inline long acrn_hypercall0(unsigned long hcall_id)
+{
+   return -ENOTSUPP;
+}
+
+static inline long acrn_hypercall1(unsigned long hcall_id,
+  unsigned long param1)
+{
+   return -ENOTSUPP;
+}
+
+static inline long acrn_hypercall2(unsigned long hcall_id,
+  unsigned long param1,
+  unsigned long param2)
+{
+   return -ENOTSUPP;
+}
+#endif /* CONFIG_ACRN_GUEST */
+#endif /* _ASM_X86_ACRN_HYPERCALL_H */
-- 
2.7.4

[RFC PATCH v5 1/4] x86/Kconfig: Add new config symbol to unify conditional definition of hv_irq_callback_count

2019-04-23 Thread Zhao Yakui

Add a special Kconfig symbol X86_HV_CALLBACK_VECTOR so that the guests
using the hypervisor interrupt callback counter can select and thus
enable that counter. Select it when xen or hyperv support is enabled.
No functional changes.

Signed-off-by: Zhao Yakui 
Reviewed-by: Borislav Petkov 
---
v3->v4: Follow the comments to refine the commit log.
v4->v5: No change
---
 arch/x86/Kconfig   | 3 +++
 arch/x86/include/asm/hardirq.h | 2 +-
 arch/x86/kernel/irq.c  | 2 +-
 arch/x86/xen/Kconfig   | 1 +
 drivers/hv/Kconfig | 1 +
 5 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 62fc3fd..2fc9297 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -791,6 +791,9 @@ config QUEUED_LOCK_STAT
  behavior of paravirtualized queued spinlocks and report
  them on debugfs.
 
+config X86_HV_CALLBACK_VECTOR
+   def_bool n
+
 source "arch/x86/xen/Kconfig"
 
 config KVM_GUEST
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index d9069bb..0753379 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -37,7 +37,7 @@ typedef struct {
 #ifdef CONFIG_X86_MCE_AMD
unsigned int irq_deferred_error_count;
 #endif
-#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
+#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
unsigned int irq_hv_callback_count;
 #endif
 #if IS_ENABLED(CONFIG_HYPERV)
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 59b5f2e..a147826 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -134,7 +134,7 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%10u ", per_cpu(mce_poll_count, j));
seq_puts(p, "  Machine check polls\n");
 #endif
-#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
+#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) {
seq_printf(p, "%*s: ", prec, "HYP");
for_each_online_cpu(j)
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index e07abef..ba5a418 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -7,6 +7,7 @@ config XEN
bool "Xen guest support"
depends on PARAVIRT
select PARAVIRT_CLOCK
+   select X86_HV_CALLBACK_VECTOR
depends on X86_64 || (X86_32 && X86_PAE)
depends on X86_LOCAL_APIC && X86_TSC
help
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 1c1a251..cafcb97 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -6,6 +6,7 @@ config HYPERV
tristate "Microsoft Hyper-V client drivers"
depends on X86 && ACPI && X86_LOCAL_APIC && HYPERVISOR_GUEST
select PARAVIRT
+   select X86_HV_CALLBACK_VECTOR
help
  Select this option to run Linux as a Hyper-V client operating
  system.
-- 
2.7.4

[RFC PATCH v5 3/4] x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector

2019-04-23 Thread Zhao Yakui

Linux kernel uses the HYPERVISOR_CALLBACK_VECTOR for hypervisor upcall
vector. And it is already used for Xen and HyperV.
After ACRN hypervisor is detected, it will also use this defined vector
to notify ACRN guest.

Co-developed-by: Jason Chen CJ 
Signed-off-by: Jason Chen CJ 
Signed-off-by: Zhao Yakui 
---
V1->V2: Remove the unused API definition of acrn_setup_intr_handler and
acrn_remove_intr_handler.
Adjust the order of header file
Add the declaration of acrn_hv_vector_handler and tracing
definition of acrn_hv_callback_vector.

v2->v3: Select the X86_HV_CALLBACK_VECTOR for ACRN guest
v3->v4: Refine the file name of acrnhyper.h to acrn.h
v4->v5: no change
---
 arch/x86/Kconfig|  1 +
 arch/x86/entry/entry_64.S   |  5 +
 arch/x86/include/asm/acrn.h | 11 +++
 arch/x86/kernel/cpu/acrn.c  | 22 ++
 4 files changed, 39 insertions(+)
 create mode 100644 arch/x86/include/asm/acrn.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8dc4200..d7a10f6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -848,6 +848,7 @@ config JAILHOUSE_GUEST
 config ACRN_GUEST
bool "ACRN Guest support"
depends on X86_64
+   select X86_HV_CALLBACK_VECTOR
help
  This option allows to run Linux as guest in ACRN hypervisor. Enabling
  this will allow the kernel to boot in virtualized environment under
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1f0efdb..d1b8ad3 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1129,6 +1129,11 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \
hv_stimer0_callback_vector hv_stimer0_vector_handler
 #endif /* CONFIG_HYPERV */
 
+#if IS_ENABLED(CONFIG_ACRN_GUEST)
+apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
+   acrn_hv_callback_vector acrn_hv_vector_handler
+#endif
+
 idtentry debug do_debughas_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
 idtentry int3  do_int3 has_error_code=0
 idtentry stack_segment do_stack_segmenthas_error_code=1
diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
new file mode 100644
index 000..43ab032
--- /dev/null
+++ b/arch/x86/include/asm/acrn.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_ACRN_H
+#define _ASM_X86_ACRN_H
+
+void acrn_hv_callback_vector(void);
+#ifdef CONFIG_TRACING
+#define trace_acrn_hv_callback_vector acrn_hv_callback_vector
+#endif
+
+void acrn_hv_vector_handler(struct pt_regs *regs);
+#endif /* _ASM_X86_ACRN_H */
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index f556640..d8072bf 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -9,7 +9,11 @@
  *
  */
 
+#include 
+#include 
+#include 
 #include 
+#include 
 
 static uint32_t __init acrn_detect(void)
 {
@@ -18,6 +22,8 @@ static uint32_t __init acrn_detect(void)
 
 static void __init acrn_init_platform(void)
 {
+   alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
+   acrn_hv_callback_vector);
 }
 
 static bool acrn_x2apic_available(void)
@@ -30,6 +36,22 @@ static bool acrn_x2apic_available(void)
return false;
 }
 
+static void (*acrn_intr_handler)(void);
+
+__visible void __irq_entry acrn_hv_vector_handler(struct pt_regs *regs)
+{
+   struct pt_regs *old_regs = set_irq_regs(regs);
+
+   entering_ack_irq();
+   inc_irq_stat(irq_hv_callback_count);
+
+   if (acrn_intr_handler)
+   acrn_intr_handler();
+
+   exiting_irq();
+   set_irq_regs(old_regs);
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_acrn = {
.name   = "ACRN",
.detect = acrn_detect,
-- 
2.7.4

[RFC PATCH v5 2/4] x86: Add the support of Linux guest on ACRN hypervisor

2019-04-23 Thread Zhao Yakui

ACRN is an open-source hypervisor maintained by Linux Foundation.
It is built for embedded IOT with small footprint and real-time features.
Add the ACRN guest support so that it allows Linux to be booted under
ACRN hypervisor. Following this patch it will setup the upcall
notification vector and enable hypercall. And after ACRN guest is
supported, the ACRN driver part can add the interface that is used to
manage the virtualized CPU/memory/device/interrupt for other guest system.

Co-developed-by: Jason Chen CJ 
Signed-off-by: Jason Chen CJ 
Signed-off-by: Zhao Yakui 
---
v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to
understand.
Remove the export of x86_hyper_acrn.

v2->v3: Remove the unnecessary dependency of PARAVIRT
v3->v4: Refine the commit log and add meaningful description in Kconfig
v4->v5: Minor change for the commit log.
---
 arch/x86/Kconfig  | 12 
 arch/x86/include/asm/hypervisor.h |  1 +
 arch/x86/kernel/cpu/Makefile  |  1 +
 arch/x86/kernel/cpu/acrn.c| 39 +++
 arch/x86/kernel/cpu/hypervisor.c  |  4 
 5 files changed, 57 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/acrn.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2fc9297..8dc4200 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -845,6 +845,18 @@ config JAILHOUSE_GUEST
  cell. You can leave this option disabled if you only want to start
  Jailhouse and run Linux afterwards in the root cell.
 
+config ACRN_GUEST
+   bool "ACRN Guest support"
+   depends on X86_64
+   help
+ This option allows to run Linux as guest in ACRN hypervisor. Enabling
+ this will allow the kernel to boot in virtualized environment under
+ the ACRN hypervisor.
+ ACRN is a flexible, lightweight reference open-source hypervisor, 
built
+ with real-time and safety-criticality in mind. It is built for 
embedded
+ IOT with small footprint and real-time features. More details can be
+ found in https://projectacrn.org/
+
 endif #HYPERVISOR_GUEST
 
 source "arch/x86/Kconfig.cpu"
diff --git a/arch/x86/include/asm/hypervisor.h 
b/arch/x86/include/asm/hypervisor.h
index 8c5aaba..50a30f6 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -29,6 +29,7 @@ enum x86_hypervisor_type {
X86_HYPER_XEN_HVM,
X86_HYPER_KVM,
X86_HYPER_JAILHOUSE,
+   X86_HYPER_ACRN,
 };
 
 #ifdef CONFIG_HYPERVISOR_GUEST
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index cfd24f9..17a7cdf 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_X86_CPU_RESCTRL) += resctrl/
 obj-$(CONFIG_X86_LOCAL_APIC)   += perfctr-watchdog.o
 
 obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o
+obj-$(CONFIG_ACRN_GUEST)   += acrn.o
 
 ifdef CONFIG_X86_FEATURE_NAMES
 quiet_cmd_mkcapflags = MKCAP   $@
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
new file mode 100644
index 000..f556640
--- /dev/null
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN detection support
+ *
+ * Copyright (C) 2019 Intel Corporation. All rights reserved.
+ *
+ * Jason Chen CJ 
+ * Zhao Yakui 
+ *
+ */
+
+#include 
+
+static uint32_t __init acrn_detect(void)
+{
+   return hypervisor_cpuid_base("ACRNACRNACRN\0\0", 0);
+}
+
+static void __init acrn_init_platform(void)
+{
+}
+
+static bool acrn_x2apic_available(void)
+{
+   /* x2apic is not supported now.
+* Later it needs to check the X86_FEATURE_X2APIC bit of cpu info
+* returned by CPUID to determine whether the x2apic is
+* supported in Linux guest.
+*/
+   return false;
+}
+
+const __initconst struct hypervisor_x86 x86_hyper_acrn = {
+   .name   = "ACRN",
+   .detect = acrn_detect,
+   .type   = X86_HYPER_ACRN,
+   .init.init_platform = acrn_init_platform,
+   .init.x2apic_available  = acrn_x2apic_available,
+};
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 479ca47..87e39ad 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -32,6 +32,7 @@ extern const struct hypervisor_x86 x86_hyper_xen_pv;
 extern const struct hypervisor_x86 x86_hyper_xen_hvm;
 extern const struct hypervisor_x86 x86_hyper_kvm;
 extern const struct hypervisor_x86 x86_hyper_jailhouse;
+extern const struct hypervisor_x86 x86_hyper_acrn;
 
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
@@ -49,6 +50,9 @@ static const __initconst struct hypervisor_x86 * const 
hypervisors[] =
 #ifdef CONFIG_JAILHOUSE_GUEST
_hyper_jailhouse,
 #endif
+#ifdef CONFIG_ACRN_GUEST
+   _hyper_acrn,
+#endif
 };
 
 enum x86_hypervisor_type

[RFC PATCH v5 0/4] x86: Add the support of ACRN guest under x86

2019-04-23 Thread Zhao Yakui

ACRN is a flexible, lightweight reference hypervisor, built with real-time
and safety-criticality in mind, optimized to streamline embedded development
through an open source platform. It is built for embedded IOT with small
footprint and real-time features. More details can be found
in https://projectacrn.org/

This is the patch set that allows the Linux to work on ACRN hypervisor and it 
can
work with the following patch set to manage the Linux guest on ACRN hypervisor. 
It
includes the detection of ACRN hypervisor, upcall notification vector from
hypervisor, hypercall. The hypervisor detection is similar to Xen/VMWARE/Hyperv.
ACRN also uses the upcall notification mechanism similar to that in 
Xen/Microsoft
HyperV when it needs to send the notification to Linux OS. The hypercall 
provides
the mechanism that can be used to query/configure the ACRN hypervisor by Linux 
guest.

Following this patch set, we will send acrn driver part, which provides the 
interface
that can be used to manage the virtualized CPU/memory/device/interrupt for 
other guest
OS after the ACRN hypervisor is detected.

v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to
understand.
Remove the export of x86_hyper_acrn.
Remove the unused API definition of acrn_setup_intr_handler and
acrn_remove_intr_handler.
Adjust the order of header file
Add the declaration of acrn_hv_vector_handler and tracing
definition of acrn_hv_callback_vector.
Refine the comments for the function of acrn_hypercall0/1/2

v2-v3:  Add one new config symbol to unify the conditional definition
of hv_irq_callback_count
Use the "vmcall" mnemonic to replace the hard-code byte definition
Remove the unnecessary dependency of CONFIG_PARAVIRT for ACRN_GUEST

v3-v4:  Rename the file name of acrnhyper.h to acrn.h
Refine the commit log and some other minor changes(more comments and 
redundant ifdef in acrn.h, sorting the header file in acrn.c)

v4->v5: Minor changes of comments/commit log in patch 04
Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H.
Use the "VMCALL" mnemonic in comment/commit log.
Uppercase r8/rdi/rsi/rax for hypercall parameter register in comment.

Zhao Yakui (4):
  x86/Kconfig: Add new config symbol to unify conditional definition of
hv_irq_callback_count
  x86: Add the support of Linux guest on ACRN hypervisor
  x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector
  x86/acrn: Add hypercall for ACRN guest

 arch/x86/Kconfig  | 16 +++
 arch/x86/entry/entry_64.S |  5 +++
 arch/x86/include/asm/acrn.h   | 11 +
 arch/x86/include/asm/acrn_hypercall.h | 81 +++
 arch/x86/include/asm/hardirq.h|  2 +-
 arch/x86/include/asm/hypervisor.h |  1 +
 arch/x86/kernel/cpu/Makefile  |  1 +
 arch/x86/kernel/cpu/acrn.c| 61 ++
 arch/x86/kernel/cpu/hypervisor.c  |  4 ++
 arch/x86/kernel/irq.c |  2 +-
 arch/x86/xen/Kconfig  |  1 +
 drivers/hv/Kconfig|  1 +
 12 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/include/asm/acrn.h
 create mode 100644 arch/x86/include/asm/acrn_hypercall.h
 create mode 100644 arch/x86/kernel/cpu/acrn.c

-- 
2.7.4

Re: [PATCH] platform/x86: asus-wmi: Add fn-lock mode switch support

2019-04-23 Thread Chris Chiu

On Thu, Apr 18, 2019 at 2:46 PM Chris Chiu  wrote:
>
> Some of latest ASUS laptops support new fn-lock mode switching.
> This commit detect whether if the fn-lock option is enabled in
> BIOS setting, and toggle the fn-lock mode via a new WMI DEVID
> 0x00100023 when the corresponding notify code captured.
>
> The ASUS fn-lock mode switch is activated by pressing Fn+Esc.
> When on, keys F1 to F12 behave as applicable, with meanings
> defined by the application being used at the time. When off,
> F1 to F12 directly triggers hardware features, well known audio
> volume up/down, brightness up/down...etc, which were triggered
> by holding down Fn key and F-keys.
>
> Because there's no way to retrieve the fn-lock mode via existing
> WMI methods per ASUS spec, driver need to initialize and keep the
> fn-lock mode by itself.
>
> Signed-off-by: Chris Chiu 
> ---
>  drivers/platform/x86/asus-wmi.c| 36 
> ++
>  include/linux/platform_data/x86/asus-wmi.h |  1 +
>  2 files changed, 37 insertions(+)
>
> diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
> index 37b5de541270..5f52b66e40cb 100644
> --- a/drivers/platform/x86/asus-wmi.c
> +++ b/drivers/platform/x86/asus-wmi.c
> @@ -69,6 +69,7 @@ MODULE_LICENSE("GPL");
>  #define NOTIFY_KBD_BRTUP   0xc4
>  #define NOTIFY_KBD_BRTDWN  0xc5
>  #define NOTIFY_KBD_BRTTOGGLE   0xc7
> +#define NOTIFY_FNLOCK_TOGGLE   0x4e
>
>  #define ASUS_FAN_DESC  "cpu_fan"
>  #define ASUS_FAN_MFUN  0x13
> @@ -177,6 +178,8 @@ struct asus_wmi {
> struct workqueue_struct *hotplug_workqueue;
> struct work_struct hotplug_work;
>
> +   bool fnlock_locked;
> +
> struct asus_wmi_debug debug;
>
> struct asus_wmi_driver *driver;
> @@ -1619,6 +1622,24 @@ static int is_display_toggle(int code)
> return 0;
>  }
>
> +static bool asus_wmi_has_fnlock_key(struct asus_wmi *asus)
> +{
> +#define ASUS_WMI_FNLOCK_BIOS_DISABLED  BIT(0)
> +   u32 result;
> +
> +   asus_wmi_get_devstate(asus, ASUS_WMI_DEVID_FNLOCK, );
> +
> +   return (result & ASUS_WMI_DSTS_PRESENCE_BIT) &&
> +   !(result & ASUS_WMI_FNLOCK_BIOS_DISABLED);
> +}
> +
> +static void asus_wmi_fnlock_update(struct asus_wmi *asus)
> +{
> +   int mode = asus->fnlock_locked;
> +
> +   asus_wmi_set_devstate(ASUS_WMI_DEVID_FNLOCK, mode, NULL);
> +}
> +
>  static void asus_wmi_notify(u32 value, void *context)
>  {
> struct asus_wmi *asus = context;
> @@ -1680,6 +1701,12 @@ static void asus_wmi_notify(u32 value, void *context)
> goto exit;
> }
>
> +   if (code == NOTIFY_FNLOCK_TOGGLE) {
> +   asus->fnlock_locked = !asus->fnlock_locked;
> +   asus_wmi_fnlock_update(asus);
> +   goto exit;
> +   }
> +
> if (is_display_toggle(code) &&
> asus->driver->quirks->no_display_toggle)
> goto exit;
> @@ -2134,6 +2161,11 @@ static int asus_wmi_add(struct platform_device *pdev)
> } else
> err = asus_wmi_set_devstate(ASUS_WMI_DEVID_BACKLIGHT, 2, 
> NULL);
>
> +   if (asus_wmi_has_fnlock_key(asus)) {
> +   asus->fnlock_locked = true;
> +   asus_wmi_fnlock_update(asus);
> +   }
> +
> status = wmi_install_notify_handler(asus->driver->event_guid,
> asus_wmi_notify, asus);
> if (ACPI_FAILURE(status)) {
> @@ -2213,6 +2245,8 @@ static int asus_hotk_resume(struct device *device)
> if (!IS_ERR_OR_NULL(asus->kbd_led.dev))
> kbd_led_update(asus);
>
> +   if (asus_wmi_has_fnlock_key(asus))
> +   asus_wmi_fnlock_update(asus);
> return 0;
>  }
>
> @@ -2249,6 +2283,8 @@ static int asus_hotk_restore(struct device *device)
> if (!IS_ERR_OR_NULL(asus->kbd_led.dev))
> kbd_led_update(asus);
>
> +   if (asus_wmi_has_fnlock_key(asus))
> +   asus_wmi_fnlock_update(asus);
> return 0;
>  }
>
> diff --git a/include/linux/platform_data/x86/asus-wmi.h 
> b/include/linux/platform_data/x86/asus-wmi.h
> index 53dfc2541960..bfba245636a7 100644
> --- a/include/linux/platform_data/x86/asus-wmi.h
> +++ b/include/linux/platform_data/x86/asus-wmi.h
> @@ -67,6 +67,7 @@
>  /* Input */
>  #define ASUS_WMI_DEVID_TOUCHPAD0x00100011
>  #define ASUS_WMI_DEVID_TOUCHPAD_LED0x00100012
> +#define ASUS_WMI_DEVID_FNLOCK  0x00100023
>
>  /* Fan, Thermal */
>  #define ASUS_WMI_DEVID_THERMAL_CTRL0x00110011
> --
> 2.11.0
>

Gentle ping. Any comments or suggestions for this are appreciated.

Chris

Re: [PATCH 1/2] dt-bindings: adc: mt8183: add binding document

2019-04-23 Thread Zhiyong Tao

On Tue, 2019-04-23 at 16:35 +0200, Matthias Brugger wrote:
> 
> On 22/04/2019 13:54, Zhiyong Tao wrote:
> > The commit adds mt8183 compatible node in binding document.
> > 
> > Signed-off-by: Zhiyong Tao 
> > ---
> >  Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt 
> > b/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
> > index 0df9befdaecc..05bc79d8483c 100644
> > --- a/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
> > +++ b/Documentation/devicetree/bindings/iio/adc/mt6577_auxadc.txt
> > @@ -15,6 +15,7 @@ Required properties:
> >  - "mediatek,mt2712-auxadc": For MT2712 family of SoCs
> >  - "mediatek,mt7622-auxadc": For MT7622 family of SoCs
> >  - "mediatek,mt8173-auxadc": For MT8173 family of SoCs
> > +- "mediatek,mt8183-auxadc": For MT8183 family of SoCs
> >- reg: Address range of the AUXADC unit.
> >- clocks: Should contain a clock specifier for each entry in clock-names
> >- clock-names: Should contain "main".
> > 
> 
> You are missing the logic in the driver to bind against this compatible.
> If there is nothing different from other SoCs then you could add a compatible
> with a fallback, like:
> 
> "mediatek,mt8183-auxadc", "mediatek,mt7622-auxadc": For MT8183 family of SoCs

==> Thanks for your suggestion. In v2, we will add the comment here:
- "mediatek,mt8183-auxadc", "mediatek,mt8173-auxadc": For MT8183 family
of SoCs.

> 
> Regards,
> Matthias

Re: [PATCH v4] tpm: fix an invalid condition in tpm_common_poll

2019-04-23 Thread Sasha Levin


On Tue, Apr 23, 2019 at 10:54:47PM +0200, Jonas Witschel wrote:

On 2019-04-09 15:44, Jarkko Sakkinen wrote:

On Mon, Apr 08, 2019 at 02:01:38PM +0200, Thibaut Sautereau wrote:

[...]
What's the status of this patch now? It's needed in linux-5.0.y as TPM
2.0 support is currently broken with those stable kernels without this
commit.


part of a PR.

https://lore.kernel.org/linux-integrity/20190329115544.ga27...@linux.intel.com/


It appears that the final version of the patch that was merged to
Linus's tree [1] does not include the "Cc: sta...@vger.kernel.org" tag.
If I understand correctly, this means that the patch will not be
automatically included in the -stable tree without further action. Is
there a specific reason not to apply this patch to 5.0.x, or did the tag
just get lost in the merge process?


Good catch; I see that Jarkko had the same comment on v3 but v4 ended up
being without the -stable tag without any explanation. I've queued this
for 5.0, it doesn't seem relevant for older branches.

--
Thanks,
Sasha

Re: [GIT PULL] arch: add pidfd and io_uring syscalls everywhere

2019-04-23 Thread Dmitry V. Levin

Hi,

On Tue, Apr 23, 2019 at 09:28:48PM +0200, Arnd Bergmann wrote:
> The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:
> 
>   Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
> syscalls-5.1
> 
> for you to fetch changes up to 39036cd2727395c3369b1051005da74059a85317:
> 
>   arch: add pidfd and io_uring syscalls everywhere (2019-04-15 16:31:17 +0200)
> 
> 
> arch: add pidfd and io_uring syscalls everywhere
> 
> This comes a bit late, but should be in 5.1 anyway: we want the newly
> added system calls to be synchronized across all architectures in
> the release.
> 
> I hope that in the future, any newly added system calls can be added
> to all architectures at the same time, and tested there while they
> are in linux-next, avoiding dependencies between the architecture
> maintainer trees and the tree that contains the new system call.

Is "everywhere" really means everywhere?
The reason I'm asking this question is that sh64 seems to be excluded:
arch/sh/kernel/syscalls_64.S hasn't got any syscall entries since commit
v4.8-rc1~15^2~3.  Is sh64 supported in any way at all?


-- 
ldv


signature.asc
Description: PGP signature

Re: [PATCH v20 15/28] x86/sgx: Add the Linux SGX Enclave Driver

2019-04-23 Thread Sean Christopherson

On Tue, Apr 23, 2019 at 11:29:24PM +, Jethro Beekman wrote:
> On 2019-04-22 14:58, Sean Christopherson wrote:
> >+Cc Jethro
> >
> >On Wed, Apr 17, 2019 at 01:39:25PM +0300, Jarkko Sakkinen wrote:
> >>Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> >>can be used by applications to set aside private regions of code and
> >>data. The code outside the enclave is disallowed to access the memory
> >>inside the enclave by the CPU access control.
> >>
> >>This commit adds the Linux SGX Enclave Driver that provides an ioctl API
> >>to manage enclaves. The address range for an enclave, commonly referred
> >>as ELRANGE in the documentation (e.g. Intel SDM), is reserved with
> >>mmap() against /dev/sgx/enclave. After that a set ioctls is used to
> >>build the enclave to the ELRANGE.
> >>
> >>Signed-off-by: Jarkko Sakkinen 
> >>Co-developed-by: Sean Christopherson 
> >>Signed-off-by: Sean Christopherson 
> >>Co-developed-by: Serge Ayoun 
> >>Signed-off-by: Serge Ayoun 
> >>Co-developed-by: Shay Katz-zamir 
> >>Signed-off-by: Shay Katz-zamir 
> >>Co-developed-by: Suresh Siddha 
> >>Signed-off-by: Suresh Siddha 
> >>---
> >
> >...
> >
> >>+#ifdef CONFIG_ACPI
> >>+static struct acpi_device_id sgx_device_ids[] = {
> >>+   {"INT0E0C", 0},
> >>+   {"", 0},
> >>+};
> >>+MODULE_DEVICE_TABLE(acpi, sgx_device_ids);
> >>+#endif
> >>+
> >>+static struct platform_driver sgx_drv = {
> >>+   .probe = sgx_drv_probe,
> >>+   .remove = sgx_drv_remove,
> >>+   .driver = {
> >>+   .name   = "sgx",
> >>+   .acpi_match_table   = ACPI_PTR(sgx_device_ids),
> >>+   },
> >>+};
> >
> >Where do we stand on removing the ACPI and platform_driver dependencies?
> >Can we get rid of them sooner rather than later?
> 
> You know my position on this...
> https://www.spinics.net/lists/linux-sgx/msg00624.html . I don't really have
> any new arguments.
> 
> Considering the amount of planned changes for the driver post-merge, I think
> it's crucial that the driver part can be swapped out with alternative
> implementations.

This gets far outside of my area of expertise as I think this is more of
a policy question as opposed to a technical question, e.g. do we export
function simply to allow out-of-tree alternatives.

> >Now that the core SGX code is approaching stability, I'd like to start
> >sending RFCs for the EPC virtualization and KVM bits to hash out that side
> >of things.  The ACPI crud is the last chunk of code that would require
> >non-trivial changes to the core SGX code for the proposed virtualization
> >implementation.  I'd strongly prefer to get it out of the way before
> >sending the KVM RFCs.
> 
> What kind of changes? Wouldn't KVM just be another consumer of the same API
> used by the driver?

Nope, userspace "only" needs to be able to mmap() arbitrary chunks of EPC.
Except for EPC management, which is already in built into the kernel, the
EPC virtualization code has effectively zero overlap with the driver.  Of
course this is all technically speculative since none of this is upstream...

Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility

2019-04-23 Thread Yang Shi





On 4/23/19 11:34 AM, Yang Shi wrote:



On 4/23/19 10:52 AM, Michal Hocko wrote:

On Wed 24-04-19 00:43:01, Yang Shi wrote:
The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for 
each
vma") introduced THPeligible bit for processes' smaps. But, when 
checking

the eligibility for shmem vma, __transparent_hugepage_enabled() is
called to override the result from shmem_huge_enabled().  It may result
in the anonymous vma's THP flag override shmem's.  For example, 
running a
simple test which create THP for shmem, but with anonymous THP 
disabled,

when reading the process's smaps, it may show:

7fc92ec0-7fc92f00 rw-s  00:14 27764 /dev/shm/test
Size:   4096 kB
...
[snip]
...
ShmemPmdMapped: 4096 kB
...
[snip]
...
THPeligible:    0

And, /proc/meminfo does show THP allocated and PMD mapped too:

ShmemHugePages: 4096 kB
ShmemPmdMapped: 4096 kB

This doesn't make too much sense.  The anonymous THP flag should not
intervene shmem THP.  Calling shmem_huge_enabled() with checking
MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
dax vma check since we already checked if the vma is shmem already.

Kirill, can we get a confirmation that this is really intended behavior
rather than an omission please? Is this documented? What is a global
knob to simply disable THP system wise?

I have to say that the THP tuning API is one giant mess :/

Btw. this patch also seem to fix khugepaged behavior because it 
previously

ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.


Second look shows this is not ignored. hugepage_vma_check() would check 
this for both anonymous vma and shmem vma before scanning. It is called 
before shmem_huge_enabled().




Aha, I didn't notice this. It looks we need separate the patch to fix 
that khugepaged problem for both 5.1-rc and LTS.




Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each 
vma")

Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: David Rientjes 
Cc: Kirill A. Shutemov 
Signed-off-by: Yang Shi 
---
v2: Check VM_NOHUGEPAGE per Michal Hocko

  mm/huge_memory.c | 4 ++--
  mm/shmem.c   | 3 +++
  2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 165ea46..5881e82 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct 
vm_area_struct *vma)

  {
  if (vma_is_anonymous(vma))
  return __transparent_hugepage_enabled(vma);
-    if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
-    return __transparent_hugepage_enabled(vma);
+    if (vma_is_shmem(vma))
+    return shmem_huge_enabled(vma);
    return false;
  }
diff --git a/mm/shmem.c b/mm/shmem.c
index 2275a0f..6f09a31 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct 
*vma)

  loff_t i_size;
  pgoff_t off;
  +    if ((vma->vm_flags & VM_NOHUGEPAGE) ||
+    test_bit(MMF_DISABLE_THP, >vm_mm->flags))
+    return false;
  if (shmem_huge == SHMEM_HUGE_FORCE)
  return true;
  if (shmem_huge == SHMEM_HUGE_DENY)
--
1.8.3.1

linux-next: Signed-off-by missing for commit in the drm tree

2019-04-23 Thread Stephen Rothwell

Hi all,

Commit

  a9f58c456e9d ("drm/vmwgfx: Be more restrictive when dirtying resources")

is missing a Signed-off-by from its committer.

-- 
Cheers,
Stephen Rothwell


pgpp4oxkN5hTJ.pgp
Description: OpenPGP digital signature

Re: [PATCH] fpga: stratix10-soc: fix use-after-free on s10_init()

2019-04-23 Thread Moritz Fischer

Hi Wen,

On Wed, Apr 24, 2019 at 07:32:05AM +0800, Wen Yang wrote:
> The refcount of fw_np has already been decreased by of_find_matching_node()
> so it shouldn't be used anymore.
> This patch adds an of_node_get() before of_find_matching_node() to avoid
> the use-after-free problem.
> 
> Fixes: e7eef1d7633a ("fpga: add intel stratix10 soc fpga manager driver")
> Signed-off-by: Wen Yang 
> Cc: Alan Tull 
> Cc: Moritz Fischer 
> Cc: Nicolas Saenz Julienne 
> Cc: linux-f...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

Reviewed-by: Moritz Fischer 
> ---
>  drivers/fpga/stratix10-soc.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/fpga/stratix10-soc.c b/drivers/fpga/stratix10-soc.c
> index 13851b3..215d337 100644
> --- a/drivers/fpga/stratix10-soc.c
> +++ b/drivers/fpga/stratix10-soc.c
> @@ -507,12 +507,16 @@ static int __init s10_init(void)
>   if (!fw_np)
>   return -ENODEV;
>  
> + of_node_get(fw_np);
>   np = of_find_matching_node(fw_np, s10_of_match);
> - if (!np)
> + if (!np) {
> + of_node_put(fw_np);
>   return -ENODEV;
> + }
>  
>   of_node_put(np);
>   ret = of_platform_populate(fw_np, s10_of_match, NULL, NULL);
> + of_node_put(fw_np);
>   if (ret)
>   return ret;
>  
> -- 
> 2.9.5
> 

Thanks,
Moritz

Re: [REGRESSION 5.0.x] Windows XP broken on KVM

2019-04-23 Thread Sean Christopherson

On Tue, Apr 23, 2019 at 10:23:17PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Apr 18, 2019 at 09:56:02AM +0200, Paolo Bonzini wrote:
> > On 18/04/19 09:38, Takashi Iwai wrote:
> > > Hi,
> > > 
> > > we've got a regression report on the recent 5.0.x kernel, starting
> > > from 5.0.6, where Windows XP can't boot on KVM any longer.
> > > 
> > > The culprit seems to be the patch
> > >KVM: x86: update %rip after emulating IO
> > > with the upstream commit 45def77ebf79e2e8942b89ed79294d97ce914fa0.
> > > Reverting this alone fixed the problem.
> > > 
> > > The report is found at openSUSE bugzilla:
> > >   https://bugzilla.suse.com/show_bug.cgi?id=1132694
> > > 
> > > Is there already a followup fix?  If not, we need to revert it from
> > > stable, at least.
> > 
> > No, it's the first time I hear this and I actually test Windows XP
> > before every pull request I send to Linus...  I'll download 5.0.x and
> > test it there.
> 
> Any further ideas about this?

I followed up on the bugzilla to request more information.  My best guess
at this point is that the issue is related to an older version of Qemu or
a specific emulated device, but without additional details we're stuck.

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-04-23 Thread Tim Chen



> +
> +void sched_core_enqueue(struct rq *rq, struct task_struct *p)
> +{

...

> +}
> +
> +void sched_core_dequeue(struct rq *rq, struct task_struct *p)
> +{

...

> +}
> +
> +/*
> + * Find left-most (aka, highest priority) task matching @cookie.
> + */
> +struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
> +{

...


The sched_core_* functions are used only in the core.c
they are declared in.  We can convert them to static functions.

Thanks.

Tim

---
 kernel/sched/core.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 14e766d0df99..455e7ecc2f48 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -155,7 +155,7 @@ static inline bool __sched_core_less(struct task_struct *a, 
struct task_struct *
return false;
 }
 
-void sched_core_enqueue(struct rq *rq, struct task_struct *p)
+static void sched_core_enqueue(struct rq *rq, struct task_struct *p)
 {
struct rb_node *parent, **node;
struct task_struct *node_task;
@@ -182,7 +182,7 @@ void sched_core_enqueue(struct rq *rq, struct task_struct 
*p)
rb_insert_color(>core_node, >core_tree);
 }
 
-void sched_core_dequeue(struct rq *rq, struct task_struct *p)
+static void sched_core_dequeue(struct rq *rq, struct task_struct *p)
 {
rq->core->core_task_seq++;
 
@@ -195,7 +195,7 @@ void sched_core_dequeue(struct rq *rq, struct task_struct 
*p)
 /*
  * Find left-most (aka, highest priority) task matching @cookie.
  */
-struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
+static struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
 {
struct rb_node *node = rq->core_tree.rb_node;
struct task_struct *node_task, *match;
@@ -221,7 +221,7 @@ struct task_struct *sched_core_find(struct rq *rq, unsigned 
long cookie)
return match;
 }
 
-struct task_struct *sched_core_next(struct task_struct *p, unsigned long 
cookie)
+static struct task_struct *sched_core_next(struct task_struct *p, unsigned 
long cookie)
 {
struct rb_node *node = >core_node;
 
@@ -282,7 +282,7 @@ static void __sched_core_disable(void)
printk("core sched disabled\n");
 }
 
-void sched_core_get(void)
+static void sched_core_get(void)
 {
mutex_lock(_core_mutex);
if (!sched_core_count++)
@@ -290,7 +290,7 @@ void sched_core_get(void)
mutex_unlock(_core_mutex);
 }
 
-void sched_core_put(void)
+static void sched_core_put(void)
 {
mutex_lock(_core_mutex);
if (!--sched_core_count)
-- 
2.20.1

[PATCH v2 3/5 RFC] since cmdline args can be same for multiple kexec, log entry hash will collide. Prepend the kernel file name to the cmdline args to distinguish between cmdline args passed to subseq

2019-04-23 Thread Prakhar Srivastava

From: Prakhar Srivastava 

Signed-off-by: Prakhar Srivastava 
---

Currently for soft reboot(kexec_file_load) the kernel file and
signature is measured by IMA. The cmdline args used to load the kernel
is not measured.
The boot aggregate that gets calculated will have no change since the
EFI loader has not been triggered.
Adding the kexec cmdline args measure and kernel version will add some
attestable criteria.

Cmdline args can be same for multiple kexec, log entry
hash will collide. Prepend the kernel file name to the cmdline args to
distinguish between cmdline args passed to subsequent kexec calls

 kernel/kexec_core.c | 57 +
 kernel/kexec_file.c | 14 --
 kernel/kexec_internal.h |  3 +++
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index ae1a3ba24df5..97b77c780311 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1151,3 +1151,60 @@ void __weak arch_kexec_protect_crashkres(void)
 
 void __weak arch_kexec_unprotect_crashkres(void)
 {}
+
+/**
+ * kexec_cmdline_prepend_img_name - prepare the buffer with cmdline
+ * that needs to be measured
+ * @outbuf - out buffer that contains the formated string
+ * @kernel_fd - the file identifier for the kerenel image
+ * @cmdline_ptr - ptr to the cmdline buffer
+ * @cmdline_len - len of the buffer.
+ *
+ * This generates a buffer in the format Kerenelfilename::cmdline
+ *
+ * On success return 0.
+ * On failure return -EINVAL.
+ */
+int kexec_cmdline_prepend_img_name(char **outbuf, int kernel_fd,
+   const char *cmdline_ptr,
+   unsigned long cmdline_len)
+{
+   int ret = -EINVAL;
+   struct fd f = {};
+   int size = 0;
+   char *buf = NULL;
+   char delimiter[] = "::";
+
+   if (!outbuf || !cmdline_ptr)
+   goto out;
+
+   f = fdget(kernel_fd);
+   if (!f.file)
+   goto out;
+
+   size = (f.file->f_path.dentry->d_name.len + cmdline_len - 1+
+   ARRAY_SIZE(delimiter)) - 1;
+
+   buf = kzalloc(size, GFP_KERNEL);
+   if (!buf)
+   goto out;
+
+   memcpy(buf, f.file->f_path.dentry->d_name.name,
+   f.file->f_path.dentry->d_name.len);
+   memcpy(buf + f.file->f_path.dentry->d_name.len,
+   delimiter, ARRAY_SIZE(delimiter) - 1);
+   memcpy(buf + f.file->f_path.dentry->d_name.len +
+   ARRAY_SIZE(delimiter) - 1,
+   cmdline_ptr, cmdline_len - 1);
+
+   *outbuf = buf;
+   ret = size;
+
+   pr_debug("kexec cmdline buff: %s\n", buf);
+
+out:
+   if (f.file)
+   fdput(f);
+
+   return ret;
+}
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 2a5234eb4b28..a487491d55b9 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -126,6 +126,8 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
int ret = 0;
void *ldata;
loff_t size;
+   char *buff_to_measure = NULL;
+   int buff_to_measure_size = 0;
 
ret = kernel_read_file_from_fd(kernel_fd, >kernel_buf,
   , INT_MAX, READING_KEXEC_IMAGE);
@@ -183,8 +185,13 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
goto out;
}
 
-   ima_buffer_check(image->cmdline_buf, cmdline_len - 1,
-   "kexec_cmdline");
+   /* IMA measures the cmdline args passed to the next kernel*/
+   buff_to_measure_size = 
kexec_cmdline_prepend_img_name(_to_measure,
+   kernel_fd, image->cmdline_buf, image->cmdline_buf_len);
+
+   ima_buffer_check(buff_to_measure, buff_to_measure_size,
+   "kexec_cmdline");
+
}
 
/* Call arch image load handlers */
@@ -200,6 +207,9 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
/* In case of error, free up all allocated memory in this function */
if (ret)
kimage_file_post_load_cleanup(image);
+
+   kfree(buff_to_measure);
+
return ret;
 }
 
diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
index 799a8a452187..4d34a8ef4637 100644
--- a/kernel/kexec_internal.h
+++ b/kernel/kexec_internal.h
@@ -11,6 +11,9 @@ int kimage_load_segment(struct kimage *image, struct 
kexec_segment *segment);
 void kimage_terminate(struct kimage *image);
 int kimage_is_destination_range(struct kimage *image,
unsigned long start, unsigned long end);
+int kexec_cmdline_prepend_img_name(char **outbuf, int kernel_fd,
+   const char *cmdline_ptr,
+   unsigned long cmdline_len);
 
 extern struct mutex kexec_mutex;
 
-- 
2.17.1

[PATCH v2 2/5 RFC] use event name instead of enum to make the call generic

2019-04-23 Thread Prakhar Srivastava

From: Prakhar Srivastava 

Signed-off-by: Prakhar Srivastava 
---

Currently for soft reboot(kexec_file_load) the kernel file and
signature is measured by IMA. The cmdline args used to load the kernel
is not measured.
The boot aggregate that gets calculated will have no change since the
EFI loader has not been triggered.
Adding the kexec cmdline args measure and kernel version will add some
attestable criteria.

remove enums to control type of buffers entries, instead pass the event name to 
be used.

 include/linux/ima.h   | 10 ++
 kernel/kexec_file.c   |  3 +++
 security/integrity/ima/ima.h  |  2 +-
 security/integrity/ima/ima_main.c | 30 ++
 4 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 733d0cb9dedc..5e41507c57e5 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -14,12 +14,6 @@
 #include 
 struct linux_binprm;
 
-enum __buffer_id {
-   KERNEL_VERSION,
-   KEXEC_CMDLINE,
-   MAX_BUFFER_ID = KEXEC_CMDLINE
-} buffer_id;
-
 #ifdef CONFIG_IMA
 extern int ima_bprm_check(struct linux_binprm *bprm);
 extern int ima_file_check(struct file *file, int mask, int opened);
@@ -29,7 +23,7 @@ extern int ima_read_file(struct file *file, enum 
kernel_read_file_id id);
 extern int ima_post_read_file(struct file *file, void *buf, loff_t size,
  enum kernel_read_file_id id);
 extern void ima_post_path_mknod(struct dentry *dentry);
-extern void ima_buffer_check(const void *buff, int size, enum buffer_id id);
+extern void ima_buffer_check(const void *buff, int size, char *eventname);
 #ifdef CONFIG_IMA_KEXEC
 extern void ima_add_kexec_buffer(struct kimage *image);
 #endif
@@ -72,7 +66,7 @@ static inline void ima_post_path_mknod(struct dentry *dentry)
 }
 
 static inline void ima_buffer_check(const void *buff, int size,
-   enum buffer_id id)
+   char *eventname)
 {
return;
 }
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b118735fea9d..2a5234eb4b28 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -182,6 +182,9 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
ret = -EINVAL;
goto out;
}
+
+   ima_buffer_check(image->cmdline_buf, cmdline_len - 1,
+   "kexec_cmdline");
}
 
/* Call arch image load handlers */
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index b71f2f6f7421..fcade3c103ed 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -181,8 +181,8 @@ enum ima_hooks {
FIRMWARE_CHECK,
KEXEC_KERNEL_CHECK,
KEXEC_INITRAMFS_CHECK,
-   BUFFER_CHECK,
POLICY_CHECK,
+   BUFFER_CHECK,
MAX_CHECK
 };
 
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 6408cadaadbb..da82c705a5ed 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -160,8 +160,7 @@ void ima_file_free(struct file *file)
  * (Instead of using the file hash the buffer hash is used).
  * @buff - The buffer that needs to be added to the log
  * @size - size of buffer(in bytes)
- * @id - buffer id, this is differentiator for the various buffers
- * that can be measured.
+ * @id - eventname, event name to be used for buffer measurement.
  *
  * The buffer passed is added to the ima logs.
  * If the sig template is used, then the sig field contains the buffer.
@@ -170,7 +169,7 @@ void ima_file_free(struct file *file)
  * On error cases surface errors from ima calls.
  */
 static int process_buffer_measurement(const void *buff, int size,
-   enum buffer_id id)
+   char *eventname)
 {
int ret = -EINVAL;
struct ima_template_entry *entry = NULL;
@@ -185,23 +184,13 @@ static int process_buffer_measurement(const void *buff, 
int size,
int violation = 0;
int pcr = CONFIG_IMA_MEASURE_PCR_IDX;
 
-   if (!buff || size ==  0)
+   if (!buff || size ==  0 || !eventname)
goto err_out;
 
if (ima_get_action(NULL, 0, BUFFER_CHECK, ) != IMA_MEASURE)
goto err_out;
 
-   switch (buffer_id) {
-   case KERNEL_VERSION:
-   name = "Kernel-version";
-   break;
-   case KEXEC_CMDLINE:
-   name = "Kexec-cmdline";
-   break;
-   default:
-   goto err_out;
-   }
-
+   name = eventname;
memset(iint, 0, sizeof(*iint));
memset(, 0, sizeof(hash));
 
@@ -452,15 +441,16 @@ int ima_read_file(struct file *file, enum 
kernel_read_file_id read_id)
  * ima_buffer_check - based on policy, collect & store buffer measurement
  * @buf: pointer to buffer
  * @size: size of buffer
- * @buffer_id: caller

[PATCH v2 5/5 RFC] add the buffer to the event data in ima free entry data if store_template failed added check in templates for buffer

2019-04-23 Thread Prakhar Srivastava

From: Prakhar Srivastava 

Signed-off-by: Prakhar Srivastava 
---
Currently for soft reboot(kexec_file_load) the kernel file and
signature is measured by IMA. The cmdline args used to load the kernel
is not measured.
The boot aggregate that gets calculated will have no change since the
EFI loader has not been triggered.
Adding the kexec cmdline args measure and kernel version will add some
attestable criteria.

This patch adds the buffer to be measured as the event data.
this also contains changes necessary for template

 security/integrity/ima/ima_main.c | 36 +--
 security/integrity/ima/ima_template_lib.c |  3 +-
 security/integrity/integrity.h|  1 +
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index da82c705a5ed..204a7a1acb86 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -14,7 +14,7 @@
  *
  * File: ima_main.c
  * implements the IMA hooks: ima_bprm_check, ima_file_mmap,
- * and ima_file_check.
+ * ima_file_check and ima_buffer_check.
  */
 #include 
 #include 
@@ -180,16 +180,37 @@ static int process_buffer_measurement(const void *buff, 
int size,
struct ima_digest_data hdr;
char digest[IMA_MAX_DIGEST_SIZE];
} hash;
+   struct buffer_xattr {
+   enum evm_ima_xattr_type type;
+   u16 buff_length;
+   unsigned char buff[0];
+   };
char *name = NULL;
int violation = 0;
int pcr = CONFIG_IMA_MEASURE_PCR_IDX;
+   struct buffer_xattr *buffer_event_data = NULL;
+   int alloc_length = 0;
+   int action  = 0;
 
if (!buff || size ==  0 || !eventname)
goto err_out;
 
-   if (ima_get_action(NULL, 0, BUFFER_CHECK, ) != IMA_MEASURE)
+   action = ima_get_action(NULL, 0, BUFFER_CHECK, );
+   if (!(action & IMA_AUDIT) && !(action & IMA_MEASURE))
goto err_out;
 
+   alloc_length = sizeof(struct buffer_xattr) + size;
+   buffer_event_data = kzalloc(alloc_length, GFP_KERNEL);
+   if (!buffer_event_data)
+   goto err_out;
+
+   buffer_event_data->type = IMA_BUFFER_CHECK;
+   buffer_event_data->buff_length = size;
+   memcpy(buffer_event_data->buff, buff, size);
+
+   event_data.xattr_value = (struct evm_ima_xattr_data *)buffer_event_data;
+   event_data.xattr_len = alloc_length;
+
name = eventname;
memset(iint, 0, sizeof(*iint));
memset(, 0, sizeof(hash));
@@ -208,16 +229,25 @@ static int process_buffer_measurement(const void *buff, 
int size,
if (ret < 0)
goto err_out;
 
-   ret = ima_store_template(entry, violation, NULL,
+   if (action & IMA_MEASURE)
+   ret = ima_store_template(entry, violation, NULL,
buff, pcr);
+
if (ret < 0) {
ima_free_template_entry(entry);
goto err_out;
}
 
+   if (action & IMA_AUDIT)
+   ima_audit_measurement(iint, event_data.filename);
+
+   kfree(buffer_event_data);
return 0;
 
 err_out:
+
+   kfree(buffer_event_data);
+
pr_err("Error in adding buffer measure: %d\n", ret);
return ret;
 }
diff --git a/security/integrity/ima/ima_template_lib.c 
b/security/integrity/ima/ima_template_lib.c
index f9ba37b3928d..6050ef774355 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -322,7 +322,8 @@ int ima_eventsig_init(struct ima_event_data *event_data,
int xattr_len = event_data->xattr_len;
int rc = 0;
 
-   if ((!xattr_value) || (xattr_value->type != EVM_IMA_XATTR_DIGSIG))
+   if ((!xattr_value) || !((xattr_value->type == EVM_IMA_XATTR_DIGSIG) ||
+(xattr_value->type == IMA_BUFFER_CHECK)))
goto out;
 
rc = ima_write_template_field_data(xattr_value, xattr_len, fmt,
diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
index 24520b4ef3b0..a674ae5be231 100644
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -58,6 +58,7 @@ enum evm_ima_xattr_type {
EVM_XATTR_HMAC,
EVM_IMA_XATTR_DIGSIG,
IMA_XATTR_DIGEST_NG,
+   IMA_BUFFER_CHECK,
IMA_XATTR_LAST
 };
 
-- 
2.17.1

[PATCH v2 4/5 RFC] added a buffer_check LSM hook

2019-04-23 Thread Prakhar Srivastava

From: Prakhar Srivastava 

Signed-off-by: Prakhar Srivastava 
---
Currently for soft reboot(kexec_file_load) the kernel file and
signature is measured by IMA. The cmdline args used to load the kernel
is not measured.
The boot aggregate that gets calculated will have no change since the
EFI loader has not been triggered.
Adding the kexec cmdline args measure and kernel version will add some
attestable criteria.

This patch adds a LSM hook for buffer_check
Suggested by Mimi Zohar

 include/linux/lsm_hooks.h | 3 +++
 include/linux/security.h  | 5 +
 security/security.c   | 7 +++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 080f34e66017..854bf3cac716 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1568,6 +1568,8 @@ union security_list_options {
int (*inode_setsecctx)(struct dentry *dentry, void *ctx, u32 ctxlen);
int (*inode_getsecctx)(struct inode *inode, void **ctx, u32 *ctxlen);
 
+   int (*buffer_check)(const void *buff, int size, const char *eventname);
+
 #ifdef CONFIG_SECURITY_NETWORK
int (*unix_stream_connect)(struct sock *sock, struct sock *other,
struct sock *newsk);
@@ -1813,6 +1815,7 @@ struct security_hook_heads {
struct list_head inode_notifysecctx;
struct list_head inode_setsecctx;
struct list_head inode_getsecctx;
+   struct list_head buffer_check;
 #ifdef CONFIG_SECURITY_NETWORK
struct list_head unix_stream_connect;
struct list_head unix_may_send;
diff --git a/include/linux/security.h b/include/linux/security.h
index af675b576645..cbba0e119234 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -377,6 +377,8 @@ void security_inode_invalidate_secctx(struct inode *inode);
 int security_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen);
 int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen);
 int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
+
+void security_buffer_measure(const void *buff, int size, char *eventname);
 #else /* CONFIG_SECURITY */
 struct security_mnt_opts {
 };
@@ -776,6 +778,9 @@ static inline void security_inode_getsecid(struct inode 
*inode, u32 *secid)
*secid = 0;
 }
 
+static inline void security_buffer_measure(const void *buff, int size, char 
*eventname)
+{ }
+
 static inline int security_inode_copy_up(struct dentry *src, struct cred **new)
 {
return 0;
diff --git a/security/security.c b/security/security.c
index 38316bb28b16..a0dfdb015412 100644
--- a/security/security.c
+++ b/security/security.c
@@ -320,6 +320,13 @@ int security_bprm_check(struct linux_binprm *bprm)
return ima_bprm_check(bprm);
 }
 
+void security_buffer_measure(const void *buff, int size, char *eventname)
+{
+   call_void_hook(buffer_check, buff, size, eventname);
+   return ima_buffer_check(buff, size, eventname);
+}
+
+
 void security_bprm_committing_creds(struct linux_binprm *bprm)
 {
call_void_hook(bprm_committing_creds, bprm);
-- 
2.17.1

[PATCH v2 1/5 RFC] added ima hook for buffer, being enabled as a policy

2019-04-23 Thread Prakhar Srivastava

From: Prakhar Srivastava 

Signed-off-by: Prakhar Srivastava 
---
Currently for soft reboot(kexec_file_load) the kernel file and
signature is measured by IMA. The cmdline args used to load the kernel
is not measured.
The boot aggregate that gets calculated will have no change since the
EFI loader has not been triggered.
Adding the kexec cmdline args measure and kernel version will add some
attestable criteria.

This adds a new ima hook ima_buffer_check and a policy entry BUFFER_CHECK.
This enables buffer has measurements into ima log

 Documentation/ABI/testing/ima_policy |  1 +
 include/linux/ima.h  | 13 +++-
 security/integrity/ima/ima.h |  1 +
 security/integrity/ima/ima_main.c| 95 
 security/integrity/ima/ima_policy.c  | 14 +++-
 5 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/ima_policy 
b/Documentation/ABI/testing/ima_policy
index bb0f9a135e21..676088c7ab26 100644
--- a/Documentation/ABI/testing/ima_policy
+++ b/Documentation/ABI/testing/ima_policy
@@ -28,6 +28,7 @@ Description:
base:   func:= 
[BPRM_CHECK][MMAP_CHECK][FILE_CHECK][MODULE_CHECK]
[FIRMWARE_CHECK]
[KEXEC_KERNEL_CHECK] [KEXEC_INITRAMFS_CHECK]
+   [BUFFER_CHECK]
mask:= [[^]MAY_READ] [[^]MAY_WRITE] [[^]MAY_APPEND]
   [[^]MAY_EXEC]
fsmagic:= hex value
diff --git a/include/linux/ima.h b/include/linux/ima.h
index 7f6952f8d6aa..733d0cb9dedc 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -14,6 +14,12 @@
 #include 
 struct linux_binprm;
 
+enum __buffer_id {
+   KERNEL_VERSION,
+   KEXEC_CMDLINE,
+   MAX_BUFFER_ID = KEXEC_CMDLINE
+} buffer_id;
+
 #ifdef CONFIG_IMA
 extern int ima_bprm_check(struct linux_binprm *bprm);
 extern int ima_file_check(struct file *file, int mask, int opened);
@@ -23,7 +29,7 @@ extern int ima_read_file(struct file *file, enum 
kernel_read_file_id id);
 extern int ima_post_read_file(struct file *file, void *buf, loff_t size,
  enum kernel_read_file_id id);
 extern void ima_post_path_mknod(struct dentry *dentry);
-
+extern void ima_buffer_check(const void *buff, int size, enum buffer_id id);
 #ifdef CONFIG_IMA_KEXEC
 extern void ima_add_kexec_buffer(struct kimage *image);
 #endif
@@ -65,6 +71,11 @@ static inline void ima_post_path_mknod(struct dentry *dentry)
return;
 }
 
+static inline void ima_buffer_check(const void *buff, int size,
+   enum buffer_id id)
+{
+   return;
+}
 #endif /* CONFIG_IMA */
 
 #ifndef CONFIG_IMA_KEXEC
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index b563fbd4d122..b71f2f6f7421 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -181,6 +181,7 @@ enum ima_hooks {
FIRMWARE_CHECK,
KEXEC_KERNEL_CHECK,
KEXEC_INITRAMFS_CHECK,
+   BUFFER_CHECK,
POLICY_CHECK,
MAX_CHECK
 };
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 2aebb7984437..6408cadaadbb 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -155,6 +155,84 @@ void ima_file_free(struct file *file)
ima_check_last_writer(iint, inode, file);
 }
 
+/*
+ * process_buffer_measurement - Measure the buffer passed to ima log.
+ * (Instead of using the file hash the buffer hash is used).
+ * @buff - The buffer that needs to be added to the log
+ * @size - size of buffer(in bytes)
+ * @id - buffer id, this is differentiator for the various buffers
+ * that can be measured.
+ *
+ * The buffer passed is added to the ima logs.
+ * If the sig template is used, then the sig field contains the buffer.
+ *
+ * On success return 0.
+ * On error cases surface errors from ima calls.
+ */
+static int process_buffer_measurement(const void *buff, int size,
+   enum buffer_id id)
+{
+   int ret = -EINVAL;
+   struct ima_template_entry *entry = NULL;
+   struct integrity_iint_cache tmp_iint, *iint = _iint;
+   struct ima_event_data event_data = {iint, NULL, NULL,
+   NULL, 0, NULL};
+   struct {
+   struct ima_digest_data hdr;
+   char digest[IMA_MAX_DIGEST_SIZE];
+   } hash;
+   char *name = NULL;
+   int violation = 0;
+   int pcr = CONFIG_IMA_MEASURE_PCR_IDX;
+
+   if (!buff || size ==  0)
+   goto err_out;
+
+   if (ima_get_action(NULL, 0, BUFFER_CHECK, ) != IMA_MEASURE)
+   goto err_out;
+
+   switch (buffer_id) {
+   case KERNEL_VERSION:
+   name = "Kernel-version";
+   break;
+   case KEXEC_CMDLINE:
+   name = "Kexec-cmdline";
+   break;
+   default:
+   goto err_out;
+   }
+
+

[PATCH] ARM: dts: qcom-apq8064: Fix DSI PHY ref clk phandle

2019-04-23 Thread Matthias Kaehlcke

Commit 3560af5a56b5 ("ARM: dts: qcom-apq8064: Set 'xo_board' as
ref clock of the DSI PHY") specifies the non-existing phandle
'xo_board' as DSI PHY ref clk. Fix this by using the correct
phandle is 'cxo_board'.

Fixes: 3560af5a56b5 ("ARM: dts: qcom-apq8064: Set 'xo_board' as ref clock of 
the DSI PHY")

Signed-off-by: Matthias Kaehlcke 
---
 arch/arm/boot/dts/qcom-apq8064.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
b/arch/arm/boot/dts/qcom-apq8064.dtsi
index 2fff5074f2d7..65975df6a8c3 100644
--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
@@ -1305,7 +1305,7 @@
reg-names = "dsi_pll", "dsi_phy", "dsi_phy_regulator";
clock-names = "iface_clk", "ref";
clocks = < DSI_M_AHB_CLK>,
-<_board>;
+<_board>;
};
 
 
-- 
2.21.0.593.g511ec345e18-goog

Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

2019-04-23 Thread Tim Chen

On 4/23/19 9:18 AM, Vineeth Remanan Pillai wrote:

> +/* real prio, less is less */
> +static inline bool __prio_less(struct task_struct *a, struct task_struct *b, 
> bool core_cmp)
> +{
> + u64 vruntime;
> +
> + int pa = __task_prio(a), pb = __task_prio(b);
> +
> + if (-pa < -pb)
> + return true;
> +
> + if (-pb < -pa)
> + return false;
> +
> + if (pa == -1) /* dl_prio() doesn't work because of stop_class above */
> + return !dl_time_before(a->dl.deadline, b->dl.deadline);
> +
> + vruntime = b->se.vruntime;
> + if (core_cmp) {
> + vruntime -= task_cfs_rq(b)->min_vruntime;
> + vruntime += task_cfs_rq(a)->min_vruntime;
> + }
> + if (pa == MAX_RT_PRIO + MAX_NICE) /* fair */
> + return !((s64)(a->se.vruntime - vruntime) <= 0);
> +
> + return false;
> +}
> +
> +static inline bool cpu_prio_less(struct task_struct *a, struct task_struct 
> *b)
> +{
> + return __prio_less(a, b, false);
> +}
> +
> +static inline bool core_prio_less(struct task_struct *a, struct task_struct 
> *b)
> +{
> + return __prio_less(a, b, true);
> +}
> +
> +static inline bool __sched_core_less(struct task_struct *a, struct 
> task_struct *b)
> +{
> + if (a->core_cookie < b->core_cookie)
> + return true;
> +
> + if (a->core_cookie > b->core_cookie)
> + return false;
> +
> + /* flip prio, so high prio is leftmost */
> + if (cpu_prio_less(b, a))
> + return true;
> +
> + return false;
> +}
> +

A minor nitpick.  I find keeping the vruntime base readjustment in
core_prio_less probably is more straight forward rather than pass a
core_cmp bool around.

Tim


diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 455e7ecc2f48..5917fb85669b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -100,15 +87,13 @@ static inline struct cfs_rq *task_cfs_rq(struct 
task_struct *p)
  */
 
 /* real prio, less is less */
-static inline bool __prio_less(struct task_struct *a, struct task_struct *b, 
bool core_cmp)
+static inline bool __prio_less(struct task_struct *a, struct task_struct *b, 
u64 vruntime)
 {
-   u64 vruntime;
-
int pa = __task_prio(a), pb = __task_prio(b);
 
trace_printk("(%s/%d;%d,%Lu,%Lu) ?< (%s/%d;%d,%Lu,%Lu)\n",
-a->comm, a->pid, pa, a->se.vruntime, a->dl.deadline,
-b->comm, b->pid, pa, b->se.vruntime, b->dl.deadline);
+   a->comm, a->pid, pa, a->se.vruntime, a->dl.deadline,
+   b->comm, b->pid, pa, b->se.vruntime, b->dl.deadline);
 
if (-pa < -pb)
return true;
@@ -119,11 +104,6 @@ static inline bool __prio_less(struct task_struct *a, 
struct task_struct *b, boo
if (pa == -1) /* dl_prio() doesn't work because of stop_class above */
return !dl_time_before(a->dl.deadline, b->dl.deadline);
 
-   vruntime = b->se.vruntime;
-   if (core_cmp) {
-   vruntime -= task_cfs_rq(b)->min_vruntime;
-   vruntime += task_cfs_rq(a)->min_vruntime;
-   }
if (pa == MAX_RT_PRIO + MAX_NICE) /* fair */
return !((s64)(a->se.vruntime - vruntime) <= 0);
 
@@ -132,12 +112,17 @@ static inline bool __prio_less(struct task_struct *a, 
struct task_struct *b, boo
 
 static inline bool cpu_prio_less(struct task_struct *a, struct task_struct *b)
 {
-   return __prio_less(a, b, false);
+   return __prio_less(a, b, b->se.vruntime);
 }
 
 static inline bool core_prio_less(struct task_struct *a, struct task_struct *b)
 {
-   return __prio_less(a, b, true);
+   u64 vruntime = b->se.vruntime;
+
+   vruntime -= task_cfs_rq(b)->min_vruntime;
+   vruntime += task_cfs_rq(a)->min_vruntime;
+
+   return __prio_less(a, b, vruntime);
 }
 
 static inline bool __sched_core_less(struct task_struct *a, struct task_struct 
*b)

[PATCH v3 1/3] RISC-V: Add RISC-V specific arch_match_cpu_phys_id

2019-04-23 Thread Atish Patra

OF/DT core has a hook for architecture specific logical cpuid to hartid
mapping. By implementing this, we can pass the logical cpu id to cpu
node parsing functions.

Fix the instances where logical cpuid is expected as an argument in
of_get_cpu_node.

Signed-off-by: Atish Patra 
---
 arch/riscv/kernel/cpu.c | 3 +--
 arch/riscv/kernel/smp.c | 5 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index cf2fca12414a..c8d2a3223099 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -136,8 +136,7 @@ static void c_stop(struct seq_file *m, void *v)
 static int c_show(struct seq_file *m, void *v)
 {
unsigned long cpu_id = (unsigned long)v - 1;
-   struct device_node *node = of_get_cpu_node(cpuid_to_hartid_map(cpu_id),
-  NULL);
+   struct device_node *node = of_get_cpu_node(cpu_id, NULL);
const char *compat, *isa, *mmu;
 
seq_printf(m, "processor\t: %lu\n", cpu_id);
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 0c41d07ec281..94db72662f60 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -70,6 +70,11 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, 
struct cpumask *out)
for_each_cpu(cpu, in)
cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
 }
+
+bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
+{
+   return phys_id == cpuid_to_hartid_map(cpu);
+}
 /* Unsupported */
 int setup_profiling_timer(unsigned int multiplier)
 {
-- 
2.21.0

[PATCH v3 3/3] RISC-V: Support nr_cpus command line option.

2019-04-23 Thread Atish Patra

If nr_cpus command line option is set, maximum possible cpu should be
set to that value.

Signed-off-by: Atish Patra 
---
 arch/riscv/kernel/smpboot.c | 10 +-
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/kernel/smpboot.

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index a8ad200581aa..7a0b62252524 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -84,11 +84,19 @@ void __init setup_smp(void)
}
 
cpuid_to_hartid_map(cpuid) = hart;
-   set_cpu_possible(cpuid, true);
cpuid++;
}
 
BUG_ON(!found_boot_cpu);
+
+   if (cpuid > nr_cpu_ids)
+   pr_warn("Total number of cpus [%d] is greater than nr_cpus 
option value [%d]\n",
+   cpuid, nr_cpu_ids);
+
+   for (cpuid = 1; cpuid < nr_cpu_ids; cpuid++) {
+   if (cpuid_to_hartid_map(cpuid) != INVALID_HARTID)
+   set_cpu_possible(cpuid, true);
+   }
 }
 
 int __cpu_up(unsigned int cpu, struct task_struct *tidle)
-- 
2.21.0

[PATCH v3 0/3] Miscellaneous kernel command line fixes

2019-04-23 Thread Atish Patra

Assorted command line option fixes for RISC-V.

Changes from v2->v3.
1. Merged patch 1 & 2 into one patch.

Changes from v1->v2.
1. Update pr_err string in patch (4/4) as per review.

Atish Patra (3):
RISC-V: Add RISC-V specific arch_match_cpu_phys_id
RISC-V: Implement nosmp commandline option.
RISC-V: Support nr_cpus command line option.

arch/riscv/kernel/cpu.c |  3 +--
arch/riscv/kernel/smp.c |  5 +
arch/riscv/kernel/smpboot.  |  0
arch/riscv/kernel/smpboot.c | 22 --
4 files changed, 26 insertions(+), 4 deletions(-)
create mode 100644 arch/riscv/kernel/smpboot.

--
2.21.0

[PATCH v3 2/3] RISC-V: Implement nosmp commandline option.

2019-04-23 Thread Atish Patra

nosmp command line option sets max_cpus to zero. No secondary harts
will boot if this is enabled. But present cpu mask will still point to
all possible masks.

Fix present cpu mask for nosmp usecase.

Signed-off-by: Atish Patra 
---
 arch/riscv/kernel/smpboot.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index eb533b5c2c8c..a8ad200581aa 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -47,6 +47,17 @@ void __init smp_prepare_boot_cpu(void)
 
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
+   int cpuid;
+
+   /* This covers non-smp usecase mandated by "nosmp" option */
+   if (max_cpus == 0)
+   return;
+
+   for_each_possible_cpu(cpuid) {
+   if (cpuid == smp_processor_id())
+   continue;
+   set_cpu_present(cpuid, true);
+   }
 }
 
 void __init setup_smp(void)
@@ -74,7 +85,6 @@ void __init setup_smp(void)
 
cpuid_to_hartid_map(cpuid) = hart;
set_cpu_possible(cpuid, true);
-   set_cpu_present(cpuid, true);
cpuid++;
}
 
-- 
2.21.0

Re: [PATCH] trace: Fix preempt_enable_no_resched() abuse

2019-04-23 Thread Steven Rostedt

On Tue, 23 Apr 2019 22:03:18 +0200
Peter Zijlstra  wrote:

> On Tue, Apr 23, 2019 at 09:55:59PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 23, 2019 at 03:41:32PM -0400, Waiman Long wrote:  
> 
> > > I saw a number of instances of
> > > preempt_enable_no_resched() without right next a schedule().  
> > 
> > Look more closely.. and let me know, if true, those are bugs that need
> > fixing.
> > 
> > Argghhh.. BPF...  
> 
> /me shakes head, Steve...

/me points finger to Frederic ;-)


> 
> ---
> Subject: trace: Fix preempt_enable_no_resched() abuse
> 
> Unless the very next line is schedule(), or implies it, one must not use
> preempt_enable_no_resched(). It can cause a preemption to go missing and
> thereby cause arbitrary delays, breaking the PREEMPT=y invariant.
> 
> Cc: Steven Rostedt 
> Fixes: 37886f6a9f62 ("ring-buffer: add api to allow a tracer to change clock 
> source")

That commit just moved the buggy code. That tag should be:

Fixes: 2c2d7329d8af ("tracing/ftrace: use preempt_enable_no_resched_notrace in 
ring_buffer_time_stamp()")

OK, this isn't quite fair to point all the blame on Frederic, because
it did fix a bug. But the real fix for that bug was your fix here:

499d79559ffe4b ("sched/core: More notrace annotations")

-- Steve


> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  kernel/trace/ring_buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 41b6f96e5366..4ee8d8aa3d0f 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -762,7 +762,7 @@ u64 ring_buffer_time_stamp(struct ring_buffer
> *buffer, int cpu) 
>   preempt_disable_notrace();
>   time = rb_time_stamp(buffer);
> - preempt_enable_no_resched_notrace();
> + preempt_enable_notrace();
>  
>   return time;
>  }

Re: Does vdso_install attempt to re-compile objects under root privilege?

2019-04-23 Thread Andy Lutomirski




> On Apr 23, 2019, at 4:38 PM, Linus Torvalds  
> wrote:
> 
>> On Tue, Apr 23, 2019 at 11:47 AM Andy Lutomirski  wrote:
>> 
>> Hmm.  I suppose an alternative would be for vdso_install to fail if
>> the vdso isn't built?
> 
> I absolutely abhor even the concept of building the kernel as root,
> and I think it should be actively disallowed. Our build system is
> good, but it's good as in "clever and complex" rather than necessarily
> good as in "very secure".
> 
> So anybody who builds the kernel as root is doing something seriously
> wrong, in my opinion.
> 
> That's partly exactly _because_ we have a lot of magical and very
> powerful build rules, and complicated implicit things going on.
> 
> For example, our dependencies aren't even about just the files in the
> kernel repository itself, we have clever things like "if the compiler
> has been updated and features or version changes, we'll automatically
> rebuild, because it's part of our clever build system checks".
> 
> But that is also part of the reason why I absolutely do *not* want any
> root-building to happen, because our build setup is simply way too
> clever.
> 
> If root builds stuff, you'll end up with root-owned generated
> subdirectories or various config files etc, and even if you don't have
> security issues, it can complicate the build later as a regular user.
> 
> I've had the build occasionally fail in odd ways, because some
> root-owned file was now no longer removable (usually it's the
> auto-generated header files in the directory, and the root-generated
> and owned directory is now not writable by the developer any more).
> And every time it happens, I shudder.
> 
> So all of that simply boils down to "root should not be running those
> complex rules for our config and dependency magic".
> 
> At the same time, "make install" obviously needs to be done as root.
> 
> All of which is why I opine that "make install" should never build
> anything at all, it should purely be used as a "install previously
> built files".
> 
> So yes, I'd much prefer just failing over trying to build as root (or
> even trying to figure out dependencies as root).
> 
>> What's the ideal outcome here?
> 
> I'd basically like the rule for "make install" to be that it never
> ever generates a single file in the build tree, so that there are
> never any root-owned (or root-overwritten) files there.
> 
> So "make install" should even avoid all dependency checking, for the
> simple reason that if you happen to do a system update between "make"
> and "make install", our smart dependencies should never say "oh, the
> compiler version has changed, so now I'll rebuild everything as root
> just because 'make install'".
> 
> So I think the ideal outcome is just "fail if you can't find the files
> to install".
> 
> 

To clarify, this is “fail if you can’t find the files to install, but don’t 
even try to check whether those files are up to date”, right?

1 2 3 4 5 6 7 8 9 >

1 - 100 of 849 matches

Mail list logo