[PATCH 1/2] gpio: davinci: Fix the compiler warning with ARM64 config enabled

2019-06-04 Thread Keerthy
Fix the compiler warning with ARM64 config enabled
as the current mask assumes 32 bit by default.

Signed-off-by: Keerthy 
---
 drivers/gpio/gpio-davinci.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpio/gpio-davinci.c b/drivers/gpio/gpio-davinci.c
index 3bbf5804bd11..0977590eb996 100644
--- a/drivers/gpio/gpio-davinci.c
+++ b/drivers/gpio/gpio-davinci.c
@@ -297,7 +297,7 @@ static int davinci_gpio_probe(struct platform_device *pdev)
 static void gpio_irq_disable(struct irq_data *d)
 {
struct davinci_gpio_regs __iomem *g = irq2regs(d);
-   u32 mask = (u32) irq_data_get_irq_handler_data(d);
+   uintptr_t mask = (uintptr_t)irq_data_get_irq_handler_data(d);
 
writel_relaxed(mask, >clr_falling);
writel_relaxed(mask, >clr_rising);
@@ -306,7 +306,7 @@ static void gpio_irq_disable(struct irq_data *d)
 static void gpio_irq_enable(struct irq_data *d)
 {
struct davinci_gpio_regs __iomem *g = irq2regs(d);
-   u32 mask = (u32) irq_data_get_irq_handler_data(d);
+   uintptr_t mask = (uintptr_t)irq_data_get_irq_handler_data(d);
unsigned status = irqd_get_trigger_type(d);
 
status &= IRQ_TYPE_EDGE_FALLING | IRQ_TYPE_EDGE_RISING;
@@ -447,7 +447,7 @@ davinci_gpio_irq_map(struct irq_domain *d, unsigned int irq,
"davinci_gpio");
irq_set_irq_type(irq, IRQ_TYPE_NONE);
irq_set_chip_data(irq, (__force void *)g);
-   irq_set_handler_data(irq, (void *)__gpio_mask(hw));
+   irq_set_handler_data(irq, (void *)(uintptr_t)__gpio_mask(hw));
 
return 0;
 }
-- 
2.17.1



[PATCH 2/2] gpio: Davinci: Add K3 dependencies

2019-06-04 Thread Keerthy
Add K3 dependencies to enable the driver on K3 platforms.

Signed-off-by: Keerthy 
---
 drivers/gpio/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 62f3fe06cd2f..28dba62e2219 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -174,7 +174,7 @@ config GPIO_CLPS711X
 config GPIO_DAVINCI
bool "TI Davinci/Keystone GPIO support"
default y if ARCH_DAVINCI
-   depends on ARM && (ARCH_DAVINCI || ARCH_KEYSTONE)
+   depends on (ARM || ARM64) && (ARCH_DAVINCI || ARCH_KEYSTONE || ARCH_K3)
help
  Say yes here to enable GPIO support for TI Davinci/Keystone SoCs.
 
-- 
2.17.1



[PATCH 0/2] gpio: davinci: Add support for TI K3 AM6 platform

2019-06-04 Thread Keerthy
K3 AM6 platform has 2 instances of gpio banks on main domain
and 1 instance on wakeup domin. All are capable of generating
banked interrupts.

Keerthy (2):
  gpio: davinci: Fix the compiler warning with ARM64 config enabled
  gpio: Davinci: Add K3 Specific dependencies

 drivers/gpio/Kconfig| 2 +-
 drivers/gpio/gpio-davinci.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.17.1



[PATCH -next] net: ethernet: mediatek: fix mtk_eth_soc build errors & warnings

2019-06-04 Thread Randy Dunlap
From: Randy Dunlap 

Fix build errors in Mediatek mtk_eth_soc driver.

It looks like these 3 source files were meant to be linked together
since 2 of them are library-like functions,
but they are currently being built as 3 loadable modules.

Fixes these build errors:

  WARNING: modpost: missing MODULE_LICENSE() in 
drivers/net/ethernet/mediatek/mtk_eth_path.o
  WARNING: modpost: missing MODULE_LICENSE() in 
drivers/net/ethernet/mediatek/mtk_sgmii.o
  ERROR: "mtk_sgmii_init" [drivers/net/ethernet/mediatek/mtk_eth_soc.ko] 
undefined!
  ERROR: "mtk_setup_hw_path" [drivers/net/ethernet/mediatek/mtk_eth_soc.ko] 
undefined!
  ERROR: "mtk_sgmii_setup_mode_force" 
[drivers/net/ethernet/mediatek/mtk_eth_soc.ko] undefined!
  ERROR: "mtk_sgmii_setup_mode_an" 
[drivers/net/ethernet/mediatek/mtk_eth_soc.ko] undefined!
  ERROR: "mtk_w32" [drivers/net/ethernet/mediatek/mtk_eth_path.ko] undefined!
  ERROR: "mtk_r32" [drivers/net/ethernet/mediatek/mtk_eth_path.ko] undefined!

This changes the loadable module name from mtk_eth_soc to mtk_eth.
I didn't see a way to leave it as mtk_eth_soc.

Reported-by: kbuild test robot 
Signed-off-by: Randy Dunlap 
Cc: Sean Wang 
Cc: John Crispin 
Cc: Felix Fietkau 
Cc: Nelson Chang 
---
 drivers/net/ethernet/mediatek/Makefile |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-next-20190604.orig/drivers/net/ethernet/mediatek/Makefile
+++ linux-next-20190604/drivers/net/ethernet/mediatek/Makefile
@@ -3,5 +3,5 @@
 # Makefile for the Mediatek SoCs built-in ethernet macs
 #
 
-obj-$(CONFIG_NET_MEDIATEK_SOC) += mtk_eth_soc.o mtk_sgmii.o \
- mtk_eth_path.o
+obj-$(CONFIG_NET_MEDIATEK_SOC) += mtk_eth.o
+mtk_eth-y := mtk_eth_soc.o mtk_sgmii.o mtk_eth_path.o




RE: [PATCH 0/3] (Qualcomm) UFS device reset support

2019-06-04 Thread Avri Altman
Hi,

> 
> On Tue, Jun 4, 2019 at 12:22 AM Bjorn Andersson
>  wrote:
> >
> > This series exposes the ufs_reset line as a gpio, adds support for ufshcd to
> > acquire and toggle this and then adds this to SDM845 MTP.
> >
> > Bjorn Andersson (3):
> >   pinctrl: qcom: sdm845: Expose ufs_reset as gpio
> >   scsi: ufs: Allow resetting the UFS device
> >   arm64: dts: qcom: sdm845-mtp: Specify UFS device-reset GPIO
> 
> Adding similar change as in sdm845-mtp to the not yet upstream
> blueline dts, I validated this allows my micron UFS pixel3 to boot.
> 
> Tested-by: John Stultz 
Maybe ufs_hba_variant_ops would be the proper place to add this?

Thanks,
Avri



> 
> thanks
> -john


Re: [PATCH v3 1/4] iommu: Add gfp parameter to iommu_ops::map

2019-06-04 Thread Christoph Hellwig
On Mon, May 06, 2019 at 07:52:03PM +0100, Tom Murphy via iommu wrote:
> We can remove the mutex lock from amd_iommu_map and amd_iommu_unmap.
> iommu_map doesn’t lock while mapping and so no two calls should touch
> the same iova range. The AMD driver already handles the page table page
> allocations without locks so we can safely remove the locks.

Btw, this really should be a separate patch.


[PATCH v6 1/5] usb: fsl: Set USB_EN bit to select ULPI phy

2019-06-04 Thread Yinbo Zhu
From: Nikhil Badola 

Set USB_EN bit to select ULPI phy for USB controller version 2.5

Signed-off-by: Nikhil Badola 
Signed-off-by: Yinbo Zhu 
---
 drivers/usb/host/ehci-fsl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
index e3d0c1c25160..38674b7aa51e 100644
--- a/drivers/usb/host/ehci-fsl.c
+++ b/drivers/usb/host/ehci-fsl.c
@@ -122,6 +122,12 @@ static int fsl_ehci_drv_probe(struct platform_device *pdev)
tmp |= 0x4;
iowrite32be(tmp, hcd->regs + FSL_SOC_USB_CTRL);
}
+
+   /* Set USB_EN bit to select ULPI phy for USB controller version 2.5 */
+   if (pdata->controller_ver == FSL_USB_VER_2_5 &&
+   pdata->phy_mode == FSL_USB2_PHY_ULPI)
+   iowrite32be(USB_CTRL_USB_EN, hcd->regs + FSL_SOC_USB_CTRL);
+
/*
 * Enable UTMI phy and program PTS field in UTMI mode before asserting
 * controller reset for USB Controller version 2.5
-- 
2.17.1



[PATCH v6 3/5] usb: linux/fsl_device: Add platform member has_fsl_erratum_a006918

2019-06-04 Thread Yinbo Zhu
This patch is to add member has_fsl_erratum_a006918 in platform data

Signed-off-by: Yinbo Zhu 
---
 include/linux/fsl_devices.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fsl_devices.h b/include/linux/fsl_devices.h
index cb2b46f57af3..5d231ce8709b 100644
--- a/include/linux/fsl_devices.h
+++ b/include/linux/fsl_devices.h
@@ -98,6 +98,7 @@ struct fsl_usb2_platform_data {
unsignedhas_fsl_erratum_14:1;
unsignedhas_fsl_erratum_a005275:1;
unsignedhas_fsl_erratum_a005697:1;
+   unsignedhas_fsl_erratum_a006918:1;
unsignedcheck_phy_clk_valid:1;
 
/* register save area for suspend/resume */
-- 
2.17.1



[PATCH v6 5/5] usb :fsl: Change string format for errata property

2019-06-04 Thread Yinbo Zhu
From: Nikhil Badola 

Remove USB errata checking code from driver. Applicability of erratum
is retrieved by reading corresponding property in device tree.
This property is written during device tree fixup.

Signed-off-by: Ramneek Mehresh 
Signed-off-by: Nikhil Badola 
Signed-off-by: Yinbo Zhu 
---
 drivers/usb/host/fsl-mph-dr-of.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/fsl-mph-dr-of.c b/drivers/usb/host/fsl-mph-dr-of.c
index 762b97600ab0..ae8f60f6e6a5 100644
--- a/drivers/usb/host/fsl-mph-dr-of.c
+++ b/drivers/usb/host/fsl-mph-dr-of.c
@@ -226,11 +226,8 @@ static int fsl_usb2_mph_dr_of_probe(struct platform_device 
*ofdev)
of_property_read_bool(np, "fsl,usb_erratum-a005697");
pdata->has_fsl_erratum_a006918 =
of_property_read_bool(np, "fsl,usb_erratum-a006918");
-
-   if (of_get_property(np, "fsl,usb_erratum_14", NULL))
-   pdata->has_fsl_erratum_14 = 1;
-   else
-   pdata->has_fsl_erratum_14 = 0;
+   pdata->has_fsl_erratum_14 =
+   of_property_read_bool(np, "fsl,usb_erratum-14");
 
/*
 * Determine whether phy_clk_valid needs to be checked
-- 
2.17.1



[PATCH v6 4/5] usb: host: Stops USB controller init if PLL fails to lock

2019-06-04 Thread Yinbo Zhu
From: Ramneek Mehresh 

USB erratum-A006918 workaround tries to start internal PHY inside
uboot (when PLL fails to lock). However, if the workaround also
fails, then USB initialization is also stopped inside Linux.
Erratum-A006918 workaround failure creates "fsl,erratum_a006918"
node in device-tree. Presence of this node in device-tree is
used to stop USB controller initialization in Linux

Signed-off-by: Ramneek Mehresh 
Signed-off-by: Suresh Gupta 
Signed-off-by: Yinbo Zhu 
---
Change in v6:
add a "Fall through" comment

 drivers/usb/host/ehci-fsl.c  | 10 +-
 drivers/usb/host/fsl-mph-dr-of.c |  3 ++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
index 8f3bf3efb038..86ae37086a74 100644
--- a/drivers/usb/host/ehci-fsl.c
+++ b/drivers/usb/host/ehci-fsl.c
@@ -234,8 +234,16 @@ static int ehci_fsl_setup_phy(struct usb_hcd *hcd,
break;
case FSL_USB2_PHY_UTMI_WIDE:
portsc |= PORT_PTS_PTW;
-   /* fall through */
case FSL_USB2_PHY_UTMI:
+   /* Presence of this node "has_fsl_erratum_a006918"
+* in device-tree is used to stop USB controller
+* initialization in Linux
+*/
+   if (pdata->has_fsl_erratum_a006918) {
+   dev_warn(dev, "USB PHY clock invalid\n");
+   return -EINVAL;
+   }
+
case FSL_USB2_PHY_UTMI_DUAL:
/* PHY_CLK_VALID bit is de-featured from all controller
 * versions below 2.4 and is to be checked only for
diff --git a/drivers/usb/host/fsl-mph-dr-of.c b/drivers/usb/host/fsl-mph-dr-of.c
index 4f8b8a08c914..762b97600ab0 100644
--- a/drivers/usb/host/fsl-mph-dr-of.c
+++ b/drivers/usb/host/fsl-mph-dr-of.c
@@ -224,13 +224,14 @@ static int fsl_usb2_mph_dr_of_probe(struct 
platform_device *ofdev)
of_property_read_bool(np, "fsl,usb-erratum-a005275");
pdata->has_fsl_erratum_a005697 =
of_property_read_bool(np, "fsl,usb_erratum-a005697");
+   pdata->has_fsl_erratum_a006918 =
+   of_property_read_bool(np, "fsl,usb_erratum-a006918");
 
if (of_get_property(np, "fsl,usb_erratum_14", NULL))
pdata->has_fsl_erratum_14 = 1;
else
pdata->has_fsl_erratum_14 = 0;
 
-
/*
 * Determine whether phy_clk_valid needs to be checked
 * by reading property in device tree
-- 
2.17.1



[PATCH v6 2/5] usb: phy: Workaround for USB erratum-A005728

2019-06-04 Thread Yinbo Zhu
From: Suresh Gupta 

PHY_CLK_VALID bit for UTMI PHY in USBDR does not set even
if PHY is providing valid clock. Workaround for this
involves resetting of PHY and check PHY_CLK_VALID bit
multiple times. If PHY_CLK_VALID bit is still not set even
after 5 retries, it would be safe to deaclare that PHY
clock is not available.
This erratum is applicable for USBDR less then ver 2.4.

Signed-off-by: Suresh Gupta 
Signed-off-by: Yinbo Zhu 
---
Change in v6:
Indented the code in ehci-fsl.c 

 drivers/usb/host/ehci-fsl.c | 37 ++---
 drivers/usb/host/ehci-fsl.h |  3 +++
 2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
index 38674b7aa51e..8f3bf3efb038 100644
--- a/drivers/usb/host/ehci-fsl.c
+++ b/drivers/usb/host/ehci-fsl.c
@@ -183,6 +183,17 @@ static int fsl_ehci_drv_probe(struct platform_device *pdev)
return retval;
 }
 
+static bool usb_phy_clk_valid(struct usb_hcd *hcd)
+{
+   void __iomem *non_ehci = hcd->regs;
+   bool ret = true;
+
+   if (!(ioread32be(non_ehci + FSL_SOC_USB_CTRL) & PHY_CLK_VALID))
+   ret = false;
+
+   return ret;
+}
+
 static int ehci_fsl_setup_phy(struct usb_hcd *hcd,
   enum fsl_usb2_phy_modes phy_mode,
   unsigned int port_offset)
@@ -226,6 +237,16 @@ static int ehci_fsl_setup_phy(struct usb_hcd *hcd,
/* fall through */
case FSL_USB2_PHY_UTMI:
case FSL_USB2_PHY_UTMI_DUAL:
+   /* PHY_CLK_VALID bit is de-featured from all controller
+* versions below 2.4 and is to be checked only for
+* internal UTMI phy
+*/
+   if (pdata->controller_ver > FSL_USB_VER_2_4 &&
+   pdata->have_sysif_regs && !usb_phy_clk_valid(hcd)) {
+   dev_err(dev, "USB PHY clock invalid\n");
+   return -EINVAL;
+   }
+
if (pdata->have_sysif_regs && pdata->controller_ver) {
/* controller version 1.6 or above */
tmp = ioread32be(non_ehci + FSL_SOC_USB_CTRL);
@@ -249,17 +270,11 @@ static int ehci_fsl_setup_phy(struct usb_hcd *hcd,
break;
}
 
-   /*
-* check PHY_CLK_VALID to determine phy clock presence before writing
-* to portsc
-*/
-   if (pdata->check_phy_clk_valid) {
-   if (!(ioread32be(non_ehci + FSL_SOC_USB_CTRL) &
-   PHY_CLK_VALID)) {
-   dev_warn(hcd->self.controller,
-"USB PHY clock invalid\n");
-   return -EINVAL;
-   }
+   if (pdata->have_sysif_regs &&
+   pdata->controller_ver > FSL_USB_VER_1_6 &&
+   !usb_phy_clk_valid(hcd)) {
+   dev_warn(hcd->self.controller, "USB PHY clock invalid\n");
+   return -EINVAL;
}
 
ehci_writel(ehci, portsc, >regs->port_status[port_offset]);
diff --git a/drivers/usb/host/ehci-fsl.h b/drivers/usb/host/ehci-fsl.h
index cbc422032e50..9d18c6e6ab27 100644
--- a/drivers/usb/host/ehci-fsl.h
+++ b/drivers/usb/host/ehci-fsl.h
@@ -50,4 +50,7 @@
 #define UTMI_PHY_EN (1<<9)
 #define ULPI_PHY_CLK_SEL(1<<10)
 #define PHY_CLK_VALID  (1<<17)
+
+/* Retry count for checking UTMI PHY CLK validity */
+#define UTMI_PHY_CLK_VALID_CHK_RETRY 5
 #endif /* _EHCI_FSL_H */
-- 
2.17.1



[PATCH] nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header

2019-06-04 Thread Masahiro Yamada
cpu_to_le32/le32_to_cpu is defined in include/linux/byteorder/generic.h,
which is not exported to user-space.

UAPI headers must use the ones prefixed with double-underscore.

Detected by compile-testing exported headers:

./usr/include/linux/nilfs2_ondisk.h: In function 
‘nilfs_checkpoint_set_snapshot’:
./usr/include/linux/nilfs2_ondisk.h:536:17: error: implicit declaration of 
function ‘cpu_to_le32’ [-Werror=implicit-function-declaration]
  cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) |  \
 ^
./usr/include/linux/nilfs2_ondisk.h:552:1: note: in expansion of macro 
‘NILFS_CHECKPOINT_FNS’
 NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
 ^~~~
./usr/include/linux/nilfs2_ondisk.h:536:29: error: implicit declaration of 
function ‘le32_to_cpu’ [-Werror=implicit-function-declaration]
  cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) |  \
 ^
./usr/include/linux/nilfs2_ondisk.h:552:1: note: in expansion of macro 
‘NILFS_CHECKPOINT_FNS’
 NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
 ^~~~
./usr/include/linux/nilfs2_ondisk.h: In function 
‘nilfs_segment_usage_set_clean’:
./usr/include/linux/nilfs2_ondisk.h:622:19: error: implicit declaration of 
function ‘cpu_to_le64’ [-Werror=implicit-function-declaration]
  su->su_lastmod = cpu_to_le64(0);
   ^~~

Signed-off-by: Masahiro Yamada 
---

 include/uapi/linux/nilfs2_ondisk.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/nilfs2_ondisk.h 
b/include/uapi/linux/nilfs2_ondisk.h
index a7e66ab11d1d..c23f91ae5fe8 100644
--- a/include/uapi/linux/nilfs2_ondisk.h
+++ b/include/uapi/linux/nilfs2_ondisk.h
@@ -29,7 +29,7 @@
 
 #include 
 #include 
-
+#include 
 
 #define NILFS_INODE_BMAP_SIZE  7
 
@@ -533,19 +533,19 @@ enum {
 static inline void \
 nilfs_checkpoint_set_##name(struct nilfs_checkpoint *cp)   \
 {  \
-   cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) |  \
-  (1UL << NILFS_CHECKPOINT_##flag));   \
+   cp->cp_flags = __cpu_to_le32(__le32_to_cpu(cp->cp_flags) |  \
+(1UL << NILFS_CHECKPOINT_##flag)); \
 }  \
 static inline void \
 nilfs_checkpoint_clear_##name(struct nilfs_checkpoint *cp) \
 {  \
-   cp->cp_flags = cpu_to_le32(le32_to_cpu(cp->cp_flags) &  \
+   cp->cp_flags = __cpu_to_le32(__le32_to_cpu(cp->cp_flags) &  \
   ~(1UL << NILFS_CHECKPOINT_##flag));  \
 }  \
 static inline int  \
 nilfs_checkpoint_##name(const struct nilfs_checkpoint *cp) \
 {  \
-   return !!(le32_to_cpu(cp->cp_flags) &   \
+   return !!(__le32_to_cpu(cp->cp_flags) & \
  (1UL << NILFS_CHECKPOINT_##flag));\
 }
 
@@ -595,20 +595,20 @@ enum {
 static inline void \
 nilfs_segment_usage_set_##name(struct nilfs_segment_usage *su) \
 {  \
-   su->su_flags = cpu_to_le32(le32_to_cpu(su->su_flags) |  \
+   su->su_flags = __cpu_to_le32(__le32_to_cpu(su->su_flags) |  \
   (1UL << NILFS_SEGMENT_USAGE_##flag));\
 }  \
 static inline void \
 nilfs_segment_usage_clear_##name(struct nilfs_segment_usage *su)   \
 {  \
su->su_flags =  \
-   cpu_to_le32(le32_to_cpu(su->su_flags) & \
+   __cpu_to_le32(__le32_to_cpu(su->su_flags) & \
~(1UL << NILFS_SEGMENT_USAGE_##flag));  \
 }  \
 static inline int  \
 nilfs_segment_usage_##name(const struct nilfs_segment_usage *su)   \
 {  \
-   return !!(le32_to_cpu(su->su_flags) &   \
+   return !!(__le32_to_cpu(su->su_flags) & \
  (1UL << NILFS_SEGMENT_USAGE_##flag)); \
 }
 
@@ 

Re: [PATCH] arm64: dts: qcom: sdm845-mtp: Add Truly display

2019-06-04 Thread Vivek Gautam
On Tue, May 14, 2019 at 2:39 AM Bjorn Andersson
 wrote:
>
> Bring in the Truly display and enable the DSI channels to make the
> mdss/gpu probe, even though we're lacking LABIB, preventing us from
> seeing anything on the screen.
>
> Signed-off-by: Bjorn Andersson 
> ---

Looks good to me and work well too with a wip lab-ibb driver change.

Reviewed-by: Vivek Gautam 
Tested-by: Vivek Gautam 

>  arch/arm64/boot/dts/qcom/sdm845-mtp.dts | 79 +
>  1 file changed, 79 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts 
> b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> index 02b8357c8ce8..83198a19ff57 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> +++ b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> @@ -352,6 +352,77 @@
> status = "okay";
>  };
>
> + {
> +   status = "okay";
> +   vdda-supply = <_mipi_dsi0_1p2>;
> +
> +   qcom,dual-dsi-mode;
> +   qcom,master-dsi;
> +
> +   ports {
> +   port@1 {
> +   endpoint {
> +   remote-endpoint = <_in_0>;
> +   data-lanes = <0 1 2 3>;
> +   };
> +   };
> +   };
> +
> +   panel@0 {
> +   compatible = "truly,nt35597-2K-display";
> +   reg = <0>;
> +   vdda-supply = <_l14a_1p88>;
> +
> +   reset-gpios = < 6 GPIO_ACTIVE_LOW>;
> +   mode-gpios = < 52 GPIO_ACTIVE_HIGH>;
> +
> +   ports {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   port@0 {
> +   reg = <0>;
> +   truly_in_0: endpoint {
> +   remote-endpoint = <_out>;
> +   };
> +   };
> +
> +   port@1 {
> +   reg = <1>;
> +   truly_in_1: endpoint {
> +   remote-endpoint = <_out>;
> +   };
> +   };
> +   };
> +   };
> +};
> +
> +_phy {
> +   status = "okay";
> +   vdds-supply = <_mipi_dsi0_pll>;
> +};
> +
> + {
> +   status = "okay";
> +   vdda-supply = <_mipi_dsi1_1p2>;
> +
> +   qcom,dual-dsi-mode;
> +
> +   ports {
> +   port@1 {
> +   endpoint {
> +   remote-endpoint = <_in_1>;
> +   data-lanes = <0 1 2 3>;
> +   };
> +   };
> +   };
> +};
> +
> +_phy {
> +   status = "okay";
> +   vdds-supply = <_mipi_dsi1_pll>;
> +};
> +
>   {
> protected-clocks = ,
>,
> @@ -365,6 +436,14 @@
> clock-frequency = <40>;
>  };
>
> + {
> +   status = "okay";
> +};
> +
> +_mdp {
> +   status = "okay";
> +};
> +
>  _id_1 {
> status = "okay";
>  };
> --
> 2.18.0
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH] lib/test_stackinit: Handle Clang auto-initialization pattern

2019-06-04 Thread Kees Cook
While the gcc plugin for automatic stack variable initialization (i.e.
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL) performs initialization with
0x00 bytes, the Clang automatic stack variable initialization (i.e.
CONFIG_INIT_STACK_ALL) uses various type-specific patterns that are
typically 0xAA. Therefore the stackinit selftest has been fixed to check
that bytes are no longer the test fill pattern of 0xFF (instead of looking
for bytes that have become 0x00). This retains the test coverage for the
0x00 pattern of the gcc plugin while adding coverage for the mostly 0xAA
pattern of Clang.

Signed-off-by: Kees Cook 
---
 lib/test_stackinit.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/lib/test_stackinit.c b/lib/test_stackinit.c
index e97dc54b4fdf..2d7d257a430e 100644
--- a/lib/test_stackinit.c
+++ b/lib/test_stackinit.c
@@ -12,7 +12,7 @@
 
 /* Exfiltration buffer. */
 #define MAX_VAR_SIZE   128
-static char check_buf[MAX_VAR_SIZE];
+static u8 check_buf[MAX_VAR_SIZE];
 
 /* Character array to trigger stack protector in all functions. */
 #define VAR_BUFFER  32
@@ -106,9 +106,18 @@ static noinline __init int test_ ## name (void)
\
\
/* Fill clone type with zero for per-field init. */ \
memset(, 0x00, sizeof(zero));  \
+   /* Clear entire check buffer for 0xFF overlap test. */  \
+   memset(check_buf, 0x00, sizeof(check_buf)); \
/* Fill stack with 0xFF. */ \
ignored = leaf_ ##name((unsigned long), 1,  \
FETCH_ARG_ ## which(zero)); \
+   /* Verify all bytes overwritten with 0xFF. */   \
+   for (sum = 0, i = 0; i < target_size; i++)  \
+   sum += (check_buf[i] != 0xFF);  \
+   if (sum) {  \
+   pr_err(#name ": leaf fill was not 0xFF!?\n");   \
+   return 1;   \
+   }   \
/* Clear entire check buffer for later bit tests. */\
memset(check_buf, 0x00, sizeof(check_buf)); \
/* Extract stack-defined variable contents. */  \
@@ -126,9 +135,9 @@ static noinline __init int test_ ## name (void) 
\
return 1;   \
}   \
\
-   /* Look for any set bits in the check region. */\
-   for (i = 0; i < sizeof(check_buf); i++) \
-   sum += (check_buf[i] != 0); \
+   /* Look for any bytes still 0xFF in check region. */\
+   for (sum = 0, i = 0; i < target_size; i++)  \
+   sum += (check_buf[i] == 0xFF);  \
\
if (sum == 0)   \
pr_info(#name " ok\n"); \
@@ -162,13 +171,13 @@ static noinline __init int leaf_ ## name(unsigned long 
sp,\
 * Keep this buffer around to make sure we've got a \
 * stack frame of SOME kind...  \
 */ \
-   memset(buf, (char)(sp && 0xff), sizeof(buf));   \
+   memset(buf, (char)(sp & 0xff), sizeof(buf));\
/* Fill variable with 0xFF. */  \
if (fill) { \
fill_start =   \
fill_size = sizeof(var);\
memset(fill_start,  \
-  (char)((sp && 0xff) | forced_mask),  \
+  (char)((sp & 0xff) | forced_mask),   \
   fill_size);  \
}   \
\
-- 
2.17.1


-- 
Kees Cook


Re: [PATCH] media: do not use C++ style comments in uapi headers

2019-06-04 Thread Joe Perches
On Wed, 2019-06-05 at 07:10 +0200, Greg KH wrote:
> On Wed, Jun 05, 2019 at 01:10:41PM +0900, Masahiro Yamada wrote:
> > On Wed, Jun 5, 2019 at 3:21 AM Arnd Bergmann  wrote:
[]
> > This means we cannot reliably use uint{8,16,32,64}_t in UAPI headers.
> 
> We should not be doing that as they are in the userspace "namespace" of
> variables, not in the kernel namespace.  We've been over this many times
> in the past :(

Just not very successfully...

$ git grep -w -P 'u?_?int(?:8|16|32|64)_t' include/uapi | wc -l
342

$ git grep -w -P --name-only 'u?_?int(?:8|16|32|64)_t' include/uapi | wc -l
13

Documentation helps a bit, checkpatch helps as well.
Maintainer knowledge and vigilance probably helps the most.



Re: [PATCH] media: do not use C++ style comments in uapi headers

2019-06-04 Thread Greg KH
On Wed, Jun 05, 2019 at 01:10:41PM +0900, Masahiro Yamada wrote:
> On Wed, Jun 5, 2019 at 3:21 AM Arnd Bergmann  wrote:
> > > > >
> > > > > There are two ways to define fixed-width type.
> > > > >
> > > > > [1] #include , __u8, __u16, __u32, __u64
> > > > >
> > > > >   vs
> > > > >
> > > > > [2] #include , uint8_t, uint16_t, uint32_t, uint64_t
> > > > >
> > > > >
> > > > > Both are used in UAPI headers.
> > > > > IIRC,  was standardized by C99.
> > > > >
> > > > > So, we have already relied on C99 in user-space too.
> >
> > A related problem is that using the stdint.h types requires
> > including stdint.h first, but the C library requires that including
> > one standard header does not include another one recursively.
> >
> > So if sys/socket.h includes linux/socket.h, that must not include
> > stdint.h or any other header file that does so.
> 
> 
> This means we cannot reliably use uint{8,16,32,64}_t in UAPI headers.

We should not be doing that as they are in the userspace "namespace" of
variables, not in the kernel namespace.  We've been over this many times
in the past :(

> [1] If we include  from linux/foo.h
> 
> If sys/foo.h includes  and ,
> it violates the C library requirement.
> 
> 
> [2] If we do not include  from linux/foo.h
> 
> If sys/foo.h includes , but not ,
> we get 'unknown type name' errors.

We need to just use the proper __u{8,16,32,64} variable types instead,
that is exactly what they are there for.

thanks,

greg k-h


Re: [PATCH AUTOSEL 5.1 06/60] driver core: platform: Fix the usage of platform device name(pdev->name)

2019-06-04 Thread Greg Kroah-Hartman
On Tue, Jun 04, 2019 at 07:21:16PM -0400, Sasha Levin wrote:
> From: Venkata Narendra Kumar Gutta 
> 
> [ Upstream commit edb16da34b084c66763f29bee42b4e6bb33c3d66 ]
> 
> Platform core is using pdev->name as the platform device name to do
> the binding of the devices with the drivers. But, when the platform
> driver overrides the platform device name with dev_set_name(),
> the pdev->name is pointing to a location which is freed and becomes
> an invalid parameter to do the binding match.
> 
> use-after-free instance:
> 
> [   33.325013] BUG: KASAN: use-after-free in strcmp+0x8c/0xb0
> [   33.330646] Read of size 1 at addr ffc10beae600 by task modprobe
> [   33.339068] CPU: 5 PID: 518 Comm: modprobe Tainted:
>   G S  W  O  4.19.30+ #3
> [   33.346835] Hardware name: MTP (DT)
> [   33.350419] Call trace:
> [   33.352941]  dump_backtrace+0x0/0x3b8
> [   33.356713]  show_stack+0x24/0x30
> [   33.360119]  dump_stack+0x160/0x1d8
> [   33.363709]  print_address_description+0x84/0x2e0
> [   33.368549]  kasan_report+0x26c/0x2d0
> [   33.372322]  __asan_report_load1_noabort+0x2c/0x38
> [   33.377248]  strcmp+0x8c/0xb0
> [   33.380306]  platform_match+0x70/0x1f8
> [   33.384168]  __driver_attach+0x78/0x3a0
> [   33.388111]  bus_for_each_dev+0x13c/0x1b8
> [   33.392237]  driver_attach+0x4c/0x58
> [   33.395910]  bus_add_driver+0x350/0x560
> [   33.399854]  driver_register+0x23c/0x328
> [   33.403886]  __platform_driver_register+0xd0/0xe0
> 
> So, use dev_name(>dev), which fetches the platform device name from
> the kobject(dev->kobj->name) of the device instead of the pdev->name.
> 
> Signed-off-by: Venkata Narendra Kumar Gutta 
> Signed-off-by: Greg Kroah-Hartman 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/base/platform.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Please drop this from everywhere as it was reverted from Linus's tree
because it causes big problems.

thanks,

greg k-h


Re: [PATCH] wcd9335: fix a incorrect use of kstrndup()

2019-06-04 Thread Jiri Slaby
On 29. 05. 19, 3:53, Gen Zhang wrote:
> In wcd9335_codec_enable_dec(), 'widget_name' is allocated by kstrndup().
> However, according to doc: "Note: Use kmemdup_nul() instead if the size
> is known exactly."

Except the size is not known exactly. It is at most 15, not 15. Right?

> So we should use kmemdup_nul() here instead of
> kstrndup().
> 
> Signed-off-by: Gen Zhang 
> ---
> diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c
> index a04a7ce..85737fe 100644
> --- a/sound/soc/codecs/wcd9335.c
> +++ b/sound/soc/codecs/wcd9335.c
> @@ -2734,7 +2734,7 @@ static int wcd9335_codec_enable_dec(struct 
> snd_soc_dapm_widget *w,
>   char *dec;
>   u8 hpf_coff_freq;
>  
> - widget_name = kstrndup(w->name, 15, GFP_KERNEL);
> + widget_name = kmemdup_nul(w->name, 15, GFP_KERNEL);
>   if (!widget_name)
>   return -ENOMEM;
>  

thanks,
-- 
js
suse labs


Re: [PATCH] arm64: dts: sdm845: Add iommus property to qup1

2019-06-04 Thread Vivek Gautam
On Wed, Jun 5, 2019 at 4:16 AM Stephen Boyd  wrote:
>
> Quoting Bjorn Andersson (2019-06-04 15:37:00)
> > On Tue 04 Jun 15:29 PDT 2019, Stephen Boyd wrote:
> >
> > > The SMMU that sits in front of the QUP needs to be programmed properly
> > > so that the i2c geni driver can allocate DMA descriptors. Failure to do
> > > this leads to faults when using devices such as an i2c touchscreen where
> > > the transaction is larger than 32 bytes and we use a DMA buffer.
> > >
> >
> > I'm pretty sure I've run into this problem, but before we marked the
> > smmu bypass_disable and as such didn't get the fault, thanks.
> >
> > >  arm-smmu 1500.iommu: Unexpected global fault, this could be serious
> > >  arm-smmu 1500.iommu: GFSR 0x0002, GFSYNR0 0x0002, 
> > > GFSYNR1 0x06c0, GFSYNR2 0x
> > >
> > > Add the right SID and mask so this works.
> > >
> > > Cc: Sibi Sankar 
> > > Signed-off-by: Stephen Boyd 
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index fcb93300ca62..2e57e861e17c 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -900,6 +900,7 @@
> > >   #address-cells = <2>;
> > >   #size-cells = <2>;
> > >   ranges;
> > > + iommus = <_smmu 0x6c0 0x3>;
> >
> > According to the docs this stream belongs to TZ, the HLOS stream should
> > be 0x6c3.
>
> Aye, I saw this line in the downstream kernel but it doesn't work for
> me. If I specify <_smmu 0x6c3 0x0> it still blows up. I wonder if
> my firmware perhaps is missing some initialization here to make the QUP
> operate in HLOS mode? Otherwise, I thought that the 0x3 at the end was
> the mask and so it should be split off to the second cell in the DT
> specifier but that seemed a little weird.

Two things here -
0x6c0 - TZ SID. Do you see above fault on MTP sdm845 devices?
0x6c3/0x6c6 - HLOS SIDs.

Cheza will throw faults for anything that is programmed with TZ on mtp
as all of that should be handled in HLOS. The firmwares of all these
peripherals assume that the SID reservation is done (whether in TZ or HLOS).

I am inclined to moving the iommus property for all 'TZ' to board dts files.
MTP wouldn't need those SIDs. So, the SOC level dtsi will have just the
HLOS SIDs.

P.S.
As you rightly said, the second cell in iommus property is the mask so that
the iommu is able to reserve all that SIDs that are covered with the
starting SID
and the mask.


Best regards
Vivek
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH] phy: renesas: rcar-gen3-usb2: fix imbalance powered flag

2019-06-04 Thread Yoshihiro Shimoda
The powered flag should be set for any other phys anyway. Otherwise,
after we have revised the device tree for the usb phy, the following
warning happened during a second system suspend. So, this patch fixes
the issue.

[   56.026531] unbalanced disables for USB20_VBUS0
[   56.031108] WARNING: CPU: 3 PID: 513 at drivers/regulator/core.c:2593 _regula
tor_disable+0xe0/0x1c0
[   56.040146] Modules linked in: rcar_du_drm rcar_lvds drm_kms_helper drm drm_p
anel_orientation_quirks vsp1 videobuf2_vmalloc videobuf2_dma_contig videobuf2_me
mops videobuf2_v4l2 videobuf2_common videodev snd_soc_rcar renesas_usbhs snd_soc
_audio_graph_card media snd_soc_simple_card_utils crct10dif_ce renesas_usb3 snd_
soc_ak4613 rcar_fcp pwm_rcar usb_dmac phy_rcar_gen3_usb3 pwm_bl ipv6
[   56.074047] CPU: 3 PID: 513 Comm: kworker/u16:19 Not tainted 5.2.0-rc3-1-
g5f20a19 #6
[   56.082129] Hardware name: Renesas Salvator-X board based on r8a7795 ES2.0+ (
DT)
[   56.089524] Workqueue: events_unbound async_run_entry_fn
[   56.094832] pstate: 4005 (nZcv daif -PAN -UAO)
[   56.099617] pc : _regulator_disable+0xe0/0x1c0
[   56.104054] lr : _regulator_disable+0xe0/0x1c0
[   56.108489] sp : 121c3ae0
[   56.111796] x29: 121c3ae0 x28: 
[   56.117102] x27:  x26: 10fe0e60
[   56.122407] x25: 0002 x24: 0001
[   56.127712] x23: 0002 x22: 8006f99d4000
[   56.133017] x21: 8006f99cc000 x20: 8006f9846800
[   56.138322] x19: 8006f9846800 x18: 
[   56.143626] x17:  x16: 
[   56.148931] x15: 112f96c8 x14: 921c37f7
[   56.154235] x13: 121c3805 x12: 11312000
[   56.159540] x11: 05f5e0ff x10: 112f9f20
[   56.164844] x9 : 112d3018 x8 : 01ad
[   56.170149] x7 : ffcc x6 : 8006ff768180
[   56.175453] x5 : 8006ff768180 x4 : 
[   56.180758] x3 : 8006ff76ef10 x2 : 8006ff768180
[   56.186062] x1 : 3d2eccbaead8fb00 x0 : 
[   56.191367] Call trace:
[   56.193808]  _regulator_disable+0xe0/0x1c0
[   56.197899]  regulator_disable+0x40/0x78
[   56.201820]  rcar_gen3_phy_usb2_power_off+0x3c/0x50
[   56.206692]  phy_power_off+0x48/0xd8
[   56.210263]  usb_phy_roothub_power_off+0x30/0x50
[   56.214873]  usb_phy_roothub_suspend+0x1c/0x50
[   56.219311]  hcd_bus_suspend+0x13c/0x168
[   56.223226]  generic_suspend+0x4c/0x58
[   56.226969]  usb_suspend_both+0x1ac/0x238
[   56.230972]  usb_suspend+0xcc/0x170
[   56.234455]  usb_dev_suspend+0x10/0x18
[   56.238199]  dpm_run_callback.isra.6+0x20/0x68
[   56.242635]  __device_suspend+0x110/0x308
[   56.246637]  async_suspend+0x24/0xa8
[   56.250205]  async_run_entry_fn+0x40/0xf8
[   56.254210]  process_one_work+0x1e0/0x320
[   56.258211]  worker_thread+0x40/0x450
[   56.261867]  kthread+0x124/0x128
[   56.265094]  ret_from_fork+0x10/0x18
[   56.268661] ---[ end trace 86d7ec5de5c517af ]---
[   56.273290] phy phy-ee080200.usb-phy.10: phy poweroff failed --> -5

Reported-by: Geert Uytterhoeven 
Fixes: 549b6b55b005 ("phy: renesas: rcar-gen3-usb2: enable/disable independent 
irqs")
Signed-off-by: Yoshihiro Shimoda 
---
 drivers/phy/renesas/phy-rcar-gen3-usb2.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c 
b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
index 1322185..dd2d7290 100644
--- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
+++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
@@ -437,15 +437,15 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
struct rcar_gen3_chan *channel = rphy->ch;
void __iomem *usb2_base = channel->base;
u32 val;
-   int ret;
+   int ret = 0;
 
if (!rcar_gen3_are_all_rphys_power_off(channel))
-   return 0;
+   goto out;
 
if (channel->vbus) {
ret = regulator_enable(channel->vbus);
if (ret)
-   return ret;
+   goto out;
}
 
val = readl(usb2_base + USB2_USBCTR);
@@ -454,6 +454,8 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
val &= ~USB2_USBCTR_PLL_RST;
writel(val, usb2_base + USB2_USBCTR);
 
+out:
+   /* The powered flag should be set for any other phys anyway */
rphy->powered = true;
 
return 0;
-- 
2.7.4



Re: bcache: oops when writing to writeback_percent without a cache device

2019-06-04 Thread Coly Li
On 2019/6/5 1:24 上午, Bjørn Forsman wrote:
> On Tue, 4 Jun 2019 at 17:41, Coly Li  wrote:
>>
>> On 2019/6/4 10:59 下午, Coly Li wrote:
>>> On 2019/6/4 7:00 下午, Bjørn Forsman wrote:
 Hi all,

 I get a kernel oops from bcache when writing to
 /sys/block/bcache0/bcache/writeback_percent and there is no attached
 cache device. See the oops itself below my signature.

 This is on Linux 4.19.46. I looked in git and see many commits to
 bcache lately, but none seem to address this particular issue.

 Background: I'm writing to .../writeback_percent with
 systemd-tmpfiles. I'd rather not replace it with a script that figures
 out whether or not the kernel will oops if writing to the sysfs file
 -- the kernel should not oops in the first place.
>>>
>>> Hi Bjorn,
>>>
>>> Thank you for the reporting. I believe this is a case we missed in
>>> testings. When a bcache device is not attached, it does not make sense
>>> to update the writeback rate in period by the changing of writeback_percent.
>>>
>>> I will post a patch for your testing soon.
>>
>> Hi Bjorn,
>>
>> Could you please to try this patch ? Hope it may help a bit.
> 
> Hi Coly,
> 
> Thanks for the quick patch! I tested it on linux 5.2-rc2 and it indeed
> fixes the problem.
> 
> There is one typo in the patch/commit message: s/writebac/writeback/
> 

Hi Bjorn,

Thanks for the patch review. Do you mind if I add Reviewed-By: tag with
your name and email address ?

-- 

Coly Li


Re: [PATCH v6 01/10] mm: add missing smp read barrier on getting memcg kmem_cache pointer

2019-06-04 Thread Shakeel Butt
On Tue, Jun 4, 2019 at 7:45 PM Roman Gushchin  wrote:
>
> Johannes noticed that reading the memcg kmem_cache pointer in
> cache_from_memcg_idx() is performed using READ_ONCE() macro,
> which doesn't implement a SMP barrier, which is required
> by the logic.
>
> Add a proper smp_rmb() to be paired with smp_wmb() in
> memcg_create_kmem_cache().
>
> The same applies to memcg_create_kmem_cache() itself,
> which reads the same value without barriers and READ_ONCE().
>
> Suggested-by: Johannes Weiner 
> Signed-off-by: Roman Gushchin 

Reviewed-by: Shakeel Butt 

This seems like independent to the series. Shouldn't this be Cc'ed stable?

> ---
>  mm/slab.h| 1 +
>  mm/slab_common.c | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 739099af6cbb..1176b61bb8fc 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -260,6 +260,7 @@ cache_from_memcg_idx(struct kmem_cache *s, int idx)
>  * memcg_caches issues a write barrier to match this (see
>  * memcg_create_kmem_cache()).
>  */
> +   smp_rmb();
> cachep = READ_ONCE(arr->entries[idx]);
> rcu_read_unlock();
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 58251ba63e4a..8092bdfc05d5 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -652,7 +652,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  * allocation (see memcg_kmem_get_cache()), several threads can try to
>  * create the same cache, but only one of them may succeed.
>  */
> -   if (arr->entries[idx])
> +   smp_rmb();
> +   if (READ_ONCE(arr->entries[idx]))
> goto out_unlock;
>
> cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> --
> 2.20.1
>


Re: [PATCH 0/3] Enhance virtio rpmsg bus driver buffer allocation

2019-06-04 Thread Bjorn Andersson
On Thu 31 Jan 07:41 PST 2019, Xiang Xiao wrote:

> Hi,
> This series enhance the buffer allocation by:
> 1.Support the different buffer number in rx/tx direction
> 2.Get the individual rx/tx buffer size from config space
> 
> Here is the related OpenAMP change:
> https://github.com/OpenAMP/open-amp/pull/155
> 

This looks pretty reasonable, but can you confirm that it's possible to
use new firmware with an old Linux kernel when introducing this?


But ever since we discussed Loic's similar proposal earlier I've been
questioning if the fixed buffer size isn't just an artifact of how we
preallocate our buffers. The virtqueue seems to support arbitrary sizes
of buffers and I see that the receive function in OpenAMP has been fixed
to put back the buffer of the size that was received, rather than 512
bytes. So it seems like Linux would be able to send whatever size
messages to OpenAMP it would handle it.

The question is if we could do the same in the other direction, perhaps
by letting the OpenAMP side do it's message allocation when it's
sending, rather than Linux pushing inbufs to be filled by the remote.

This would remove the problem of always having suboptimal buffer sizes.

Regards,
Bjorn

> Xiang Xiao (3):
>   rpmsg: virtio_rpmsg_bus: allow the different vring size for send/recv
>   rpmsg: virtio_rpmsg_bus: allocate rx/tx buffer separately
>   rpmsg: virtio_rpmsg_bus: get buffer size from config space
> 
>  drivers/rpmsg/virtio_rpmsg_bus.c  | 127 
> +++---
>  include/uapi/linux/virtio_rpmsg.h |  24 +++
>  2 files changed, 100 insertions(+), 51 deletions(-)
>  create mode 100644 include/uapi/linux/virtio_rpmsg.h
> 
> -- 
> 2.7.4
> 


Re: [PATCH] RDMA/ucma: Use struct_size() helper

2019-06-04 Thread Leon Romanovsky
On Tue, Jun 04, 2019 at 10:42:22AM -0500, Gustavo A. R. Silva wrote:
> Make use of the struct_size() helper instead of an open-coded version
> in order to avoid any potential type mistakes, in particular in the
> context in which this code is being used.

What does "in particular in the context in which this code is being
used" mean?

>
> So, replace the following form:
>
> sizeof(*resp) + (i * sizeof(struct ib_path_rec_data))
>
> with:
>
> struct_size(resp, path_data, i)

It is already written inside commit itself.

>
> This code was detected with the help of Coccinelle.
>
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/infiniband/core/ucma.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
> index 140a338a135f..cbe460076611 100644
> --- a/drivers/infiniband/core/ucma.c
> +++ b/drivers/infiniband/core/ucma.c
> @@ -951,8 +951,7 @@ static ssize_t ucma_query_path(struct ucma_context *ctx,
>   }
>   }
>
> - if (copy_to_user(response, resp,
> -  sizeof(*resp) + (i * sizeof(struct ib_path_rec_data
> + if (copy_to_user(response, resp, struct_size(resp, path_data, i)))
>   ret = -EFAULT;
>
>   kfree(resp);
> --
> 2.21.0
>


Re: [PATCH v6 00/10] mm: reparent slab memory on cgroup removal

2019-06-04 Thread Andrew Morton
On Tue, 4 Jun 2019 19:44:44 -0700 Roman Gushchin  wrote:

> So instead of trying to find a maybe non-existing balance, let's do reparent
> the accounted slabs to the parent cgroup on cgroup removal.

s/slabs/slab caches/.  Take more care with the terminology, please...

> There is a bonus: currently we do release empty kmem_caches on cgroup
> removal, however all other are waiting for the releasing of the memory cgroup.
> These refactorings allow kmem_caches to be released as soon as they
> become inactive and free.

Unclear.

s/All other/releasing of all non-empty slab caches depends upon the releasing/

I think?


Re: [PATCH] media: do not use C++ style comments in uapi headers

2019-06-04 Thread Masahiro Yamada
On Wed, Jun 5, 2019 at 3:21 AM Arnd Bergmann  wrote:
> > > >
> > > > There are two ways to define fixed-width type.
> > > >
> > > > [1] #include , __u8, __u16, __u32, __u64
> > > >
> > > >   vs
> > > >
> > > > [2] #include , uint8_t, uint16_t, uint32_t, uint64_t
> > > >
> > > >
> > > > Both are used in UAPI headers.
> > > > IIRC,  was standardized by C99.
> > > >
> > > > So, we have already relied on C99 in user-space too.
>
> A related problem is that using the stdint.h types requires
> including stdint.h first, but the C library requires that including
> one standard header does not include another one recursively.
>
> So if sys/socket.h includes linux/socket.h, that must not include
> stdint.h or any other header file that does so.


This means we cannot reliably use uint{8,16,32,64}_t in UAPI headers.


[1] If we include  from linux/foo.h

If sys/foo.h includes  and ,
it violates the C library requirement.


[2] If we do not include  from linux/foo.h

If sys/foo.h includes , but not ,
we get 'unknown type name' errors.


-- 
Best Regards
Masahiro Yamada


Re: linux-next: Tree for Jun 4 (drivers/iio/addac/adt7316.c)

2019-06-04 Thread Randy Dunlap
On 6/3/19 11:09 PM, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20190603:
> 

on x86_64:

when GPIOLIB is not set/enabled:

../drivers/staging/iio/addac/adt7316.c: In function ‘adt7316_store_update_DAC’:
../drivers/staging/iio/addac/adt7316.c:947:3: error: implicit declaration of 
function ‘gpiod_set_value’ [-Werror=implicit-function-declaration]
   gpiod_set_value(chip->ldac_pin, 0);
   ^
  CC [M]  drivers/target/target_core_tpg.o
../drivers/staging/iio/addac/adt7316.c: In function ‘adt7316_setup_irq’:
../drivers/staging/iio/addac/adt7316.c:1805:2: error: implicit declaration of 
function ‘irqd_get_trigger_type’ [-Werror=implicit-function-declaration]
  irq_type = irqd_get_trigger_type(irq_get_irq_data(chip->bus.irq));
  ^
../drivers/staging/iio/addac/adt7316.c:1805:2: error: implicit declaration of 
function ‘irq_get_irq_data’ [-Werror=implicit-function-declaration]
../drivers/staging/iio/addac/adt7316.c: In function ‘adt7316_probe’:
../drivers/staging/iio/addac/adt7316.c:2156:2: error: implicit declaration of 
function ‘devm_gpiod_get_optional’ [-Werror=implicit-function-declaration]
  chip->ldac_pin = devm_gpiod_get_optional(dev, "adi,ldac",
  ^
../drivers/staging/iio/addac/adt7316.c:2157:8: error: ‘GPIOD_OUT_LOW’ 
undeclared (first use in this function)
GPIOD_OUT_LOW);
^

Full randconfig file is attached.


-- 
~Randy
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 5.2.0-rc3 Kernel Configuration
#

#
# Compiler: gcc (SUSE Linux) 4.8.5
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=40805
CONFIG_CLANG_VERSION=0
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y
CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED=y
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_COMPILE_TEST=y
CONFIG_LOCALVERSION=""
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
CONFIG_KERNEL_BZIP2=y
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
# CONFIG_SWAP is not set
# CONFIG_SYSVIPC is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
CONFIG_USELIB=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
# end of IRQ subsystem

CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y
# end of Timers subsystem

CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting

CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem

CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=m
CONFIG_IKCONFIG_PROC=y
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=17
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_KMEM=y
# CONFIG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
# CONFIG_CGROUP_HUGETLB is not set
# CONFIG_CPUSETS is not set
# CONFIG_CGROUP_DEVICE is not set
CONFIG_CGROUP_CPUACCT=y
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_DEBUG=y
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_USER_NS is not set
CONFIG_PID_NS=y
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
# 

RE: rcar_gen3_phy_usb2: unbalanced disables for USB20_VBUS0

2019-06-04 Thread Yoshihiro Shimoda
Hi Geert-san,

Thank you very much for your report!

> From: Geert Uytterhoeven, Sent: Wednesday, June 5, 2019 3:06 AM
> 
> Hi Shimoda-san,
> 
> Using a tree based on renesas-drivers-2019-06-04-v5.2-rc3, I started seeing
> the following warning during a second system suspend (s2idle):

> So far I've seen this on Salvator-X with R-Car H3 ES1.0 or M3-W, and
> on Salvator-XS with R-Car M3-N, but not (yet?) on H3 ES2.0.

I could reproduce this issue on R-Car H3 ES3.0 with Suspend-to-RAM.
# I'm silly but I could not use s2idle that didn't wake up by ravb.
# https://elinux.org/R-Car/Boards/Salvator-X#Suspend-to-Idle

> Unfortunately the issue seems to be fairly timing-sensitive, so I failed
> to bisect it.
> 
> I have added some debug.  While this didn't help me finding the cause
> of the above warning, it did discover another imbalance:

Thank you for trying it. I have investigated this issue and then I found the 
root cause.

After the following patch was applied, multiple phy devices are generated.
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/commit/drivers/phy/renesas/phy-rcar-gen3-usb2.c?h=renesas-drivers-2019-06-04-v5.2-rc3=549b6b55b00558183cef4af2c2bb61d4f2ffe508

But, on the power_on function, it should set the "powered" flag for any other 
phys anyway.
Otherwise, such a strange imbalance behavior happened.
The powered flag is needed to avoid multiple "PLL_RST" register setting.
# I think regulator_{en,dis}able() don't need such a condition though.

I'll submit a bugfix patch with your Reported-by tag later.

Best regards,
Yoshihiro Shimoda



Re: [RFC PATCH v2 1/3] vfio: Use capability chains to handle device specific irq

2019-06-04 Thread Zhenyu Wang
On 2019.06.04 17:55:32 +0800, Tina Zhang wrote:
> Caps the number of irqs with fixed indexes and uses capability chains
> to chain device specific irqs.
> 
> VFIO vGPU leverages this mechanism to trigger primary plane and cursor
> plane page flip event to the user space.
> 
> Signed-off-by: Tina Zhang 
> ---
>  include/uapi/linux/vfio.h | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 02bb7ad6e986..9b5e25937c7d 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -444,11 +444,31 @@ struct vfio_irq_info {
>  #define VFIO_IRQ_INFO_MASKABLE   (1 << 1)
>  #define VFIO_IRQ_INFO_AUTOMASKED (1 << 2)
>  #define VFIO_IRQ_INFO_NORESIZE   (1 << 3)
> +#define VFIO_IRQ_INFO_FLAG_CAPS  (1 << 4) /* Info supports caps 
> */
>   __u32   index;  /* IRQ index */
> + __u32   cap_offset; /* Offset within info struct of first cap */
>   __u32   count;  /* Number of IRQs within this index */

This would break ABI for get irq info. I think irq cap chain can just follow
vfio_irq_info.

>  };
>  #define VFIO_DEVICE_GET_IRQ_INFO _IO(VFIO_TYPE, VFIO_BASE + 9)
>  
> +/*
> + * The irq type capability allows irqs unique to a specific device or
> + * class of devices to be exposed.
> + *
> + * The structures below define version 1 of this capability.
> + */
> +#define VFIO_IRQ_INFO_CAP_TYPE  3
> +
> +struct vfio_irq_info_cap_type {
> + struct vfio_info_cap_header header;
> + __u32 type; /* global per bus driver */
> + __u32 subtype;  /* type specific */
> +};
> +
> +#define VFIO_IRQ_TYPE_GFX(1)
> +#define VFIO_IRQ_SUBTYPE_GFX_PRI_PLANE_FLIP  (1)
> +#define VFIO_IRQ_SUBTYPE_GFX_CUR_PLANE_FLIP  (2)
> +

Really need to split for different planes? I'd like a 
VFIO_IRQ_SUBTYPE_GFX_DISPLAY_EVENT
so user space can probe change for all.

>  /**
>   * VFIO_DEVICE_SET_IRQS - _IOW(VFIO_TYPE, VFIO_BASE + 10, struct 
> vfio_irq_set)
>   *
> @@ -550,7 +570,8 @@ enum {
>   VFIO_PCI_MSIX_IRQ_INDEX,
>   VFIO_PCI_ERR_IRQ_INDEX,
>   VFIO_PCI_REQ_IRQ_INDEX,
> - VFIO_PCI_NUM_IRQS
> + VFIO_PCI_NUM_IRQS = 5   /* Fixed user ABI, IRQ indexes >=5 use   */
> + /* device specific cap to define content */
>  };
>  
>  /*
> -- 
> 2.17.1
> 

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


signature.asc
Description: PGP signature


Re: rcu_read_lock lost its compiler barrier

2019-06-04 Thread Paul E. McKenney
On Wed, Jun 05, 2019 at 10:21:17AM +0800, Herbert Xu wrote:
> On Tue, Jun 04, 2019 at 02:14:49PM -0700, Paul E. McKenney wrote:
> >
> > Yeah, I know, even with the "volatile" keyword, it is not entirely clear
> > how much reordering the compiler is allowed to do.  I was relying on
> > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, which says:
> 
> The volatile keyword doesn't give any guarantees of this kind.
> The key to ensuring ordering between unrelated variable/register
> reads/writes is the memory clobber:
> 
>   6.47.2.6 Clobbers and Scratch Registers
> 
>   ...
> 
>   "memory" The "memory" clobber tells the compiler that the assembly
>   code performs memory reads or writes to items other than those
>   listed in the input and output operands (for example, accessing
>   the memory pointed to by one of the input parameters). To ensure
>   memory contains correct values, GCC may need to flush specific
>   register values to memory before executing the asm. Further,
>   the compiler does not assume that any values read from memory
>   before an asm remain unchanged after that asm; it reloads them as
>   needed. Using the "memory" clobber effectively forms a read/write
>   memory barrier for the compiler.
> 
>   Note that this clobber does not prevent the processor from
>   doing speculative reads past the asm statement. To prevent that,
>   you need processor-specific fence instructions.
> 
> IOW you need a barrier().

Understood.  Does the patch I sent out a few hours ago cover it?  Or is
something else needed?

Other than updates to the RCU requirements documentation, which is
forthcoming.

Thanx, Paul



Re: [PATCH 2/3] i2c: slave-mqueue: add a slave backend to receive and queue messages

2019-06-04 Thread Eduardo Valentin
Hey Andry,

Long time no seeing :-)

On Tue, Jun 04, 2019 at 08:16:11PM +0300, Andy Shevchenko wrote:
> On Thu, May 30, 2019 at 09:33:46PM -0700, Eduardo Valentin wrote:
> > From: Haiyue Wang 
> > 
> > Some protocols over I2C are designed for bi-directional transferring
> > messages by using I2C Master Write protocol. Like the MCTP (Management
> > Component Transport Protocol) and IPMB (Intelligent Platform Management
> > Bus), they both require that the userspace can receive messages from
> > I2C dirvers under slave mode.
> > 
> > This new slave mqueue backend is used to receive and queue messages, it
> > will exposes these messages to userspace by sysfs bin file.
> > 
> > Note: DT interface and a couple of minor fixes here and there
> > by Eduardo, so I kept the original authorship here.
> 
> > +#define MQ_MSGBUF_SIZE CONFIG_I2C_SLAVE_MQUEUE_MESSAGE_SIZE
> > +#define MQ_QUEUE_SIZE  CONFIG_I2C_SLAVE_MQUEUE_QUEUE_SIZE
> 
> > +#define MQ_QUEUE_NEXT(x)   (((x) + 1) & (MQ_QUEUE_SIZE - 1))
> 
> Also possible ((x + 1) % ..._SIZE)

Right.. but I suppose the original idea is to avoid divisions on the hotpath.

So, I am actually fine with the limitation of only using power of 2.

> 
> > +   mq = dev_get_drvdata(container_of(kobj, struct device, kobj));
> 
> kobj_to_dev()

Well, yeah, I guess this is a nit, but I can add that in case of a real need 
for a v7.

> 
> > +static int i2c_slave_mqueue_probe(struct i2c_client *client,
> > + const struct i2c_device_id *id)
> > +{
> > +   struct device *dev = >dev;
> > +   struct mq_queue *mq;
> > +   int ret, i;
> > +   void *buf;
> > +
> > +   mq = devm_kzalloc(dev, sizeof(*mq), GFP_KERNEL);
> > +   if (!mq)
> > +   return -ENOMEM;
> > +
> 
> > +   BUILD_BUG_ON(!is_power_of_2(MQ_QUEUE_SIZE));
> 
> Perhaps start function with this kind of assertions?
> 


same here, in case I see a huge ask for a v7, I can move this up.

> > +
> > +   buf = devm_kmalloc_array(dev, MQ_QUEUE_SIZE, MQ_MSGBUF_SIZE,
> > +GFP_KERNEL);
> > +   if (!buf)
> > +   return -ENOMEM;
> > +
> > +   for (i = 0; i < MQ_QUEUE_SIZE; i++)
> > +   mq->queue[i].buf = buf + i * MQ_MSGBUF_SIZE;
> 
> 
> Just wondering if kfifo API can bring an advantage here?
> 

Well, then again, I suppose the idea is simplify here, not if we need to go
kfifo as the Protocol on top of this is perfectly fine with the current
discipline of just having a simple drop of older messages.


> > +   return 0;
> > +}
> 
> > +static const struct of_device_id i2c_slave_mqueue_of_match[] = {
> > +   {
> > +   .compatible = "i2c-slave-mqueue",
> > +   },
> 
> > +   { },
> 
> No need for comma here.

It does not hurt to have it either :-)

> 
> > +};
> 
> > +
> > +static struct i2c_driver i2c_slave_mqueue_driver = {
> > +   .driver = {
> > +   .name   = "i2c-slave-mqueue",
> 
> > +   .of_match_table = of_match_ptr(i2c_slave_mqueue_of_match),
> 
> Wouldn't compiler warn you due to unused data?
> Perhaps drop of_match_ptr() for good...


Not sure what you meant here. I dont see any compiler warning.
Also, of_match_ptr seams to be well spread in the kernel.
> 
> > +   },
> > +   .probe  = i2c_slave_mqueue_probe,
> > +   .remove = i2c_slave_mqueue_remove,
> > +   .id_table   = i2c_slave_mqueue_id,
> > +};
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

-- 
All the best,
Eduardo Valentin


Re: [PATCH v3 net-next 00/17] PTP support for the SJA1105 DSA driver

2019-06-04 Thread David Miller
From: Vladimir Oltean 
Date: Tue,  4 Jun 2019 20:07:39 +0300

> This patchset adds the following:
> 
>  - A timecounter/cyclecounter based PHC for the free-running
>timestamping clock of this switch.
> 
>  - A state machine implemented in the DSA tagger for SJA1105, which
>keeps track of metadata follow-up Ethernet frames (the switch's way
>of transmitting RX timestamps).

This series doesn't apply cleanly to net-next, please respin.

Thank you.


[GIT PULL] pstore fixes for v5.2-rc4

2019-06-04 Thread Kees Cook
Hi Linus,

Please pull these pstore fixes for v5.2-rc4. They've been in linux-next
for a bit now and catch some pstore corner cases found recently.

Thanks!

-Kees

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
tags/pstore-v5.2-rc4

for you to fetch changes up to 8880fa32c557600f5f624084152668ed3c2ea51e:

  pstore/ram: Run without kernel crash dump region (2019-05-31 01:19:06 -0700)


pstore fixes for v5.2-rc4

- Avoid NULL deref when unloading/reloading ramoops module (Pi-Hsun Shih)
- Run ramoops without crash dump region


Kees Cook (1):
  pstore/ram: Run without kernel crash dump region

Pi-Hsun Shih (1):
  pstore: Set tfm to NULL on free_buf_for_compression

 fs/pstore/platform.c |  7 +--
 fs/pstore/ram.c  | 36 +++-
 2 files changed, 28 insertions(+), 15 deletions(-)

-- 
Kees Cook


Re: [PATCH net] tcp: avoid creating multiple req socks with the same tuples

2019-06-04 Thread Eric Dumazet
On Tue, Jun 4, 2019 at 7:07 PM maowenan  wrote:
>
>
>
> On 2019/6/4 23:24, Eric Dumazet wrote:
> > On Tue, Jun 4, 2019 at 7:47 AM Mao Wenan  wrote:
> >>
> >> There is one issue about bonding mode BOND_MODE_BROADCAST, and
> >> two slaves with diffierent affinity, so packets will be handled
> >> by different cpu. These are two pre-conditions in this case.
> >>
> >> When two slaves receive the same syn packets at the same time,
> >> two request sock(reqsk) will be created if below situation happens:
> >> 1. syn1 arrived tcp_conn_request, create reqsk1 and have not yet called
> >> inet_csk_reqsk_queue_hash_add.
> >> 2. syn2 arrived tcp_v4_rcv, it goes to tcp_conn_request and create reqsk2
> >> because it can't find reqsk1 in the __inet_lookup_skb.
> >>
> >> Then reqsk1 and reqsk2 are added to establish hash table, and two synack 
> >> with different
> >> seq(seq1 and seq2) are sent to client, then tcp ack arrived and will be
> >> processed in tcp_v4_rcv and tcp_check_req, if __inet_lookup_skb find the 
> >> reqsk2, and
> >> tcp ack packet is ack_seq is seq1, it will be failed after checking:
> >> TCP_SKB_CB(skb)->ack_seq != tcp_rsk(req)->snt_isn + 1)
> >> and then tcp rst will be sent to client and close the connection.
> >>
> >> To fix this, do lookup before calling inet_csk_reqsk_queue_hash_add
> >> to add reqsk2 to hash table, if it finds the existed reqsk1 with the same 
> >> five tuples,
> >> it removes reqsk2 and does not send synack to client.
> >>
> >> Signed-off-by: Mao Wenan 
> >> ---
> >>  net/ipv4/tcp_input.c | 9 +
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> >> index 08a477e74cf3..c75eeb1fe098 100644
> >> --- a/net/ipv4/tcp_input.c
> >> +++ b/net/ipv4/tcp_input.c
> >> @@ -6569,6 +6569,15 @@ int tcp_conn_request(struct request_sock_ops 
> >> *rsk_ops,
> >> bh_unlock_sock(fastopen_sk);
> >> sock_put(fastopen_sk);
> >> } else {
> >> +   struct sock *sk1 = req_to_sk(req);
> >> +   struct sock *sk2 = NULL;
> >> +   sk2 = __inet_lookup_established(sock_net(sk1), 
> >> _hashinfo,
> >> +   
> >> sk1->sk_daddr, sk1->sk_dport,
> >> +   
> >> sk1->sk_rcv_saddr, sk1->sk_num,
> >> +   
> >> inet_iif(skb),inet_sdif(skb));
> >> +   if (sk2 != NULL)
> >> +   goto drop_and_release;
> >> +
> >> tcp_rsk(req)->tfo_listener = false;
> >> if (!want_cookie)
> >> inet_csk_reqsk_queue_hash_add(sk, req,
> >
> > This issue has been discussed last year.
> Can you share discussion information?


https://www.spinics.net/lists/netdev/msg507423.html


>
> >
> > I am afraid your patch does not solve all races.
> >
> > The lookup you add is lockless, so this is racy.
> it's right, it has already in race region.
> >
> > Really the only way to solve this is to make sure that _when_ the
> > bucket lock is held,
> > we do not insert a request socket if the 4-tuple is already in the
> > chain (probably in inet_ehash_insert())
> >
>
> put lookup code in spin_lock() of inet_ehash_insert(), is it ok like this?
> will it affect performance?
>
> in inet_ehash_insert():
> ...
> spin_lock(lock);
> +   reqsk = __inet_lookup_established(sock_net(sk), _hashinfo,
> +   sk->sk_daddr, 
> sk->sk_dport,
> +   sk->sk_rcv_saddr, 
> sk->sk_num,
> +   sk_bound_dev_if, 
> sk_bound_dev_if);
> +   if (reqsk) {

You should test this before asking :)


> +   spin_unlock(lock);
> +   return ret;
> +   }
> +
> if (osk) {
> WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
> ret = sk_nulls_del_node_init_rcu(osk);
> }
> if (ret)
> __sk_nulls_add_node_rcu(sk, list);
> spin_unlock(lock);
> ...
>
> > This needs more tricky changes than your patch.
> >
> > .
> >
>


Re: [PATCH net-next v4 00/10] net: dsa: mv88e6xxx: support for mv88e6250

2019-06-04 Thread David Miller
From: Rasmus Villemoes 
Date: Tue, 4 Jun 2019 07:34:22 +

> This adds support for the mv88e6250 chip. Initially based on the
> mv88e6240, this time around, I've been through each ->ops callback and
> checked that it makes sense, either replacing with a 6250 specific
> variant or dropping it if no equivalent functionality seems to exist
> for the 6250. Along the way, I found a few oddities in the existing
> code, mostly sent as separate patches/questions.
> 
> The one relevant to the 6250 is the ieee_pri_map callback, where the
> existing mv88e6085_g1_ieee_pri_map() is actually wrong for many of the
> existing users. I've put the mv88e6250_g1_ieee_pri_map() patch first
> in case some of the existing chips get switched over to use that and
> it is deemed important enough for -stable.
 ...

Series applied, thanks.


[PATCH] spi: mediatek: add SPI_LSB_FIRST support

2019-06-04 Thread Leilk Liu
this patch add SPI_LSB_FIRST feature support.

Signed-off-by: Leilk Liu 
---
 drivers/spi/spi-mt65xx.c |   15 ++-
 include/linux/platform_data/spi-mt65xx.h |2 --
 2 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi-mt65xx.c b/drivers/spi/spi-mt65xx.c
index 0cce6f0..7f4dc18 100644
--- a/drivers/spi/spi-mt65xx.c
+++ b/drivers/spi/spi-mt65xx.c
@@ -131,8 +131,6 @@ struct mtk_spi {
  * supplies it.
  */
 static const struct mtk_chip_config mtk_default_chip_info = {
-   .rx_mlsb = 1,
-   .tx_mlsb = 1,
.cs_pol = 0,
.sample_sel = 0,
 };
@@ -203,14 +201,13 @@ static int mtk_spi_prepare_message(struct spi_master 
*master,
reg_val &= ~SPI_CMD_CPOL;
 
/* set the mlsbx and mlsbtx */
-   if (chip_config->tx_mlsb)
-   reg_val |= SPI_CMD_TXMSBF;
-   else
+   if (spi->mode & SPI_LSB_FIRST) {
reg_val &= ~SPI_CMD_TXMSBF;
-   if (chip_config->rx_mlsb)
-   reg_val |= SPI_CMD_RXMSBF;
-   else
reg_val &= ~SPI_CMD_RXMSBF;
+   } else {
+   reg_val |= SPI_CMD_TXMSBF;
+   reg_val |= SPI_CMD_RXMSBF;
+   }
 
/* set the tx/rx endian */
 #ifdef __LITTLE_ENDIAN
@@ -607,7 +604,7 @@ static int mtk_spi_probe(struct platform_device *pdev)
 
master->auto_runtime_pm = true;
master->dev.of_node = pdev->dev.of_node;
-   master->mode_bits = SPI_CPOL | SPI_CPHA;
+   master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_LSB_FIRST;
 
master->set_cs = mtk_spi_set_cs;
master->prepare_message = mtk_spi_prepare_message;
diff --git a/include/linux/platform_data/spi-mt65xx.h 
b/include/linux/platform_data/spi-mt65xx.h
index ba4e4bb..8d5df58 100644
--- a/include/linux/platform_data/spi-mt65xx.h
+++ b/include/linux/platform_data/spi-mt65xx.h
@@ -14,8 +14,6 @@
 
 /* Board specific platform_data */
 struct mtk_chip_config {
-   u32 tx_mlsb;
-   u32 rx_mlsb;
u32 cs_pol;
u32 sample_sel;
 };
-- 
1.7.9.5



Re: [PATCH net-next] vmxnet3: turn off lro when rxcsum is disabled

2019-06-04 Thread David Miller
From: Ronak Doshi 
Date: Mon, 3 Jun 2019 23:58:38 -0700

> Currently, when rx csum is disabled, vmxnet3 driver does not turn
> off lro, which can cause performance issues if user does not turn off
> lro explicitly. This patch adds fix_features support which is used to
> turn off LRO whenever RXCSUM is disabled.
> 
> Signed-off-by: Ronak Doshi 
> Acked-by: Rishi Mehta 

Applied.


Re:Hello

2019-06-04 Thread chervosvita

Dear Sir/Madam,

Invest-Capital, gives you an opportunity to grow your business and bring 
happiness to your loved ones through hassle-free loans.
We provide international loan for corporate and private entities around the 
world.
Attractive interest rates 4%
For further details please contacts us: -  swift_l...@rediffmail.com


Regards,
Peter Yoon,
Head of marketing Team
Word Trade Center Ae,
P.O. Box: 7089.

Re: [PATCH] net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set

2019-06-04 Thread David Miller
From: Miaohe Lin 
Date: Tue, 4 Jun 2019 06:07:34 +

> There's some NICs, such as hinic, with NETIF_F_IP_CSUM and NETIF_F_TSO
> on but NETIF_F_HW_CSUM off. And ipvlan device features will be
> NETIF_F_TSO on with NETIF_F_IP_CSUM and NETIF_F_IP_CSUM both off as
> IPVLAN_FEATURES only care about NETIF_F_HW_CSUM. So TSO will be
> disabled in netdev_fix_features.
> For example:
> Features for enp129s0f0:
> rx-checksumming: on
> tx-checksumming: on
> tx-checksum-ipv4: on
> tx-checksum-ip-generic: off [fixed]
> tx-checksum-ipv6: on
> 
> Fixes: a188222b6ed2 ("net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK")
> Signed-off-by: Miaohe Lin 

Applied.


Re: [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest

2019-06-04 Thread Yang Weijiang
On Tue, Jun 04, 2019 at 12:58:01PM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:55PM +0800, Yang Weijiang wrote:
> > CET SHSTK and IBT features are introduced here so that
> > CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] reflect them.
> > CET xsave components for supervisor and user mode are reported
> > via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
> > respectively.
> > 
> > To make the code look clean, wrap CPUID(0xD,n>=1) report code in
> > a helper function now.
> 
> Create the helper in a separate patch so that it's introduced without
> any functional changes.
OK, will add a new patch to put the helper.
>  
> > Signed-off-by: Yang Weijiang 
> > Co-developed-by: Zhang Yi Z 
> > ---
> >  arch/x86/include/asm/kvm_host.h |  4 +-
> >  arch/x86/kvm/cpuid.c| 97 +
> >  arch/x86/kvm/vmx/vmx.c  |  6 ++
> >  arch/x86/kvm/x86.h  |  4 ++
> >  4 files changed, 76 insertions(+), 35 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h 
> > b/arch/x86/include/asm/kvm_host.h
> > index a5db4475e72d..8c3f0ddc7676 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -91,7 +91,8 @@
> >   | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | 
> > X86_CR4_PCIDE \
> >   | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> >   | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> > - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> > + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> > + | X86_CR4_CET))
> 
> As I mentioned in v4, the patch ordering is wrong.  Features shouldn't be
> advertised to userspace or exposed to the guest until they're fully
> supported in KVM, i.e. the bulk of this patch to advertise the CPUID bits
> and allow CR4.CET=1 belongs at the end of the series.
> 
How about merge it to patch 6/8?
> >  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
> >  
> > @@ -1192,6 +1193,7 @@ struct kvm_x86_ops {
> > int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
> >uint16_t *vmcs_version);
> > uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
> > +   u64 (*supported_xss)(void);
> >  };
> >  
> >  struct kvm_arch_async_pf {
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index fd3951638ae4..b9fc967fe55a 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
> > return xcr0;
> >  }
> >  
> > +u64 kvm_supported_xss(void)
> > +{
> > +   return KVM_SUPPORTED_XSS & kvm_x86_ops->supported_xss();
> > +}
> > +
> >  #define F(x) bit(X86_FEATURE_##x)
> >  
> >  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> > @@ -316,6 +321,50 @@ static int __do_cpuid_ent_emulated(struct 
> > kvm_cpuid_entry2 *entry,
> > return 0;
> >  }
> >  
> > +static inline int __do_cpuid_dx_leaf(struct kvm_cpuid_entry2 *entry, int 
> > *nent,
> > +int maxnent, u64 xss_mask, u64 xcr0_mask,
> > +u32 eax_mask)
> > +{
> > +   int idx, i;
> > +   u64 mask;
> > +   u64 supported;
> > +
> > +   for (idx = 1, i = 1; idx < 64; ++idx) {
> > +   mask = ((u64)1 << idx);
> > +   if (*nent >= maxnent)
> > +   return -EINVAL;
> > +
> > +   do_cpuid_1_ent([i], 0xD, idx);
> > +   if (idx == 1) {
> > +   entry[i].eax &= eax_mask;
> > +   cpuid_mask([i].eax, CPUID_D_1_EAX);
> > +   supported = xcr0_mask | xss_mask;
> > +   entry[i].ebx = 0;
> > +   entry[i].edx = 0;
> > +   entry[i].ecx &= xss_mask;
> > +   if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
> > +   entry[i].ebx =
> > +   xstate_required_size(supported,
> > +true);
> > +   }
> > +   } else {
> > +   supported = (entry[i].ecx & 1) ? xss_mask :
> > +xcr0_mask;
> > +   if (entry[i].eax == 0 || !(supported & mask))
> > +   continue;
> > +   entry[i].ecx &= 1;
> > +   entry[i].edx = 0;
> > +   if (entry[i].ecx)
> > +   entry[i].ebx = 0;
> > +   }
> > +   entry[i].flags |=
> > +   KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> > +   ++*nent;
> > +   ++i;
> > +   }
> > +   return 0;
> > +}
> > +
> >  static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 
> > function,
> >  u32 index, int *nent, int maxnent)
> >  {
> > @@ -405,12 +454,13 @@ static inline int __do_cpuid_ent(struct 
> > kvm_cpuid_entry2 *entry, u32 

[PATCH v6 05/10] mm: introduce __memcg_kmem_uncharge_memcg()

2019-06-04 Thread Roman Gushchin
Let's separate the page counter modification code out of
__memcg_kmem_uncharge() in a way similar to what
__memcg_kmem_charge() and __memcg_kmem_charge_memcg() work.

This will allow to reuse this code later using a new
memcg_kmem_uncharge_memcg() wrapper, which calls
__memcg_kmem_uncharge_memcg() if memcg_kmem_enabled()
check is passed.

Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
---
 include/linux/memcontrol.h | 10 ++
 mm/memcontrol.c| 25 +
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3ca57bacfdd2..9abf31bbe53a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1304,6 +1304,8 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, int 
order);
 void __memcg_kmem_uncharge(struct page *page, int order);
 int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
  struct mem_cgroup *memcg);
+void __memcg_kmem_uncharge_memcg(struct mem_cgroup *memcg,
+unsigned int nr_pages);
 
 extern struct static_key_false memcg_kmem_enabled_key;
 extern struct workqueue_struct *memcg_kmem_cache_wq;
@@ -1345,6 +1347,14 @@ static inline int memcg_kmem_charge_memcg(struct page 
*page, gfp_t gfp,
return __memcg_kmem_charge_memcg(page, gfp, order, memcg);
return 0;
 }
+
+static inline void memcg_kmem_uncharge_memcg(struct page *page, int order,
+struct mem_cgroup *memcg)
+{
+   if (memcg_kmem_enabled())
+   __memcg_kmem_uncharge_memcg(memcg, 1 << order);
+}
+
 /*
  * helper for accessing a memcg's index. It will be used as an index in the
  * child cache array in kmem_cache, and also to derive its name. This function
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bdb66871cdec..3427396da612 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2731,6 +2731,22 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, 
int order)
css_put(>css);
return ret;
 }
+
+/**
+ * __memcg_kmem_uncharge_memcg: uncharge a kmem page
+ * @memcg: memcg to uncharge
+ * @nr_pages: number of pages to uncharge
+ */
+void __memcg_kmem_uncharge_memcg(struct mem_cgroup *memcg,
+unsigned int nr_pages)
+{
+   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+   page_counter_uncharge(>kmem, nr_pages);
+
+   page_counter_uncharge(>memory, nr_pages);
+   if (do_memsw_account())
+   page_counter_uncharge(>memsw, nr_pages);
+}
 /**
  * __memcg_kmem_uncharge: uncharge a kmem page
  * @page: page to uncharge
@@ -2745,14 +2761,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
return;
 
VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
-
-   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
-   page_counter_uncharge(>kmem, nr_pages);
-
-   page_counter_uncharge(>memory, nr_pages);
-   if (do_memsw_account())
-   page_counter_uncharge(>memsw, nr_pages);
-
+   __memcg_kmem_uncharge_memcg(memcg, nr_pages);
page->mem_cgroup = NULL;
 
/* slab pages do not have PageKmemcg flag set */
-- 
2.20.1



[PATCH v6 03/10] mm: rename slab delayed deactivation functions and fields

2019-06-04 Thread Roman Gushchin
The delayed work/rcu deactivation infrastructure of non-root
kmem_caches can be also used for asynchronous release of these
objects. Let's get rid of the word "deactivation" in corresponding
names to make the code look better after generalization.

It's easier to make the renaming first, so that the generalized
code will look consistent from scratch.

Let's rename struct memcg_cache_params fields:
  deact_fn -> work_fn
  deact_rcu_head -> rcu_head
  deact_work -> work

And RCU/delayed work callbacks in slab common code:
  kmemcg_deactivate_rcufn -> kmemcg_rcufn
  kmemcg_deactivate_workfn -> kmemcg_workfn

This patch contains no functional changes, only renamings.

Signed-off-by: Roman Gushchin 
---
 include/linux/slab.h |  6 +++---
 mm/slab.h|  2 +-
 mm/slab_common.c | 30 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 9449b19c5f10..47923c173f30 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -642,10 +642,10 @@ struct memcg_cache_params {
struct list_head children_node;
struct list_head kmem_caches_node;
 
-   void (*deact_fn)(struct kmem_cache *);
+   void (*work_fn)(struct kmem_cache *);
union {
-   struct rcu_head deact_rcu_head;
-   struct work_struct deact_work;
+   struct rcu_head rcu_head;
+   struct work_struct work;
};
};
};
diff --git a/mm/slab.h b/mm/slab.h
index c16e5af0fb59..8ff90f42548a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -292,7 +292,7 @@ static __always_inline void memcg_uncharge_slab(struct page 
*page, int order,
 extern void slab_init_memcg_params(struct kmem_cache *);
 extern void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg);
 extern void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
-   void (*deact_fn)(struct kmem_cache *));
+   void (*work_fn)(struct kmem_cache *));
 
 #else /* CONFIG_MEMCG_KMEM */
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 77df6029de8e..d019ee66bdc4 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -692,17 +692,17 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
put_online_cpus();
 }
 
-static void kmemcg_deactivate_workfn(struct work_struct *work)
+static void kmemcg_workfn(struct work_struct *work)
 {
struct kmem_cache *s = container_of(work, struct kmem_cache,
-   memcg_params.deact_work);
+   memcg_params.work);
 
get_online_cpus();
get_online_mems();
 
mutex_lock(_mutex);
 
-   s->memcg_params.deact_fn(s);
+   s->memcg_params.work_fn(s);
 
mutex_unlock(_mutex);
 
@@ -713,36 +713,36 @@ static void kmemcg_deactivate_workfn(struct work_struct 
*work)
css_put(>memcg_params.memcg->css);
 }
 
-static void kmemcg_deactivate_rcufn(struct rcu_head *head)
+static void kmemcg_rcufn(struct rcu_head *head)
 {
struct kmem_cache *s = container_of(head, struct kmem_cache,
-   memcg_params.deact_rcu_head);
+   memcg_params.rcu_head);
 
/*
-* We need to grab blocking locks.  Bounce to ->deact_work.  The
+* We need to grab blocking locks.  Bounce to ->work.  The
 * work item shares the space with the RCU head and can't be
 * initialized eariler.
 */
-   INIT_WORK(>memcg_params.deact_work, kmemcg_deactivate_workfn);
-   queue_work(memcg_kmem_cache_wq, >memcg_params.deact_work);
+   INIT_WORK(>memcg_params.work, kmemcg_workfn);
+   queue_work(memcg_kmem_cache_wq, >memcg_params.work);
 }
 
 /**
  * slab_deactivate_memcg_cache_rcu_sched - schedule deactivation after a
  *sched RCU grace period
  * @s: target kmem_cache
- * @deact_fn: deactivation function to call
+ * @work_fn: deactivation function to call
  *
- * Schedule @deact_fn to be invoked with online cpus, mems and slab_mutex
+ * Schedule @work_fn to be invoked with online cpus, mems and slab_mutex
  * held after a sched RCU grace period.  The slab is guaranteed to stay
- * alive until @deact_fn is finished.  This is to be used from
+ * alive until @work_fn is finished.  This is to be used from
  * __kmemcg_cache_deactivate().
  */
 void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
-  void (*deact_fn)(struct kmem_cache 
*))
+  void (*work_fn)(struct kmem_cache *))
 {
if (WARN_ON_ONCE(is_root_cache(s)) ||
-   WARN_ON_ONCE(s->memcg_params.deact_fn))
+   

[PATCH v6 07/10] mm: synchronize access to kmem_cache dying flag using a spinlock

2019-06-04 Thread Roman Gushchin
Currently the memcg_params.dying flag and the corresponding
workqueue used for the asynchronous deactivation of kmem_caches
is synchronized using the slab_mutex.

It makes impossible to check this flag from the irq context,
which will be required in order to implement asynchronous release
of kmem_caches.

So let's switch over to the irq-save flavor of the spinlock-based
synchronization.

Signed-off-by: Roman Gushchin 
---
 mm/slab_common.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 09b26673b63f..2914a8f0aa85 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -130,6 +130,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t 
flags, size_t nr,
 #ifdef CONFIG_MEMCG_KMEM
 
 LIST_HEAD(slab_root_caches);
+static DEFINE_SPINLOCK(memcg_kmem_wq_lock);
 
 void slab_init_memcg_params(struct kmem_cache *s)
 {
@@ -629,6 +630,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
struct memcg_cache_array *arr;
struct kmem_cache *s = NULL;
char *cache_name;
+   bool dying;
int idx;
 
get_online_cpus();
@@ -640,7 +642,13 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 * The memory cgroup could have been offlined while the cache
 * creation work was pending.
 */
-   if (memcg->kmem_state != KMEM_ONLINE || root_cache->memcg_params.dying)
+   if (memcg->kmem_state != KMEM_ONLINE)
+   goto out_unlock;
+
+   spin_lock_irq(_kmem_wq_lock);
+   dying = root_cache->memcg_params.dying;
+   spin_unlock_irq(_kmem_wq_lock);
+   if (dying)
goto out_unlock;
 
idx = memcg_cache_id(memcg);
@@ -735,14 +743,17 @@ static void kmemcg_cache_deactivate(struct kmem_cache *s)
 
__kmemcg_cache_deactivate(s);
 
+   spin_lock_irq(_kmem_wq_lock);
if (s->memcg_params.root_cache->memcg_params.dying)
-   return;
+   goto unlock;
 
/* pin memcg so that @s doesn't get destroyed in the middle */
css_get(>memcg_params.memcg->css);
 
s->memcg_params.work_fn = __kmemcg_cache_deactivate_after_rcu;
call_rcu(>memcg_params.rcu_head, kmemcg_rcufn);
+unlock:
+   spin_unlock_irq(_kmem_wq_lock);
 }
 
 void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)
@@ -852,9 +863,9 @@ static int shutdown_memcg_caches(struct kmem_cache *s)
 
 static void flush_memcg_workqueue(struct kmem_cache *s)
 {
-   mutex_lock(_mutex);
+   spin_lock_irq(_kmem_wq_lock);
s->memcg_params.dying = true;
-   mutex_unlock(_mutex);
+   spin_unlock_irq(_kmem_wq_lock);
 
/*
 * SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
-- 
2.20.1



[PATCH v6 08/10] mm: rework non-root kmem_cache lifecycle management

2019-06-04 Thread Roman Gushchin
Currently each charged slab page holds a reference to the cgroup to
which it's charged. Kmem_caches are held by the memcg and are released
all together with the memory cgroup. It means that none of kmem_caches
are released unless at least one reference to the memcg exists, which
is very far from optimal.

Let's rework it in a way that allows releasing individual kmem_caches
as soon as the cgroup is offline, the kmem_cache is empty and there
are no pending allocations.

To make it possible, let's introduce a new percpu refcounter for
non-root kmem caches. The counter is initialized to the percpu mode,
and is switched to the atomic mode during kmem_cache deactivation. The
counter is bumped for every charged page and also for every running
allocation. So the kmem_cache can't be released unless all allocations
complete.

To shutdown non-active empty kmem_caches, let's reuse the work queue,
previously used for the kmem_cache deactivation. Once the reference
counter reaches 0, let's schedule an asynchronous kmem_cache release.

* I used the following simple approach to test the performance
(stolen from another patchset by T. Harding):

time find / -name fname-no-exist
echo 2 > /proc/sys/vm/drop_caches
repeat 10 times

Results:

origpatched

real0m1.455sreal0m1.355s
user0m0.206suser0m0.219s
sys 0m0.855ssys 0m0.807s

real0m1.487sreal0m1.699s
user0m0.221suser0m0.256s
sys 0m0.806ssys 0m0.948s

real0m1.515sreal0m1.505s
user0m0.183suser0m0.215s
sys 0m0.876ssys 0m0.858s

real0m1.291sreal0m1.380s
user0m0.193suser0m0.198s
sys 0m0.843ssys 0m0.786s

real0m1.364sreal0m1.374s
user0m0.180suser0m0.182s
sys 0m0.868ssys 0m0.806s

real0m1.352sreal0m1.312s
user0m0.201suser0m0.212s
sys 0m0.820ssys 0m0.761s

real0m1.302sreal0m1.349s
user0m0.205suser0m0.203s
sys 0m0.803ssys 0m0.792s

real0m1.334sreal0m1.301s
user0m0.194suser0m0.201s
sys 0m0.806ssys 0m0.779s

real0m1.426sreal0m1.434s
user0m0.216suser0m0.181s
sys 0m0.824ssys 0m0.864s

real0m1.350sreal0m1.295s
user0m0.200suser0m0.190s
sys 0m0.842ssys 0m0.811s

So it looks like the difference is not noticeable in this test.

Signed-off-by: Roman Gushchin 
---
 include/linux/slab.h |  3 +-
 mm/memcontrol.c  | 51 +---
 mm/slab.h| 45 +++--
 mm/slab_common.c | 79 ++--
 4 files changed, 100 insertions(+), 78 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 47923c173f30..1b54e5f83342 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 /*
@@ -152,7 +153,6 @@ int kmem_cache_shrink(struct kmem_cache *);
 
 void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
 void memcg_deactivate_kmem_caches(struct mem_cgroup *);
-void memcg_destroy_kmem_caches(struct mem_cgroup *);
 
 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -641,6 +641,7 @@ struct memcg_cache_params {
struct mem_cgroup *memcg;
struct list_head children_node;
struct list_head kmem_caches_node;
+   struct percpu_ref refcnt;
 
void (*work_fn)(struct kmem_cache *);
union {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3427396da612..49084e2d81ff 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2591,12 +2591,13 @@ static void memcg_schedule_kmem_cache_create(struct 
mem_cgroup *memcg,
 {
struct memcg_kmem_cache_create_work *cw;
 
+   if (!css_tryget_online(>css))
+   return;
+
cw = kmalloc(sizeof(*cw), GFP_NOWAIT | __GFP_NOWARN);
if (!cw)
return;
 
-   css_get(>css);
-
cw->memcg = memcg;
cw->cachep = cachep;
INIT_WORK(>work, memcg_kmem_cache_create_func);
@@ -2631,6 +2632,7 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache 
*cachep)
 {
struct mem_cgroup *memcg;
struct kmem_cache *memcg_cachep;
+   struct memcg_cache_array *arr;
int kmemcg_id;
 
VM_BUG_ON(!is_root_cache(cachep));
@@ -2638,14 +2640,29 @@ struct kmem_cache *memcg_kmem_get_cache(struct 
kmem_cache *cachep)
if (memcg_kmem_bypass())
return cachep;
 
-   memcg = get_mem_cgroup_from_current();
+   rcu_read_lock();
+
+   if (unlikely(current->active_memcg))
+   memcg = current->active_memcg;
+   else
+   

[PATCH v6 02/10] mm: postpone kmem_cache memcg pointer initialization to memcg_link_cache()

2019-06-04 Thread Roman Gushchin
Initialize kmem_cache->memcg_params.memcg pointer in
memcg_link_cache() rather than in init_memcg_params().

Once kmem_cache will hold a reference to the memory cgroup,
it will simplify the refcounting.

For non-root kmem_caches memcg_link_cache() is always called
before the kmem_cache becomes visible to a user, so it's safe.

Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
Acked-by: Vladimir Davydov 
Acked-by: Johannes Weiner 
---
 mm/slab.c|  2 +-
 mm/slab.h|  5 +++--
 mm/slab_common.c | 14 +++---
 mm/slub.c|  2 +-
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9e3eee5568b6..a4091f8b3655 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1239,7 +1239,7 @@ void __init kmem_cache_init(void)
  nr_node_ids * sizeof(struct kmem_cache_node 
*),
  SLAB_HWCACHE_ALIGN, 0, 0);
list_add(_cache->list, _caches);
-   memcg_link_cache(kmem_cache);
+   memcg_link_cache(kmem_cache, NULL);
slab_state = PARTIAL;
 
/*
diff --git a/mm/slab.h b/mm/slab.h
index 1176b61bb8fc..c16e5af0fb59 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -290,7 +290,7 @@ static __always_inline void memcg_uncharge_slab(struct page 
*page, int order,
 }
 
 extern void slab_init_memcg_params(struct kmem_cache *);
-extern void memcg_link_cache(struct kmem_cache *s);
+extern void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg);
 extern void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
void (*deact_fn)(struct kmem_cache *));
 
@@ -345,7 +345,8 @@ static inline void slab_init_memcg_params(struct kmem_cache 
*s)
 {
 }
 
-static inline void memcg_link_cache(struct kmem_cache *s)
+static inline void memcg_link_cache(struct kmem_cache *s,
+   struct mem_cgroup *memcg)
 {
 }
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8092bdfc05d5..77df6029de8e 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -140,13 +140,12 @@ void slab_init_memcg_params(struct kmem_cache *s)
 }
 
 static int init_memcg_params(struct kmem_cache *s,
-   struct mem_cgroup *memcg, struct kmem_cache *root_cache)
+struct kmem_cache *root_cache)
 {
struct memcg_cache_array *arr;
 
if (root_cache) {
s->memcg_params.root_cache = root_cache;
-   s->memcg_params.memcg = memcg;
INIT_LIST_HEAD(>memcg_params.children_node);
INIT_LIST_HEAD(>memcg_params.kmem_caches_node);
return 0;
@@ -221,11 +220,12 @@ int memcg_update_all_caches(int num_memcgs)
return ret;
 }
 
-void memcg_link_cache(struct kmem_cache *s)
+void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg)
 {
if (is_root_cache(s)) {
list_add(>root_caches_node, _root_caches);
} else {
+   s->memcg_params.memcg = memcg;
list_add(>memcg_params.children_node,
 >memcg_params.root_cache->memcg_params.children);
list_add(>memcg_params.kmem_caches_node,
@@ -244,7 +244,7 @@ static void memcg_unlink_cache(struct kmem_cache *s)
 }
 #else
 static inline int init_memcg_params(struct kmem_cache *s,
-   struct mem_cgroup *memcg, struct kmem_cache *root_cache)
+   struct kmem_cache *root_cache)
 {
return 0;
 }
@@ -384,7 +384,7 @@ static struct kmem_cache *create_cache(const char *name,
s->useroffset = useroffset;
s->usersize = usersize;
 
-   err = init_memcg_params(s, memcg, root_cache);
+   err = init_memcg_params(s, root_cache);
if (err)
goto out_free_cache;
 
@@ -394,7 +394,7 @@ static struct kmem_cache *create_cache(const char *name,
 
s->refcount = 1;
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, memcg);
 out:
if (err)
return ERR_PTR(err);
@@ -998,7 +998,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char 
*name,
 
create_boot_cache(s, name, size, flags, useroffset, usersize);
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, NULL);
s->refcount = 1;
return s;
 }
diff --git a/mm/slub.c b/mm/slub.c
index 1802c87799ff..9cb2eef62a37 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4213,7 +4213,7 @@ static struct kmem_cache * __init bootstrap(struct 
kmem_cache *static_cache)
}
slab_init_memcg_params(s);
list_add(>list, _caches);
-   memcg_link_cache(s);
+   memcg_link_cache(s, NULL);
return s;
 }
 
-- 
2.20.1



[PATCH v6 06/10] mm: unify SLAB and SLUB page accounting

2019-06-04 Thread Roman Gushchin
Currently the page accounting code is duplicated in SLAB and SLUB
internals. Let's move it into new (un)charge_slab_page helpers
in the slab_common.c file. These helpers will be responsible
for statistics (global and memcg-aware) and memcg charging.
So they are replacing direct memcg_(un)charge_slab() calls.

Signed-off-by: Roman Gushchin 
Reviewed-by: Shakeel Butt 
Acked-by: Christoph Lameter 
Acked-by: Vladimir Davydov 
Acked-by: Johannes Weiner 
---
 mm/slab.c | 19 +++
 mm/slab.h | 25 +
 mm/slub.c | 14 ++
 3 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 4b865393ebb4..b417824a9b15 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1360,7 +1360,6 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
int nodeid)
 {
struct page *page;
-   int nr_pages;
 
flags |= cachep->allocflags;
 
@@ -1370,17 +1369,11 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
return NULL;
}
 
-   if (memcg_charge_slab(page, flags, cachep->gfporder, cachep)) {
+   if (charge_slab_page(page, flags, cachep->gfporder, cachep)) {
__free_pages(page, cachep->gfporder);
return NULL;
}
 
-   nr_pages = (1 << cachep->gfporder);
-   if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
-   mod_lruvec_page_state(page, NR_SLAB_RECLAIMABLE, nr_pages);
-   else
-   mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE, nr_pages);
-
__SetPageSlab(page);
/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
if (sk_memalloc_socks() && page_is_pfmemalloc(page))
@@ -1395,12 +1388,6 @@ static struct page *kmem_getpages(struct kmem_cache 
*cachep, gfp_t flags,
 static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
 {
int order = cachep->gfporder;
-   unsigned long nr_freed = (1 << order);
-
-   if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
-   mod_lruvec_page_state(page, NR_SLAB_RECLAIMABLE, -nr_freed);
-   else
-   mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE, -nr_freed);
 
BUG_ON(!PageSlab(page));
__ClearPageSlabPfmemalloc(page);
@@ -1409,8 +1396,8 @@ static void kmem_freepages(struct kmem_cache *cachep, 
struct page *page)
page->mapping = NULL;
 
if (current->reclaim_state)
-   current->reclaim_state->reclaimed_slab += nr_freed;
-   memcg_uncharge_slab(page, order, cachep);
+   current->reclaim_state->reclaimed_slab += 1 << order;
+   uncharge_slab_page(page, order, cachep);
__free_pages(page, order);
 }
 
diff --git a/mm/slab.h b/mm/slab.h
index d35d85794247..345fb59afb8f 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -205,6 +205,12 @@ ssize_t slabinfo_write(struct file *file, const char 
__user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
+static inline int cache_vmstat_idx(struct kmem_cache *s)
+{
+   return (s->flags & SLAB_RECLAIM_ACCOUNT) ?
+   NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE;
+}
+
 #ifdef CONFIG_MEMCG_KMEM
 
 /* List of all root caches. */
@@ -362,6 +368,25 @@ static inline struct kmem_cache *virt_to_cache(const void 
*obj)
return page->slab_cache;
 }
 
+static __always_inline int charge_slab_page(struct page *page,
+   gfp_t gfp, int order,
+   struct kmem_cache *s)
+{
+   int ret = memcg_charge_slab(page, gfp, order, s);
+
+   if (!ret)
+   mod_lruvec_page_state(page, cache_vmstat_idx(s), 1 << order);
+
+   return ret;
+}
+
+static __always_inline void uncharge_slab_page(struct page *page, int order,
+  struct kmem_cache *s)
+{
+   mod_lruvec_page_state(page, cache_vmstat_idx(s), -(1 << order));
+   memcg_uncharge_slab(page, order, s);
+}
+
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
struct kmem_cache *cachep;
diff --git a/mm/slub.c b/mm/slub.c
index ae3b1e49ecec..6a5174b51cd6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1488,7 +1488,7 @@ static inline struct page *alloc_slab_page(struct 
kmem_cache *s,
else
page = __alloc_pages_node(node, flags, order);
 
-   if (page && memcg_charge_slab(page, flags, order, s)) {
+   if (page && charge_slab_page(page, flags, order, s)) {
__free_pages(page, order);
page = NULL;
}
@@ -1681,11 +1681,6 @@ static struct page *allocate_slab(struct kmem_cache *s, 
gfp_t flags, int node)
if (!page)
return NULL;
 
-   mod_lruvec_page_state(page,
-   (s->flags & SLAB_RECLAIM_ACCOUNT) ?

[PATCH v6 10/10] mm: reparent slab memory on cgroup removal

2019-06-04 Thread Roman Gushchin
Let's reparent memcg slab memory on memcg offlining. This allows us
to release the memory cgroup without waiting for the last outstanding
kernel object (e.g. dentry used by another application).

So instead of reparenting all accounted slab pages, let's do reparent
a relatively small amount of kmem_caches. Reparenting is performed as
a part of the deactivation process.

Since the parent cgroup is already charged, everything we need to do
is to splice the list of kmem_caches to the parent's kmem_caches list,
swap the memcg pointer and drop the css refcounter for each kmem_cache
and adjust the parent's css refcounter. Quite simple.

Please, note that kmem_cache->memcg_params.memcg isn't a stable
pointer anymore. It's safe to read it under rcu_read_lock() or
with slab_mutex held.

We can race with the slab allocation and deallocation paths. It's not
a big problem: parent's charge and slab global stats are always
correct, and we don't care anymore about the child usage and global
stats. The child cgroup is already offline, so we don't use or show it
anywhere.

Local slab stats (NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE)
aren't used anywhere except count_shadow_nodes(). But even there it
won't break anything: after reparenting "nodes" will be 0 on child
level (because we're already reparenting shrinker lists), and on
parent level page stats always were 0, and this patch won't change
anything.

Signed-off-by: Roman Gushchin 
---
 include/linux/slab.h |  4 ++--
 mm/list_lru.c|  8 +++-
 mm/memcontrol.c  | 14 --
 mm/slab.h| 23 +--
 mm/slab_common.c | 22 +++---
 5 files changed, 53 insertions(+), 18 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1b54e5f83342..109cab2ad9b4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -152,7 +152,7 @@ void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
 
 void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
-void memcg_deactivate_kmem_caches(struct mem_cgroup *);
+void memcg_deactivate_kmem_caches(struct mem_cgroup *, struct mem_cgroup *);
 
 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -638,7 +638,7 @@ struct memcg_cache_params {
bool dying;
};
struct {
-   struct mem_cgroup *memcg;
+   struct mem_cgroup __rcu *memcg;
struct list_head children_node;
struct list_head kmem_caches_node;
struct percpu_ref refcnt;
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 0f1f6b06b7f3..0b2319897e86 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -77,11 +77,15 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr,
if (!nlru->memcg_lrus)
goto out;
 
+   rcu_read_lock();
memcg = mem_cgroup_from_kmem(ptr);
-   if (!memcg)
+   if (!memcg) {
+   rcu_read_unlock();
goto out;
+   }
 
l = list_lru_from_memcg_idx(nlru, memcg_cache_id(memcg));
+   rcu_read_unlock();
 out:
if (memcg_ptr)
*memcg_ptr = memcg;
@@ -131,12 +135,14 @@ bool list_lru_add(struct list_lru *lru, struct list_head 
*item)
 
spin_lock(>lock);
if (list_empty(item)) {
+   rcu_read_lock();
l = list_lru_from_kmem(nlru, item, );
list_add_tail(item, >list);
/* Set shrinker bit if the first element was added */
if (!l->nr_items++)
memcg_set_shrinker_bit(memcg, nid,
   lru_shrinker_id(lru));
+   rcu_read_unlock();
nlru->nr_items++;
spin_unlock(>lock);
return true;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c097b1fc74ec..0f64a2c06803 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3209,15 +3209,15 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg)
 */
memcg->kmem_state = KMEM_ALLOCATED;
 
-   memcg_deactivate_kmem_caches(memcg);
-
-   kmemcg_id = memcg->kmemcg_id;
-   BUG_ON(kmemcg_id < 0);
-
parent = parent_mem_cgroup(memcg);
if (!parent)
parent = root_mem_cgroup;
 
+   memcg_deactivate_kmem_caches(memcg, parent);
+
+   kmemcg_id = memcg->kmemcg_id;
+   BUG_ON(kmemcg_id < 0);
+
/*
 * Change kmemcg_id of this cgroup and all its descendants to the
 * parent's id, and then move all entries from this cgroup's list_lrus
@@ -3250,7 +3250,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
if (memcg->kmem_state == KMEM_ALLOCATED) {
WARN_ON(!list_empty(>kmem_caches));
static_branch_dec(_kmem_enabled_key);
-   WARN_ON(page_counter_read(>kmem));
}
 }
 #else

[PATCH v6 04/10] mm: generalize postponed non-root kmem_cache deactivation

2019-06-04 Thread Roman Gushchin
Currently SLUB uses a work scheduled after an RCU grace period
to deactivate a non-root kmem_cache. This mechanism can be reused
for kmem_caches release, but requires generalization for SLAB
case.

Introduce kmemcg_cache_deactivate() function, which calls
allocator-specific __kmem_cache_deactivate() and schedules
execution of __kmem_cache_deactivate_after_rcu() with all
necessary locks in a worker context after an rcu grace period.

Here is the new calling scheme:
  kmemcg_cache_deactivate()
__kmemcg_cache_deactivate()  SLAB/SLUB-specific
kmemcg_rcufn()   rcu
  kmemcg_workfn()work
__kmemcg_cache_deactivate_after_rcu()SLAB/SLUB-specific

instead of:
  __kmemcg_cache_deactivate()SLAB/SLUB-specific
slab_deactivate_memcg_cache_rcu_sched()  SLUB-only
  kmemcg_rcufn() rcu
kmemcg_workfn()  work
  kmemcg_cache_deact_after_rcu() SLUB-only

For consistency, all allocator-specific functions start with "__".

Signed-off-by: Roman Gushchin 
---
 mm/slab.c|  4 
 mm/slab.h|  3 +--
 mm/slab_common.c | 27 ---
 mm/slub.c|  8 +---
 4 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index a4091f8b3655..4b865393ebb4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2252,6 +2252,10 @@ void __kmemcg_cache_deactivate(struct kmem_cache *cachep)
 {
__kmem_cache_shrink(cachep);
 }
+
+void __kmemcg_cache_deactivate_after_rcu(struct kmem_cache *s)
+{
+}
 #endif
 
 int __kmem_cache_shutdown(struct kmem_cache *cachep)
diff --git a/mm/slab.h b/mm/slab.h
index 8ff90f42548a..d35d85794247 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -172,6 +172,7 @@ int __kmem_cache_shutdown(struct kmem_cache *);
 void __kmem_cache_release(struct kmem_cache *);
 int __kmem_cache_shrink(struct kmem_cache *);
 void __kmemcg_cache_deactivate(struct kmem_cache *s);
+void __kmemcg_cache_deactivate_after_rcu(struct kmem_cache *s);
 void slab_kmem_cache_release(struct kmem_cache *);
 
 struct seq_file;
@@ -291,8 +292,6 @@ static __always_inline void memcg_uncharge_slab(struct page 
*page, int order,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 extern void memcg_link_cache(struct kmem_cache *s, struct mem_cgroup *memcg);
-extern void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
-   void (*work_fn)(struct kmem_cache *));
 
 #else /* CONFIG_MEMCG_KMEM */
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d019ee66bdc4..09b26673b63f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -709,7 +709,7 @@ static void kmemcg_workfn(struct work_struct *work)
put_online_mems();
put_online_cpus();
 
-   /* done, put the ref from slab_deactivate_memcg_cache_rcu_sched() */
+   /* done, put the ref from kmemcg_cache_deactivate() */
css_put(>memcg_params.memcg->css);
 }
 
@@ -727,31 +727,21 @@ static void kmemcg_rcufn(struct rcu_head *head)
queue_work(memcg_kmem_cache_wq, >memcg_params.work);
 }
 
-/**
- * slab_deactivate_memcg_cache_rcu_sched - schedule deactivation after a
- *sched RCU grace period
- * @s: target kmem_cache
- * @work_fn: deactivation function to call
- *
- * Schedule @work_fn to be invoked with online cpus, mems and slab_mutex
- * held after a sched RCU grace period.  The slab is guaranteed to stay
- * alive until @work_fn is finished.  This is to be used from
- * __kmemcg_cache_deactivate().
- */
-void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
-  void (*work_fn)(struct kmem_cache *))
+static void kmemcg_cache_deactivate(struct kmem_cache *s)
 {
if (WARN_ON_ONCE(is_root_cache(s)) ||
WARN_ON_ONCE(s->memcg_params.work_fn))
return;
 
+   __kmemcg_cache_deactivate(s);
+
if (s->memcg_params.root_cache->memcg_params.dying)
return;
 
/* pin memcg so that @s doesn't get destroyed in the middle */
css_get(>memcg_params.memcg->css);
 
-   s->memcg_params.work_fn = work_fn;
+   s->memcg_params.work_fn = __kmemcg_cache_deactivate_after_rcu;
call_rcu(>memcg_params.rcu_head, kmemcg_rcufn);
 }
 
@@ -774,7 +764,7 @@ void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)
if (!c)
continue;
 
-   __kmemcg_cache_deactivate(c);
+   kmemcg_cache_deactivate(c);
arr->entries[idx] = NULL;
}
mutex_unlock(_mutex);
@@ -867,11 +857,10 @@ static void flush_memcg_workqueue(struct kmem_cache *s)
mutex_unlock(_mutex);
 
/*
-* SLUB deactivates the kmem_caches through call_rcu. Make
+* SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
 * sure all registered 

[PATCH v6 00/10] mm: reparent slab memory on cgroup removal

2019-06-04 Thread Roman Gushchin
# Why do we need this?

We've noticed that the number of dying cgroups is steadily growing on most
of our hosts in production. The following investigation revealed an issue
in userspace memory reclaim code [1], accounting of kernel stacks [2],
and also the mainreason: slab objects.

The underlying problem is quite simple: any page charged
to a cgroup holds a reference to it, so the cgroup can't be reclaimed unless
all charged pages are gone. If a slab object is actively used by other cgroups,
it won't be reclaimed, and will prevent the origin cgroup from being reclaimed.

Slab objects, and first of all vfs cache, is shared between cgroups, which are
using the same underlying fs, and what's even more important, it's shared
between multiple generations of the same workload. So if something is running
periodically every time in a new cgroup (like how systemd works), we do
accumulate multiple dying cgroups.

Strictly speaking pagecache isn't different here, but there is a key difference:
we disable protection and apply some extra pressure on LRUs of dying cgroups,
and these LRUs contain all charged pages.
My experiments show that with the disabled kernel memory accounting the number
of dying cgroups stabilizes at a relatively small number (~100, depends on
memory pressure and cgroup creation rate), and with kernel memory accounting
it grows pretty steadily up to several thousands.

Memory cgroups are quite complex and big objects (mostly due to percpu stats),
so it leads to noticeable memory losses. Memory occupied by dying cgroups
is measured in hundreds of megabytes. I've even seen a host with more than 100Gb
of memory wasted for dying cgroups. It leads to a degradation of performance
with the uptime, and generally limits the usage of cgroups.

My previous attempt [3] to fix the problem by applying extra pressure on slab
shrinker lists caused a regressions with xfs and ext4, and has been reverted 
[4].
The following attempts to find the right balance [5, 6] were not successful.

So instead of trying to find a maybe non-existing balance, let's do reparent
the accounted slabs to the parent cgroup on cgroup removal.


# Implementation approach

There is however a significant problem with reparenting of slab memory:
there is no list of charged pages. Some of them are in shrinker lists,
but not all. Introducing of a new list is really not an option.

But fortunately there is a way forward: every slab page has a stable pointer
to the corresponding kmem_cache. So the idea is to reparent kmem_caches
instead of slab pages.

It's actually simpler and cheaper, but requires some underlying changes:
1) Make kmem_caches to hold a single reference to the memory cgroup,
   instead of a separate reference per every slab page.
2) Stop setting page->mem_cgroup pointer for memcg slab pages and use
   page->kmem_cache->memcg indirection instead. It's used only on
   slab page release, so it shouldn't be a big issue.
3) Introduce a refcounter for non-root slab caches. It's required to
   be able to destroy kmem_caches when they become empty and release
   the associated memory cgroup.

There is a bonus: currently we do release empty kmem_caches on cgroup
removal, however all other are waiting for the releasing of the memory cgroup.
These refactorings allow kmem_caches to be released as soon as they
become inactive and free.

Some additional implementation details are provided in corresponding
commit messages.


# Results

Below is the average number of dying cgroups on two groups of our production
hosts. They do run some sort of web frontend workload, the memory pressure
is moderate. As we can see, with the kernel memory reparenting the number
stabilizes in 60s range; however with the original version it grows almost
linearly and doesn't show any signs of plateauing. The difference in slab
and percpu usage between patched and unpatched versions also grows linearly.
In 7 days it exceeded 200Mb.

day   01234567
original 56  362  628  752 1070 1250 1490 1560
patched  23   46   51   55   60   57   67   69
mem diff(Mb) 22   74  123  152  164  182  214  241


# History

v6:
  1) split biggest patches into parts to make the review easier
  2) changed synchronization around the dying flag
  3) sysfs entry removal on deactivation is back
  4) got rid of redundant rcu wait on kmem_cache release
  5) fixed getting memcg pointer in mem_cgroup_from_kmem()
  5) fixed missed smp_rmb()
  6) removed redundant CONFIG_SLOB
  7) some renames and cosmetic fixes

v5:
  1) fixed a compilation warning around missing kmemcg_queue_cache_shutdown()
  2) s/rcu_read_lock()/rcu_read_unlock() in memcg_kmem_get_cache()

v4:
  1) removed excessive memcg != parent check in memcg_deactivate_kmem_caches()
  2) fixed rcu_read_lock() usage in memcg_charge_slab()
  3) fixed synchronization around dying flag in kmemcg_queue_cache_shutdown()
  4) refreshed test results data
  5) reworked PageTail() checks in 

[PATCH v6 01/10] mm: add missing smp read barrier on getting memcg kmem_cache pointer

2019-06-04 Thread Roman Gushchin
Johannes noticed that reading the memcg kmem_cache pointer in
cache_from_memcg_idx() is performed using READ_ONCE() macro,
which doesn't implement a SMP barrier, which is required
by the logic.

Add a proper smp_rmb() to be paired with smp_wmb() in
memcg_create_kmem_cache().

The same applies to memcg_create_kmem_cache() itself,
which reads the same value without barriers and READ_ONCE().

Suggested-by: Johannes Weiner 
Signed-off-by: Roman Gushchin 
---
 mm/slab.h| 1 +
 mm/slab_common.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/slab.h b/mm/slab.h
index 739099af6cbb..1176b61bb8fc 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -260,6 +260,7 @@ cache_from_memcg_idx(struct kmem_cache *s, int idx)
 * memcg_caches issues a write barrier to match this (see
 * memcg_create_kmem_cache()).
 */
+   smp_rmb();
cachep = READ_ONCE(arr->entries[idx]);
rcu_read_unlock();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 58251ba63e4a..8092bdfc05d5 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -652,7 +652,8 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 * allocation (see memcg_kmem_get_cache()), several threads can try to
 * create the same cache, but only one of them may succeed.
 */
-   if (arr->entries[idx])
+   smp_rmb();
+   if (READ_ONCE(arr->entries[idx]))
goto out_unlock;
 
cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
-- 
2.20.1



[PATCH v6 09/10] mm: stop setting page->mem_cgroup pointer for slab pages

2019-06-04 Thread Roman Gushchin
Every slab page charged to a non-root memory cgroup has a pointer
to the memory cgroup and holds a reference to it, which protects
a non-empty memory cgroup from being released. At the same time
the page has a pointer to the corresponding kmem_cache, and also
hold a reference to the kmem_cache. And kmem_cache by itself
holds a reference to the cgroup.

So there is clearly some redundancy, which allows to stop setting
the page->mem_cgroup pointer and rely on getting memcg pointer
indirectly via kmem_cache. Further it will allow to change this
pointer easier, without a need to go over all charged pages.

So let's stop setting page->mem_cgroup pointer for slab pages,
and stop using the css refcounter directly for protecting
the memory cgroup from going away. Instead rely on kmem_cache
as an intermediate object.

Make sure that vmstats and shrinker lists are working as previously,
as well as /proc/kpagecgroup interface.

Signed-off-by: Roman Gushchin 
---
 mm/list_lru.c   |  3 +-
 mm/memcontrol.c | 12 
 mm/slab.h   | 74 -
 3 files changed, 70 insertions(+), 19 deletions(-)

diff --git a/mm/list_lru.c b/mm/list_lru.c
index 927d85be32f6..0f1f6b06b7f3 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include "slab.h"
 
 #ifdef CONFIG_MEMCG_KMEM
 static LIST_HEAD(list_lrus);
@@ -63,7 +64,7 @@ static __always_inline struct mem_cgroup 
*mem_cgroup_from_kmem(void *ptr)
if (!memcg_kmem_enabled())
return NULL;
page = virt_to_head_page(ptr);
-   return page->mem_cgroup;
+   return memcg_from_slab_page(page);
 }
 
 static inline struct list_lru_one *
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 49084e2d81ff..c097b1fc74ec 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -485,7 +485,10 @@ ino_t page_cgroup_ino(struct page *page)
unsigned long ino = 0;
 
rcu_read_lock();
-   memcg = READ_ONCE(page->mem_cgroup);
+   if (PageHead(page) && PageSlab(page))
+   memcg = memcg_from_slab_page(page);
+   else
+   memcg = READ_ONCE(page->mem_cgroup);
while (memcg && !(memcg->css.flags & CSS_ONLINE))
memcg = parent_mem_cgroup(memcg);
if (memcg)
@@ -2727,9 +2730,6 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t 
gfp, int order,
cancel_charge(memcg, nr_pages);
return -ENOMEM;
}
-
-   page->mem_cgroup = memcg;
-
return 0;
 }
 
@@ -2752,8 +2752,10 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, 
int order)
memcg = get_mem_cgroup_from_current();
if (!mem_cgroup_is_root(memcg)) {
ret = __memcg_kmem_charge_memcg(page, gfp, order, memcg);
-   if (!ret)
+   if (!ret) {
+   page->mem_cgroup = memcg;
__SetPageKmemcg(page);
+   }
}
css_put(>css);
return ret;
diff --git a/mm/slab.h b/mm/slab.h
index 5d2b8511e6fb..7ead47cb9338 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -255,30 +255,67 @@ static inline struct kmem_cache *memcg_root_cache(struct 
kmem_cache *s)
return s->memcg_params.root_cache;
 }
 
+/*
+ * Expects a pointer to a slab page. Please note, that PageSlab() check
+ * isn't sufficient, as it returns true also for tail compound slab pages,
+ * which do not have slab_cache pointer set.
+ * So this function assumes that the page can pass PageHead() and PageSlab()
+ * checks.
+ */
+static inline struct mem_cgroup *memcg_from_slab_page(struct page *page)
+{
+   struct kmem_cache *s;
+
+   s = READ_ONCE(page->slab_cache);
+   if (s && !is_root_cache(s))
+   return s->memcg_params.memcg;
+
+   return NULL;
+}
+
+/*
+ * Charge the slab page belonging to the non-root kmem_cache.
+ * Can be called for non-root kmem_caches only.
+ */
 static __always_inline int memcg_charge_slab(struct page *page,
 gfp_t gfp, int order,
 struct kmem_cache *s)
 {
+   struct mem_cgroup *memcg;
+   struct lruvec *lruvec;
int ret;
 
-   if (is_root_cache(s))
-   return 0;
-
-   ret = memcg_kmem_charge_memcg(page, gfp, order, s->memcg_params.memcg);
+   memcg = s->memcg_params.memcg;
+   ret = memcg_kmem_charge_memcg(page, gfp, order, memcg);
if (ret)
return ret;
 
+   lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
+   mod_lruvec_state(lruvec, cache_vmstat_idx(s), 1 << order);
+
+   /* transer try_charge() page references to kmem_cache */
percpu_ref_get_many(>memcg_params.refcnt, 1 << order);
+   css_put_many(>css, 1 << order);
 
return 0;
 }
 
+/*
+ * Uncharge a slab page belonging to a non-root kmem_cache.
+ * Can be called for non-root kmem_caches only.
+ */
 static __always_inline void 

Re: [PATCH 3/3] rpmsg: virtio_rpmsg_bus: get buffer size from config space

2019-06-04 Thread xiang xiao
On Tue, Jun 4, 2019 at 10:25 PM Arnaud Pouliquen
 wrote:
>
> Hello Xiang,
>
> On 5/9/19 3:00 PM, xiang xiao wrote:
> > On Thu, May 9, 2019 at 8:36 PM Arnaud Pouliquen  
> > wrote:
> >>
> >> Hello Xiang,
> >>
> >> Similar mechanism has been proposed by Loic 2 years ago (link to the
> >> series here https://lkml.org/lkml/2017/3/28/349).
> >>
> >> Did you see them? Regarding history, patches seem just on hold...
> >>
> >
> > Just saw this patchset, so it's common problem hit by many vendor,
> > rpmsg framework need to address it.:)
> >
> >> Main differences (except interesting RX/TX size split) seems that you
> >> - don't use the virtio_config_ops->get
> >
> > virtio_cread call virtio_config_ops->get internally, the ideal is same
> > for both patch, just the implementation detail is different.
> >
> >> - define a new feature VIRTIO_RPMSG_F_NS.
> >
> > I add this flag to keep the compatibility with old remote peer, and
> > also follow the common virito driver practice.
> I discussed with Loic, he is ok to go further with your patch and
> abandon his one. Please find some remarks below in-line
> >
> >>
> >> Regards
> >> Arnaud
> >>
> >>
> >> On 1/31/19 4:41 PM, Xiang Xiao wrote:
> >>> 512 bytes isn't always suitable for all case, let firmware
> >>> maker decide the best value from resource table.
> >>> enable by VIRTIO_RPMSG_F_BUFSZ feature bit.
> >>>
> >>> Signed-off-by: Xiang Xiao 
> >>> ---
> >>>  drivers/rpmsg/virtio_rpmsg_bus.c  | 50 
> >>> +--
> >>>  include/uapi/linux/virtio_rpmsg.h | 24 +++
> >>>  2 files changed, 56 insertions(+), 18 deletions(-)
> >>>  create mode 100644 include/uapi/linux/virtio_rpmsg.h
> >>>
> >>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c 
> >>> b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> index 59c4554..049dd97 100644
> >>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> @@ -16,6 +16,7 @@
> >>>  #include 
> >>>  #include 
> >>>  #include 
> >>> +#include 
> >>>  #include 
> >>>  #include 
> >>>  #include 
> >>> @@ -38,7 +39,8 @@
> >>>   * @sbufs:   kernel address of tx buffers
> >>>   * @num_rbufs:   total number of buffers for rx
> >>>   * @num_sbufs:   total number of buffers for tx
> >>> - * @buf_size:size of one rx or tx buffer
> >>> + * @rbuf_size:   size of one rx buffer
> >>> + * @sbuf_size:   size of one tx buffer
> >>>   * @last_sbuf:   index of last tx buffer used
> >>>   * @rbufs_dma:   dma base addr of rx buffers
> >>>   * @sbufs_dma:   dma base addr of tx buffers
> >>> @@ -61,7 +63,8 @@ struct virtproc_info {
> >>>   void *rbufs, *sbufs;
> >>>   unsigned int num_rbufs;
> >>>   unsigned int num_sbufs;
> >>> - unsigned int buf_size;
> >>> + unsigned int rbuf_size;
> >>> + unsigned int sbuf_size;
> >>>   int last_sbuf;
> >>>   dma_addr_t rbufs_dma;
> >>>   dma_addr_t sbufs_dma;
> >>> @@ -73,9 +76,6 @@ struct virtproc_info {
> >>>   struct rpmsg_endpoint *ns_ept;
> >>>  };
> >>>
> >>> -/* The feature bitmap for virtio rpmsg */
> >>> -#define VIRTIO_RPMSG_F_NS0 /* RP supports name service notifications 
> >>> */
> >>> -
> >>>  /**
> >>>   * struct rpmsg_hdr - common header for all rpmsg messages
> >>>   * @src: source address
> >>> @@ -452,7 +452,7 @@ static void *get_a_tx_buf(struct virtproc_info *vrp)
> >>>
> >>>   /* either pick the next unused tx buffer */
> >>>   if (vrp->last_sbuf < vrp->num_sbufs)
> >>> - ret = vrp->sbufs + vrp->buf_size * vrp->last_sbuf++;
> >>> + ret = vrp->sbufs + vrp->sbuf_size * vrp->last_sbuf++;
> >>>   /* or recycle a used one */
> >>>   else
> >>>   ret = virtqueue_get_buf(vrp->svq, );
> >>> @@ -578,7 +578,7 @@ static int rpmsg_send_offchannel_raw(struct 
> >>> rpmsg_device *rpdev,
> >>>* messaging), or to improve the buffer allocator, to support
> >>>* variable-length buffer sizes.
> >>>*/
> >>> - if (len > vrp->buf_size - sizeof(struct rpmsg_hdr)) {
> >>> + if (len > vrp->sbuf_size - sizeof(struct rpmsg_hdr)) {
> >>>   dev_err(dev, "message is too big (%d)\n", len);
> >>>   return -EMSGSIZE;
> >>>   }
> >>> @@ -718,7 +718,7 @@ static int rpmsg_recv_single(struct virtproc_info 
> >>> *vrp, struct device *dev,
> >>>* We currently use fixed-sized buffers, so trivially sanitize
> >>>* the reported payload length.
> >>>*/
> >>> - if (len > vrp->buf_size ||
> >>> + if (len > vrp->rbuf_size ||
> >>>   msg->len > (len - sizeof(struct rpmsg_hdr))) {
> >>>   dev_warn(dev, "inbound msg too big: (%d, %d)\n", len, 
> >>> msg->len);
> >>>   return -EINVAL;
> >>> @@ -751,7 +751,7 @@ static int rpmsg_recv_single(struct virtproc_info 
> >>> *vrp, struct device *dev,
> >>>   dev_warn(dev, "msg received with no recipient\n");
> >>>
> >>>   /* publish the real size of the buffer */
> >>> - 

Re: [PATCH v3] scsi: ibmvscsi: Don't use rc uninitialized in ibmvscsi_do_work

2019-06-04 Thread Martin K. Petersen


Nathan,

> clang warns:
>
> drivers/scsi/ibmvscsi/ibmvscsi.c:2126:7: warning: variable 'rc' is used
> uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
> case IBMVSCSI_HOST_ACTION_NONE:
>  ^

Applied to 5.3/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2 2/2] zswap: Add module parameter malloc_movable_if_support

2019-06-04 Thread Hui Zhu
Shakeel Butt  于2019年6月5日周三 上午1:12写道:
>
> On Sun, Jun 2, 2019 at 2:47 AM Hui Zhu  wrote:
> >
> > This is the second version that was updated according to the comments
> > from Sergey Senozhatsky in https://lkml.org/lkml/2019/5/29/73
> >
> > zswap compresses swap pages into a dynamically allocated RAM-based
> > memory pool.  The memory pool should be zbud, z3fold or zsmalloc.
> > All of them will allocate unmovable pages.  It will increase the
> > number of unmovable page blocks that will bad for anti-fragment.
> >
> > zsmalloc support page migration if request movable page:
> > handle = zs_malloc(zram->mem_pool, comp_len,
> > GFP_NOIO | __GFP_HIGHMEM |
> > __GFP_MOVABLE);
> >
> > And commit "zpool: Add malloc_support_movable to zpool_driver" add
> > zpool_malloc_support_movable check malloc_support_movable to make
> > sure if a zpool support allocate movable memory.
> >
> > This commit adds module parameter malloc_movable_if_support to enable
> > or disable zpool allocate block with gfp __GFP_HIGHMEM | __GFP_MOVABLE
> > if it support allocate movable memory (disabled by default).
> >
> > Following part is test log in a pc that has 8G memory and 2G swap.
> >
> > When it disabled:
> >  echo lz4 > /sys/module/zswap/parameters/compressor
> >  echo zsmalloc > /sys/module/zswap/parameters/zpool
> >  echo 1 > /sys/module/zswap/parameters/enabled
> >  swapon /swapfile
> >  cd /home/teawater/kernel/vm-scalability/
> > /home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 
> > 1024))
> > /home/teawater/kernel/vm-scalability# ./case-anon-w-seq
> > 2717908992 bytes / 3977932 usecs = 667233 KB/s
> > 2717908992 bytes / 4160702 usecs = 637923 KB/s
> > 2717908992 bytes / 4354611 usecs = 609516 KB/s
> > 293359 usecs to free memory
> > 340304 usecs to free memory
> > 205781 usecs to free memory
> > 2717908992 bytes / 5588016 usecs = 474982 KB/s
> > 166124 usecs to free memory
> > /home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo
> > Page block order: 9
> > Pages per block:  512
> >
> > Free pages count per migrate type at order   0  1  2  3 
> >  4  5  6  7  8  9 10
> > Node0, zone  DMA, typeUnmovable  1  1  1  0 
> >  2  1  1  0  1  0  0
> > Node0, zone  DMA, type  Movable  0  0  0  0 
> >  0  0  0  0  0  1  3
> > Node0, zone  DMA, type  Reclaimable  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone  DMA, type   HighAtomic  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone  DMA, type  CMA  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone  DMA, type  Isolate  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zoneDMA32, typeUnmovable  5 10  9  8 
> >  8  5  1  2  3  0  0
> > Node0, zoneDMA32, type  Movable 15 16 14 12 
> > 14 10  9  6  6  5776
> > Node0, zoneDMA32, type  Reclaimable  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zoneDMA32, type   HighAtomic  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zoneDMA32, type  CMA  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zoneDMA32, type  Isolate  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone   Normal, typeUnmovable   7097   6914   6473   5642   
> > 4373   2664   1220319 78  4  0
> > Node0, zone   Normal, type  Movable   2092   3216   2820   2266   
> > 1585946559359237258378
> > Node0, zone   Normal, type  Reclaimable 47 88122 80 
> > 34  9  5  4  2  1  2
> > Node0, zone   Normal, type   HighAtomic  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone   Normal, type  CMA  0  0  0  0 
> >  0  0  0  0  0  0  0
> > Node0, zone   Normal, type  Isolate  0  0  0  0 
> >  0  0  0  0  0  0  0
> >
> > Number of blocks type Unmovable  Movable  Reclaimable   HighAtomic  
> > CMA  Isolate
> > Node 0, zone  DMA1700   
> >  00
> > Node 0, zoneDMA324 165200   
> >  00
> > Node 0, zone   Normal  834 1572   250   
> >  00
> >
> > When it 

Re: [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits

2019-06-04 Thread Yang Weijiang
On Tue, Jun 04, 2019 at 07:46:13AM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:54PM +0800, Yang Weijiang wrote:
> > CET(Control-flow Enforcement Technology) is an upcoming Intel® processor
> > family feature that blocks return/jump-oriented programming (ROP) attacks.
> > It provides the following capabilities to defend
> > against ROP/JOP style control-flow subversion attacks:
> > 
> > - Shadow Stack (SHSTK):
> >   A second stack for the program that is used exclusively for
> >   control transfer operations.
> > 
> > - Indirect Branch Tracking (IBT):
> >   Free branch protection to defend against jump/call oriented
> >   programming.
> 
> What is "free" referring to here?  The software enabling certainly isn't
> free, and I doubt the hardware/ucode cost is completely free.
>
Thank you for pointing it out!
"free" comes from the spec., I guess the author means the major effort of
enabling IBT is in compiler and HW, free effort to SW enabling.
But as you mentioned, actually there's deficated effort to enable it,
will change it to other words.

> > Several new CET MSRs are defined in kernel to support CET:
> > MSR_IA32_{U,S}_CET - MSRs to control the CET settings for user
> > mode and suervisor mode respectively.
> > 
> > MSR_IA32_PL{0,1,2,3}_SSP - MSRs to store shadow stack pointers for
> > CPL-0,1,2,3 levels.
> > 
> > MSR_IA32_INT_SSP_TAB - MSR to store base address of shadow stack
> > pointer table.
> 
> For consistency (within the changelog), these should be list style, e.g.:
> 
> 
>   - MSR_IA32_{U,S}_CET: Control CET settings for user mode and suervisor
> mode respectively.
> 
>   - MSR_IA32_PL{0,1,2,3}_SSP: Store shadow stack pointers for CPL levels.
> 
>   - MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer
>   table.
> 
OK, will change it in next version.
> > Two XSAVES state components are introduced for CET:
> > IA32_XSS:[bit 11] - bit for save/restor user mode CET states
> > IA32_XSS:[bit 12] - bit for save/restor supervisor mode CET states.
> 
> Likewise, use a consistent list format.
> 
> > 6 VMCS fields are introduced for CET, {HOST,GUEST}_S_CET is to store
> > CET settings in supervisor mode. {HOST,GUEST}_SSP is to store shadow
> > stack pointers in supervisor mode. {HOST,GUEST}_INTR_SSP_TABLE is to
> > store base address of shadow stack pointer table.
> 
> It'd probably be easier to use a list format for the fields, e.g.:
> 
> 6 VMCS fields are introduced for CET:
> 
>   - {HOST,GUEST}_S_CET: stores CET settings for supervisor mode.
> 
>   - {HOST,GUEST}_SSP: stores shadow stack pointers for supervisor mode.
> 
>   - {HOST,GUEST}_INTR_SSP_TABLE: stores the based address of the shadow
>  stack pointer table.
> 
OK, will modify it.
> > If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host's CET MSRs are restored
> > from below VMCS fields at VM-Exit:
> > - HOST_S_CET
> > - HOST_SSP
> > - HOST_INTR_SSP_TABLE
> 
> Personal preference, I like indenting lists like this with a space or two
> so that the list is clearly delineated.
Good suggestion, thanks!
> 
> > If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest's CET MSRs are loaded
> > from below VMCS fields at VM-Entry:
> > - GUEST_S_CET
> > - GUEST_SSP
> > - GUEST_INTR_SSP_TABLE
> > 
> > Apart from VMCS auto-load fields, KVM calls kvm_load_guest_fpu() and
> > kvm_put_guest_fpu() to save/restore the guest CET MSR states at
> > VM exit/entry. XSAVES/XRSTORS are executed underneath these functions
> > if they are supported. The CET xsave area is consolidated with other
> > XSAVE components in thread_struct.fpu field.
> > 
> > When context switch happens during task switch/interrupt/exception etc.,
> > Kernel also relies on above functions to switch CET states properly.
> 
> These paragraphs about the FPU and KVM behavior don't belong in this
> patch.

OK. looks like it's redundant, will remve it.
>  
> > Signed-off-by: Yang Weijiang 
> > Co-developed-by: Zhang Yi Z 
> 
> Co-developed-by needs to be accompanied by a SOB.  And your SOB should
> be last since you sent the patch.  This comment applies to all patches.
> 
> See "12) When to use Acked-by:, Cc:, and Co-developed-by:" in
> Documentation/process/submitting-patches.rst for details (I recommend
> looking at a v5.2-rc* version, a docs update was merged for v5.2).
Got it, will change all the signatures.


Re: [PATCH -next] scsi: lpfc: Make some symbols static

2019-06-04 Thread Martin K. Petersen


YueHaibing,

> Fix sparse warnings:
>
> drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' 
> was not declared. Should it be static?
> drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 
> 'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be 
> static?
> drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 
> 'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static?
> drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 
> 'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static?
> drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was 
> not declared. Should it be static?

Applied to 5.3/scsi-queue. Thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH -next] scsi: lpfc: Remove set but not used variables 'qp'

2019-06-04 Thread Martin K. Petersen


YueHaibing,

> Fixes gcc '-Wunused-but-set-variable' warnings:
>
> drivers/scsi/lpfc/lpfc_init.c: In function lpfc_setup_cq_lookup:
> drivers/scsi/lpfc/lpfc_init.c:9359:30: warning: variable qp set but not used 
> [-Wunused-but-set-variable]

Applied to 5.3/scsi-queue, thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v3 7/8] dmaengine: fsl-edma: add i.mx7ulp edma2 version support

2019-06-04 Thread Robin Gong
On 2019-06-04 at 12:37 +, Vinod Koul wrote:
> On 29-05-19, 17:08, yibin.g...@nxp.com wrote:
> > 
> > From: Robin Gong 
> > 
> >   Add edma2 for i.mx7ulp by version v3, since v2 has already
> Why leading spaces at start of line?
Sorry for the typo, will fix in next version
> 
> > 
> > been used by mcf-edma.
> > The big changes based on v1 are belows:
> > 1. only one dmamux.
> > 2. another clock dma_clk except dmamux clk.
> > 3. 16 independent interrupts instead of only one interrupt for
> > all channels.
> > 
> > Signed-off-by: Robin Gong 
> > ---
> >  drivers/dma/fsl-edma-common.c | 18 +++-
> >  drivers/dma/fsl-edma-common.h |  3 ++
> >  drivers/dma/fsl-edma.c| 67
> > +++
> >  3 files changed, 87 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-
> > common.c
> > index 45d70d3..0d9915c 100644
> > --- a/drivers/dma/fsl-edma-common.c
> > +++ b/drivers/dma/fsl-edma-common.c
> > @@ -90,6 +90,19 @@ static void mux_configure8(struct fsl_edma_chan
> > *fsl_chan, void __iomem *addr,
> >     iowrite8(val8, addr + off);
> >  }
> >  
> > +void mux_configure32(struct fsl_edma_chan *fsl_chan, void __iomem
> > *addr,
> > +    u32 off, u32 slot, bool enable)
> > +{
> > +   u32 val;
> > +
> > +   if (enable)
> > +   val = EDMAMUX_CHCFG_ENBL << 24 | slot;
> > +   else
> > +   val = EDMAMUX_CHCFG_DIS;
> > +
> > +   iowrite32(val, addr + off * 4);
> > +}
> > +
> >  void fsl_edma_chan_mux(struct fsl_edma_chan *fsl_chan,
> >     unsigned int slot, bool enable)
> >  {
> > @@ -102,7 +115,10 @@ void fsl_edma_chan_mux(struct fsl_edma_chan
> > *fsl_chan,
> >     muxaddr = fsl_chan->edma->muxbase[ch / chans_per_mux];
> >     slot = EDMAMUX_CHCFG_SOURCE(slot);
> >  
> > -   mux_configure8(fsl_chan, muxaddr, ch_off, slot, enable);
> > +   if (fsl_chan->edma->version == v3)
> > +   mux_configure32(fsl_chan, muxaddr, ch_off, slot,
> > enable);
> > +   else
> > +   mux_configure8(fsl_chan, muxaddr, ch_off, slot,
> > enable);
> >  }
> >  EXPORT_SYMBOL_GPL(fsl_edma_chan_mux);
> >  
> > diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-
> > common.h
> > index 014ab74..07482d2 100644
> > --- a/drivers/dma/fsl-edma-common.h
> > +++ b/drivers/dma/fsl-edma-common.h
> > @@ -125,6 +125,7 @@ struct fsl_edma_chan {
> >     dma_addr_t  dma_dev_addr;
> >     u32 dma_dev_size;
> >     enum dma_data_direction dma_dir;
> > +   charchan_name[16];
> >  };
> >  
> >  struct fsl_edma_desc {
> > @@ -139,6 +140,7 @@ struct fsl_edma_desc {
> >  enum edma_version {
> >     v1, /* 32ch, Vybrid, mpc57x, etc */
> >     v2, /* 64ch Coldfire */
> > +   v3, /* 32ch, i.mx7ulp */
> >  };
> >  
> >  struct fsl_edma_drvdata {
> > @@ -154,6 +156,7 @@ struct fsl_edma_engine {
> >     void __iomem*membase;
> >     void __iomem*muxbase[DMAMUX_NR];
> >     struct clk  *muxclk[DMAMUX_NR];
> > +   struct clk  *dmaclk;
> >     u32 dmamux_nr;
> >     struct mutexfsl_edma_mutex;
> >     const struct fsl_edma_drvdata *drvdata;
> > diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
> > index cf18301..45b26d6 100644
> > --- a/drivers/dma/fsl-edma.c
> > +++ b/drivers/dma/fsl-edma.c
> > @@ -165,6 +165,51 @@ fsl_edma_irq_init(struct platform_device
> > *pdev, struct fsl_edma_engine *fsl_edma
> >     return 0;
> >  }
> >  
> > +static int
> > +fsl_edma2_irq_init(struct platform_device *pdev,
> > +      struct fsl_edma_engine *fsl_edma)
> > +{
> > +   struct device_node *np = pdev->dev.of_node;
> > +   int i, ret, irq;
> > +   int count = 0;
> Superflous initialization of count!
Would fix it in v4.
> 
> > 
> > +
> > +   count = of_irq_count(np);
> > +   dev_info(>dev, "%s Found %d interrupts\r\n",
> > __func__, count);
> Consider using debug level..
Would downgrade print level in v4.
> 
> > 
> > +   if (count <= 2) {
> > +   dev_err(>dev, "Interrupts in DTS not
> > correct.\n");
> > +   return -EINVAL;
> > +   }
> > +   /*
> > +    * 16 channel independent interrupts + 1 error interrupt
> > on i.mx7ulp.
> > +    * 2 channel share one interrupt, for example, ch0/ch16,
> > ch1/ch17...
> > +    * For now, just simply request irq without IRQF_SHARED
> > flag, since 16
> > +    * channels are enough on i.mx7ulp whose M4 domain own
> > some peripherals.
> > +    */
> > +   for (i = 0; i < count; i++) {
> > +   irq = platform_get_irq(pdev, i);
> > +   if (irq < 0)
> > +   return -ENXIO;
> > +
> > +   sprintf(fsl_edma->chans[i].chan_name, "eDMA2-
> > CH%02d", i);
> > +
> > +   /* The last IRQ is for eDMA err */
> > +   if (i == count - 1)
> > +   ret = devm_request_irq(>dev, irq,
> > +   

Re: [PATCH v3 5/8] dmaengine: fsl-edma: add drvdata for vf610

2019-06-04 Thread Robin Gong
On 二, 2019-06-04 at 18:03 +0530, Vinod Koul wrote:
> On 29-05-19, 17:08, yibin.g...@nxp.com wrote:
> 
> > 
> > @@ -205,8 +228,9 @@ static int fsl_edma_probe(struct
> > platform_device *pdev)
> >     if (!fsl_edma)
> >     return -ENOMEM;
> >  
> > -   fsl_edma->version = v1;
> > -   fsl_edma->dmamux_nr = DMAMUX_NR;
> > +   fsl_edma->drvdata = drvdata;
> > +   fsl_edma->version = drvdata->version;
> > +   fsl_edma->dmamux_nr = drvdata->dmamuxs;
> And can we avoid the duplication here, you have version and dmamuxs
> represented in two places. But right now it looks logical so the
> removal
> should be done after this series
To avoid more code changes in other edma driver such as mcf-edma.c and
fsl-edma-common.c(replace all version/dmamux_nr with new
'drvdata'),meanwhile, no board to test mcf-edma so I keep
'version'/'dmamux' here in the last minute. But if you stick, I would
try to refine it in next version. 
> 

Re: [PATCH 0/6] hisi_sas: Some misc patches

2019-06-04 Thread Martin K. Petersen


John,

> This patchset introduces some misc patches for the driver. Nothing
> particularly stands out, maybe apart from a patch to delete a PHY's
> timer when necessary.

Applied to 5.3/scsi-queue, thanks.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: rcu_read_lock lost its compiler barrier

2019-06-04 Thread Herbert Xu
On Tue, Jun 04, 2019 at 02:14:49PM -0700, Paul E. McKenney wrote:
>
> Yeah, I know, even with the "volatile" keyword, it is not entirely clear
> how much reordering the compiler is allowed to do.  I was relying on
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, which says:

The volatile keyword doesn't give any guarantees of this kind.
The key to ensuring ordering between unrelated variable/register
reads/writes is the memory clobber:

6.47.2.6 Clobbers and Scratch Registers

...

"memory" The "memory" clobber tells the compiler that the assembly
code performs memory reads or writes to items other than those
listed in the input and output operands (for example, accessing
the memory pointed to by one of the input parameters). To ensure
memory contains correct values, GCC may need to flush specific
register values to memory before executing the asm. Further,
the compiler does not assume that any values read from memory
before an asm remain unchanged after that asm; it reloads them as
needed. Using the "memory" clobber effectively forms a read/write
memory barrier for the compiler.

Note that this clobber does not prevent the processor from
doing speculative reads past the asm statement. To prevent that,
you need processor-specific fence instructions.

IOW you need a barrier().

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC PATCH v4] rtl8xxxu: Improve TX performance of RTL8723BU on rtl8xxxu driver

2019-06-04 Thread Chris Chiu
On Tue, Jun 4, 2019 at 3:21 AM Jes Sorensen  wrote:
>
> On 5/31/19 5:12 AM, Chris Chiu wrote:
> > We have 3 laptops which connect the wifi by the same RTL8723BU.
> > The PCI VID/PID of the wifi chip is 10EC:B720 which is supported.
> > They have the same problem with the in-kernel rtl8xxxu driver, the
> > iperf (as a client to an ethernet-connected server) gets ~1Mbps.
> > Nevertheless, the signal strength is reported as around -40dBm,
> > which is quite good. From the wireshark capture, the tx rate for each
> > data and qos data packet is only 1Mbps. Compare to the Realtek driver
> > at https://github.com/lwfinger/rtl8723bu, the same iperf test gets
> > ~12Mbps or better. The signal strength is reported similarly around
> > -40dBm. That's why we want to improve.
> >
> > After reading the source code of the rtl8xxxu driver and Realtek's, the
> > major difference is that Realtek's driver has a watchdog which will keep
> > monitoring the signal quality and updating the rate mask just like the
> > rtl8xxxu_gen2_update_rate_mask() does if signal quality changes.
> > And this kind of watchdog also exists in rtlwifi driver of some specific
> > chips, ex rtl8192ee, rtl8188ee, rtl8723ae, rtl8821ae...etc. They have
> > the same member function named dm_watchdog and will invoke the
> > corresponding dm_refresh_rate_adaptive_mask to adjust the tx rate
> > mask.
> >
> > With this commit, the tx rate of each data and qos data packet will
> > be 39Mbps (MCS4) with the 0xF0 as the tx rate mask. The 20th bit
> > to 23th bit means MCS4 to MCS7. It means that the firmware still picks
> > the lowest rate from the rate mask and explains why the tx rate of
> > data and qos data is always lowest 1Mbps because the default rate mask
> > passed is always 0xFFF ranges from the basic CCK rate, OFDM rate,
> > and MCS rate. However, with Realtek's driver, the tx rate observed from
> > wireshark under the same condition is almost 65Mbps or 72Mbps.
> >
> > I believe the firmware of RTL8723BU may need fix. And I think we
> > can still bring in the dm_watchdog as rtlwifi to improve from the
> > driver side. Please leave precious comments for my commits and
> > suggest what I can do better. Or suggest if there's any better idea
> > to fix this. Thanks.
> >
> > Signed-off-by: Chris Chiu 
>
> I am really pleased to see you're investigating some of these issues,
> since I've been pretty swamped and not had time to work on this driver
> for a long time.
>
> The firmware should allow for two rate modes, either firmware handled or
> controlled by the driver. Ideally we would want the driver to handle it,
> but I never was able to make that work reliable.
>
> This fix should at least improve the situation, and it may explain some
> of the performance issues with the 8192eu as well?
>
> > diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h 
> > b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> > index 8828baf26e7b..216f603827a8 100644
> > --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> > +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> > @@ -1195,6 +1195,44 @@ struct rtl8723bu_c2h {
> >
> >  struct rtl8xxxu_fileops;
> >
> > +/*mlme related.*/
> > +enum wireless_mode {
> > + WIRELESS_MODE_UNKNOWN = 0,
> > + /* Sub-Element */
> > + WIRELESS_MODE_B = BIT(0),
> > + WIRELESS_MODE_G = BIT(1),
> > + WIRELESS_MODE_A = BIT(2),
> > + WIRELESS_MODE_N_24G = BIT(3),
> > + WIRELESS_MODE_N_5G = BIT(4),
> > + WIRELESS_AUTO = BIT(5),
> > + WIRELESS_MODE_AC = BIT(6),
> > + WIRELESS_MODE_MAX = 0x7F,
> > +};
> > +
> > +/* from rtlwifi/wifi.h */
> > +enum ratr_table_mode_new {
> > + RATEID_IDX_BGN_40M_2SS = 0,
> > + RATEID_IDX_BGN_40M_1SS = 1,
> > + RATEID_IDX_BGN_20M_2SS_BN = 2,
> > + RATEID_IDX_BGN_20M_1SS_BN = 3,
> > + RATEID_IDX_GN_N2SS = 4,
> > + RATEID_IDX_GN_N1SS = 5,
> > + RATEID_IDX_BG = 6,
> > + RATEID_IDX_G = 7,
> > + RATEID_IDX_B = 8,
> > + RATEID_IDX_VHT_2SS = 9,
> > + RATEID_IDX_VHT_1SS = 10,
> > + RATEID_IDX_MIX1 = 11,
> > + RATEID_IDX_MIX2 = 12,
> > + RATEID_IDX_VHT_3SS = 13,
> > + RATEID_IDX_BGN_3SS = 14,
> > +};
> > +
> > +#define RTL8XXXU_RATR_STA_INIT 0
> > +#define RTL8XXXU_RATR_STA_HIGH 1
> > +#define RTL8XXXU_RATR_STA_MID  2
> > +#define RTL8XXXU_RATR_STA_LOW  3
> > +
>
> >  extern struct rtl8xxxu_fileops rtl8192cu_fops;
> >  extern struct rtl8xxxu_fileops rtl8192eu_fops;
> > diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c 
> > b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> > index 26b674aca125..2071ab9fd001 100644
> > --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> > +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> > @@ -1645,6 +1645,148 @@ static void rtl8723bu_init_statistics(struct 
> > rtl8xxxu_priv *priv)
> >   rtl8xxxu_write32(priv, REG_OFDM0_FA_RSTC, val32);
> >  }
> >
> > +static u8 rtl8723b_signal_to_rssi(int signal)
> > +{
> > + 

Re: [PATCH] USB: move usb debugfs directory creation to the usb common core

2019-06-04 Thread Chunfeng Yun
On Tue, 2019-06-04 at 13:59 +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 04, 2019 at 11:32:58AM +0200, Greg Kroah-Hartman wrote:
> > The USB gadget subsystem wants to use the USB debugfs root directory, so
> > move it to the common "core" USB code so that it is properly initialized
> > and removed as needed.
> > 
> > Signed-off-by: Greg Kroah-Hartman 
> > 
> > ---
> > 
> > This should be the "correct" version of this, Chunfeng, can you test
> > this to verify it works for you?
I'll test it ASAP, thanks a lot

> > 
> > 
> > diff --git a/drivers/usb/common/common.c b/drivers/usb/common/common.c
> > index 18f5dcf58b0d..3b5e4263ffef 100644
> > --- a/drivers/usb/common/common.c
> > +++ b/drivers/usb/common/common.c
> > @@ -15,6 +15,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  static const char *const ep_type_names[] = {
> > [USB_ENDPOINT_XFER_CONTROL] = "ctrl",
> > @@ -291,4 +292,21 @@ struct device *usb_of_get_companion_dev(struct device 
> > *dev)
> >  EXPORT_SYMBOL_GPL(usb_of_get_companion_dev);
> >  #endif
> >  
> > +struct dentry *usb_debug_root;
> > +EXPORT_SYMBOL_GPL(usb_debug_root);
> > +
> > +static int usb_common_init(void)
> > +{
> > +   usb_debug_root = debugfs_create_dir("usb", NULL);
> > +   return 0;
> > +}
> > +
> > +static void usb_common_exit(void)
> > +{
> > +   debugfs_remove_recursive(usb_debug_root);
> > +}
> > +
> > +module_init(usb_common_init);
> > +module_exit(usb_common_exit);
> > +
> >  MODULE_LICENSE("GPL");
> > diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> > index 7fcb9f782931..f3d6b1ab80cb 100644
> > --- a/drivers/usb/core/usb.c
> > +++ b/drivers/usb/core/usb.c
> > @@ -1185,19 +1185,17 @@ static struct notifier_block usb_bus_nb = {
> > .notifier_call = usb_bus_notify,
> >  };
> >  
> > -struct dentry *usb_debug_root;
> > -EXPORT_SYMBOL_GPL(usb_debug_root);
> > +static struct dentry *usb_devices_root;
> >  
> >  static void usb_debugfs_init(void)
> >  {
> > -   usb_debug_root = debugfs_create_dir("usb", NULL);
> > -   debugfs_create_file("devices", 0444, usb_debug_root, NULL,
> > -   _devices_fops);
> > +   usb_devices_root = debugfs_create_file("devices", 0444, usb_debug_root,
> > +  NULL, _devices_fops);
> >  }
> >  
> >  static void usb_debugfs_cleanup(void)
> >  {
> > -   debugfs_remove_recursive(usb_debug_root);
> > +   debugfs_remove_recursive(usb_devices_root);
> 
> That should just be debugfs_remove();
> 
> I'll fix it up after someone tests this :)
> 
> thanks,
> 
> greg k-h




Re: [PATCH] scsi: ufs: Check that space was properly alloced in copy_query_response

2019-06-04 Thread Martin K. Petersen


Avri,

> struct ufs_dev_cmd is the main container that supports device management
> commands. In the case of a read descriptor request, we assume that the
> proper space was allocated in dev_cmd to hold the returning descriptor.
>
> This is no longer true, as there are flows that doesn't use dev_cmd
> for device management requests, and was wrong in the first place.

Applied to 5.2/scsi-fixes, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


linux-next: build failure after merge of the tpmdd tree

2019-06-04 Thread Stephen Rothwell
Hi all,

After merging the tpmdd tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

include/linux/tpm_eventlog.h: In function '__calc_tpm2_event_size':
drivers/firmware/efi/tpm.c:7:35: error: implicit declaration of function 
'early_memremap'; did you mean 'early_memtest'? 
[-Werror=implicit-function-declaration]
 #define TPM_MEMREMAP(start, size) early_memremap(start, size)
   ^~
include/linux/tpm_eventlog.h:182:13: note: in expansion of macro 'TPM_MEMREMAP'
   mapping = TPM_MEMREMAP((unsigned long)marker_start,
 ^~~~
In file included from drivers/firmware/efi/tpm.c:13:
include/linux/tpm_eventlog.h:182:11: warning: assignment to 'void *' from 'int' 
makes pointer from integer without a cast [-Wint-conversion]
   mapping = TPM_MEMREMAP((unsigned long)marker_start,
   ^
drivers/firmware/efi/tpm.c:8:35: error: implicit declaration of function 
'early_memunmap'; did you mean 'early_memtest'? 
[-Werror=implicit-function-declaration]
 #define TPM_MEMUNMAP(start, size) early_memunmap(start, size)
   ^~
include/linux/tpm_eventlog.h:207:4: note: in expansion of macro 'TPM_MEMUNMAP'
TPM_MEMUNMAP(mapping, mapping_size);
^~~~
In file included from drivers/firmware/efi/tpm.c:13:
include/linux/tpm_eventlog.h:209:12: warning: assignment to 'void *' from 'int' 
makes pointer from integer without a cast [-Wint-conversion]
mapping = TPM_MEMREMAP((unsigned long)marker,
^
include/linux/tpm_eventlog.h:243:11: warning: assignment to 'void *' from 'int' 
makes pointer from integer without a cast [-Wint-conversion]
   mapping = TPM_MEMREMAP((unsigned long)marker,
   ^
In file included from ./arch/arm/include/generated/asm/early_ioremap.h:1,
 from drivers/firmware/efi/tpm.c:15:
include/asm-generic/early_ioremap.h: At top level:
include/asm-generic/early_ioremap.h:13:14: error: conflicting types for 
'early_memremap'
 extern void *early_memremap(resource_size_t phys_addr,
  ^~
drivers/firmware/efi/tpm.c:7:35: note: previous implicit declaration of 
'early_memremap' was here
 #define TPM_MEMREMAP(start, size) early_memremap(start, size)
   ^~
include/linux/tpm_eventlog.h:182:13: note: in expansion of macro 'TPM_MEMREMAP'
   mapping = TPM_MEMREMAP((unsigned long)marker_start,
 ^~~~
In file included from ./arch/arm/include/generated/asm/early_ioremap.h:1,
 from drivers/firmware/efi/tpm.c:15:
include/asm-generic/early_ioremap.h:20:13: warning: conflicting types for 
'early_memunmap'
 extern void early_memunmap(void *addr, unsigned long size);
 ^~
drivers/firmware/efi/tpm.c:8:35: note: previous implicit declaration of 
'early_memunmap' was here
 #define TPM_MEMUNMAP(start, size) early_memunmap(start, size)
   ^~
include/linux/tpm_eventlog.h:207:4: note: in expansion of macro 'TPM_MEMUNMAP'
TPM_MEMUNMAP(mapping, mapping_size);
^~~~
drivers/firmware/efi/tpm.c: In function 'efi_tpm_eventlog_init':
drivers/firmware/efi/tpm.c:81:10: warning: passing argument 1 of 
'tpm2_calc_event_log_size' makes pointer from integer without a cast 
[-Wint-conversion]
  tbl_size = tpm2_calc_event_log_size(efi.tpm_final_log
  ~
  + sizeof(final_tbl->version)
  
  + sizeof(final_tbl->nr_events),
  ^~
drivers/firmware/efi/tpm.c:20:43: note: expected 'void *' but argument is of 
type 'long unsigned int'
 static int tpm2_calc_event_log_size(void *data, int count, void *size_info)
 ~~^~~~
cc1: some warnings being treated as errors

Caused by commit

  b25b956d13d5 ("tpm: Reserve the TPM final events table")

I have used the tpmdd tree from next-20190604 for today.

-- 
Cheers,
Stephen Rothwell


pgpVmsRswq6ex.pgp
Description: OpenPGP digital signature


Re: [PATCH net] tcp: avoid creating multiple req socks with the same tuples

2019-06-04 Thread maowenan



On 2019/6/4 23:24, Eric Dumazet wrote:
> On Tue, Jun 4, 2019 at 7:47 AM Mao Wenan  wrote:
>>
>> There is one issue about bonding mode BOND_MODE_BROADCAST, and
>> two slaves with diffierent affinity, so packets will be handled
>> by different cpu. These are two pre-conditions in this case.
>>
>> When two slaves receive the same syn packets at the same time,
>> two request sock(reqsk) will be created if below situation happens:
>> 1. syn1 arrived tcp_conn_request, create reqsk1 and have not yet called
>> inet_csk_reqsk_queue_hash_add.
>> 2. syn2 arrived tcp_v4_rcv, it goes to tcp_conn_request and create reqsk2
>> because it can't find reqsk1 in the __inet_lookup_skb.
>>
>> Then reqsk1 and reqsk2 are added to establish hash table, and two synack 
>> with different
>> seq(seq1 and seq2) are sent to client, then tcp ack arrived and will be
>> processed in tcp_v4_rcv and tcp_check_req, if __inet_lookup_skb find the 
>> reqsk2, and
>> tcp ack packet is ack_seq is seq1, it will be failed after checking:
>> TCP_SKB_CB(skb)->ack_seq != tcp_rsk(req)->snt_isn + 1)
>> and then tcp rst will be sent to client and close the connection.
>>
>> To fix this, do lookup before calling inet_csk_reqsk_queue_hash_add
>> to add reqsk2 to hash table, if it finds the existed reqsk1 with the same 
>> five tuples,
>> it removes reqsk2 and does not send synack to client.
>>
>> Signed-off-by: Mao Wenan 
>> ---
>>  net/ipv4/tcp_input.c | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index 08a477e74cf3..c75eeb1fe098 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -6569,6 +6569,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>> bh_unlock_sock(fastopen_sk);
>> sock_put(fastopen_sk);
>> } else {
>> +   struct sock *sk1 = req_to_sk(req);
>> +   struct sock *sk2 = NULL;
>> +   sk2 = __inet_lookup_established(sock_net(sk1), _hashinfo,
>> +   
>> sk1->sk_daddr, sk1->sk_dport,
>> +   
>> sk1->sk_rcv_saddr, sk1->sk_num,
>> +   
>> inet_iif(skb),inet_sdif(skb));
>> +   if (sk2 != NULL)
>> +   goto drop_and_release;
>> +
>> tcp_rsk(req)->tfo_listener = false;
>> if (!want_cookie)
>> inet_csk_reqsk_queue_hash_add(sk, req,
> 
> This issue has been discussed last year.
Can you share discussion information?

> 
> I am afraid your patch does not solve all races.
> 
> The lookup you add is lockless, so this is racy.
it's right, it has already in race region.
> 
> Really the only way to solve this is to make sure that _when_ the
> bucket lock is held,
> we do not insert a request socket if the 4-tuple is already in the
> chain (probably in inet_ehash_insert())
> 

put lookup code in spin_lock() of inet_ehash_insert(), is it ok like this?
will it affect performance?

in inet_ehash_insert():
...
spin_lock(lock);
+   reqsk = __inet_lookup_established(sock_net(sk), _hashinfo,
+   sk->sk_daddr, 
sk->sk_dport,
+   sk->sk_rcv_saddr, 
sk->sk_num,
+   sk_bound_dev_if, 
sk_bound_dev_if);
+   if (reqsk) {
+   spin_unlock(lock);
+   return ret;
+   }
+
if (osk) {
WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
ret = sk_nulls_del_node_init_rcu(osk);
}
if (ret)
__sk_nulls_add_node_rcu(sk, list);
spin_unlock(lock);
...

> This needs more tricky changes than your patch.
> 
> .
> 



[PATCH] drivers/usb/host/imx21-hcd.c: fix divide-by-zero in func nonisoc_etd_done

2019-06-04 Thread Duyanlin


If the function usb_maxpacket(urb->dev, urb->pipe, usb_pipeout(urb->pipe)) 
returns 0, that will cause a illegal divide-by-zero operation, unexpected 
results may occur.
It is best to ensure that the denominator is non-zero before dividing by zero.

Signed-off-by: Yanlin Du 
---
 drivers/usb/host/imx21-hcd.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/imx21-hcd.c b/drivers/usb/host/imx21-hcd.c index 
6e3dad1..6a47f78 100644
--- a/drivers/usb/host/imx21-hcd.c
+++ b/drivers/usb/host/imx21-hcd.c
@@ -1038,6 +1038,7 @@ static void nonisoc_etd_done(struct usb_hcd *hcd, int 
etd_num)
int cc;
u32 bytes_xfrd;
int etd_done;
+   unsigned int maxp;
 
disactivate_etd(imx21, etd_num);
 
@@ -1104,13 +1105,13 @@ static void nonisoc_etd_done(struct usb_hcd *hcd, int 
etd_num)
break;
 
case PIPE_BULK:
+   maxp = usb_maxpacket(urb->dev, urb->pipe,
+   usb_pipeout(urb->pipe));
urb->actual_length += bytes_xfrd;
if ((urb_priv->state == US_BULK)
&& (urb->transfer_flags & URB_ZERO_PACKET)
&& urb->transfer_buffer_length > 0
-   && ((urb->transfer_buffer_length %
-usb_maxpacket(urb->dev, urb->pipe,
-  usb_pipeout(urb->pipe))) == 0)) {
+   && maxp && (urb->transfer_buffer_length % maxp == 0)) {
/* need a 0-packet */
urb_priv->state = US_BULK0;
} else {
--
1.8.5.6



Re: [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest

2019-06-04 Thread Yang Weijiang
On Tue, Jun 04, 2019 at 01:03:36PM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:58PM +0800, Yang Weijiang wrote:
> > "Load Guest CET state" bit controls whether Guest CET states
> > will be loaded at Guest entry. Before doing that, KVM needs
> > to check if CPU CET feature is available to Guest.
> > 
> > Note: SHSTK and IBT features share one control MSR:
> > MSR_IA32_{U,S}_CET, which means it's difficult to hide
> > one feature from another in the case of SHSTK != IBT,
> > after discussed in community, it's agreed to allow Guest
> > control two features independently as it won't introduce
> > security hole.
> > 
> > Signed-off-by: Yang Weijiang 
> > Co-developed-by: Zhang Yi Z 
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 9321da538f65..1c0d487a4037 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -47,6 +47,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> 
> Is this include actually needed?  I haven't attempted to compile, but a
> glance everything should be in cpufeatures.h or vmx.h.
> 
  Thanks Sean!
  My original purpose is to re-use the macro cpu_x86_cet_enabled() to
  check host CET status, for somehow, the check is not there, but to resolve
  your below question, I need to use the macro to check it, so will keep
  this include and add the check in next version.

> >  #include "capabilities.h"
> >  #include "cpuid.h"
> > @@ -2929,6 +2930,17 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long 
> > cr4)
> > if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
> > return 1;
> > }
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> > +   guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> > +   if (cr4 & X86_CR4_CET)
> > +   vmcs_set_bits(VM_ENTRY_CONTROLS,
> > + VM_ENTRY_LOAD_GUEST_CET_STATE);
> > +   else
> > +   vmcs_clear_bits(VM_ENTRY_CONTROLS,
> > +   VM_ENTRY_LOAD_GUEST_CET_STATE);
> > +   } else if (cr4 & X86_CR4_CET) {
> > +   return 1;
> > +   }
> 
> Don't we also need to check for host CET support prior to toggling
> VM_ENTRY_LOAD_GUEST_CET_STATE?

Yes, need add back the check. v3 patch changed the CET CPUID enumeration to
guest, and lost the check from then on.
> 
> >  
> > if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
> > return 1;
> > -- 
> > 2.17.2
> > 


Re: [PATCH v2] ARM: configs: Remove useless UEVENT_HELPER_PATH

2019-06-04 Thread Andrew Jeffery



On Tue, 4 Jun 2019, at 17:45, Krzysztof Kozlowski wrote:
> Remove the CONFIG_UEVENT_HELPER_PATH because:
> 1. It is disabled since commit 1be01d4a5714 ("driver: base: Disable
>CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
>made default to 'n',
> 2. It is not recommended (help message: "This should not be used today
>[...] creates a high system load") and was kept only for ancient
>userland,
> 3. Certain userland specifically requests it to be disabled (systemd
>README: "Legacy hotplug slows down the system and confuses udev").
> 
> Signed-off-by: Krzysztof Kozlowski 
> Acked-by: Geert Uytterhoeven 
> 
> ---
> 
> Changes since v2:
> 1. Remove unrelated files.
> 2. Add Geert's ack.
> ---
>  arch/arm/configs/acs5k_defconfig  | 1 -
>  arch/arm/configs/acs5k_tiny_defconfig | 1 -
>  arch/arm/configs/am200epdkit_defconfig| 1 -
>  arch/arm/configs/aspeed_g4_defconfig  | 1 -
>  arch/arm/configs/aspeed_g5_defconfig  | 1 -
>  arch/arm/configs/at91_dt_defconfig| 1 -
>  arch/arm/configs/axm55xx_defconfig| 1 -
>  arch/arm/configs/cm_x2xx_defconfig| 1 -
>  arch/arm/configs/cm_x300_defconfig| 1 -
>  arch/arm/configs/cns3420vb_defconfig  | 1 -
>  arch/arm/configs/colibri_pxa270_defconfig | 1 -
>  arch/arm/configs/colibri_pxa300_defconfig | 1 -
>  arch/arm/configs/corgi_defconfig  | 1 -
>  arch/arm/configs/dove_defconfig   | 1 -
>  arch/arm/configs/em_x270_defconfig| 1 -
>  arch/arm/configs/ep93xx_defconfig | 1 -
>  arch/arm/configs/eseries_pxa_defconfig| 1 -
>  arch/arm/configs/ezx_defconfig| 1 -
>  arch/arm/configs/gemini_defconfig | 1 -
>  arch/arm/configs/h3600_defconfig  | 1 -
>  arch/arm/configs/h5000_defconfig  | 1 -
>  arch/arm/configs/imote2_defconfig | 1 -
>  arch/arm/configs/imx_v4_v5_defconfig  | 1 -
>  arch/arm/configs/iop13xx_defconfig| 1 -
>  arch/arm/configs/iop32x_defconfig | 1 -
>  arch/arm/configs/iop33x_defconfig | 1 -
>  arch/arm/configs/ixp4xx_defconfig | 1 -
>  arch/arm/configs/jornada720_defconfig | 1 -
>  arch/arm/configs/keystone_defconfig   | 1 -
>  arch/arm/configs/ks8695_defconfig | 1 -
>  arch/arm/configs/lpc32xx_defconfig| 1 -
>  arch/arm/configs/magician_defconfig   | 1 -
>  arch/arm/configs/moxart_defconfig | 1 -
>  arch/arm/configs/multi_v5_defconfig   | 1 -
>  arch/arm/configs/mv78xx0_defconfig| 1 -
>  arch/arm/configs/mvebu_v5_defconfig   | 1 -
>  arch/arm/configs/mvebu_v7_defconfig   | 1 -
>  arch/arm/configs/nhk8815_defconfig| 1 -
>  arch/arm/configs/nuc910_defconfig | 1 -
>  arch/arm/configs/nuc950_defconfig | 1 -
>  arch/arm/configs/nuc960_defconfig | 1 -
>  arch/arm/configs/omap1_defconfig  | 1 -
>  arch/arm/configs/orion5x_defconfig| 1 -
>  arch/arm/configs/palmz72_defconfig| 1 -
>  arch/arm/configs/pcm027_defconfig | 1 -
>  arch/arm/configs/prima2_defconfig | 1 -
>  arch/arm/configs/pxa168_defconfig | 1 -
>  arch/arm/configs/pxa3xx_defconfig | 1 -
>  arch/arm/configs/pxa910_defconfig | 1 -
>  arch/arm/configs/pxa_defconfig| 1 -
>  arch/arm/configs/realview_defconfig   | 1 -
>  arch/arm/configs/s3c2410_defconfig| 1 -
>  arch/arm/configs/s3c6400_defconfig| 1 -
>  arch/arm/configs/s5pv210_defconfig| 1 -
>  arch/arm/configs/sama5_defconfig  | 1 -
>  arch/arm/configs/socfpga_defconfig| 1 -
>  arch/arm/configs/spear13xx_defconfig  | 1 -
>  arch/arm/configs/spear3xx_defconfig   | 1 -
>  arch/arm/configs/spear6xx_defconfig   | 1 -
>  arch/arm/configs/spitz_defconfig  | 1 -
>  arch/arm/configs/tango4_defconfig | 1 -
>  arch/arm/configs/tct_hammer_defconfig | 1 -
>  arch/arm/configs/u300_defconfig   | 1 -
>  arch/arm/configs/u8500_defconfig  | 1 -
>  arch/arm/configs/vexpress_defconfig   | 1 -
>  arch/arm/configs/viper_defconfig  | 1 -
>  arch/arm/configs/xcep_defconfig   | 1 -
>  arch/arm/configs/zeus_defconfig   | 1 -
>  arch/arm/configs/zx_defconfig | 1 -
>  69 files changed, 69 deletions(-)
> 
> diff --git a/arch/arm/configs/acs5k_defconfig 
> b/arch/arm/configs/acs5k_defconfig
> index d04ee19e5b75..bcb8bda09158 100644
> --- a/arch/arm/configs/acs5k_defconfig
> +++ b/arch/arm/configs/acs5k_defconfig
> @@ -30,7 +30,6 @@ CONFIG_INET=y
>  CONFIG_IP_PNP=y
>  CONFIG_IP_PNP_DHCP=y
>  # CONFIG_IPV6 is not set
> -CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
>  CONFIG_MTD=y
>  CONFIG_MTD_BLOCK=y
>  CONFIG_MTD_CFI=y
> diff --git a/arch/arm/configs/acs5k_tiny_defconfig 
> b/arch/arm/configs/acs5k_tiny_defconfig
> index 25c593df41d1..e802cdebfd0b 100644
> --- a/arch/arm/configs/acs5k_tiny_defconfig
> +++ b/arch/arm/configs/acs5k_tiny_defconfig
> @@ -25,7 +25,6 @@ CONFIG_INET=y
>  # 

[PATCH] media: platform: Fix Warning of Unneeded Semicolon reported by coccicheck

2019-06-04 Thread Shobhit Kukreti
fixed the warning in the files below

drivers/media/platform/pxa_camera.c:1391:2-3: Unneeded semicolon
drivers/media/platform/qcom/venus/vdec_ctrls.c:78:2-3: Unneeded semicolon
drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c:146:3-4: Unneeded semicolon

Signed-off-by: Shobhit Kukreti 
---
 drivers/media/platform/pxa_camera.c  | 2 +-
 drivers/media/platform/qcom/venus/vdec_ctrls.c   | 2 +-
 drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/pxa_camera.c 
b/drivers/media/platform/pxa_camera.c
index 6addc5e..1c9bfaa 100644
--- a/drivers/media/platform/pxa_camera.c
+++ b/drivers/media/platform/pxa_camera.c
@@ -1388,7 +1388,7 @@ static int pxa_buffer_init(struct pxa_camera_dev *pcdev,
break;
default:
return -EINVAL;
-   };
+   }
buf->nb_planes = nb_channels;
 
ret = sg_split(sgt->sgl, sgt->nents, 0, nb_channels,
diff --git a/drivers/media/platform/qcom/venus/vdec_ctrls.c 
b/drivers/media/platform/qcom/venus/vdec_ctrls.c
index f4604b0..90f7620 100644
--- a/drivers/media/platform/qcom/venus/vdec_ctrls.c
+++ b/drivers/media/platform/qcom/venus/vdec_ctrls.c
@@ -75,7 +75,7 @@ static int vdec_op_g_volatile_ctrl(struct v4l2_ctrl *ctrl)
break;
default:
return -EINVAL;
-   };
+   }
 
return 0;
 }
diff --git a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c 
b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c
index 075d469..a79250a 100644
--- a/drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c
+++ b/drivers/media/platform/sti/c8sectpfe/c8sectpfe-dvb.c
@@ -143,7 +143,7 @@ int c8sectpfe_frontend_attach(struct dvb_frontend **fe,
"%s: stv0367ter_attach failed for NIM card %s\n"
, __func__, dvb_card_str(tsin->dvb_card));
return -ENODEV;
-   };
+   }
 
/*
 * init the demod so that i2c gate_ctrl
@@ -203,7 +203,7 @@ int c8sectpfe_frontend_attach(struct dvb_frontend **fe,
"%s: stv6110x_attach failed for NIM card %s\n"
, __func__, dvb_card_str(tsin->dvb_card));
return -ENODEV;
-   };
+   }
 
stv090x_config.tuner_init = fe2->tuner_init;
stv090x_config.tuner_set_mode = fe2->tuner_set_mode;
-- 
2.7.4



Re: [RFC v2] irqchip/gic-its: fix command queue pointer comparison bug

2019-06-04 Thread Guoheyi




On 2019/6/4 18:28, Marc Zyngier wrote:

Hi Heyi,

On 13/05/2019 12:42, Heyi Guo wrote:

When we run several VMs with PCI passthrough and GICv4 enabled, not
pinning vCPUs, we will occasionally see below warnings in dmesg:

ITS queue timeout (65440 65504 480)
ITS cmd its_build_vmovp_cmd failed

The reason for the above issue is that in BUILD_SINGLE_CMD_FUNC:
1. Post the write command.
2. Release the lock.
3. Start to read GITS_CREADR to get the reader pointer.
4. Compare the reader pointer to the target pointer.
5. If reader pointer does not reach the target, sleep 1us and continue
to try.

If we have several processors running the above concurrently, other
CPUs will post write commands while the 1st CPU is waiting the
completion. So we may have below issue:

phase 1:
---rd_idx-from_idx-to_idx--0-

wait 1us:

phase 2:
--from_idx-to_idx--0-rd_idx--

That is the rd_idx may fly ahead of to_idx, and if in case to_idx is
near the wrap point, rd_idx will wrap around. So the below condition
will not be met even after 1s:

if (from_idx < to_idx && rd_idx >= to_idx)

There is another theoretical issue. For a slow and busy ITS, the
initial rd_idx may fall behind from_idx a lot, just as below:

---rd_idx---0--from_idx-to_idx---

This will cause the wait function exit too early.

Actually, it does not make much sense to use from_idx to judge if
to_idx is wrapped, but we need a initial rd_idx when lock is still
acquired, and it can be used to judge whether to_idx is wrapped and
the current rd_idx is wrapped.

That's an interesting observation. Indeed, from_idx is pretty irrelevant
here, and all we want to observe is the read pointer reaching the end of
the command set.


We switch to a method of calculating the delta of two adjacent reads
and accumulating it to get the sum, so that we can get the real rd_idx
from the wrapped value even when the queue is almost full.

Cc: Thomas Gleixner 
Cc: Jason Cooper 
Cc: Marc Zyngier 

Signed-off-by: Heyi Guo 
---
  drivers/irqchip/irq-gic-v3-its.c | 30 --
  1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755..f05acd4 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -745,32 +745,40 @@ static void its_flush_cmd(struct its_node *its, struct 
its_cmd_block *cmd)
  }
  
  static int its_wait_for_range_completion(struct its_node *its,

-struct its_cmd_block *from,
+u64origin_rd_idx,
 struct its_cmd_block *to)
  {
-   u64 rd_idx, from_idx, to_idx;
+   u64 rd_idx, prev_idx, to_idx, sum;
+   s64 delta;
u32 count = 100;/* 1s! */
  
-	from_idx = its_cmd_ptr_to_offset(its, from);

to_idx = its_cmd_ptr_to_offset(its, to);
+   if (to_idx < origin_rd_idx)
+   to_idx += ITS_CMD_QUEUE_SZ;
+
+   prev_idx = origin_rd_idx;

I guess you could just rename origin_rd_idx to prev_idx and drop the
extra declaration (the pr_err doesn't matter much).


+   sum = origin_rd_idx;
  
  	while (1) {

rd_idx = readl_relaxed(its->base + GITS_CREADR);
  
-		/* Direct case */

-   if (from_idx < to_idx && rd_idx >= to_idx)
-   break;
+   /* Wrap around for CREADR */
+   if (rd_idx >= prev_idx)
+   delta = rd_idx - prev_idx;
+   else
+   delta = rd_idx + ITS_CMD_QUEUE_SZ - prev_idx;
  
-		/* Wrapped case */

-   if (from_idx >= to_idx && rd_idx >= to_idx && rd_idx < from_idx)
+   sum += delta;

So "sum" isn't quite saying what it represent. My understanding is that
it is the linearized version of the read pointer, right? Just like
you've linearized to_idx at the beginning of the function.


+   if (sum >= to_idx)
break;
  
  		count--;

if (!count) {
pr_err_ratelimited("ITS queue timeout (%llu %llu 
%llu)\n",
-  from_idx, to_idx, rd_idx);
+  origin_rd_idx, to_idx, sum);
return -1;
}
+   prev_idx = rd_idx;
cpu_relax();
udelay(1);
}
@@ -787,6 +795,7 @@ void name(struct its_node *its, 
\
struct its_cmd_block *cmd, *sync_cmd, *next_cmd;\
synctype *sync_obj; \
unsigned long flags;\
+   u64 rd_idx; \
\
raw_spin_lock_irqsave(>lock, flags);   

Re: [PATCH 1/3 linux dev-5.1 v2] ARM: dts: aspeed: Add SGPM pinmux

2019-06-04 Thread Andrew Jeffery



On Wed, 5 Jun 2019, at 07:23, Hongwei Zhang wrote:
> Add SGPM pinmux to ast2500-pinctrl function and group, to prepare for
> supporting SGPIO in AST2500 SoC.
> 
> Signed-off-by: Hongwei Zhang 

Reviewed-by: Andrew Jeffery 

> ---
>  Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt | 2 +-
>  drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c   | 4 
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git 
> a/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt 
> b/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> index 3b7266c..8f1c5c4 100644
> --- a/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> +++ b/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> @@ -84,7 +84,7 @@ NDCD2 NDCD3 NDCD4 NDSR1 NDSR2 NDSR3 NDSR4 NDTR1 NDTR2 
> NDTR3 NDTR4 NRI1 NRI2
>  NRI3 NRI4 NRTS1 NRTS2 NRTS3 NRTS4 OSCCLK PEWAKE PNOR PWM0 PWM1 PWM2 
> PWM3 PWM4
>  PWM5 PWM6 PWM7 RGMII1 RGMII2 RMII1 RMII2 RXD1 RXD2 RXD3 RXD4 SALT1 
> SALT10
>  SALT11 SALT12 SALT13 SALT14 SALT2 SALT3 SALT4 SALT5 SALT6 SALT7 SALT8 
> SALT9
> -SCL1 SCL2 SD1 SD2 SDA1 SDA2 SGPS1 SGPS2 SIOONCTRL SIOPBI SIOPBO 
> SIOPWREQ
> +SCL1 SCL2 SD1 SD2 SDA1 SDA2 SGPM SGPS1 SGPS2 SIOONCTRL SIOPBI SIOPBO 
> SIOPWREQ
>  SIOPWRGD SIOS3 SIOS5 SIOSCI SPI1 SPI1CS1 SPI1DEBUG SPI1PASSTHRU SPI2CK 
> SPI2CS0
>  SPI2CS1 SPI2MISO SPI2MOSI TIMER3 TIMER4 TIMER5 TIMER6 TIMER7 TIMER8 
> TXD1 TXD2
>  TXD3 TXD4 UART6 USB11BHID USB2AD USB2AH USB2BD USB2BH USBCKI 
> VGABIOSROM VGAHS
> diff --git a/drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c 
> b/drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c
> index 187abd7..0c89647 100644
> --- a/drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c
> +++ b/drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c
> @@ -577,6 +577,8 @@ SS_PIN_DECL(N3, GPIOJ2, SGPMO);
>  SIG_EXPR_LIST_DECL_SINGLE(SGPMI, SGPM, SIG_DESC_SET(SCU84, 11));
>  SS_PIN_DECL(N4, GPIOJ3, SGPMI);
>  
> +FUNC_GROUP_DECL(SGPM, R2, L2, N3, N4);
> +
>  #define N5 76
>  SIG_EXPR_LIST_DECL_SINGLE(VGAHS, VGAHS, SIG_DESC_SET(SCU84, 12));
>  SIG_EXPR_LIST_DECL_SINGLE(DASHN5, DASHN5, SIG_DESC_SET(SCU94, 8));
> @@ -2127,6 +2129,7 @@ static const struct aspeed_pin_group 
> aspeed_g5_groups[] = {
>   ASPEED_PINCTRL_GROUP(SD2),
>   ASPEED_PINCTRL_GROUP(SDA1),
>   ASPEED_PINCTRL_GROUP(SDA2),
> + ASPEED_PINCTRL_GROUP(SGPM),
>   ASPEED_PINCTRL_GROUP(SGPS1),
>   ASPEED_PINCTRL_GROUP(SGPS2),
>   ASPEED_PINCTRL_GROUP(SIOONCTRL),
> @@ -2296,6 +2299,7 @@ static const struct aspeed_pin_function 
> aspeed_g5_functions[] = {
>   ASPEED_PINCTRL_FUNC(SD2),
>   ASPEED_PINCTRL_FUNC(SDA1),
>   ASPEED_PINCTRL_FUNC(SDA2),
> + ASPEED_PINCTRL_FUNC(SGPM),
>   ASPEED_PINCTRL_FUNC(SGPS1),
>   ASPEED_PINCTRL_FUNC(SGPS2),
>   ASPEED_PINCTRL_FUNC(SIOONCTRL),
> -- 
> 2.7.4
> 
>


Re: [PATCH 1/3 linux dev-5.1 arm/soc v2] ARM: dts: aspeed: Add SGPM pinmux

2019-06-04 Thread Andrew Jeffery



On Wed, 5 Jun 2019, at 07:12, Hongwei Zhang wrote:
> Add SGPM pinmux to ast2500-pinctrl function and group, to prepare for
> supporting SGPIO in AST2500 SoC.
> 
> Signed-off-by: Hongwei Zhang 

Reviewed-by: Andrew Jeffery 

> ---
>  arch/arm/boot/dts/aspeed-g5.dtsi | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi 
> b/arch/arm/boot/dts/aspeed-g5.dtsi
> index 85ed9db..8d30818 100644
> --- a/arch/arm/boot/dts/aspeed-g5.dtsi
> +++ b/arch/arm/boot/dts/aspeed-g5.dtsi
> @@ -1321,6 +1321,11 @@
>   groups = "SDA2";
>   };
>  
> + pinctrl_sgpm_default: sgpm_default {
> + function = "SGPM";
> + groups = "SGPM";
> + };
> +
>   pinctrl_sgps1_default: sgps1_default {
>   function = "SGPS1";
>   groups = "SGPS1";
> -- 
> 2.7.4
> 
>


Re: [RFC 4/6] workqueue: Convert for_each_wq to use built-in list check

2019-06-04 Thread Daniel Jordan
On Sat, Jun 01, 2019 at 06:27:36PM -0400, Joel Fernandes (Google) wrote:
> list_for_each_entry_rcu now has support to check for RCU reader sections
> as well as lock. Just use the support in it, instead of explictly
> checking in the caller.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/workqueue.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 9657315405de..91ed7aca16e5 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -424,9 +424,8 @@ static void workqueue_sysfs_unregister(struct 
> workqueue_struct *wq);
>   * ignored.
>   */
>  #define for_each_pwq(pwq, wq)
> \
> - list_for_each_entry_rcu((pwq), &(wq)->pwqs, pwqs_node)  \
> - if (({ assert_rcu_or_wq_mutex(wq); false; })) { }   \
> - else
> + list_for_each_entry_rcu((pwq), &(wq)->pwqs, pwqs_node,  \
> +  lock_is_held(&(wq->mutex).dep_map))
>  

I think the definition of assert_rcu_or_wq_mutex can also be deleted.


Re: possible deadlock in get_user_pages_unlocked (2)

2019-06-04 Thread syzbot

syzbot has bisected this bug to:

commit 69d61f577d147b396be0991b2ac6f65057f7d445
Author: Mimi Zohar 
Date:   Wed Apr 3 21:47:46 2019 +

ima: verify mprotect change is consistent with mmap policy

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1055a2f2a0
start commit:   56b697c6 Add linux-next specific files for 20190604
git tree:   linux-next
final crash:https://syzkaller.appspot.com/x/report.txt?x=1255a2f2a0
console output: https://syzkaller.appspot.com/x/log.txt?x=1455a2f2a0
kernel config:  https://syzkaller.appspot.com/x/.config?x=4248d6bc70076f7d
dashboard link: https://syzkaller.appspot.com/bug?extid=e1374b2ec8f6a25ab2e5
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=165757eea0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10dd3e86a0

Reported-by: syzbot+e1374b2ec8f6a25ab...@syzkaller.appspotmail.com
Fixes: 69d61f577d14 ("ima: verify mprotect change is consistent with mmap  
policy")


For information about bisection process see: https://goo.gl/tpsmEJ#bisection


[PATCH 3/6] staging: kpc2000: kpc_spi: remove unnecessary struct member word_len

2019-06-04 Thread Geordan Neukum
The structure kp_spi_controller_state, defined in the kpc2000_spi
driver, contains a member named word_len which is never used after
initialization. Therefore, it should be removed for simplicity's sake.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 1d89cb3b861f..61296335313b 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -110,7 +110,6 @@ struct kp_spi {
 struct kp_spi_controller_state {
void __iomem   *base;
unsigned char   chip_select;
-   int word_len;
s64 conf_cache;
 };
 
@@ -269,7 +268,6 @@ kp_spi_setup(struct spi_device *spidev)
}
cs->base = kpspi->base;
cs->chip_select = spidev->chip_select;
-   cs->word_len = spidev->bits_per_word;
cs->conf_cache = -1;
spidev->controller_state = cs;
}
@@ -369,7 +367,6 @@ kp_spi_transfer_one_message(struct spi_master *master, 
struct spi_message *m)
if (transfer->bits_per_word) {
word_len = transfer->bits_per_word;
}
-   cs->word_len = word_len;
sc.bitfield.wl = word_len-1;
 
/* ...chip select */
-- 
2.21.0



[PATCH 5/6] staging: kpc2000: kpc_spi: remove unnecessary ulong repr of i/o addr

2019-06-04 Thread Geordan Neukum
The kpc_spi driver stashes off an unsigned long representation of the
i/o mapping returned by devm_ioremap_nocache(). This is unnecessary, as
the only use of the unsigned long repr is to eventually be re-cast to
an (u64 __iomem *). Instead of casting the (void __iomem *) to an
(unsigned long) then a (u64 __iomem *), just remove this intermediate
step. As this intermediary is no longer used, also remove it from its
structure.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 07b0327d8bef..4f517afc6239 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -103,7 +103,6 @@ static struct spi_board_info p2kr0_board_info[] = {
 struct kp_spi {
struct spi_master  *master;
u64 __iomem*base;
-   unsigned long   phys;
struct device  *dev;
 };
 
@@ -462,9 +461,8 @@ kp_spi_probe(struct platform_device *pldev)
goto free_master;
}
 
-   kpspi->phys = (unsigned long)devm_ioremap_nocache(>dev, r->start,
- resource_size(r));
-   kpspi->base = (u64 __iomem *)kpspi->phys;
+   kpspi->base = devm_ioremap_nocache(>dev, r->start,
+  resource_size(r));
 
status = spi_register_master(master);
if (status < 0) {
-- 
2.21.0



[PATCH 4/6] staging: kpc2000: kpc_spi: remove unnecessary struct member chip_select

2019-06-04 Thread Geordan Neukum
The structure kp_spi_controller_state, defined in the kpc2000_spi
driver, contains a member named chip_select which is never used after
initialization. Therefore, it should be removed for simplicity's sake.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 61296335313b..07b0327d8bef 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -109,7 +109,6 @@ struct kp_spi {
 
 struct kp_spi_controller_state {
void __iomem   *base;
-   unsigned char   chip_select;
s64 conf_cache;
 };
 
@@ -267,7 +266,6 @@ kp_spi_setup(struct spi_device *spidev)
return -ENOMEM;
}
cs->base = kpspi->base;
-   cs->chip_select = spidev->chip_select;
cs->conf_cache = -1;
spidev->controller_state = cs;
}
-- 
2.21.0



[PATCH 0/6] staging: kpc2000: kpc_spi: Assorted minor fixups

2019-06-04 Thread Geordan Neukum
Primarily just a bunch of unused / unnecessarily used struct member
cleanup patches with the exception of one patch which removes an
unnecessary cast to a (void *) in a couple of functions.

Geordan Neukum (6):
  staging: kpc2000: kpc_spi: remove unnecessary struct member phys
  staging: kpc2000: kpc_spi: remove unnecessary struct member pin_dir
  staging: kpc2000: kpc_spi: remove unnecessary struct member word_len
  staging: kpc2000: kpc_spi: remove unnecessary struct member
chip_select
  staging: kpc2000: kpc_spi: remove unnecessary ulong repr of i/o addr
  staging: kpc2000: kpc_spi: remove unnecessary cast in
[read|write]_reg()

 drivers/staging/kpc2000/kpc2000_spi.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

-- 
2.21.0



[PATCH 2/6] staging: kpc2000: kpc_spi: remove unnecessary struct member pin_dir

2019-06-04 Thread Geordan Neukum
The structure kpc_spi, defined in in the kpc2000_spi driver, contains
a member named pin_dir which is never used after initialization.
Therefore, it should be removed for simplicity's sake.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 20c396bcd904..1d89cb3b861f 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -105,7 +105,6 @@ struct kp_spi {
u64 __iomem*base;
unsigned long   phys;
struct device  *dev;
-   unsigned intpin_dir:1;
 };
 
 struct kp_spi_controller_state {
@@ -460,7 +459,6 @@ kp_spi_probe(struct platform_device *pldev)
if (pldev->id != -1) {
master->bus_num = pldev->id;
}
-   kpspi->pin_dir = 0;
 
r = platform_get_resource(pldev, IORESOURCE_MEM, 0);
if (r == NULL) {
-- 
2.21.0



[PATCH 6/6] staging: kpc2000: kpc_spi: remove unnecessary cast in [read|write]_reg()

2019-06-04 Thread Geordan Neukum
The kpc_spi driver unnecessarily casts from a (u64 __iomem *) to a (void
*) when invoking readq and writeq which both take a (void __iomem *) arg.
There is no need for this cast, and it actually harms us by discarding
the sparse cookie, __iomem. Make the driver stop performing this casting
operation.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 4f517afc6239..28132e9e260d 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -167,7 +167,7 @@ kp_spi_read_reg(struct kp_spi_controller_state *cs, int idx)
if ((idx == KP_SPI_REG_CONFIG) && (cs->conf_cache >= 0)){
return cs->conf_cache;
}
-   val = readq((void*)addr);
+   val = readq(addr);
return val;
 }
 
@@ -176,7 +176,7 @@ kp_spi_write_reg(struct kp_spi_controller_state *cs, int 
idx, u64 val)
 {
u64 __iomem *addr = cs->base;
addr += idx;
-   writeq(val, (void*)addr);
+   writeq(val, addr);
if (idx == KP_SPI_REG_CONFIG)
cs->conf_cache = val;
 }
-- 
2.21.0



[PATCH 1/6] staging: kpc2000: kpc_spi: remove unnecessary struct member phys

2019-06-04 Thread Geordan Neukum
The structure kp_spi_controller_state, defined in the kpc2000_spi
driver, contains a member named phys which is never used after
initialization. Therefore, it should be removed for simplicity's sake.

Signed-off-by: Geordan Neukum 
---
 drivers/staging/kpc2000/kpc2000_spi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/kpc2000/kpc2000_spi.c 
b/drivers/staging/kpc2000/kpc2000_spi.c
index 32d3ec532e26..20c396bcd904 100644
--- a/drivers/staging/kpc2000/kpc2000_spi.c
+++ b/drivers/staging/kpc2000/kpc2000_spi.c
@@ -110,7 +110,6 @@ struct kp_spi {
 
 struct kp_spi_controller_state {
void __iomem   *base;
-   unsigned long   phys;
unsigned char   chip_select;
int word_len;
s64 conf_cache;
@@ -270,7 +269,6 @@ kp_spi_setup(struct spi_device *spidev)
return -ENOMEM;
}
cs->base = kpspi->base;
-   cs->phys = kpspi->phys;
cs->chip_select = spidev->chip_select;
cs->word_len = spidev->bits_per_word;
cs->conf_cache = -1;
-- 
2.21.0



[RFC] Kernel Access to Ftrace instances.

2019-06-04 Thread Divya Indi
Hi, 
Please Review the patches that follow. These include -

[PATCH 1/3] tracing: Relevant changes for kernel access to Ftrace instances.
[PATCH 2/3] tracing: Adding additional NULL checks.
[PATCH 3/3] tracing: Add 2 new funcs. for kernel access to Ftrace instances.

Let me know if you have any concerns or questions. 

A sample module demonstrating the use of the above functions will follow soon. 

Thanks,
Divya


[PATCH 2/3] tracing: Adding additional NULL checks.

2019-06-04 Thread Divya Indi
Now that we have exported certain functions providing access to Ftrace
instances from other kernel components, we are adding some additional
NULL checks to ensure safe usage by the users.

Signed-off-by: Divya Indi 
---
 kernel/trace/trace.c| 3 +++
 kernel/trace/trace_events.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 1c80521..a60dc13 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3205,6 +3205,9 @@ int trace_array_printk(struct trace_array *tr,
if (!(global_trace.trace_flags & TRACE_ITER_PRINTK))
return 0;
 
+   if (!tr)
+   return -EINVAL;
+
va_start(ap, fmt);
ret = trace_array_vprintk(tr, ip, fmt, ap);
va_end(ap);
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index b6b4618..445b059 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -800,6 +800,8 @@ int ftrace_set_clr_event(struct trace_array *tr, char *buf, 
int set)
char *event = NULL, *sub = NULL, *match;
int ret;
 
+   if (!tr)
+   return -ENODEV;
/*
 * The buf format can be :
 *  *: means any event by that name.
-- 
1.8.3.1



[PATCH 1/3] tracing: Relevant changes for kernel access to Ftrace instances.

2019-06-04 Thread Divya Indi
For commit (f45d122): tracing: Kernel access to Ftrace instances.
We need the following additional changes to ensure other kernel components can
use these functions -
1) Remove static keyword for newly exported fn - ftrace_set_clr_event.
2) Add the req functions to header file include/linux/trace_events.h.

Signed-off-by: Divya Indi 
---
 include/linux/trace_events.h | 6 ++
 kernel/trace/trace_events.c  | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 8a62731..d7b7d85 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -539,6 +539,12 @@ extern int trace_define_field(struct trace_event_call 
*call, const char *type,
 
 #define is_signed_type(type)   (((type)(-1)) < (type)1)
 
+void trace_printk_init_buffers(void);
+int trace_array_printk(struct trace_array *tr, unsigned long ip,
+   const char *fmt, ...);
+struct trace_array *trace_array_create(const char *name);
+int trace_array_destroy(struct trace_array *tr);
+int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
 int trace_set_clr_event(const char *system, const char *event, int set);
 
 /*
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 0ce3db6..b6b4618 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -795,7 +795,7 @@ static int __ftrace_set_clr_event(struct trace_array *tr, 
const char *match,
return ret;
 }
 
-static int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set)
+int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set)
 {
char *event = NULL, *sub = NULL, *match;
int ret;
-- 
1.8.3.1



[PATCH 3/3] tracing: Add 2 new funcs. for kernel access to Ftrace instances.

2019-06-04 Thread Divya Indi
Adding 2 new functions -
1) trace_array_lookup : Look up and return a trace array, given its
name.
2) trace_array_set_clr_event : Enable/disable event recording to the
given trace array.

Signed-off-by: Divya Indi 
---
 include/linux/trace_events.h |  3 +++
 kernel/trace/trace.c | 11 +++
 kernel/trace/trace_events.c  | 22 ++
 3 files changed, 36 insertions(+)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index d7b7d85..0cc99a8 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -545,7 +545,10 @@ int trace_array_printk(struct trace_array *tr, unsigned 
long ip,
 struct trace_array *trace_array_create(const char *name);
 int trace_array_destroy(struct trace_array *tr);
 int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
+struct trace_array *trace_array_lookup(const char *name);
 int trace_set_clr_event(const char *system, const char *event, int set);
+int trace_array_set_clr_event(struct trace_array *tr, const char *system,
+   const char *event, int set);
 
 /*
  * The double __builtin_constant_p is because gcc will give us an error
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a60dc13..1d171fd 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -8465,6 +8465,17 @@ static int instance_rmdir(const char *name)
return ret;
 }
 
+struct trace_array *trace_array_lookup(const char *name)
+{
+   struct trace_array *tr;
+   list_for_each_entry(tr, _trace_arrays, list) {
+   if (tr->name && strcmp(tr->name, name) == 0)
+   return tr;
+   }
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(trace_array_lookup);
+
 static __init void create_trace_instances(struct dentry *d_tracer)
 {
trace_instance_dir = tracefs_create_instance_dir("instances", d_tracer,
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 445b059..c126d2c 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -859,6 +859,28 @@ int trace_set_clr_event(const char *system, const char 
*event, int set)
 }
 EXPORT_SYMBOL_GPL(trace_set_clr_event);
 
+/**
+ * trace_array_set_clr_event - enable or disable an event for a trace array
+ * @system: system name to match (NULL for any system)
+ * @event: event name to match (NULL for all events, within system)
+ * @set: 1 to enable, 0 to disable
+ *
+ * This is a way for other parts of the kernel to enable or disable
+ * event recording to instances.
+ *
+ * Returns 0 on success, -EINVAL if the parameters do not match any
+ * registered events.
+ */
+int trace_array_set_clr_event(struct trace_array *tr, const char *system,
+   const char *event, int set)
+{
+   if (!tr)
+   return -ENODEV;
+
+   return __ftrace_set_clr_event(tr, NULL, system, event, set);
+}
+EXPORT_SYMBOL_GPL(trace_array_set_clr_event);
+
 /* 128 should be much more than enough */
 #define EVENT_BUF_SIZE 127
 
-- 
1.8.3.1



Re: [PATCH 4/4] cpufreq: add driver for Raspbery Pi

2019-06-04 Thread Eric Anholt
Nicolas Saenz Julienne  writes:

> Raspberry Pi's firmware offers and interface though which update it's
> performance requirements. It allows us to request for specific runtime
> frequencies, which the firmware might or might not respect, depending on
> the firmware configuration and thermals.
>
> As the maximum and minimum frequencies are configurable in the firmware
> there is no way to know in advance their values. So the Raspberry Pi
> cpufreq driver queries them, builds an opp frequency table to then
> launch cpufreq-dt.
>
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>
> Changes since RFC:
>   - Alphabetically ordered relevant stuff
>   - Updated Kconfig to select firmware interface
>   - Correctly unref clk_dev after use
>   - Remove all opps on failure
>   - Remove use of dev_pm_opp_set_sharing_cpus()
>
>  drivers/cpufreq/Kconfig.arm   |  8 +++
>  drivers/cpufreq/Makefile  |  1 +
>  drivers/cpufreq/raspberrypi-cpufreq.c | 84 +++
>  3 files changed, 93 insertions(+)
>  create mode 100644 drivers/cpufreq/raspberrypi-cpufreq.c
>
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index f8129edc145e..556d432cc826 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -133,6 +133,14 @@ config ARM_QCOM_CPUFREQ_HW
> The driver implements the cpufreq interface for this HW engine.
> Say Y if you want to support CPUFreq HW.
>  
> +config ARM_RASPBERRYPI_CPUFREQ
> + tristate "Raspberry Pi cpufreq support"
> + select RASPBERRYPI_FIRMWARE
> + help
> +   This adds the CPUFreq driver for Raspberry Pi
> +
> +   If in doubt, say N.
> +
>  config ARM_S3C_CPUFREQ
>   bool
>   help
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index 689b26c6f949..121c1acb66c0 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -64,6 +64,7 @@ obj-$(CONFIG_ARM_PXA2xx_CPUFREQ)+= pxa2xx-cpufreq.o
>  obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o
>  obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW)+= qcom-cpufreq-hw.o
>  obj-$(CONFIG_ARM_QCOM_CPUFREQ_KRYO)  += qcom-cpufreq-kryo.o
> +obj-$(CONFIG_ARM_RASPBERRYPI_CPUFREQ)+= raspberrypi-cpufreq.o
>  obj-$(CONFIG_ARM_S3C2410_CPUFREQ)+= s3c2410-cpufreq.o
>  obj-$(CONFIG_ARM_S3C2412_CPUFREQ)+= s3c2412-cpufreq.o
>  obj-$(CONFIG_ARM_S3C2416_CPUFREQ)+= s3c2416-cpufreq.o
> diff --git a/drivers/cpufreq/raspberrypi-cpufreq.c 
> b/drivers/cpufreq/raspberrypi-cpufreq.c
> new file mode 100644
> index ..2b3a195a9d37
> --- /dev/null
> +++ b/drivers/cpufreq/raspberrypi-cpufreq.c
> @@ -0,0 +1,84 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Raspberry Pi cpufreq driver
> + *
> + * Copyright (C) 2019, Nicolas Saenz Julienne 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static const struct of_device_id machines[] __initconst = {
> + { .compatible = "raspberrypi,3-model-b-plus" },
> + { .compatible = "raspberrypi,3-model-b" },
> + { .compatible = "raspberrypi,2-model-b" },
> + { /* sentinel */ }
> +};

I think I'd skip the compatible string check here.  The firmware's
clock-management should be well-tested by folks playing with clocking in
the downstream tree.  There aren't any firmware differences in the
processing of these clock management packets, to my recollection.

Other than that, I'm happy with the series and would give it my
acked-by.


signature.asc
Description: PGP signature


KMSAN: uninit-value in i2c_w

2019-06-04 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:f75e4cfe kmsan: use kmsan_handle_urb() in urb.c
git tree:   kmsan
console output: https://syzkaller.appspot.com/x/log.txt?x=1514cdaaa0
kernel config:  https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a
dashboard link: https://syzkaller.appspot.com/bug?extid=397fd082ce5143e2f67d
compiler:   clang version 9.0.0 (/home/glider/llvm/clang  
06d00afa61eef8f7f501ebdb4e8612ea43ec2d78)

syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12e7a54aa0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17ab35dea0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+397fd082ce5143e2f...@syzkaller.appspotmail.com

usb 1-1: New USB device found, idVendor=06a2, idProduct=6810,  
bcdDevice=1b.af

usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1: string descriptor 0 read error: -71
gspca_main: gspca_topro-2.14.0 probing 06a2:6810
gspca_topro: reg_w err -71
==
BUG: KMSAN: uninit-value in i2c_w+0xb7a/0xd70  
drivers/media/usb/gspca/topro.c:1043

CPU: 1 PID: 3338 Comm: kworker/1:2 Not tainted 5.1.0+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x191/0x1f0 lib/dump_stack.c:113
 kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
 i2c_w+0xb7a/0xd70 drivers/media/usb/gspca/topro.c:1043
 probe_6810 drivers/media/usb/gspca/topro.c:1126 [inline]
 sd_init+0xc05/0x7ca0 drivers/media/usb/gspca/topro.c:4081
 gspca_dev_probe2+0xee0/0x2240 drivers/media/usb/gspca/gspca.c:1546
 gspca_dev_probe+0x346/0x3b0 drivers/media/usb/gspca/gspca.c:1619
 sd_probe+0x8d/0xa0 drivers/media/usb/gspca/gl860/gl860.c:523
 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534
 hub_port_connect drivers/usb/core/hub.c:5089 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
 port_event drivers/usb/core/hub.c:5350 [inline]
 hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432
 process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269
 worker_thread+0x111b/0x2460 kernel/workqueue.c:2415
 kthread+0x4b5/0x4f0 kernel/kthread.c:254
 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:208 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:162
 kmsan_kmalloc+0xa4/0x130 mm/kmsan/kmsan_hooks.c:175
 kmem_cache_alloc_trace+0x503/0xae0 mm/slub.c:2801
 kmalloc include/linux/slab.h:547 [inline]
 gspca_dev_probe2+0x30c/0x2240 drivers/media/usb/gspca/gspca.c:1480
 gspca_dev_probe+0x346/0x3b0 drivers/media/usb/gspca/gspca.c:1619
 sd_probe+0x8d/0xa0 drivers/media/usb/gspca/gl860/gl860.c:523
 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 

KMSAN: uninit-value in sd_init

2019-06-04 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:f75e4cfe kmsan: use kmsan_handle_urb() in urb.c
git tree:   kmsan
console output: https://syzkaller.appspot.com/x/log.txt?x=17eadebaa0
kernel config:  https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a
dashboard link: https://syzkaller.appspot.com/bug?extid=1a35278dd0ebfb3a038a
compiler:   clang version 9.0.0 (/home/glider/llvm/clang  
06d00afa61eef8f7f501ebdb4e8612ea43ec2d78)

syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=147f4136a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17aec4f2a0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+1a35278dd0ebfb3a0...@syzkaller.appspotmail.com

usb 1-1: config 0 has an invalid interface number: 142 but max is 0
usb 1-1: config 0 has no interface number 0
usb 1-1: New USB device found, idVendor=08ca, idProduct=2018,  
bcdDevice=95.4a

usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1: config 0 descriptor??
gspca_main: sunplus-2.14.0 probing 08ca:2018
gspca_sunplus: reg_w_riv err -71
==
BUG: KMSAN: uninit-value in spca504B_PollingDataReady  
drivers/media/usb/gspca/sunplus.c:409 [inline]
BUG: KMSAN: uninit-value in sd_init+0x5b6f/0x5e60  
drivers/media/usb/gspca/sunplus.c:643

CPU: 0 PID: 3902 Comm: kworker/0:2 Not tainted 5.1.0+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x191/0x1f0 lib/dump_stack.c:113
 kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
 spca504B_PollingDataReady drivers/media/usb/gspca/sunplus.c:409 [inline]
 sd_init+0x5b6f/0x5e60 drivers/media/usb/gspca/sunplus.c:643
 gspca_dev_probe2+0xee0/0x2240 drivers/media/usb/gspca/gspca.c:1546
 gspca_dev_probe+0x346/0x3b0 drivers/media/usb/gspca/gspca.c:1619
 sd_probe+0x8d/0xa0 drivers/media/usb/gspca/gl860/gl860.c:523
 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534
 hub_port_connect drivers/usb/core/hub.c:5089 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
 port_event drivers/usb/core/hub.c:5350 [inline]
 hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432
 process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269
 worker_thread+0x111b/0x2460 kernel/workqueue.c:2415
 kthread+0x4b5/0x4f0 kernel/kthread.c:254
 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:208 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:162
 kmsan_kmalloc+0xa4/0x130 mm/kmsan/kmsan_hooks.c:175
 kmem_cache_alloc_trace+0x503/0xae0 mm/slub.c:2801
 kmalloc include/linux/slab.h:547 [inline]
 gspca_dev_probe2+0x30c/0x2240 drivers/media/usb/gspca/gspca.c:1480
 gspca_dev_probe+0x346/0x3b0 drivers/media/usb/gspca/gspca.c:1619
 sd_probe+0x8d/0xa0 drivers/media/usb/gspca/gl860/gl860.c:523
 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
 __device_attach+0x454/0x730 drivers/base/dd.c:844
 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
 bus_probe_device+0x137/0x390 drivers/base/bus.c:514
 device_add+0x288d/0x30e0 drivers/base/core.c:2106
 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
 really_probe+0xdae/0x1d80 drivers/base/dd.c:513
 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
 __device_attach_driver+0x5b8/0x790 

Re: [PATCH 1/4] clk: bcm2835: remove pllb

2019-06-04 Thread Eric Anholt
Nicolas Saenz Julienne  writes:

> Raspberry Pi's firmware controls this pll, we should use the firmware
> interface to access it.
>
> Signed-off-by: Nicolas Saenz Julienne 

Acked-by: Eric Anholt 

If someone ever has a non-rpi 2835 to support, they can resurrect this.


signature.asc
Description: PGP signature


Re: [PATCH 3/4] clk: bcm2835: register Raspberry Pi's firmware clk device

2019-06-04 Thread Eric Anholt
Nicolas Saenz Julienne  writes:

> Registers clk-raspberrypi as a platform device as part of the driver's
> probe sequence.

Similar to how we have VCHI register platform devices for the services
VCHI provides, shouldn't we have the firmware driver register the device
for clk_raspberrypi?  Or put the clk provider in the fw driver instead
of a separate driver (no opinion on my part).


signature.asc
Description: PGP signature


Re: [RFC 1/6] rcu: Add support for consolidated-RCU reader checking

2019-06-04 Thread Joel Fernandes
On Tue, Jun 04, 2019 at 04:01:00PM +0200, Rasmus Villemoes wrote:
> On 02/06/2019 00.27, Joel Fernandes (Google) wrote:
> > This patch adds support for checking RCU reader sections in list
> > traversal macros. Optionally, if the list macro is called under SRCU or
> > other lock/mutex protection, then appropriate lockdep expressions can be
> > passed to make the checks pass.
> > 
> > Existing list_for_each_entry_rcu() invocations don't need to pass the
> > optional fourth argument (cond) unless they are under some non-RCU
> > protection and needs to make lockdep check pass.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  include/linux/rculist.h  | 40 
> >  include/linux/rcupdate.h |  7 +++
> >  kernel/rcu/update.c  | 26 ++
> >  3 files changed, 69 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > index e91ec9ddcd30..b641fdd9f1a2 100644
> > --- a/include/linux/rculist.h
> > +++ b/include/linux/rculist.h
> > @@ -40,6 +40,25 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head 
> > *list)
> >   */
> >  #define list_next_rcu(list)(*((struct list_head __rcu 
> > **)(&(list)->next)))
> >  
> > +/*
> > + * Check during list traversal that we are within an RCU reader
> > + */
> > +#define __list_check_rcu() \
> > +   RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(), \
> > +"RCU-list traversed in non-reader section!")
> > +
> > +static inline void __list_check_rcu_cond(int dummy, ...)
> > +{
> > +   va_list ap;
> > +   int cond;
> > +
> > +   va_start(ap, dummy);
> > +   cond = va_arg(ap, int);
> > +   va_end(ap);
> > +
> > +   RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(),
> > +"RCU-list traversed in non-reader section!");
> > +}
> >  /*
> >   * Insert a new entry between two known consecutive entries.
> >   *
> > @@ -338,6 +357,9 @@ static inline void list_splice_tail_init_rcu(struct 
> > list_head *list,
> >   member) : NULL; \
> >  })
> >  
> > +#define SIXTH_ARG(a1, a2, a3, a4, a5, a6, ...) a6
> > +#define COUNT_VARGS(...) SIXTH_ARG(dummy, ## __VA_ARGS__, 4, 3, 2, 1, 0)
> > +>  /**
> >   * list_for_each_entry_rcu -   iterate over rcu list of given type
> >   * @pos:   the type * to use as a loop cursor.
> > @@ -348,9 +370,14 @@ static inline void list_splice_tail_init_rcu(struct 
> > list_head *list,
> >   * the _rcu list-mutation primitives such as list_add_rcu()
> >   * as long as the traversal is guarded by rcu_read_lock().
> >   */
> > -#define list_for_each_entry_rcu(pos, head, member) \
> > -   for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> > -   >member != (head); \
> > +#define list_for_each_entry_rcu(pos, head, member, cond...)
> > \
> > +   if (COUNT_VARGS(cond) != 0) {   \
> > +   __list_check_rcu_cond(0, ## cond);  \
> > +   } else {\
> > +   __list_check_rcu(); \
> > +   }   \
> > +   for (pos = list_entry_rcu((head)->next, typeof(*pos), member);  \
> > +   >member != (head); \
> > pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> 
> Wouldn't something as simple as
> 
> #define __list_check_rcu(dummy, cond, ...) \
>RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(), \
>"RCU-list traversed in non-reader section!");
> 
> for ( ({ __list_check_rcu(junk, ##cond, 0); }), pos = ... )
> 
> work just as well (i.e., no need for two list_check_rcu and
> list_check_rcu_cond variants)? If there's an optional cond, we use that,
> if not, we pick the trailing 0, so !cond disappears and it reduces to
> your __list_check_rcu(). Moreover, this ensures the RCU_LOCKDEP_WARN
> expansion actually picks up the __LINE__ and __FILE__ where the for loop
> is used, and not the __FILE__ and __LINE__ of the static inline function
> from the header file. It also makes it a bit more type safe/type generic
> (if the cond expression happened to have type long or u64 something
> rather odd could happen with the inline vararg function).

This is much better. I will do it this way. Thank you!

 - Joel



[mm/vmalloc.c] 728e0fbf26: kernel_BUG_at_mm/vmalloc.c

2019-06-04 Thread kernel test robot

FYI, we noticed the following commit (built with gcc-7):

commit: 728e0fbf263e3ed359c10cb13623390564102881 ("mm/vmalloc.c: get rid of one 
single unlink_va() when merge")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 2G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
| | 1ed20f4bc2 | 728e0fbf26 |
+-+++
| boot_successes  | 0  | 0  |
| boot_failures   | 6  | 14 |
| BUG:kernel_reboot-without-warning_in_test_stage | 6  ||
| kernel_BUG_at_mm/vmalloc.c  | 0  | 14 |
| invalid_opcode:#[##]| 0  | 14 |
| RIP:__free_vmap_area| 0  | 14 |
| Kernel_panic-not_syncing:Fatal_exception| 0  | 14 |
+-+++


If you fix the issue, kindly add following tag
Reported-by: kernel test robot 


[2.860248] kernel BUG at mm/vmalloc.c:470!
[2.863532] invalid opcode:  [#1] SMP PTI
[2.865038] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
5.2.0-rc2-00418-g728e0fbf263e3 #2
[2.867517] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[2.869603] RIP: 0010:__free_vmap_area+0xab/0x314
[2.869603] Code: 4d e0 48 39 f0 73 0f 48 39 d1 72 0a 4c 8d 75 10 48 8b 4d 
10 eb 16 48 39 f0 72 0f 48 39 d1 73 0a 4c 8d 75 08 48 8b 4d 08 eb 02 <0f> 0b 48 
85 c9 75 c6 48 85 ed 49 89 ef 0f 84 27 02 00 00 48 8d 4d
[2.876280] RSP: :c9327d00 EFLAGS: 00010287
[2.876280] RAX: c900019e8000 RBX: 88806dbc9790 RCX: 88806dbc98f0
[2.876280] RDX: c900019ed000 RSI: c90001a0 RDI: 88806d426d88
[2.876280] RBP: 88806dbc9a18 R08: 0001 R09: 8129d4c2
[2.884274] R10: ea0001b47880 R11: f080 R12: 8000
[2.884274] R13: 88806dbc9630 R14: 88806dbc9760 R15: 
[2.884274] FS:  () GS:88807cd0() 
knlGS:
[2.884274] CS:  0010 DS:  ES:  CR0: 80050033
[2.892282] CR2: c93bc000 CR3: 0260a000 CR4: 000406e0
[2.892282] Call Trace:
[2.892282]  ? kmem_cache_free+0x140/0x1f5
[2.892282]  __purge_vmap_area_lazy+0x8f/0xdf
[2.892282]  _vm_unmap_aliases+0x110/0x13d
[2.900279]  change_page_attr_set_clr+0xc7/0x253
[2.900279]  ? set_debug_rodata+0x11/0x11
[2.900279]  set_memory_nx+0x35/0x38
[2.900279]  free_init_pages+0x54/0x7f
[2.900279]  ? do_name+0x2b1/0x2b1
[2.900279]  populate_rootfs+0xe2/0x101
[2.908291]  do_one_initcall+0x97/0x1b4
[2.908291]  kernel_init_freeable+0x23b/0x2d4
[2.908291]  ? rest_init+0xc6/0xc6
[2.908291]  kernel_init+0xa/0xff
[2.908291]  ret_from_fork+0x3a/0x50
[2.908291] Modules linked in:
[2.917205] ---[ end trace 1a2925ea0cc5d2c3 ]---


To reproduce:

# build kernel
cd linux
cp config-5.2.0-rc2-00418-g728e0fbf263e3 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 prepare
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 modules_prepare
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 SHELL=/bin/bash
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 bzImage


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script # job-script is attached in this 
email



Thanks,
lkp

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 5.2.0-rc2 Kernel Configuration
#

#
# Compiler: gcc-7 (Debian 7.4.0-6) 7.4.0
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=70400
CONFIG_CLANG_VERSION=0
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y
CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_USELIB=y

Re: [PATCH] signal: remove the wrong signal_pending() check in restore_user_sigmask()

2019-06-04 Thread Eric W. Biederman
Linus Torvalds  writes:

> On Tue, Jun 4, 2019 at 6:41 AM Oleg Nesterov  wrote:
>>
>> This is the minimal fix for stable, I'll send cleanups later.
>
> Ugh. I htink this is correct, but I wish we had a better and more
> intuitive interface.
>
> In particular, since restore_user_sigmask() basically wants to check
> for "signal_pending()" anyway (to decide if the mask should be
> restored by signal handling or by that function), I really get the
> feeling that a lot of these patterns like

Linus that checking for signal_pending() in restore_user_sigmask is the
bug that caused the regression.

>> -   restore_user_sigmask(ksig.sigmask, );
>> -   if (signal_pending(current) && !ret)
>> +
>> +   interrupted = signal_pending(current);
>> +   restore_user_sigmask(ksig.sigmask, , interrupted);
>> +   if (interrupted && !ret)
>> ret = -ERESTARTNOHAND;
>
> are wrong to begin with, and we really should aim for an interface
> which says "tell me whether you completed the system call, and I'll
> give you an error return if not".

The pattern you are pointing out is specific to io_pgetevents and it's
variations.  It does look buggy to me but not for the reason you point
out, but instead because it does not appear to let a pending signal
cause io_pgetevents to return early.

I suspect we should fix that and have do_io_getevents return
-EINTR or -ERESTARTNOHAND like everyone else.

The concept of interrupted (aka return -EINTR to userspace) is truly
fundamental to the current semantics.  We effectively put a normally
blocked signal that was triggered back if we won't be returning -EINTR
to userspace.

> How about we make restore_user_sigmask() take two return codes: the
> 'ret' we already have, and the return we would get if there is a
> signal pending and w're currently returning zero.
>
> IOW, I think the above could become
>
> ret = restore_user_sigmask(ksig.sigmask, , ret, 
> -ERESTARTHAND);
>
> instead if we just made the right interface decision.
>
> Hmm?

At best I think that is a cleanup that will complicate creating a simple
straight forward regression fix.

Unless I am misreading things that is optimizing the interface for
dealing with broken code.

So can we please get this fix in and then look at cleaning up and
simplifying this code.

Eric

p.s. A rather compelling cleanup is to:

- Leave the signal mask alone.
- Register with signalfd_wqh for wake ups.
- Have a helper

   int signal_pending_sigmask(sigset_t *blocked)
   {
struct task_struct *tsk = current;
int ret = 0;
spin_lock_irq(>sighand->siglock);
if (next_signal(>pending, blocked) ||
next_signal(>signal->pending, blocked)) {
ret = -ERESTARTHAND;
if (!sigequalsets(>blocked, blocked)) {
tsk->saved_sigmask = tsk->blocked;
__set_task_blocked(tsk, blocked);
set_restore_sigmask();
}
}
spin_unlock_irq(>sighand->siglock);
return ret;
   }
  
- Use that helper instead of signal_pending() in the various
  sleep functions.
- Possibly get the signal mask from tsk instead of passing it into
  all of the helpers.

Eric


[PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits

2019-06-04 Thread Kees Cook
With sensitive CR4 bits pinned now, it's possible that the WP bit for
CR0 might become a target as well. Following the same reasoning for
the CR4 pinning, this pins CR0's WP bit (but this can be done with a
static value).

Suggested-by: Peter Zijlstra 
Signed-off-by: Kees Cook 
---
 arch/x86/include/asm/special_insns.h | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h 
b/arch/x86/include/asm/special_insns.h
index 284a77d52fea..9c9fd3760079 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -31,7 +31,22 @@ static inline unsigned long native_read_cr0(void)
 
 static inline void native_write_cr0(unsigned long val)
 {
-   asm volatile("mov %0,%%cr0": : "r" (val), "m" (__force_order));
+   unsigned long bits_missing = 0;
+
+set_register:
+   if (static_branch_likely(_pinning))
+   val |= X86_CR0_WP;
+
+   asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
+
+   if (static_branch_likely(_pinning)) {
+   if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
+   bits_missing = X86_CR0_WP;
+   goto set_register;
+   }
+   /* Warn after we've set the missing bits. */
+   WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n");
+   }
 }
 
 static inline unsigned long native_read_cr2(void)
-- 
2.17.1



[PATCH v2 0/2] x86/asm: Pin sensitive CR4 and CR0 bits

2019-06-04 Thread Kees Cook
Hi,

Here's a v2 that hopefully addresses the concerns from the v1 thread[1] on
CR4 pinning. Now it's using static branches to avoid potential atomicity
problems (though perhaps that is overkill), and it has dropped the
needless volatile marking in favor of proper asm constraint flags. The
one piece that eluded me, but which I think is okay, is delaying bit
setting on a per-CPU basis. But since the bits are global state and we
don't have read-only per-CPU data, it seemed safe as I've got it here.

Full patch 1 commit log follows, just in case it's useful to have it
in this cover letter...

[1] 
https://lkml.kernel.org/r/CAHk-=wjnes0wn0kummy6dok_sn69z2tigpdz2cyzyf07s64...@mail.gmail.com

-Kees


Several recent exploits have used direct calls to the native_write_cr4()
function to disable SMEP and SMAP before then continuing their exploits
using userspace memory access. This pins bits of CR4 so that they cannot
be changed through a common function. This is not intended to be general
ROP protection (which would require CFI to defend against properly), but
rather a way to avoid trivial direct function calling (or CFI bypasses
via a matching function prototype) as seen in:

https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
(https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-7308)

The goals of this change:
 - pin specific bits (SMEP, SMAP, and UMIP) when writing CR4.
 - avoid setting the bits too early (they must become pinned only after
   CPU feature detection and selection has finished).
 - pinning mask needs to be read-only during normal runtime.
 - pinning needs to be checked after write to avoid jumps past the
   preceding "or".

Using __ro_after_init on the mask is done so it can't be first disabled
with a malicious write.

Since these bits are global state (once established by the boot CPU
and kernel boot parameters), they are safe to write to secondary CPUs
before those CPUs have finished feature detection. As such, the bits are
written with an "or" performed before the register write as that is both
easier and uses a few bytes less storage of a location we don't have:
read-only per-CPU data. (Note that initialization via cr4_init_shadow()
isn't early enough to avoid early native_write_cr4() calls.)

A check is performed after the register write because an attack could
just skip over the "or" before the register write. Such a direct jump
is possible because of how this function may be built by the compiler
(especially due to the removal of frame pointers) where it doesn't add
a stack frame (function exit may only be a retq without pops) which
is sufficient for trivial exploitation like in the timer overwrites
mentioned above).

The asm argument constraints gain the "+" modifier to convince the
compiler that it shouldn't make ordering assumptions about the arguments
or memory, and treat them as changed.

---
v2:
- move setup until after CPU feature detection and selection.
- refactor to use static branches to have atomic enabling.
- only perform the "or" after a failed check.
---

Kees Cook (2):
  x86/asm: Pin sensitive CR4 bits
  x86/asm: Pin sensitive CR0 bits

 arch/x86/include/asm/special_insns.h | 41 ++--
 arch/x86/kernel/cpu/common.c | 18 
 2 files changed, 57 insertions(+), 2 deletions(-)

-- 
2.17.1



[PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits

2019-06-04 Thread Kees Cook
Several recent exploits have used direct calls to the native_write_cr4()
function to disable SMEP and SMAP before then continuing their exploits
using userspace memory access. This pins bits of CR4 so that they cannot
be changed through a common function. This is not intended to be general
ROP protection (which would require CFI to defend against properly), but
rather a way to avoid trivial direct function calling (or CFI bypasses
via a matching function prototype) as seen in:

https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
(https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-7308)

The goals of this change:
 - pin specific bits (SMEP, SMAP, and UMIP) when writing CR4.
 - avoid setting the bits too early (they must become pinned only after
   CPU feature detection and selection has finished).
 - pinning mask needs to be read-only during normal runtime.
 - pinning needs to be checked after write to avoid jumps past the
   preceding "or".

Using __ro_after_init on the mask is done so it can't be first disabled
with a malicious write.

Since these bits are global state (once established by the boot CPU
and kernel boot parameters), they are safe to write to secondary CPUs
before those CPUs have finished feature detection. As such, the bits are
written with an "or" performed before the register write as that is both
easier and uses a few bytes less storage of a location we don't have:
read-only per-CPU data. (Note that initialization via cr4_init_shadow()
isn't early enough to avoid early native_write_cr4() calls.)

A check is performed after the register write because an attack could
just skip over the "or" before the register write. Such a direct jump
is possible because of how this function may be built by the compiler
(especially due to the removal of frame pointers) where it doesn't add
a stack frame (function exit may only be a retq without pops) which
is sufficient for trivial exploitation like in the timer overwrites
mentioned above).

The asm argument constraints gain the "+" modifier to convince the
compiler that it shouldn't make ordering assumptions about the arguments
or memory, and treat them as changed.

Signed-off-by: Kees Cook 
---
v2:
- move setup until after CPU feature detection and selection.
- refactor to use static branches to have atomic enabling.
- only perform the "or" after a failed check.
---
 arch/x86/include/asm/special_insns.h | 24 +++-
 arch/x86/kernel/cpu/common.c | 18 ++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h 
b/arch/x86/include/asm/special_insns.h
index 0a3c4cab39db..284a77d52fea 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -6,6 +6,8 @@
 #ifdef __KERNEL__
 
 #include 
+#include 
+#include 
 
 /*
  * Volatile isn't enough to prevent the compiler from reordering the
@@ -16,6 +18,10 @@
  */
 extern unsigned long __force_order;
 
+/* Starts false and gets enabled once CPU feature detection is done. */
+DECLARE_STATIC_KEY_FALSE(cr_pinning);
+extern unsigned long cr4_pinned_bits;
+
 static inline unsigned long native_read_cr0(void)
 {
unsigned long val;
@@ -74,7 +80,23 @@ static inline unsigned long native_read_cr4(void)
 
 static inline void native_write_cr4(unsigned long val)
 {
-   asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
+   unsigned long bits_missing = 0;
+
+set_register:
+   if (static_branch_likely(_pinning))
+   val |= cr4_pinned_bits;
+
+   asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
+
+   if (static_branch_likely(_pinning)) {
+   if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
+   bits_missing = ~val & cr4_pinned_bits;
+   goto set_register;
+   }
+   /* Warn after we've set the missing bits. */
+   WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
+ bits_missing);
+   }
 }
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2c57fffebf9b..6b210be12734 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -366,6 +366,23 @@ static __always_inline void setup_umip(struct cpuinfo_x86 
*c)
cr4_clear_bits(X86_CR4_UMIP);
 }
 
+DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
+unsigned long cr4_pinned_bits __ro_after_init;
+
+/*
+ * Once CPU feature detection is finished (and boot params have been
+ * parsed), record any of the sensitive CR bits that are set, and
+ * enable CR pinning.
+ */
+static void __init setup_cr_pinning(void)
+{
+   unsigned long mask;
+
+   mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP);
+   cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask;
+   static_key_enable(_pinning.key);
+}
+
 /*
  * Protection Keys are not available in 32-bit 

[PATCH] irqchip/qcom: Use struct_size() in devm_kzalloc()

2019-06-04 Thread Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
int stuff;
struct boo entry[];
};

size = sizeof(struct foo) + count * sizeof(struct boo);
instance = devm_kzalloc(dev, size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = devm_kzalloc(dev, struct_size(instance, entry, count), GFP_KERNEL);

Notice that, in this case, variable alloc_sz is not necessary, hence it
is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/irqchip/qcom-irq-combiner.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/qcom-irq-combiner.c 
b/drivers/irqchip/qcom-irq-combiner.c
index 7f0c0be322e0..d269a7722032 100644
--- a/drivers/irqchip/qcom-irq-combiner.c
+++ b/drivers/irqchip/qcom-irq-combiner.c
@@ -237,7 +237,6 @@ static int get_registers(struct platform_device *pdev, 
struct combiner *comb)
 static int __init combiner_probe(struct platform_device *pdev)
 {
struct combiner *combiner;
-   size_t alloc_sz;
int nregs;
int err;
 
@@ -247,8 +246,8 @@ static int __init combiner_probe(struct platform_device 
*pdev)
return -EINVAL;
}
 
-   alloc_sz = sizeof(*combiner) + sizeof(struct combiner_reg) * nregs;
-   combiner = devm_kzalloc(>dev, alloc_sz, GFP_KERNEL);
+   combiner = devm_kzalloc(>dev, struct_size(combiner, regs, nregs),
+   GFP_KERNEL);
if (!combiner)
return -ENOMEM;
 
-- 
2.21.0



Re: [PATCH] phy: qcom-qmp: Correct READY_STATUS poll break condition

2019-06-04 Thread Evan Green
On Tue, Jun 4, 2019 at 4:24 PM Bjorn Andersson
 wrote:
>
> After issuing a PHY_START request to the QMP, the hardware documentation
> states that the software should wait for the PCS_READY_STATUS to become
> 1.
>
> With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset
> controller") an additional 1ms delay was introduced between the start
> request and the check of the status bit. This greatly increases the
> chances for the hardware to actually becoming ready before the status
> bit is read.
>
> The result can be seen in that UFS PHY enabling is now reported as a
> failure in 10% of the boots on SDM845, which is a clear regression from
> the previous rare/occasional failure.
>
> This patch fixes the "break condition" of the poll to check for the
> correct state of the status bit.
>
> Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready
> register, which means that the code checks a bit that's always 0. So the
> patch also fixes these, in order to not regress these targets.
>
> Cc: sta...@vger.kernel.org
> Cc: Evan Green 
> Cc: Marc Gonzalez 
> Cc: Vivek Gautam 
> Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support")
> Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
> Signed-off-by: Bjorn Andersson 

Nice find.

Reviewed-by: Evan Green 


mmotm 2019-06-04-16-33 uploaded

2019-06-04 Thread akpm
The mm-of-the-moment snapshot 2019-06-04-16-33 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/



The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 5.2-rc3:
(patches marked "*" will be included in linux-next)

  origin.patch
* convert-struct-pid-count-to-refcount_t.patch
* mm-memcontrol-dont-batch-updates-of-local-vm-stats-and-events.patch
* list_lru-fix-memory-leak-in-__memcg_init_list_lru_node.patch
* scripts-decode_stacktracesh-prefix-addr2line-with-cross_compile.patch
* mm-mlockall-error-for-flag-mcl_onfault.patch
* mm-fix-recent_rotated-history.patch
* fs-ocfs2-fix-race-in-ocfs2_dentry_attach_lock.patch
* mm-mmu_gather-remove-__tlb_reset_range-for-force-flush.patch
* mm-mmu_gather-remove-__tlb_reset_range-for-force-flush-checkpatch-fixes.patch
* mm-dev_pfn-exclude-memory_device_private-while-computing-virtual-address.patch
* fs-proc-allow-reporting-eip-esp-for-all-coredumping-threads.patch
* mm-mempolicy-fix-an-incorrect-rebind-node-in-mpol_rebind_nodemask.patch
* binfmt_flat-make-load_flat_shared_library-work.patch
* mm-fix-trying-to-reclaim-unevicable-lru-page.patch
* zstd-pass-pointer-rathen-than-structure-to-functions.patch
* zstd-pass-pointer-rathen-than-structure-to-functions-fix.patch
* zstd-use-u16-data-type-for-rankpos.patch
* zstd-move-params-structure-to-global-variable-to-reduce-stack-usage.patch
* zstd-change-structure-variable-from-int-to-char.patch
* mm-change-count_mm_mlocked_page_nr-return-type.patch
* signal-remove-the-wrong-signal_pending-check-in-restore_user_sigmask.patch
* iommu-replace-single-char-identifiers-in-macros.patch
* 
scripts-decode_stacktrace-match-basepath-using-shell-prefix-operator-not-regex.patch
* scripts-decode_stacktrace-look-for-modules-with-kodebug-extension.patch
* scripts-decode_stacktrace-look-for-modules-with-kodebug-extension-v2.patch
* scripts-spellingtxt-drop-sepc-from-the-misspelling-list.patch
* scripts-spellingtxt-drop-sepc-from-the-misspelling-list-fix.patch
* scripts-spellingtxt-add-spelling-fix-for-prohibited.patch
* scripts-decode_stacktrace-accept-dash-underscore-in-modules.patch
* scripts-checkstackpl-fix-arm64-wrong-or-unknown-architecture.patch
* sh-configs-remove-config_logfs-from-defconfig.patch
* sh-config-remove-left-over-backlight_lcd_support.patch
* debugobjects-move-printk-out-of-db-lock-critical-sections.patch
* ocfs2-add-last-unlock-times-in-locking_state.patch
* ocfs2-add-locking-filter-debugfs-file.patch
* fs-ocfs-fix-spelling-mistake-hearbeating-heartbeat.patch
* ocfs2-clear-zero-in-unaligned-direct-io.patch
* ocfs2-clear-zero-in-unaligned-direct-io-checkpatch-fixes.patch
* ocfs2-wait-for-recovering-done-after-direct-unlock-request.patch
* ocfs2-checkpoint-appending-truncate-log-transaction-before-flushing.patch
* ramfs-support-o_tmpfile.patch
  mm.patch
* mm-slab-validate-cache-membership-under-freelist-hardening.patch
* mm-slab-sanity-check-page-type-when-looking-up-cache.patch
* lkdtm-heap-add-tests-for-freelist-hardening.patch
* mm-slub-avoid-double-string-traverse-in-kmem_cache_flags.patch
* kmemleak-fix-check-for-softirq-context.patch
* mm-kasan-print-frame-description-for-stack-bugs.patch
* device-dax-fix-memory-and-resource-leak-if-hotplug-fails.patch
* mm-hotplug-make-remove_memory-interface-useable.patch
* device-dax-hotremove-persistent-memory-that-is-used-like-normal-ram.patch
* mm-move-map_sync-to-asm-generic-mman-commonh.patch
* include-linux-pfn_th-remove-pfn_t_to_virt.patch
* 

  1   2   3   4   5   6   7   8   9   10   >