[PATCH] clk: scmi: Fix the rounding of clock rate

2018-07-30 Thread Amit Daniel Kachhap
This fix rounds the clock rate properly by using quotient and not
remainder in the calculation. This issue was found while testing HDMI
in the Juno platform.

Fixes: 6d6a1d82eaef7 ("clk: add support for clocks provided by SCMI")
Acked-by: Sudeep Holla 
Signed-off-by: Amit Daniel Kachhap 
---
 drivers/clk/clk-scmi.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index bb2a6f2..a985bf5 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -38,7 +38,6 @@ static unsigned long scmi_clk_recalc_rate(struct clk_hw *hw,
 static long scmi_clk_round_rate(struct clk_hw *hw, unsigned long rate,
unsigned long *parent_rate)
 {
-   int step;
u64 fmin, fmax, ftmp;
struct scmi_clk *clk = to_scmi_clk(hw);
 
@@ -60,9 +59,9 @@ static long scmi_clk_round_rate(struct clk_hw *hw, unsigned 
long rate,
 
ftmp = rate - fmin;
ftmp += clk->info->range.step_size - 1; /* to round up */
-   step = do_div(ftmp, clk->info->range.step_size);
+   do_div(ftmp, clk->info->range.step_size);
 
-   return step * clk->info->range.step_size + fmin;
+   return ftmp * clk->info->range.step_size + fmin;
 }
 
 static int scmi_clk_set_rate(struct clk_hw *hw, unsigned long rate,
-- 
2.7.4



[PATCH] clk: scmi: Fix the rounding of clock rate

2018-07-30 Thread Amit Daniel Kachhap
This fix rounds the clock rate properly by using quotient and not
remainder in the calculation. This issue was found while testing HDMI
in the Juno platform.

Fixes: 6d6a1d82eaef7 ("clk: add support for clocks provided by SCMI")
Acked-by: Sudeep Holla 
Signed-off-by: Amit Daniel Kachhap 
---
 drivers/clk/clk-scmi.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index bb2a6f2..a985bf5 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -38,7 +38,6 @@ static unsigned long scmi_clk_recalc_rate(struct clk_hw *hw,
 static long scmi_clk_round_rate(struct clk_hw *hw, unsigned long rate,
unsigned long *parent_rate)
 {
-   int step;
u64 fmin, fmax, ftmp;
struct scmi_clk *clk = to_scmi_clk(hw);
 
@@ -60,9 +59,9 @@ static long scmi_clk_round_rate(struct clk_hw *hw, unsigned 
long rate,
 
ftmp = rate - fmin;
ftmp += clk->info->range.step_size - 1; /* to round up */
-   step = do_div(ftmp, clk->info->range.step_size);
+   do_div(ftmp, clk->info->range.step_size);
 
-   return step * clk->info->range.step_size + fmin;
+   return ftmp * clk->info->range.step_size + fmin;
 }
 
 static int scmi_clk_set_rate(struct clk_hw *hw, unsigned long rate,
-- 
2.7.4



Re: [PATCH mmc-next v2 3/3] mmc: sdhci-of-dwcmshc: solve 128MB DMA boundary limitation

2018-07-30 Thread Jisheng Zhang
On Tue, 31 Jul 2018 11:29:24 +0800
Jisheng Zhang  wrote:

> Hi Robin,
> 
> On Mon, 30 Jul 2018 12:06:08 +0100 Robin Murphy wrote:
> 
> > Hi Jisheng,
> > 
> > On 26/07/18 08:14, Jisheng Zhang wrote:  
> > > When using DMA, if the DMA addr spans 128MB boundary, we have to split
> > > the DMA transfer into two so that each one doesn't exceed the boundary.   
> > >  
> > 
> > Out of interest, is the driver already setting its segment boundary mask 
> > appropriately? This sounds like the exact kind of hardware restriction 
> > that dma_parms is intended to describe, which scatterlist-generating 
> > code is *supposed* to already respect.  
> 
> Thanks for the nice input. It may provide an elegant solution for this
> limitation. 
> 
> To simplify the situation, let's assume no iommu, only swiotlb. And
> the DDR is less than 4GB so swiotlb on arm64 doesn't init.
> 
> There's no dma range limitation with the HW, the only limitation
> is boundary, while dma_capable() doesn't check the boundary mask, so if
> we taking this solution, we need to teach dma_capable() about the boundary
> mask, I'm not sure whether this is acceptable.
> 
> Another problem is swiotlb initialization. When to init swiotlb, we dunno
> there's such boundary limitation HW. Is there any elegant solution for
> this problem?
> 

One more problem is: swiotlb isn't available on all platforms, e.g arm?
How to solve this SDHCI HW's limitation on arm soc w/o iommu?

Thanks


Re: [PATCH mmc-next v2 3/3] mmc: sdhci-of-dwcmshc: solve 128MB DMA boundary limitation

2018-07-30 Thread Jisheng Zhang
On Tue, 31 Jul 2018 11:29:24 +0800
Jisheng Zhang  wrote:

> Hi Robin,
> 
> On Mon, 30 Jul 2018 12:06:08 +0100 Robin Murphy wrote:
> 
> > Hi Jisheng,
> > 
> > On 26/07/18 08:14, Jisheng Zhang wrote:  
> > > When using DMA, if the DMA addr spans 128MB boundary, we have to split
> > > the DMA transfer into two so that each one doesn't exceed the boundary.   
> > >  
> > 
> > Out of interest, is the driver already setting its segment boundary mask 
> > appropriately? This sounds like the exact kind of hardware restriction 
> > that dma_parms is intended to describe, which scatterlist-generating 
> > code is *supposed* to already respect.  
> 
> Thanks for the nice input. It may provide an elegant solution for this
> limitation. 
> 
> To simplify the situation, let's assume no iommu, only swiotlb. And
> the DDR is less than 4GB so swiotlb on arm64 doesn't init.
> 
> There's no dma range limitation with the HW, the only limitation
> is boundary, while dma_capable() doesn't check the boundary mask, so if
> we taking this solution, we need to teach dma_capable() about the boundary
> mask, I'm not sure whether this is acceptable.
> 
> Another problem is swiotlb initialization. When to init swiotlb, we dunno
> there's such boundary limitation HW. Is there any elegant solution for
> this problem?
> 

One more problem is: swiotlb isn't available on all platforms, e.g arm?
How to solve this SDHCI HW's limitation on arm soc w/o iommu?

Thanks


Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Huang, Ying
Byungchul Park  writes:

> On Tue, Jul 31, 2018 at 09:37:50AM +0800, Huang, Ying wrote:
>> Byungchul Park  writes:
>> 
>> > Hello folks,
>> >
>> > I'm careful in saying.. and curious about..
>> >
>> > In restrictive cases like only addtions happen but never deletion, can't
>> > we safely traverse a llist? I believe llist can be more useful if we can
>> > release the restriction. Can't we?
>> >
>> > If yes, we may add another function traversing starting from a head. Or
>> > just use existing funtion with head->first.
>> >
>> > Thank a lot for your answers in advance :)
>> 
>> What's the use case?  I don't know how it is useful that items are never
>> deleted from the llist.
>> 
>> Some other locks could be used to provide mutual exclusive between
>> 
>> - llist add, llist traverse
>
> Hello Huang,

Hello Byungchul,

> In my use case, I only do adding and traversing on a llist.

Can you provide more details about your use case?

Best Regards,
Huang, Ying

>> 
>> and
>> 
>> - llist delete
>
> Of course, I will use a lock when deletion is needed.
>
> So.. in the case only adding into and traversing a llist is needed,
> can't we safely traverse a llist in the way I thought? Or am I missing
> something?
>
> Thank you.
>
>> Is this your use case?
>> 
>> Best Regards,
>> Huang, Ying


Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Huang, Ying
Byungchul Park  writes:

> On Tue, Jul 31, 2018 at 09:37:50AM +0800, Huang, Ying wrote:
>> Byungchul Park  writes:
>> 
>> > Hello folks,
>> >
>> > I'm careful in saying.. and curious about..
>> >
>> > In restrictive cases like only addtions happen but never deletion, can't
>> > we safely traverse a llist? I believe llist can be more useful if we can
>> > release the restriction. Can't we?
>> >
>> > If yes, we may add another function traversing starting from a head. Or
>> > just use existing funtion with head->first.
>> >
>> > Thank a lot for your answers in advance :)
>> 
>> What's the use case?  I don't know how it is useful that items are never
>> deleted from the llist.
>> 
>> Some other locks could be used to provide mutual exclusive between
>> 
>> - llist add, llist traverse
>
> Hello Huang,

Hello Byungchul,

> In my use case, I only do adding and traversing on a llist.

Can you provide more details about your use case?

Best Regards,
Huang, Ying

>> 
>> and
>> 
>> - llist delete
>
> Of course, I will use a lock when deletion is needed.
>
> So.. in the case only adding into and traversing a llist is needed,
> can't we safely traverse a llist in the way I thought? Or am I missing
> something?
>
> Thank you.
>
>> Is this your use case?
>> 
>> Best Regards,
>> Huang, Ying


linux-next: manual merge of the pinctrl tree with the devicetree tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the pinctrl tree got a conflict in:

  Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt

between commit:

  791d3ef2e111 ("dt-bindings: remove 'interrupt-parent' from bindings")

from the devicetree tree and commit:

  de1d08b22974 ("dt-bindings: pinctrl: add syscfg mask parameter")

from the pinctrl tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
index a8bb36b4f9fd,046a3de026d4..
--- a/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
@@@ -37,9 -37,12 +37,10 @@@ Required properties
  
  Optional properties:
   - reset:   : Reference to the reset controller
-  - st,syscfg: Should be phandle/offset pair. The phandle to the syscon node
-which includes IRQ mux selection register, and the offset of the IRQ mux
-selection register.
 - - interrupt-parent: phandle of the interrupt parent to which the external
 -   GPIO interrupts are forwarded to.
+  - st,syscfg: Should be phandle/offset/mask.
+   -The phandle to the syscon node which includes IRQ mux selection 
register.
+   -The offset of the IRQ mux selection register
+   -The field mask of IRQ mux, needed if different of 0xf.
   - gpio-ranges: Define a dedicated mapping between a pin-controller and
 a gpio controller. Format is < a b c> with:
-(phandle): phandle of pin-controller.


pgpfMdD2St4cR.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the pinctrl tree with the devicetree tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the pinctrl tree got a conflict in:

  Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt

between commit:

  791d3ef2e111 ("dt-bindings: remove 'interrupt-parent' from bindings")

from the devicetree tree and commit:

  de1d08b22974 ("dt-bindings: pinctrl: add syscfg mask parameter")

from the pinctrl tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
index a8bb36b4f9fd,046a3de026d4..
--- a/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.txt
@@@ -37,9 -37,12 +37,10 @@@ Required properties
  
  Optional properties:
   - reset:   : Reference to the reset controller
-  - st,syscfg: Should be phandle/offset pair. The phandle to the syscon node
-which includes IRQ mux selection register, and the offset of the IRQ mux
-selection register.
 - - interrupt-parent: phandle of the interrupt parent to which the external
 -   GPIO interrupts are forwarded to.
+  - st,syscfg: Should be phandle/offset/mask.
+   -The phandle to the syscon node which includes IRQ mux selection 
register.
+   -The offset of the IRQ mux selection register
+   -The field mask of IRQ mux, needed if different of 0xf.
   - gpio-ranges: Define a dedicated mapping between a pin-controller and
 a gpio controller. Format is < a b c> with:
-(phandle): phandle of pin-controller.


pgpfMdD2St4cR.pgp
Description: OpenPGP digital signature


[PATCH v4 03/10] arm64: dts: Add Mediatek SoC MT8183 and evaluation board dts and Makefile

2018-07-30 Thread Erin Lo
From: Ben Ho 

Add basic chip support for Mediatek 8183

Signed-off-by: Ben Ho 
Signed-off-by: Erin Lo 
---
 arch/arm64/boot/dts/mediatek/Makefile   |   1 +
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  23 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 146 
 3 files changed, 170 insertions(+)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183-evb.dts
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183.dtsi

diff --git a/arch/arm64/boot/dts/mediatek/Makefile 
b/arch/arm64/boot/dts/mediatek/Makefile
index 7506b0d..a91d462 100644
--- a/arch/arm64/boot/dts/mediatek/Makefile
+++ b/arch/arm64/boot/dts/mediatek/Makefile
@@ -6,3 +6,4 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb
+dtb-$(CONFIG_ARCH_MEDIATEK) += mt8183-evb.dtb
diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
new file mode 100644
index 000..2a3dd5a
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ben Ho 
+ *Erin Lo 
+ */
+
+/dts-v1/;
+#include "mt8183.dtsi"
+
+/ {
+   model = "MediaTek MT8183 evaluation board";
+   compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
+
+   memory@4000 {
+   device_type = "memory";
+   reg = <0 0x4000 0 0x8000>;
+   };
+
+   chosen {
+   stdout-path = "serial0:921600n8";
+   };
+};
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
new file mode 100644
index 000..1553265
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ben Ho 
+ *Erin Lo 
+ */
+
+#include 
+#include 
+
+/ {
+   compatible = "mediatek,mt8183";
+   interrupt-parent = <>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   cpu-map {
+   cluster0 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   core2 {
+   cpu = <>;
+   };
+   core3 {
+   cpu = <>;
+   };
+   };
+
+   cluster1 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   core2 {
+   cpu = <>;
+   };
+   core3 {
+   cpu = <>;
+   };
+   };
+   };
+
+   cpu0: cpu@000 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x000>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@001 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x001>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@002 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x002>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@003 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x003>;
+   enable-method = "psci";
+   };
+
+   cpu4: cpu@100 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a73";
+   reg = <0x100>;
+   enable-method = "psci";
+   };
+
+   cpu5: cpu@101 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a73";
+   reg = <0x101>;
+   enable-method = "psci";
+   };
+
+   cpu6: cpu@102 {
+   device_type = "cpu";
+   compatible = 

[PATCH v4 03/10] arm64: dts: Add Mediatek SoC MT8183 and evaluation board dts and Makefile

2018-07-30 Thread Erin Lo
From: Ben Ho 

Add basic chip support for Mediatek 8183

Signed-off-by: Ben Ho 
Signed-off-by: Erin Lo 
---
 arch/arm64/boot/dts/mediatek/Makefile   |   1 +
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  23 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 146 
 3 files changed, 170 insertions(+)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183-evb.dts
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183.dtsi

diff --git a/arch/arm64/boot/dts/mediatek/Makefile 
b/arch/arm64/boot/dts/mediatek/Makefile
index 7506b0d..a91d462 100644
--- a/arch/arm64/boot/dts/mediatek/Makefile
+++ b/arch/arm64/boot/dts/mediatek/Makefile
@@ -6,3 +6,4 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb
+dtb-$(CONFIG_ARCH_MEDIATEK) += mt8183-evb.dtb
diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
new file mode 100644
index 000..2a3dd5a
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ben Ho 
+ *Erin Lo 
+ */
+
+/dts-v1/;
+#include "mt8183.dtsi"
+
+/ {
+   model = "MediaTek MT8183 evaluation board";
+   compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
+
+   memory@4000 {
+   device_type = "memory";
+   reg = <0 0x4000 0 0x8000>;
+   };
+
+   chosen {
+   stdout-path = "serial0:921600n8";
+   };
+};
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
new file mode 100644
index 000..1553265
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ben Ho 
+ *Erin Lo 
+ */
+
+#include 
+#include 
+
+/ {
+   compatible = "mediatek,mt8183";
+   interrupt-parent = <>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   cpu-map {
+   cluster0 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   core2 {
+   cpu = <>;
+   };
+   core3 {
+   cpu = <>;
+   };
+   };
+
+   cluster1 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   core2 {
+   cpu = <>;
+   };
+   core3 {
+   cpu = <>;
+   };
+   };
+   };
+
+   cpu0: cpu@000 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x000>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@001 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x001>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@002 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x002>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@003 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a53";
+   reg = <0x003>;
+   enable-method = "psci";
+   };
+
+   cpu4: cpu@100 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a73";
+   reg = <0x100>;
+   enable-method = "psci";
+   };
+
+   cpu5: cpu@101 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a73";
+   reg = <0x101>;
+   enable-method = "psci";
+   };
+
+   cpu6: cpu@102 {
+   device_type = "cpu";
+   compatible = 

[PATCH v4 09/10] dt-bindings: serial: Add compatible for Mediatek MT8183

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of uart for Mediatek MT8183 SoC
Platform.

Signed-off-by: Erin Lo 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/serial/mtk-uart.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/serial/mtk-uart.txt 
b/Documentation/devicetree/bindings/serial/mtk-uart.txt
index f73abff..4783336 100644
--- a/Documentation/devicetree/bindings/serial/mtk-uart.txt
+++ b/Documentation/devicetree/bindings/serial/mtk-uart.txt
@@ -15,6 +15,7 @@ Required properties:
   * "mediatek,mt8127-uart" for MT8127 compatible UARTS
   * "mediatek,mt8135-uart" for MT8135 compatible UARTS
   * "mediatek,mt8173-uart" for MT8173 compatible UARTS
+  * "mediatek,mt8183-uart", "mediatek,mt6577-uart" for MT8183 compatible UARTS
   * "mediatek,mt6577-uart" for MT6577 and all of the above
 
 - reg: The base address of the UART register bank.
-- 
1.9.1



[PATCH v4 05/10] clk: mediatek: Add dt-bindings for MT8183 clocks

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add MT8183 clock dt-bindings, include topckgen, apmixedsys,
infracfg and subsystem clocks.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 include/dt-bindings/clock/mt8183-clk.h | 413 +
 1 file changed, 413 insertions(+)
 create mode 100644 include/dt-bindings/clock/mt8183-clk.h

diff --git a/include/dt-bindings/clock/mt8183-clk.h 
b/include/dt-bindings/clock/mt8183-clk.h
new file mode 100644
index 000..bacad53
--- /dev/null
+++ b/include/dt-bindings/clock/mt8183-clk.h
@@ -0,0 +1,413 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Weiyi Lu 
+ */
+
+#ifndef _DT_BINDINGS_CLK_MT8183_H
+#define _DT_BINDINGS_CLK_MT8183_H
+
+/* APMIXED */
+#define CLK_APMIXED_ARMPLL_LL  0
+#define CLK_APMIXED_ARMPLL_L   1
+#define CLK_APMIXED_CCIPLL 2
+#define CLK_APMIXED_MAINPLL3
+#define CLK_APMIXED_UNIV2PLL   4
+#define CLK_APMIXED_MSDCPLL5
+#define CLK_APMIXED_MMPLL  6
+#define CLK_APMIXED_MFGPLL 7
+#define CLK_APMIXED_TVDPLL 8
+#define CLK_APMIXED_APLL1  9
+#define CLK_APMIXED_APLL2  10
+#define CLK_APMIXED_SSUSB_26M  11
+#define CLK_APMIXED_APPLL_26M  12
+#define CLK_APMIXED_MIPIC0_26M 13
+#define CLK_APMIXED_MDPLLGP_26M14
+#define CLK_APMIXED_MMSYS_26M  15
+#define CLK_APMIXED_UFS_26M16
+#define CLK_APMIXED_MIPIC1_26M 17
+#define CLK_APMIXED_MEMPLL_26M 18
+#define CLK_APMIXED_CLKSQ_LVPLL_26M19
+#define CLK_APMIXED_MIPID0_26M 20
+#define CLK_APMIXED_MIPID1_26M 21
+#define CLK_APMIXED_NR_CLK 22
+
+/* TOPCKGEN */
+#define CLK_TOP_MUX_AXI0
+#define CLK_TOP_MUX_MM 1
+#define CLK_TOP_MUX_CAM2
+#define CLK_TOP_MUX_MFG3
+#define CLK_TOP_MUX_CAMTG  4
+#define CLK_TOP_MUX_UART   5
+#define CLK_TOP_MUX_SPI6
+#define CLK_TOP_MUX_MSDC50_0_HCLK  7
+#define CLK_TOP_MUX_MSDC50_0   8
+#define CLK_TOP_MUX_MSDC30_1   9
+#define CLK_TOP_MUX_MSDC30_2   10
+#define CLK_TOP_MUX_AUDIO  11
+#define CLK_TOP_MUX_AUD_INTBUS 12
+#define CLK_TOP_MUX_FPWRAP_ULPOSC  13
+#define CLK_TOP_MUX_SCP14
+#define CLK_TOP_MUX_ATB15
+#define CLK_TOP_MUX_SSPM   16
+#define CLK_TOP_MUX_DPI0   17
+#define CLK_TOP_MUX_SCAM   18
+#define CLK_TOP_MUX_AUD_1  19
+#define CLK_TOP_MUX_AUD_2  20
+#define CLK_TOP_MUX_DISP_PWM   21
+#define CLK_TOP_MUX_SSUSB_TOP_XHCI 22
+#define CLK_TOP_MUX_USB_TOP23
+#define CLK_TOP_MUX_SPM24
+#define CLK_TOP_MUX_I2C25
+#define CLK_TOP_MUX_F52M_MFG   26
+#define CLK_TOP_MUX_SENINF 27
+#define CLK_TOP_MUX_DXCC   28
+#define CLK_TOP_MUX_CAMTG2 29
+#define CLK_TOP_MUX_AUD_ENG1   30
+#define CLK_TOP_MUX_AUD_ENG2   31
+#define CLK_TOP_MUX_FAES_UFSFDE32
+#define CLK_TOP_MUX_FUFS   33
+#define CLK_TOP_MUX_IMG34
+#define CLK_TOP_MUX_DSP35
+#define CLK_TOP_MUX_DSP1   36
+#define CLK_TOP_MUX_DSP2   37
+#define CLK_TOP_MUX_IPU_IF 38
+#define CLK_TOP_MUX_CAMTG3 39
+#define CLK_TOP_MUX_CAMTG4 40
+#define CLK_TOP_MUX_PMICSPI41
+#define CLK_TOP_SYSPLL_CK  42
+#define CLK_TOP_SYSPLL_D2  43
+#define CLK_TOP_SYSPLL_D3  44
+#define CLK_TOP_SYSPLL_D5  45
+#define CLK_TOP_SYSPLL_D7  46
+#define CLK_TOP_SYSPLL_D2_D2   47
+#define CLK_TOP_SYSPLL_D2_D4   48
+#define CLK_TOP_SYSPLL_D2_D8   49
+#define CLK_TOP_SYSPLL_D2_D16  50
+#define CLK_TOP_SYSPLL_D3_D2   51
+#define CLK_TOP_SYSPLL_D3_D4   52
+#define CLK_TOP_SYSPLL_D3_D8   53
+#define CLK_TOP_SYSPLL_D5_D2   54
+#define CLK_TOP_SYSPLL_D5_D4   55
+#define CLK_TOP_SYSPLL_D7_D2   56
+#define CLK_TOP_SYSPLL_D7_D4   57
+#define CLK_TOP_UNIVPLL_CK 58
+#define CLK_TOP_UNIVPLL_D2 59
+#define CLK_TOP_UNIVPLL_D3 60
+#define CLK_TOP_UNIVPLL_D5 61
+#define CLK_TOP_UNIVPLL_D7 62
+#define CLK_TOP_UNIVPLL_D2_D2  63
+#define CLK_TOP_UNIVPLL_D2_D4  64
+#define CLK_TOP_UNIVPLL_D2_D8  65
+#define CLK_TOP_UNIVPLL_D3_D2  66
+#define CLK_TOP_UNIVPLL_D3_D4  67
+#define CLK_TOP_UNIVPLL_D3_D8  68
+#define CLK_TOP_UNIVPLL_D5_D2  69
+#define CLK_TOP_UNIVPLL_D5_D4  70
+#define CLK_TOP_UNIVPLL_D5_D8  71
+#define CLK_TOP_APLL1_CK   72
+#define CLK_TOP_APLL1_D2  

[PATCH v4 08/10] arm64: dts: mt8183: Add clock controller device nodes

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add clock controller nodes for MT8183, include topckgen, infracfg,
apmixedsys and subsystem.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 92 
 1 file changed, 92 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 1553265..6b87a24 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -5,6 +5,7 @@
  *Erin Lo 
  */
 
+#include 
 #include 
 #include 
 
@@ -112,6 +113,13 @@
method  = "smc";
};
 
+   clk26m: oscillator@0 {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <2600>;
+   clock-output-names = "clk26m";
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupt-parent = <>;
@@ -143,4 +151,88 @@
interrupt-parent = <>;
reg = <0 0x0c530a80 0 0x50>;
};
+
+   topckgen: syscon@1000 {
+   compatible = "mediatek,mt8183-topckgen", "syscon";
+   reg = <0 0x1000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   infracfg: syscon@10001000 {
+   compatible = "mediatek,mt8183-infracfg", "syscon";
+   reg = <0 0x10001000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   apmixedsys: syscon@1000c000 {
+   compatible = "mediatek,mt8183-apmixedsys", "syscon";
+   reg = <0 0x1000c000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   audiosys: syscon@1122 {
+   compatible = "mediatek,mt8183-audiosys", "syscon";
+   reg = <0 0x1122 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   mfgcfg: syscon@1300 {
+   compatible = "mediatek,mt8183-mfgcfg", "syscon";
+   reg = <0 0x1300 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   mmsys: syscon@1400 {
+   compatible = "mediatek,mt8183-mmsys", "syscon";
+   reg = <0 0x1400 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   imgsys: syscon@1502 {
+   compatible = "mediatek,mt8183-imgsys", "syscon";
+   reg = <0 0x1502 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   vdecsys: syscon@1600 {
+   compatible = "mediatek,mt8183-vdecsys", "syscon";
+   reg = <0 0x1600 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   vencsys: syscon@1700 {
+   compatible = "mediatek,mt8183-vencsys", "syscon";
+   reg = <0 0x1700 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_conn: syscon@1900 {
+   compatible = "mediatek,mt8183-ipu_conn", "syscon";
+   reg = <0 0x1900 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_adl: syscon@1901 {
+   compatible = "mediatek,mt8183-ipu_adl", "syscon";
+   reg = <0 0x1901 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_core0: syscon@1918 {
+   compatible = "mediatek,mt8183-ipu_core0", "syscon";
+   reg = <0 0x1918 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_core1: syscon@1928 {
+   compatible = "mediatek,mt8183-ipu_core1", "syscon";
+   reg = <0 0x1928 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   camsys: syscon@1a00 {
+   compatible = "mediatek,mt8183-camsys", "syscon";
+   reg = <0 0x1a00 0 0x1000>;
+   #clock-cells = <1>;
+   };
 };
-- 
1.9.1



[PATCH v4 06/10] clk: mediatek: Add flags support for mtk_gate data

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

On some Mediatek platforms, there are critical clocks of
clock gate type.
To register clock gate with flags CLK_IS_CRITICAL,
we need to add the flags field in mtk_gate data and register APIs.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 drivers/clk/mediatek/clk-gate.c | 5 +++--
 drivers/clk/mediatek/clk-gate.h | 3 ++-
 drivers/clk/mediatek/clk-mtk.c  | 3 ++-
 drivers/clk/mediatek/clk-mtk.h  | 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/mediatek/clk-gate.c b/drivers/clk/mediatek/clk-gate.c
index 934bf0e..25d25c3 100644
--- a/drivers/clk/mediatek/clk-gate.c
+++ b/drivers/clk/mediatek/clk-gate.c
@@ -157,7 +157,8 @@ struct clk *mtk_clk_register_gate(
int clr_ofs,
int sta_ofs,
u8 bit,
-   const struct clk_ops *ops)
+   const struct clk_ops *ops,
+   unsigned int flags)
 {
struct mtk_clk_gate *cg;
struct clk *clk;
@@ -168,7 +169,7 @@ struct clk *mtk_clk_register_gate(
return ERR_PTR(-ENOMEM);
 
init.name = name;
-   init.flags = CLK_SET_RATE_PARENT;
+   init.flags = flags | CLK_SET_RATE_PARENT;
init.parent_names = parent_name ? _name : NULL;
init.num_parents = parent_name ? 1 : 0;
init.ops = ops;
diff --git a/drivers/clk/mediatek/clk-gate.h b/drivers/clk/mediatek/clk-gate.h
index 72ef89b..631cd3a 100644
--- a/drivers/clk/mediatek/clk-gate.h
+++ b/drivers/clk/mediatek/clk-gate.h
@@ -47,6 +47,7 @@ struct clk *mtk_clk_register_gate(
int clr_ofs,
int sta_ofs,
u8 bit,
-   const struct clk_ops *ops);
+   const struct clk_ops *ops,
+   unsigned int flags);
 
 #endif /* __DRV_CLK_GATE_H */
diff --git a/drivers/clk/mediatek/clk-mtk.c b/drivers/clk/mediatek/clk-mtk.c
index 50becd0..15310f8 100644
--- a/drivers/clk/mediatek/clk-mtk.c
+++ b/drivers/clk/mediatek/clk-mtk.c
@@ -131,7 +131,8 @@ int mtk_clk_register_gates(struct device_node *node,
gate->regs->set_ofs,
gate->regs->clr_ofs,
gate->regs->sta_ofs,
-   gate->shift, gate->ops);
+   gate->shift, gate->ops,
+   gate->flags);
 
if (IS_ERR(clk)) {
pr_err("Failed to register clk %s: %ld\n",
diff --git a/drivers/clk/mediatek/clk-mtk.h b/drivers/clk/mediatek/clk-mtk.h
index 61693f6..c3285ff 100644
--- a/drivers/clk/mediatek/clk-mtk.h
+++ b/drivers/clk/mediatek/clk-mtk.h
@@ -217,6 +217,7 @@ struct mtk_gate {
const struct mtk_gate_regs *regs;
int shift;
const struct clk_ops *ops;
+   unsigned int flags;
 };
 
 int mtk_clk_register_gates(struct device_node *node,
-- 
1.9.1



[PATCH v4 09/10] dt-bindings: serial: Add compatible for Mediatek MT8183

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of uart for Mediatek MT8183 SoC
Platform.

Signed-off-by: Erin Lo 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/serial/mtk-uart.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/serial/mtk-uart.txt 
b/Documentation/devicetree/bindings/serial/mtk-uart.txt
index f73abff..4783336 100644
--- a/Documentation/devicetree/bindings/serial/mtk-uart.txt
+++ b/Documentation/devicetree/bindings/serial/mtk-uart.txt
@@ -15,6 +15,7 @@ Required properties:
   * "mediatek,mt8127-uart" for MT8127 compatible UARTS
   * "mediatek,mt8135-uart" for MT8135 compatible UARTS
   * "mediatek,mt8173-uart" for MT8173 compatible UARTS
+  * "mediatek,mt8183-uart", "mediatek,mt6577-uart" for MT8183 compatible UARTS
   * "mediatek,mt6577-uart" for MT6577 and all of the above
 
 - reg: The base address of the UART register bank.
-- 
1.9.1



[PATCH v4 05/10] clk: mediatek: Add dt-bindings for MT8183 clocks

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add MT8183 clock dt-bindings, include topckgen, apmixedsys,
infracfg and subsystem clocks.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 include/dt-bindings/clock/mt8183-clk.h | 413 +
 1 file changed, 413 insertions(+)
 create mode 100644 include/dt-bindings/clock/mt8183-clk.h

diff --git a/include/dt-bindings/clock/mt8183-clk.h 
b/include/dt-bindings/clock/mt8183-clk.h
new file mode 100644
index 000..bacad53
--- /dev/null
+++ b/include/dt-bindings/clock/mt8183-clk.h
@@ -0,0 +1,413 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Weiyi Lu 
+ */
+
+#ifndef _DT_BINDINGS_CLK_MT8183_H
+#define _DT_BINDINGS_CLK_MT8183_H
+
+/* APMIXED */
+#define CLK_APMIXED_ARMPLL_LL  0
+#define CLK_APMIXED_ARMPLL_L   1
+#define CLK_APMIXED_CCIPLL 2
+#define CLK_APMIXED_MAINPLL3
+#define CLK_APMIXED_UNIV2PLL   4
+#define CLK_APMIXED_MSDCPLL5
+#define CLK_APMIXED_MMPLL  6
+#define CLK_APMIXED_MFGPLL 7
+#define CLK_APMIXED_TVDPLL 8
+#define CLK_APMIXED_APLL1  9
+#define CLK_APMIXED_APLL2  10
+#define CLK_APMIXED_SSUSB_26M  11
+#define CLK_APMIXED_APPLL_26M  12
+#define CLK_APMIXED_MIPIC0_26M 13
+#define CLK_APMIXED_MDPLLGP_26M14
+#define CLK_APMIXED_MMSYS_26M  15
+#define CLK_APMIXED_UFS_26M16
+#define CLK_APMIXED_MIPIC1_26M 17
+#define CLK_APMIXED_MEMPLL_26M 18
+#define CLK_APMIXED_CLKSQ_LVPLL_26M19
+#define CLK_APMIXED_MIPID0_26M 20
+#define CLK_APMIXED_MIPID1_26M 21
+#define CLK_APMIXED_NR_CLK 22
+
+/* TOPCKGEN */
+#define CLK_TOP_MUX_AXI0
+#define CLK_TOP_MUX_MM 1
+#define CLK_TOP_MUX_CAM2
+#define CLK_TOP_MUX_MFG3
+#define CLK_TOP_MUX_CAMTG  4
+#define CLK_TOP_MUX_UART   5
+#define CLK_TOP_MUX_SPI6
+#define CLK_TOP_MUX_MSDC50_0_HCLK  7
+#define CLK_TOP_MUX_MSDC50_0   8
+#define CLK_TOP_MUX_MSDC30_1   9
+#define CLK_TOP_MUX_MSDC30_2   10
+#define CLK_TOP_MUX_AUDIO  11
+#define CLK_TOP_MUX_AUD_INTBUS 12
+#define CLK_TOP_MUX_FPWRAP_ULPOSC  13
+#define CLK_TOP_MUX_SCP14
+#define CLK_TOP_MUX_ATB15
+#define CLK_TOP_MUX_SSPM   16
+#define CLK_TOP_MUX_DPI0   17
+#define CLK_TOP_MUX_SCAM   18
+#define CLK_TOP_MUX_AUD_1  19
+#define CLK_TOP_MUX_AUD_2  20
+#define CLK_TOP_MUX_DISP_PWM   21
+#define CLK_TOP_MUX_SSUSB_TOP_XHCI 22
+#define CLK_TOP_MUX_USB_TOP23
+#define CLK_TOP_MUX_SPM24
+#define CLK_TOP_MUX_I2C25
+#define CLK_TOP_MUX_F52M_MFG   26
+#define CLK_TOP_MUX_SENINF 27
+#define CLK_TOP_MUX_DXCC   28
+#define CLK_TOP_MUX_CAMTG2 29
+#define CLK_TOP_MUX_AUD_ENG1   30
+#define CLK_TOP_MUX_AUD_ENG2   31
+#define CLK_TOP_MUX_FAES_UFSFDE32
+#define CLK_TOP_MUX_FUFS   33
+#define CLK_TOP_MUX_IMG34
+#define CLK_TOP_MUX_DSP35
+#define CLK_TOP_MUX_DSP1   36
+#define CLK_TOP_MUX_DSP2   37
+#define CLK_TOP_MUX_IPU_IF 38
+#define CLK_TOP_MUX_CAMTG3 39
+#define CLK_TOP_MUX_CAMTG4 40
+#define CLK_TOP_MUX_PMICSPI41
+#define CLK_TOP_SYSPLL_CK  42
+#define CLK_TOP_SYSPLL_D2  43
+#define CLK_TOP_SYSPLL_D3  44
+#define CLK_TOP_SYSPLL_D5  45
+#define CLK_TOP_SYSPLL_D7  46
+#define CLK_TOP_SYSPLL_D2_D2   47
+#define CLK_TOP_SYSPLL_D2_D4   48
+#define CLK_TOP_SYSPLL_D2_D8   49
+#define CLK_TOP_SYSPLL_D2_D16  50
+#define CLK_TOP_SYSPLL_D3_D2   51
+#define CLK_TOP_SYSPLL_D3_D4   52
+#define CLK_TOP_SYSPLL_D3_D8   53
+#define CLK_TOP_SYSPLL_D5_D2   54
+#define CLK_TOP_SYSPLL_D5_D4   55
+#define CLK_TOP_SYSPLL_D7_D2   56
+#define CLK_TOP_SYSPLL_D7_D4   57
+#define CLK_TOP_UNIVPLL_CK 58
+#define CLK_TOP_UNIVPLL_D2 59
+#define CLK_TOP_UNIVPLL_D3 60
+#define CLK_TOP_UNIVPLL_D5 61
+#define CLK_TOP_UNIVPLL_D7 62
+#define CLK_TOP_UNIVPLL_D2_D2  63
+#define CLK_TOP_UNIVPLL_D2_D4  64
+#define CLK_TOP_UNIVPLL_D2_D8  65
+#define CLK_TOP_UNIVPLL_D3_D2  66
+#define CLK_TOP_UNIVPLL_D3_D4  67
+#define CLK_TOP_UNIVPLL_D3_D8  68
+#define CLK_TOP_UNIVPLL_D5_D2  69
+#define CLK_TOP_UNIVPLL_D5_D4  70
+#define CLK_TOP_UNIVPLL_D5_D8  71
+#define CLK_TOP_APLL1_CK   72
+#define CLK_TOP_APLL1_D2  

[PATCH v4 08/10] arm64: dts: mt8183: Add clock controller device nodes

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add clock controller nodes for MT8183, include topckgen, infracfg,
apmixedsys and subsystem.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 92 
 1 file changed, 92 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 1553265..6b87a24 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -5,6 +5,7 @@
  *Erin Lo 
  */
 
+#include 
 #include 
 #include 
 
@@ -112,6 +113,13 @@
method  = "smc";
};
 
+   clk26m: oscillator@0 {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <2600>;
+   clock-output-names = "clk26m";
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupt-parent = <>;
@@ -143,4 +151,88 @@
interrupt-parent = <>;
reg = <0 0x0c530a80 0 0x50>;
};
+
+   topckgen: syscon@1000 {
+   compatible = "mediatek,mt8183-topckgen", "syscon";
+   reg = <0 0x1000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   infracfg: syscon@10001000 {
+   compatible = "mediatek,mt8183-infracfg", "syscon";
+   reg = <0 0x10001000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   apmixedsys: syscon@1000c000 {
+   compatible = "mediatek,mt8183-apmixedsys", "syscon";
+   reg = <0 0x1000c000 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   audiosys: syscon@1122 {
+   compatible = "mediatek,mt8183-audiosys", "syscon";
+   reg = <0 0x1122 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   mfgcfg: syscon@1300 {
+   compatible = "mediatek,mt8183-mfgcfg", "syscon";
+   reg = <0 0x1300 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   mmsys: syscon@1400 {
+   compatible = "mediatek,mt8183-mmsys", "syscon";
+   reg = <0 0x1400 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   imgsys: syscon@1502 {
+   compatible = "mediatek,mt8183-imgsys", "syscon";
+   reg = <0 0x1502 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   vdecsys: syscon@1600 {
+   compatible = "mediatek,mt8183-vdecsys", "syscon";
+   reg = <0 0x1600 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   vencsys: syscon@1700 {
+   compatible = "mediatek,mt8183-vencsys", "syscon";
+   reg = <0 0x1700 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_conn: syscon@1900 {
+   compatible = "mediatek,mt8183-ipu_conn", "syscon";
+   reg = <0 0x1900 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_adl: syscon@1901 {
+   compatible = "mediatek,mt8183-ipu_adl", "syscon";
+   reg = <0 0x1901 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_core0: syscon@1918 {
+   compatible = "mediatek,mt8183-ipu_core0", "syscon";
+   reg = <0 0x1918 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   ipu_core1: syscon@1928 {
+   compatible = "mediatek,mt8183-ipu_core1", "syscon";
+   reg = <0 0x1928 0 0x1000>;
+   #clock-cells = <1>;
+   };
+
+   camsys: syscon@1a00 {
+   compatible = "mediatek,mt8183-camsys", "syscon";
+   reg = <0 0x1a00 0 0x1000>;
+   #clock-cells = <1>;
+   };
 };
-- 
1.9.1



[PATCH v4 06/10] clk: mediatek: Add flags support for mtk_gate data

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

On some Mediatek platforms, there are critical clocks of
clock gate type.
To register clock gate with flags CLK_IS_CRITICAL,
we need to add the flags field in mtk_gate data and register APIs.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 drivers/clk/mediatek/clk-gate.c | 5 +++--
 drivers/clk/mediatek/clk-gate.h | 3 ++-
 drivers/clk/mediatek/clk-mtk.c  | 3 ++-
 drivers/clk/mediatek/clk-mtk.h  | 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/mediatek/clk-gate.c b/drivers/clk/mediatek/clk-gate.c
index 934bf0e..25d25c3 100644
--- a/drivers/clk/mediatek/clk-gate.c
+++ b/drivers/clk/mediatek/clk-gate.c
@@ -157,7 +157,8 @@ struct clk *mtk_clk_register_gate(
int clr_ofs,
int sta_ofs,
u8 bit,
-   const struct clk_ops *ops)
+   const struct clk_ops *ops,
+   unsigned int flags)
 {
struct mtk_clk_gate *cg;
struct clk *clk;
@@ -168,7 +169,7 @@ struct clk *mtk_clk_register_gate(
return ERR_PTR(-ENOMEM);
 
init.name = name;
-   init.flags = CLK_SET_RATE_PARENT;
+   init.flags = flags | CLK_SET_RATE_PARENT;
init.parent_names = parent_name ? _name : NULL;
init.num_parents = parent_name ? 1 : 0;
init.ops = ops;
diff --git a/drivers/clk/mediatek/clk-gate.h b/drivers/clk/mediatek/clk-gate.h
index 72ef89b..631cd3a 100644
--- a/drivers/clk/mediatek/clk-gate.h
+++ b/drivers/clk/mediatek/clk-gate.h
@@ -47,6 +47,7 @@ struct clk *mtk_clk_register_gate(
int clr_ofs,
int sta_ofs,
u8 bit,
-   const struct clk_ops *ops);
+   const struct clk_ops *ops,
+   unsigned int flags);
 
 #endif /* __DRV_CLK_GATE_H */
diff --git a/drivers/clk/mediatek/clk-mtk.c b/drivers/clk/mediatek/clk-mtk.c
index 50becd0..15310f8 100644
--- a/drivers/clk/mediatek/clk-mtk.c
+++ b/drivers/clk/mediatek/clk-mtk.c
@@ -131,7 +131,8 @@ int mtk_clk_register_gates(struct device_node *node,
gate->regs->set_ofs,
gate->regs->clr_ofs,
gate->regs->sta_ofs,
-   gate->shift, gate->ops);
+   gate->shift, gate->ops,
+   gate->flags);
 
if (IS_ERR(clk)) {
pr_err("Failed to register clk %s: %ld\n",
diff --git a/drivers/clk/mediatek/clk-mtk.h b/drivers/clk/mediatek/clk-mtk.h
index 61693f6..c3285ff 100644
--- a/drivers/clk/mediatek/clk-mtk.h
+++ b/drivers/clk/mediatek/clk-mtk.h
@@ -217,6 +217,7 @@ struct mtk_gate {
const struct mtk_gate_regs *regs;
int shift;
const struct clk_ops *ops;
+   unsigned int flags;
 };
 
 int mtk_clk_register_gates(struct device_node *node,
-- 
1.9.1



[PATCH v4 02/10] dt-bindings: mtk-sysirq: Add compatible for Mediatek MT8183

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of SYSIRQ for Mediatek MT8183 SoC
Platform.

Signed-off-by: Erin Lo 
Acked-by: Rob Herring 
---
 .../devicetree/bindings/interrupt-controller/mediatek,sysirq.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt 
b/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
index 07bf0b9..5ff48a8 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
@@ -5,6 +5,7 @@ interrupt.
 
 Required properties:
 - compatible: should be
+   "mediatek,mt8183-sysirq", "mediatek,mt6577-sysirq": for MT8183
"mediatek,mt8173-sysirq", "mediatek,mt6577-sysirq": for MT8173
"mediatek,mt8135-sysirq", "mediatek,mt6577-sysirq": for MT8135
"mediatek,mt8127-sysirq", "mediatek,mt6577-sysirq": for MT8127
-- 
1.9.1



[PATCH v4 07/10] clk: mediatek: Add MT8183 clock support

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add MT8183 clock support, include topckgen, apmixedsys,
infracfg and subsystem clocks.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 drivers/clk/mediatek/Kconfig   |   74 ++
 drivers/clk/mediatek/Makefile  |   12 +
 drivers/clk/mediatek/clk-mt8183-audio.c|  112 +++
 drivers/clk/mediatek/clk-mt8183-cam.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-img.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-ipu0.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu1.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu_adl.c  |   66 ++
 drivers/clk/mediatek/clk-mt8183-ipu_conn.c |  155 
 drivers/clk/mediatek/clk-mt8183-mfgcfg.c   |   66 ++
 drivers/clk/mediatek/clk-mt8183-mm.c   |  128 +++
 drivers/clk/mediatek/clk-mt8183-vdec.c |   84 ++
 drivers/clk/mediatek/clk-mt8183-venc.c |   71 ++
 drivers/clk/mediatek/clk-mt8183.c  | 1230 
 14 files changed, 2284 insertions(+)
 create mode 100644 drivers/clk/mediatek/clk-mt8183-audio.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-cam.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-img.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu0.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu1.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_adl.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_conn.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mfgcfg.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mm.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-vdec.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-venc.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183.c

diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig
index 95e5e52..e70c164 100644
--- a/drivers/clk/mediatek/Kconfig
+++ b/drivers/clk/mediatek/Kconfig
@@ -194,6 +194,80 @@ config COMMON_CLK_MT8173
---help---
  This driver supports MediaTek MT8173 clocks.
 
+config COMMON_CLK_MT8183
+   bool "Clock driver for MediaTek MT8183"
+   depends on (ARCH_MEDIATEK && ARM64) || COMPILE_TEST
+   select COMMON_CLK_MEDIATEK
+   default ARCH_MEDIATEK && ARM64
+   help
+ This driver supports MediaTek MT8183 basic clocks.
+
+config COMMON_CLK_MT8183_AUDIOSYS
+   bool "Clock driver for MediaTek MT8183 audiosys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 audiosys clocks.
+
+config COMMON_CLK_MT8183_CAMSYS
+   bool "Clock driver for MediaTek MT8183 camsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 camsys clocks.
+
+config COMMON_CLK_MT8183_IMGSYS
+   bool "Clock driver for MediaTek MT8183 imgsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 imgsys clocks.
+
+config COMMON_CLK_MT8183_IPU_CORE0
+   bool "Clock driver for MediaTek MT8183 ipu_core0"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_core0 clocks.
+
+config COMMON_CLK_MT8183_IPU_CORE1
+   bool "Clock driver for MediaTek MT8183 ipu_core1"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_core1 clocks.
+
+config COMMON_CLK_MT8183_IPU_ADL
+   bool "Clock driver for MediaTek MT8183 ipu_adl"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_adl clocks.
+
+config COMMON_CLK_MT8183_IPU_CONN
+   bool "Clock driver for MediaTek MT8183 ipu_conn"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_conn clocks.
+
+config COMMON_CLK_MT8183_MFGCFG
+   bool "Clock driver for MediaTek MT8183 mfgcfg"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 mfgcfg clocks.
+
+config COMMON_CLK_MT8183_MMSYS
+   bool "Clock driver for MediaTek MT8183 mmsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 mmsys clocks.
+
+config COMMON_CLK_MT8183_VDECSYS
+   bool "Clock driver for MediaTek MT8183 vdecsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 vdecsys clocks.
+
+config COMMON_CLK_MT8183_VENCSYS
+   bool "Clock driver for MediaTek MT8183 vencsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 vencsys clocks.
+
 config COMMON_CLK_MT6765
bool "Clock driver for MediaTek MT6765"
depends on (ARCH_MEDIATEK && ARM64) || COMPILE_TEST
diff --git a/drivers/clk/mediatek/Makefile b/drivers/clk/mediatek/Makefile
index b455a8e..13e6919 100644
--- a/drivers/clk/mediatek/Makefile
+++ b/drivers/clk/mediatek/Makefile
@@ -35,3 +35,15 @@ obj-$(CONFIG_COMMON_CLK_MT7622_HIFSYS) += clk-mt7622-hif.o
 obj-$(CONFIG_COMMON_CLK_MT7622_AUDSYS) += clk-mt7622-aud.o
 

[PATCH v4 10/10] dts: arm64: mt8183: add uart node

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add uart node with correct uart clocks.

Signed-off-by: Erin Lo 
Signed-off-by: Weiyi Lu 
---
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  8 
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 30 +
 2 files changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
index 2a3dd5a..9b52559 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -12,6 +12,10 @@
model = "MediaTek MT8183 evaluation board";
compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
 
+   aliases {
+   serial0 = 
+   };
+
memory@4000 {
device_type = "memory";
reg = <0 0x4000 0 0x8000>;
@@ -21,3 +25,7 @@
stdout-path = "serial0:921600n8";
};
 };
+
+ {
+   status = "okay";
+};
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 6b87a24..c22a2dc 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -170,6 +170,36 @@
#clock-cells = <1>;
};
 
+   uart0: serial@11002000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11002000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART0>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
+   uart1: serial@11003000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11003000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART1>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
+   uart2: serial@11004000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11004000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART2>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
audiosys: syscon@1122 {
compatible = "mediatek,mt8183-audiosys", "syscon";
reg = <0 0x1122 0 0x1000>;
-- 
1.9.1



[PATCH v4 00/10] Add basic and clock support for Mediatek MT8183 SoC

2018-07-30 Thread Erin Lo
MT8183 is a SoC based on 64bit ARMv8 architecture.
It contains 4 CA53 and 4 CA73 cores.
MT8183 share many HW IP with MT65xx series.
This patchset was tested on MT8183 evaluation board and use correct clock to 
shell.

This series contains document bindings, device tree including interrupt, uart, 
clock.

Based on v4.18-rc1 and https://patchwork.kernel.org/patch/10528515/
Composed of clock control (PATCH 5-8) and device tree (PATCH 9-10)

Change in v4:
1. Correct syntax error in dtsi
2. Add MT8183 clock support

Change in v3:
1. Fill out GICC, GICH, GICV regions
2. Update Copyright to 2018

Change in v2:
1. Split dt-bindings into different patches
2. Correct bindings for supported SoCs (mtk-uart.txt)

Ben Ho (1):
  arm64: dts: Add Mediatek SoC MT8183 and evaluation board dts and
Makefile

Erin Lo (3):
  dt-bindings: arm: Add bindings for Mediatek MT8183 SoC Platform
  dt-bindings: mtk-sysirq: Add compatible for Mediatek MT8183
  dt-bindings: serial: Add compatible for Mediatek MT8183

Weiyi Lu (6):
  dt-bindings: ARM: Mediatek: Document bindings for MT8183
  clk: mediatek: Add dt-bindings for MT8183 clocks
  clk: mediatek: Add flags support for mtk_gate data
  clk: mediatek: Add MT8183 clock support
  arm64: dts: mt8183: Add clock controller device nodes
  dts: arm64: mt8183: add uart node

 Documentation/devicetree/bindings/arm/mediatek.txt |4 +
 .../bindings/arm/mediatek/mediatek,apmixedsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,audsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,camsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,imgsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,infracfg.txt|1 +
 .../bindings/arm/mediatek/mediatek,ipu.txt |   43 +
 .../bindings/arm/mediatek/mediatek,mfgcfg.txt  |1 +
 .../bindings/arm/mediatek/mediatek,mmsys.txt   |1 +
 .../bindings/arm/mediatek/mediatek,topckgen.txt|1 +
 .../bindings/arm/mediatek/mediatek,vdecsys.txt |1 +
 .../bindings/arm/mediatek/mediatek,vencsys.txt |1 +
 .../interrupt-controller/mediatek,sysirq.txt   |1 +
 .../devicetree/bindings/serial/mtk-uart.txt|1 +
 arch/arm64/boot/dts/mediatek/Makefile  |1 +
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts|   31 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi   |  268 +
 drivers/clk/mediatek/Kconfig   |   74 ++
 drivers/clk/mediatek/Makefile  |   12 +
 drivers/clk/mediatek/clk-gate.c|5 +-
 drivers/clk/mediatek/clk-gate.h|3 +-
 drivers/clk/mediatek/clk-mt8183-audio.c|  112 ++
 drivers/clk/mediatek/clk-mt8183-cam.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-img.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-ipu0.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu1.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu_adl.c  |   66 ++
 drivers/clk/mediatek/clk-mt8183-ipu_conn.c |  155 +++
 drivers/clk/mediatek/clk-mt8183-mfgcfg.c   |   66 ++
 drivers/clk/mediatek/clk-mt8183-mm.c   |  128 ++
 drivers/clk/mediatek/clk-mt8183-vdec.c |   84 ++
 drivers/clk/mediatek/clk-mt8183-venc.c |   71 ++
 drivers/clk/mediatek/clk-mt8183.c  | 1230 
 drivers/clk/mediatek/clk-mtk.c |3 +-
 drivers/clk/mediatek/clk-mtk.h |1 +
 include/dt-bindings/clock/mt8183-clk.h |  413 +++
 36 files changed, 3064 insertions(+), 4 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183-evb.dts
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183.dtsi
 create mode 100644 drivers/clk/mediatek/clk-mt8183-audio.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-cam.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-img.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu0.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu1.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_adl.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_conn.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mfgcfg.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mm.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-vdec.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-venc.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183.c
 create mode 100644 include/dt-bindings/clock/mt8183-clk.h

--
1.9.1



[PATCH v4 04/10] dt-bindings: ARM: Mediatek: Document bindings for MT8183

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

This patch adds the binding documentation for apmixedsys, audiosys,
camsys, imgsys, infracfg, mfgcfg, mmsys, topckgen, vdecsys, vencsys
and ipu for Mediatek MT8183.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 .../bindings/arm/mediatek/mediatek,apmixedsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,audsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,camsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,imgsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,infracfg.txt|  1 +
 .../bindings/arm/mediatek/mediatek,ipu.txt | 43 ++
 .../bindings/arm/mediatek/mediatek,mfgcfg.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,mmsys.txt   |  1 +
 .../bindings/arm/mediatek/mediatek,topckgen.txt|  1 +
 .../bindings/arm/mediatek/mediatek,vdecsys.txt |  1 +
 .../bindings/arm/mediatek/mediatek,vencsys.txt |  1 +
 11 files changed, 53 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt

diff --git 
a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
index 44eaeac..fddcec8 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
@@ -13,6 +13,7 @@ Required Properties:
- "mediatek,mt7622-apmixedsys"
- "mediatek,mt8135-apmixedsys"
- "mediatek,mt8173-apmixedsys"
+   - "mediatek,mt8183-apmixedsys", "syscon"
 - #clock-cells: Must be 1
 
 The apmixedsys controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
index 9a8672a..63dcc82 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
@@ -9,6 +9,7 @@ Required Properties:
- "mediatek,mt2701-audsys", "syscon"
- "mediatek,mt6765-audsys", "syscon"
- "mediatek,mt7622-audsys", "syscon"
+   - "mediatek,mt8183-audiosys", "syscon"
 - #clock-cells: Must be 1
 
 The AUDSYS controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
index dc75783..918ccb6 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
@@ -7,6 +7,7 @@ Required Properties:
 
 - compatible: Should be one of:
- "mediatek,mt6765-camsys", "syscon"
+   - "mediatek,mt8183-camsys", "syscon"
 - #clock-cells: Must be 1
 
 The AUDSYS controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
index c7057d0..aeee5c8 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
@@ -11,6 +11,7 @@ Required Properties:
- "mediatek,mt6765-imgsys", "syscon"
- "mediatek,mt6797-imgsys", "syscon"
- "mediatek,mt8173-imgsys", "syscon"
+   - "mediatek,mt8183-imgsys", "syscon"
 - #clock-cells: Must be 1
 
 The imgsys controller uses the common clk binding from
diff --git 
a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
index ac6aae5..1b292ec 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
@@ -14,6 +14,7 @@ Required Properties:
- "mediatek,mt7622-infracfg", "syscon"
- "mediatek,mt8135-infracfg", "syscon"
- "mediatek,mt8173-infracfg", "syscon"
+   - "mediatek,mt8183-infracfg", "syscon"
 - #clock-cells: Must be 1
 - #reset-cells: Must be 1
 
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
new file mode 100644
index 000..aabc8c5
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
@@ -0,0 +1,43 @@
+Mediatek IPU controller
+
+
+The Mediatek ipu controller provides various clocks to the system.
+
+Required Properties:
+
+- compatible: Should be one of:
+   - "mediatek,mt8183-ipu_conn", "syscon"
+   - "mediatek,mt8183-ipu_adl", "syscon"
+   - "mediatek,mt8183-ipu_core0", "syscon"
+   - "mediatek,mt8183-ipu_core1", "syscon"
+- #clock-cells: Must be 1
+
+The ipu controller uses the common clk binding from
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+The available clocks are defined in dt-bindings/clock/mt*-clk.h.
+
+Example:

[PATCH v4 01/10] dt-bindings: arm: Add bindings for Mediatek MT8183 SoC Platform

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of cpu for Mediatek MT8183.

Signed-off-by: Erin Lo 
Reviewed-by: Rob Herring 
---
 Documentation/devicetree/bindings/arm/mediatek.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt 
b/Documentation/devicetree/bindings/arm/mediatek.txt
index 7d21ab3..2754535 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek.txt
@@ -19,6 +19,7 @@ compatible: Must contain one of
"mediatek,mt8127"
"mediatek,mt8135"
"mediatek,mt8173"
+   "mediatek,mt8183"
 
 
 Supported boards:
@@ -73,3 +74,6 @@ Supported boards:
 - MTK mt8173 tablet EVB:
 Required root node properties:
   - compatible = "mediatek,mt8173-evb", "mediatek,mt8173";
+- Evaluation board for MT8183:
+Required root node properties:
+  - compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
-- 
1.9.1



[PATCH v4 04/10] dt-bindings: ARM: Mediatek: Document bindings for MT8183

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

This patch adds the binding documentation for apmixedsys, audiosys,
camsys, imgsys, infracfg, mfgcfg, mmsys, topckgen, vdecsys, vencsys
and ipu for Mediatek MT8183.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 .../bindings/arm/mediatek/mediatek,apmixedsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,audsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,camsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,imgsys.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,infracfg.txt|  1 +
 .../bindings/arm/mediatek/mediatek,ipu.txt | 43 ++
 .../bindings/arm/mediatek/mediatek,mfgcfg.txt  |  1 +
 .../bindings/arm/mediatek/mediatek,mmsys.txt   |  1 +
 .../bindings/arm/mediatek/mediatek,topckgen.txt|  1 +
 .../bindings/arm/mediatek/mediatek,vdecsys.txt |  1 +
 .../bindings/arm/mediatek/mediatek,vencsys.txt |  1 +
 11 files changed, 53 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt

diff --git 
a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
index 44eaeac..fddcec8 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
@@ -13,6 +13,7 @@ Required Properties:
- "mediatek,mt7622-apmixedsys"
- "mediatek,mt8135-apmixedsys"
- "mediatek,mt8173-apmixedsys"
+   - "mediatek,mt8183-apmixedsys", "syscon"
 - #clock-cells: Must be 1
 
 The apmixedsys controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
index 9a8672a..63dcc82 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
@@ -9,6 +9,7 @@ Required Properties:
- "mediatek,mt2701-audsys", "syscon"
- "mediatek,mt6765-audsys", "syscon"
- "mediatek,mt7622-audsys", "syscon"
+   - "mediatek,mt8183-audiosys", "syscon"
 - #clock-cells: Must be 1
 
 The AUDSYS controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
index dc75783..918ccb6 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
@@ -7,6 +7,7 @@ Required Properties:
 
 - compatible: Should be one of:
- "mediatek,mt6765-camsys", "syscon"
+   - "mediatek,mt8183-camsys", "syscon"
 - #clock-cells: Must be 1
 
 The AUDSYS controller uses the common clk binding from
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
index c7057d0..aeee5c8 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
@@ -11,6 +11,7 @@ Required Properties:
- "mediatek,mt6765-imgsys", "syscon"
- "mediatek,mt6797-imgsys", "syscon"
- "mediatek,mt8173-imgsys", "syscon"
+   - "mediatek,mt8183-imgsys", "syscon"
 - #clock-cells: Must be 1
 
 The imgsys controller uses the common clk binding from
diff --git 
a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
index ac6aae5..1b292ec 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
@@ -14,6 +14,7 @@ Required Properties:
- "mediatek,mt7622-infracfg", "syscon"
- "mediatek,mt8135-infracfg", "syscon"
- "mediatek,mt8173-infracfg", "syscon"
+   - "mediatek,mt8183-infracfg", "syscon"
 - #clock-cells: Must be 1
 - #reset-cells: Must be 1
 
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt 
b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
new file mode 100644
index 000..aabc8c5
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
@@ -0,0 +1,43 @@
+Mediatek IPU controller
+
+
+The Mediatek ipu controller provides various clocks to the system.
+
+Required Properties:
+
+- compatible: Should be one of:
+   - "mediatek,mt8183-ipu_conn", "syscon"
+   - "mediatek,mt8183-ipu_adl", "syscon"
+   - "mediatek,mt8183-ipu_core0", "syscon"
+   - "mediatek,mt8183-ipu_core1", "syscon"
+- #clock-cells: Must be 1
+
+The ipu controller uses the common clk binding from
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+The available clocks are defined in dt-bindings/clock/mt*-clk.h.
+
+Example:

[PATCH v4 02/10] dt-bindings: mtk-sysirq: Add compatible for Mediatek MT8183

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of SYSIRQ for Mediatek MT8183 SoC
Platform.

Signed-off-by: Erin Lo 
Acked-by: Rob Herring 
---
 .../devicetree/bindings/interrupt-controller/mediatek,sysirq.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt 
b/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
index 07bf0b9..5ff48a8 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/mediatek,sysirq.txt
@@ -5,6 +5,7 @@ interrupt.
 
 Required properties:
 - compatible: should be
+   "mediatek,mt8183-sysirq", "mediatek,mt6577-sysirq": for MT8183
"mediatek,mt8173-sysirq", "mediatek,mt6577-sysirq": for MT8173
"mediatek,mt8135-sysirq", "mediatek,mt6577-sysirq": for MT8135
"mediatek,mt8127-sysirq", "mediatek,mt6577-sysirq": for MT8127
-- 
1.9.1



[PATCH v4 07/10] clk: mediatek: Add MT8183 clock support

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add MT8183 clock support, include topckgen, apmixedsys,
infracfg and subsystem clocks.

Signed-off-by: Weiyi Lu 
Signed-off-by: Erin Lo 
---
 drivers/clk/mediatek/Kconfig   |   74 ++
 drivers/clk/mediatek/Makefile  |   12 +
 drivers/clk/mediatek/clk-mt8183-audio.c|  112 +++
 drivers/clk/mediatek/clk-mt8183-cam.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-img.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-ipu0.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu1.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu_adl.c  |   66 ++
 drivers/clk/mediatek/clk-mt8183-ipu_conn.c |  155 
 drivers/clk/mediatek/clk-mt8183-mfgcfg.c   |   66 ++
 drivers/clk/mediatek/clk-mt8183-mm.c   |  128 +++
 drivers/clk/mediatek/clk-mt8183-vdec.c |   84 ++
 drivers/clk/mediatek/clk-mt8183-venc.c |   71 ++
 drivers/clk/mediatek/clk-mt8183.c  | 1230 
 14 files changed, 2284 insertions(+)
 create mode 100644 drivers/clk/mediatek/clk-mt8183-audio.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-cam.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-img.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu0.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu1.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_adl.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_conn.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mfgcfg.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mm.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-vdec.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-venc.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183.c

diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig
index 95e5e52..e70c164 100644
--- a/drivers/clk/mediatek/Kconfig
+++ b/drivers/clk/mediatek/Kconfig
@@ -194,6 +194,80 @@ config COMMON_CLK_MT8173
---help---
  This driver supports MediaTek MT8173 clocks.
 
+config COMMON_CLK_MT8183
+   bool "Clock driver for MediaTek MT8183"
+   depends on (ARCH_MEDIATEK && ARM64) || COMPILE_TEST
+   select COMMON_CLK_MEDIATEK
+   default ARCH_MEDIATEK && ARM64
+   help
+ This driver supports MediaTek MT8183 basic clocks.
+
+config COMMON_CLK_MT8183_AUDIOSYS
+   bool "Clock driver for MediaTek MT8183 audiosys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 audiosys clocks.
+
+config COMMON_CLK_MT8183_CAMSYS
+   bool "Clock driver for MediaTek MT8183 camsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 camsys clocks.
+
+config COMMON_CLK_MT8183_IMGSYS
+   bool "Clock driver for MediaTek MT8183 imgsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 imgsys clocks.
+
+config COMMON_CLK_MT8183_IPU_CORE0
+   bool "Clock driver for MediaTek MT8183 ipu_core0"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_core0 clocks.
+
+config COMMON_CLK_MT8183_IPU_CORE1
+   bool "Clock driver for MediaTek MT8183 ipu_core1"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_core1 clocks.
+
+config COMMON_CLK_MT8183_IPU_ADL
+   bool "Clock driver for MediaTek MT8183 ipu_adl"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_adl clocks.
+
+config COMMON_CLK_MT8183_IPU_CONN
+   bool "Clock driver for MediaTek MT8183 ipu_conn"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 ipu_conn clocks.
+
+config COMMON_CLK_MT8183_MFGCFG
+   bool "Clock driver for MediaTek MT8183 mfgcfg"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 mfgcfg clocks.
+
+config COMMON_CLK_MT8183_MMSYS
+   bool "Clock driver for MediaTek MT8183 mmsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 mmsys clocks.
+
+config COMMON_CLK_MT8183_VDECSYS
+   bool "Clock driver for MediaTek MT8183 vdecsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 vdecsys clocks.
+
+config COMMON_CLK_MT8183_VENCSYS
+   bool "Clock driver for MediaTek MT8183 vencsys"
+   depends on COMMON_CLK_MT8183
+   help
+ This driver supports MediaTek MT8183 vencsys clocks.
+
 config COMMON_CLK_MT6765
bool "Clock driver for MediaTek MT6765"
depends on (ARCH_MEDIATEK && ARM64) || COMPILE_TEST
diff --git a/drivers/clk/mediatek/Makefile b/drivers/clk/mediatek/Makefile
index b455a8e..13e6919 100644
--- a/drivers/clk/mediatek/Makefile
+++ b/drivers/clk/mediatek/Makefile
@@ -35,3 +35,15 @@ obj-$(CONFIG_COMMON_CLK_MT7622_HIFSYS) += clk-mt7622-hif.o
 obj-$(CONFIG_COMMON_CLK_MT7622_AUDSYS) += clk-mt7622-aud.o
 

[PATCH v4 10/10] dts: arm64: mt8183: add uart node

2018-07-30 Thread Erin Lo
From: Weiyi Lu 

Add uart node with correct uart clocks.

Signed-off-by: Erin Lo 
Signed-off-by: Weiyi Lu 
---
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts |  8 
 arch/arm64/boot/dts/mediatek/mt8183.dtsi| 30 +
 2 files changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
index 2a3dd5a..9b52559 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -12,6 +12,10 @@
model = "MediaTek MT8183 evaluation board";
compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
 
+   aliases {
+   serial0 = 
+   };
+
memory@4000 {
device_type = "memory";
reg = <0 0x4000 0 0x8000>;
@@ -21,3 +25,7 @@
stdout-path = "serial0:921600n8";
};
 };
+
+ {
+   status = "okay";
+};
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 6b87a24..c22a2dc 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -170,6 +170,36 @@
#clock-cells = <1>;
};
 
+   uart0: serial@11002000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11002000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART0>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
+   uart1: serial@11003000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11003000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART1>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
+   uart2: serial@11004000 {
+   compatible = "mediatek,mt8183-uart",
+"mediatek,mt6577-uart";
+   reg = <0 0x11004000 0 0x1000>;
+   interrupts = ;
+   clocks = <>, < CLK_INFRA_UART2>;
+   clock-names = "baud", "bus";
+   status = "disabled";
+   };
+
audiosys: syscon@1122 {
compatible = "mediatek,mt8183-audiosys", "syscon";
reg = <0 0x1122 0 0x1000>;
-- 
1.9.1



[PATCH v4 00/10] Add basic and clock support for Mediatek MT8183 SoC

2018-07-30 Thread Erin Lo
MT8183 is a SoC based on 64bit ARMv8 architecture.
It contains 4 CA53 and 4 CA73 cores.
MT8183 share many HW IP with MT65xx series.
This patchset was tested on MT8183 evaluation board and use correct clock to 
shell.

This series contains document bindings, device tree including interrupt, uart, 
clock.

Based on v4.18-rc1 and https://patchwork.kernel.org/patch/10528515/
Composed of clock control (PATCH 5-8) and device tree (PATCH 9-10)

Change in v4:
1. Correct syntax error in dtsi
2. Add MT8183 clock support

Change in v3:
1. Fill out GICC, GICH, GICV regions
2. Update Copyright to 2018

Change in v2:
1. Split dt-bindings into different patches
2. Correct bindings for supported SoCs (mtk-uart.txt)

Ben Ho (1):
  arm64: dts: Add Mediatek SoC MT8183 and evaluation board dts and
Makefile

Erin Lo (3):
  dt-bindings: arm: Add bindings for Mediatek MT8183 SoC Platform
  dt-bindings: mtk-sysirq: Add compatible for Mediatek MT8183
  dt-bindings: serial: Add compatible for Mediatek MT8183

Weiyi Lu (6):
  dt-bindings: ARM: Mediatek: Document bindings for MT8183
  clk: mediatek: Add dt-bindings for MT8183 clocks
  clk: mediatek: Add flags support for mtk_gate data
  clk: mediatek: Add MT8183 clock support
  arm64: dts: mt8183: Add clock controller device nodes
  dts: arm64: mt8183: add uart node

 Documentation/devicetree/bindings/arm/mediatek.txt |4 +
 .../bindings/arm/mediatek/mediatek,apmixedsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,audsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,camsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,imgsys.txt  |1 +
 .../bindings/arm/mediatek/mediatek,infracfg.txt|1 +
 .../bindings/arm/mediatek/mediatek,ipu.txt |   43 +
 .../bindings/arm/mediatek/mediatek,mfgcfg.txt  |1 +
 .../bindings/arm/mediatek/mediatek,mmsys.txt   |1 +
 .../bindings/arm/mediatek/mediatek,topckgen.txt|1 +
 .../bindings/arm/mediatek/mediatek,vdecsys.txt |1 +
 .../bindings/arm/mediatek/mediatek,vencsys.txt |1 +
 .../interrupt-controller/mediatek,sysirq.txt   |1 +
 .../devicetree/bindings/serial/mtk-uart.txt|1 +
 arch/arm64/boot/dts/mediatek/Makefile  |1 +
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts|   31 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi   |  268 +
 drivers/clk/mediatek/Kconfig   |   74 ++
 drivers/clk/mediatek/Makefile  |   12 +
 drivers/clk/mediatek/clk-gate.c|5 +-
 drivers/clk/mediatek/clk-gate.h|3 +-
 drivers/clk/mediatek/clk-mt8183-audio.c|  112 ++
 drivers/clk/mediatek/clk-mt8183-cam.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-img.c  |   75 ++
 drivers/clk/mediatek/clk-mt8183-ipu0.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu1.c |   68 ++
 drivers/clk/mediatek/clk-mt8183-ipu_adl.c  |   66 ++
 drivers/clk/mediatek/clk-mt8183-ipu_conn.c |  155 +++
 drivers/clk/mediatek/clk-mt8183-mfgcfg.c   |   66 ++
 drivers/clk/mediatek/clk-mt8183-mm.c   |  128 ++
 drivers/clk/mediatek/clk-mt8183-vdec.c |   84 ++
 drivers/clk/mediatek/clk-mt8183-venc.c |   71 ++
 drivers/clk/mediatek/clk-mt8183.c  | 1230 
 drivers/clk/mediatek/clk-mtk.c |3 +-
 drivers/clk/mediatek/clk-mtk.h |1 +
 include/dt-bindings/clock/mt8183-clk.h |  413 +++
 36 files changed, 3064 insertions(+), 4 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/arm/mediatek/mediatek,ipu.txt
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183-evb.dts
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8183.dtsi
 create mode 100644 drivers/clk/mediatek/clk-mt8183-audio.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-cam.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-img.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu0.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu1.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_adl.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-ipu_conn.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mfgcfg.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-mm.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-vdec.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183-venc.c
 create mode 100644 drivers/clk/mediatek/clk-mt8183.c
 create mode 100644 include/dt-bindings/clock/mt8183-clk.h

--
1.9.1



[PATCH v4 01/10] dt-bindings: arm: Add bindings for Mediatek MT8183 SoC Platform

2018-07-30 Thread Erin Lo
This adds dt-binding documentation of cpu for Mediatek MT8183.

Signed-off-by: Erin Lo 
Reviewed-by: Rob Herring 
---
 Documentation/devicetree/bindings/arm/mediatek.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt 
b/Documentation/devicetree/bindings/arm/mediatek.txt
index 7d21ab3..2754535 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek.txt
@@ -19,6 +19,7 @@ compatible: Must contain one of
"mediatek,mt8127"
"mediatek,mt8135"
"mediatek,mt8173"
+   "mediatek,mt8183"
 
 
 Supported boards:
@@ -73,3 +74,6 @@ Supported boards:
 - MTK mt8173 tablet EVB:
 Required root node properties:
   - compatible = "mediatek,mt8173-evb", "mediatek,mt8173";
+- Evaluation board for MT8183:
+Required root node properties:
+  - compatible = "mediatek,mt8183-evb", "mediatek,mt8183";
-- 
1.9.1



Contact my secretary in Burkina-Faso

2018-07-30 Thread Dr. Gilmore Carson
Dear Friend,

Good day, this is Mr. Henri Zongo, I'm happy to inform you about my success in 
getting the fund $29.6Million transferred under the co-operation of a new 
partner from Dubai, Presently I'm  in Dubai for investment projects with my own 
share of the total sum. Meanwhile,I didn't forget your past efforts and 
attempts to assist me in transferring those funds despite that it failed us 
some how. 

Now contact my secretary in Burkina-Faso, his name is Mr. Leonard Kabore and 
his email address is (leonard_kab...@outlook.com) and ask him to send you VISA 
ATM CARD which has loaded the sum of USD$1,000,000.00 (One Million United 
States Dollars issued in your name by our Bank (BOA) headquarter, this is for 
your compensation for all the past efforts and attempts to assist me in this 
matter. Note, you will withdraw your money in an ATM MACHINE in any part of the 
world, but the maximum is ($5,000) Fifteen Thousand Us Dollars in three 
transactions per day.

So feel free and get in touched with my secretary and instruct him where to 
send the VISA ATM CARD to you. Please do let me know immediately you receive 
the ATM CARD so that we can share the joy. Please my main reason to compensate 
you with this amount is for you to keep your mouth shut, do not allow anybody 
to know about this deal ever, let it to be sealed between us because I don't 
want to have any problem in the future please!.

This is how we contributed for your compensation, I contributed $600,000 
dollars out of my own share while my new partner contributed, $400,000 dollars 
to make the total of $1,000,000.00 USD,please let the deal be sealed ok! delete 
every message you received so far concerning this deal in your mailbox.

In the moment, I am very busy here because of the investment projects which me 
and the new partner are having at hand, finally, remember that I had forwarded 
instruction to my secretary on your behalf to receive that VISA ATM CARD, so 
feel free to get in touch with him and he will help you to send it to your 
address. I will be traveling to Qatar this evening with my partner for an 
investment project. 

Regards,
Mr. Henri Zongo.

Contact my secretary in Burkina-Faso


Contact my secretary in Burkina-Faso

2018-07-30 Thread Dr. Gilmore Carson
Dear Friend,

Good day, this is Mr. Henri Zongo, I'm happy to inform you about my success in 
getting the fund $29.6Million transferred under the co-operation of a new 
partner from Dubai, Presently I'm  in Dubai for investment projects with my own 
share of the total sum. Meanwhile,I didn't forget your past efforts and 
attempts to assist me in transferring those funds despite that it failed us 
some how. 

Now contact my secretary in Burkina-Faso, his name is Mr. Leonard Kabore and 
his email address is (leonard_kab...@outlook.com) and ask him to send you VISA 
ATM CARD which has loaded the sum of USD$1,000,000.00 (One Million United 
States Dollars issued in your name by our Bank (BOA) headquarter, this is for 
your compensation for all the past efforts and attempts to assist me in this 
matter. Note, you will withdraw your money in an ATM MACHINE in any part of the 
world, but the maximum is ($5,000) Fifteen Thousand Us Dollars in three 
transactions per day.

So feel free and get in touched with my secretary and instruct him where to 
send the VISA ATM CARD to you. Please do let me know immediately you receive 
the ATM CARD so that we can share the joy. Please my main reason to compensate 
you with this amount is for you to keep your mouth shut, do not allow anybody 
to know about this deal ever, let it to be sealed between us because I don't 
want to have any problem in the future please!.

This is how we contributed for your compensation, I contributed $600,000 
dollars out of my own share while my new partner contributed, $400,000 dollars 
to make the total of $1,000,000.00 USD,please let the deal be sealed ok! delete 
every message you received so far concerning this deal in your mailbox.

In the moment, I am very busy here because of the investment projects which me 
and the new partner are having at hand, finally, remember that I had forwarded 
instruction to my secretary on your behalf to receive that VISA ATM CARD, so 
feel free to get in touch with him and he will help you to send it to your 
address. I will be traveling to Qatar this evening with my partner for an 
investment project. 

Regards,
Mr. Henri Zongo.

Contact my secretary in Burkina-Faso


Re: [alsa-devel] [PATCH] ASoC: soc-pcm: Use delay set in pointer function

2018-07-30 Thread Takashi Iwai
On Tue, 31 Jul 2018 03:25:06 +0200,
Agrawal, Akshu wrote:
> 
> 
> 
> On 7/30/2018 9:20 PM, Mark Brown wrote:
> > On Mon, Jul 30, 2018 at 05:32:21PM +0200, Takashi Iwai wrote:
> > 
> >> That said, if delay callback of CPU dai provides the additional delay,
> >> the patch does correct thing.  OTOH, if CPU dai provides the base
> >> delay instead, we need to clarify that it's rather a must; the delay
> >> calculation in pointer callback becomes bogus in this scenario.
> > 
> > Part of the theory here is that every component might have a delay
> > independently of the rest and we need to add them all together to figure
> > out what the system as a whole will see.  Personally I'd rather just
> > have everything use a callack consistently to avoid confusion.
> > 
> 
> For consistency we can add a delay callback in snd_pcm_ops and modify
> the drivers which directly assigning runtime->delay to use the callback.

No, ALSA PCM ops definition is fine.  The delay calculation is
basically tied with the position, hence it has to be set together, and
that's the pointer callback.

Judging from the call pattern, the current design of ASoC delay
callback implies that the return value is more or less constant, which
can be accumulated on top of the base value.  So your patch is natural
from that POV.

OTOH, if the CPU dai can really provide a dynamic value that is
strictly tied with pointer, CPU dai itself should provide the pointer
callback that covers both the pointer and the base delay, and it
should be used instead of component pointer callback.

> Apart from the 2 drivers mentioned in commit message I also found
> sound/usb to be doing the same and its delay getting lost.

The USB driver hasn't been used in ASoC, no?


thanks,

Takashi


Re: [alsa-devel] [PATCH] ASoC: soc-pcm: Use delay set in pointer function

2018-07-30 Thread Takashi Iwai
On Tue, 31 Jul 2018 03:25:06 +0200,
Agrawal, Akshu wrote:
> 
> 
> 
> On 7/30/2018 9:20 PM, Mark Brown wrote:
> > On Mon, Jul 30, 2018 at 05:32:21PM +0200, Takashi Iwai wrote:
> > 
> >> That said, if delay callback of CPU dai provides the additional delay,
> >> the patch does correct thing.  OTOH, if CPU dai provides the base
> >> delay instead, we need to clarify that it's rather a must; the delay
> >> calculation in pointer callback becomes bogus in this scenario.
> > 
> > Part of the theory here is that every component might have a delay
> > independently of the rest and we need to add them all together to figure
> > out what the system as a whole will see.  Personally I'd rather just
> > have everything use a callack consistently to avoid confusion.
> > 
> 
> For consistency we can add a delay callback in snd_pcm_ops and modify
> the drivers which directly assigning runtime->delay to use the callback.

No, ALSA PCM ops definition is fine.  The delay calculation is
basically tied with the position, hence it has to be set together, and
that's the pointer callback.

Judging from the call pattern, the current design of ASoC delay
callback implies that the return value is more or less constant, which
can be accumulated on top of the base value.  So your patch is natural
from that POV.

OTOH, if the CPU dai can really provide a dynamic value that is
strictly tied with pointer, CPU dai itself should provide the pointer
callback that covers both the pointer and the base delay, and it
should be used instead of component pointer callback.

> Apart from the 2 drivers mentioned in commit message I also found
> sound/usb to be doing the same and its delay getting lost.

The USB driver hasn't been used in ASoC, no?


thanks,

Takashi


Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Byungchul Park
On Tue, Jul 31, 2018 at 09:37:50AM +0800, Huang, Ying wrote:
> Byungchul Park  writes:
> 
> > Hello folks,
> >
> > I'm careful in saying.. and curious about..
> >
> > In restrictive cases like only addtions happen but never deletion, can't
> > we safely traverse a llist? I believe llist can be more useful if we can
> > release the restriction. Can't we?
> >
> > If yes, we may add another function traversing starting from a head. Or
> > just use existing funtion with head->first.
> >
> > Thank a lot for your answers in advance :)
> 
> What's the use case?  I don't know how it is useful that items are never
> deleted from the llist.
> 
> Some other locks could be used to provide mutual exclusive between
> 
> - llist add, llist traverse

Hello Huang,

In my use case, I only do adding and traversing on a llist.

> 
> and
> 
> - llist delete

Of course, I will use a lock when deletion is needed.

So.. in the case only adding into and traversing a llist is needed,
can't we safely traverse a llist in the way I thought? Or am I missing
something?

Thank you.

> Is this your use case?
> 
> Best Regards,
> Huang, Ying


Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Byungchul Park
On Tue, Jul 31, 2018 at 09:37:50AM +0800, Huang, Ying wrote:
> Byungchul Park  writes:
> 
> > Hello folks,
> >
> > I'm careful in saying.. and curious about..
> >
> > In restrictive cases like only addtions happen but never deletion, can't
> > we safely traverse a llist? I believe llist can be more useful if we can
> > release the restriction. Can't we?
> >
> > If yes, we may add another function traversing starting from a head. Or
> > just use existing funtion with head->first.
> >
> > Thank a lot for your answers in advance :)
> 
> What's the use case?  I don't know how it is useful that items are never
> deleted from the llist.
> 
> Some other locks could be used to provide mutual exclusive between
> 
> - llist add, llist traverse

Hello Huang,

In my use case, I only do adding and traversing on a llist.

> 
> and
> 
> - llist delete

Of course, I will use a lock when deletion is needed.

So.. in the case only adding into and traversing a llist is needed,
can't we safely traverse a llist in the way I thought? Or am I missing
something?

Thank you.

> Is this your use case?
> 
> Best Regards,
> Huang, Ying


Re: [PATCH] clk: scmi: Fix the rounding of clock rate

2018-07-30 Thread Amit Daniel Kachhap
On Mon, Jul 30, 2018 at 5:10 PM, Sudeep Holla  wrote:
> On Mon, Jul 30, 2018 at 11:03:51AM +0530, Amit Daniel Kachhap wrote:
>> Hi,
>>
>> On Fri, Jul 27, 2018 at 10:07 PM, Stephen Boyd  wrote:
>> > Quoting Amit Daniel Kachhap (2018-07-27 07:01:52)
>> >> This fix rounds the clock rate properly by using quotient and not
>> >> remainder in the calculation. This issue was found while testing HDMI
>> >> in the Juno platform.
>> >>
>> >> Signed-off-by: Amit Daniel Kachhap 
>> >
>> > Any Fixes: tag here?
>> Yes, This patch is tested with Linux v4.18-rc6 tag.
>>
>
> No Stephen meant the commit that this fixes, something like below:
>
> Fixes: 6d6a1d82eaef ("clk: add support for clocks provided by SCMI")
>
> so that it can get backported if required.

ok my mistake. Thanks for the clarification.

>
> --
> Regards,
> Sudeep


Re: [PATCH] clk: scmi: Fix the rounding of clock rate

2018-07-30 Thread Amit Daniel Kachhap
On Mon, Jul 30, 2018 at 5:10 PM, Sudeep Holla  wrote:
> On Mon, Jul 30, 2018 at 11:03:51AM +0530, Amit Daniel Kachhap wrote:
>> Hi,
>>
>> On Fri, Jul 27, 2018 at 10:07 PM, Stephen Boyd  wrote:
>> > Quoting Amit Daniel Kachhap (2018-07-27 07:01:52)
>> >> This fix rounds the clock rate properly by using quotient and not
>> >> remainder in the calculation. This issue was found while testing HDMI
>> >> in the Juno platform.
>> >>
>> >> Signed-off-by: Amit Daniel Kachhap 
>> >
>> > Any Fixes: tag here?
>> Yes, This patch is tested with Linux v4.18-rc6 tag.
>>
>
> No Stephen meant the commit that this fixes, something like below:
>
> Fixes: 6d6a1d82eaef ("clk: add support for clocks provided by SCMI")
>
> so that it can get backported if required.

ok my mistake. Thanks for the clarification.

>
> --
> Regards,
> Sudeep


linux-next: manual merge of the mux tree with the battery tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the mux tree got a conflict in:

  MAINTAINERS

between commit:

  fe8e81b7e899 ("adp5061: New driver for ADP5061 I2C battery charger")

from the battery tree and commit:

  703160ff3e50 ("dt-bindings: mux: add adi,adgs1408")

from the mux tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc MAINTAINERS
index db24b8939ed4,eaa2b55a0e9b..
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@@ -829,13 -810,12 +829,19 @@@ L:  linux-me...@vger.kernel.or
  S:Maintained
  F:drivers/media/i2c/ad9389b*
  
+ ANALOG DEVICES INC ADGS1408 DRIVER
+ M:Mircea Caprioru 
+ S:Supported
+ F:drivers/mux/adgs1408.c
+ F:Documentation/devicetree/bindings/mux/adgs1408.txt
+ 
 +ANALOG DEVICES INC ADP5061 DRIVER
 +M:Stefan Popa 
 +L:linux...@vger.kernel.org
 +W:http://ez.analog.com/community/linux-device-drivers
 +S:Supported
 +F:drivers/power/supply/adp5061.c
 +
  ANALOG DEVICES INC ADV7180 DRIVER
  M:Lars-Peter Clausen 
  L:linux-me...@vger.kernel.org


pgpHFmoGP9bOt.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the mux tree with the battery tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the mux tree got a conflict in:

  MAINTAINERS

between commit:

  fe8e81b7e899 ("adp5061: New driver for ADP5061 I2C battery charger")

from the battery tree and commit:

  703160ff3e50 ("dt-bindings: mux: add adi,adgs1408")

from the mux tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc MAINTAINERS
index db24b8939ed4,eaa2b55a0e9b..
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@@ -829,13 -810,12 +829,19 @@@ L:  linux-me...@vger.kernel.or
  S:Maintained
  F:drivers/media/i2c/ad9389b*
  
+ ANALOG DEVICES INC ADGS1408 DRIVER
+ M:Mircea Caprioru 
+ S:Supported
+ F:drivers/mux/adgs1408.c
+ F:Documentation/devicetree/bindings/mux/adgs1408.txt
+ 
 +ANALOG DEVICES INC ADP5061 DRIVER
 +M:Stefan Popa 
 +L:linux...@vger.kernel.org
 +W:http://ez.analog.com/community/linux-device-drivers
 +S:Supported
 +F:drivers/power/supply/adp5061.c
 +
  ANALOG DEVICES INC ADV7180 DRIVER
  M:Lars-Peter Clausen 
  L:linux-me...@vger.kernel.org


pgpHFmoGP9bOt.pgp
Description: OpenPGP digital signature


[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices

2018-07-30 Thread tip-bot for Kan Liang
Commit-ID:  99811294b063eb44185df9a58923928fccdbe122
Gitweb: https://git.kernel.org/tip/99811294b063eb44185df9a58923928fccdbe122
Author: Kan Liang 
AuthorDate: Mon, 30 Jul 2018 08:28:08 -0400
Committer:  Ingo Molnar 
CommitDate: Mon, 30 Jul 2018 20:13:58 +0200

perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices

Masayoshi Mizuma reported that a warning message is shown while a CPU is
hot-removed on Broadwell servers:

  WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988
  uncore_pci_remove+0x10b/0x150
  Call Trace:
   pci_device_remove+0x42/0xd0
   device_release_driver_internal+0x148/0x220
   pci_stop_bus_device+0x76/0xa0
   pci_stop_root_bus+0x44/0x60
   acpi_pci_root_remove+0x1f/0x80
   acpi_bus_trim+0x57/0x90
   acpi_bus_trim+0x2e/0x90
   acpi_device_hotplug+0x2bc/0x4b0
   acpi_hotplug_work_fn+0x1a/0x30
   process_one_work+0x174/0x3a0
   worker_thread+0x4c/0x3d0
   kthread+0xf8/0x130

This bug was introduced by:

  commit 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell 
CPUs")

The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the
index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well.

To fix the conflict, the hardcoded index needs to be cleaned up:

 - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2
   filter" on Broadwell,
 - increase UNCORE_EXTRA_PCI_DEV_MAX by one,
 - clean up the hardcoded index.

Debugged-by: Masayoshi Mizuma 
Suggested-by: Ingo Molnar 
Reported-by: Masayoshi Mizuma 
Tested-by: Masayoshi Mizuma 
Signed-off-by: Kan Liang 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: msys.miz...@gmail.com
Cc: sta...@vger.kernel.org
Fixes: 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell 
CPUs")
Link: 
http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.li...@linux.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/uncore.h   |  2 +-
 arch/x86/events/intel/uncore_snbep.c | 10 +++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index c9e1e0bef3c3..e17ab885b1e9 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -28,7 +28,7 @@
 #define UNCORE_PCI_DEV_TYPE(data)  ((data >> 8) & 0xff)
 #define UNCORE_PCI_DEV_IDX(data)   (data & 0xff)
 #define UNCORE_EXTRA_PCI_DEV   0xff
-#define UNCORE_EXTRA_PCI_DEV_MAX   3
+#define UNCORE_EXTRA_PCI_DEV_MAX   4
 
 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
 
diff --git a/arch/x86/events/intel/uncore_snbep.c 
b/arch/x86/events/intel/uncore_snbep.c
index 87dc0263a2e1..51d7c117e3c7 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1029,6 +1029,7 @@ void snbep_uncore_cpu_init(void)
 enum {
SNBEP_PCI_QPI_PORT0_FILTER,
SNBEP_PCI_QPI_PORT1_FILTER,
+   BDX_PCI_QPI_PORT2_FILTER,
HSWEP_PCI_PCU_3,
 };
 
@@ -3286,15 +3287,18 @@ static const struct pci_device_id bdx_uncore_pci_ids[] 
= {
},
{ /* QPI Port 0 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f86),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 0),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  SNBEP_PCI_QPI_PORT0_FILTER),
},
{ /* QPI Port 1 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f96),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 1),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  SNBEP_PCI_QPI_PORT1_FILTER),
},
{ /* QPI Port 2 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f46),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 2),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  BDX_PCI_QPI_PORT2_FILTER),
},
{ /* PCU.3 (for Capability registers) */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6fc0),


Re: [RFC] blk-mq: clean up the hctx restart

2018-07-30 Thread jianchao.wang
Hi Ming

On 07/31/2018 12:58 PM, Ming Lei wrote:
> On Tue, Jul 31, 2018 at 12:02:15PM +0800, Jianchao Wang wrote:
>> Currently, we will always set SCHED_RESTART whenever there are
>> requests in hctx->dispatch, then when request is completed and
>> freed the hctx queues will be restarted to avoid IO hang. This
>> is unnecessary most of time. Especially when there are lots of
>> LUNs attached to one host, the RR restart loop could be very
>> expensive.
> 
> The big RR restart loop has been killed in the following commit:
> 
> commit 97889f9ac24f8d2fc8e703ea7f80c162bab10d4d
> Author: Ming Lei 
> Date:   Mon Jun 25 19:31:48 2018 +0800
> 
> blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()
> 
> 

Oh, sorry, I didn't look into this patch due to its title when iterated the 
mail list,
therefore I didn't realize the RR restart loop has already been killed. :)

The RR restart loop could ensure the fairness of sharing some LLDD resource,
not just avoid IO hung. Is it OK to kill it totally ?

Thanks
Jianchao



[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices

2018-07-30 Thread tip-bot for Kan Liang
Commit-ID:  99811294b063eb44185df9a58923928fccdbe122
Gitweb: https://git.kernel.org/tip/99811294b063eb44185df9a58923928fccdbe122
Author: Kan Liang 
AuthorDate: Mon, 30 Jul 2018 08:28:08 -0400
Committer:  Ingo Molnar 
CommitDate: Mon, 30 Jul 2018 20:13:58 +0200

perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices

Masayoshi Mizuma reported that a warning message is shown while a CPU is
hot-removed on Broadwell servers:

  WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988
  uncore_pci_remove+0x10b/0x150
  Call Trace:
   pci_device_remove+0x42/0xd0
   device_release_driver_internal+0x148/0x220
   pci_stop_bus_device+0x76/0xa0
   pci_stop_root_bus+0x44/0x60
   acpi_pci_root_remove+0x1f/0x80
   acpi_bus_trim+0x57/0x90
   acpi_bus_trim+0x2e/0x90
   acpi_device_hotplug+0x2bc/0x4b0
   acpi_hotplug_work_fn+0x1a/0x30
   process_one_work+0x174/0x3a0
   worker_thread+0x4c/0x3d0
   kthread+0xf8/0x130

This bug was introduced by:

  commit 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell 
CPUs")

The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the
index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well.

To fix the conflict, the hardcoded index needs to be cleaned up:

 - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2
   filter" on Broadwell,
 - increase UNCORE_EXTRA_PCI_DEV_MAX by one,
 - clean up the hardcoded index.

Debugged-by: Masayoshi Mizuma 
Suggested-by: Ingo Molnar 
Reported-by: Masayoshi Mizuma 
Tested-by: Masayoshi Mizuma 
Signed-off-by: Kan Liang 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: msys.miz...@gmail.com
Cc: sta...@vger.kernel.org
Fixes: 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell 
CPUs")
Link: 
http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.li...@linux.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/uncore.h   |  2 +-
 arch/x86/events/intel/uncore_snbep.c | 10 +++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index c9e1e0bef3c3..e17ab885b1e9 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -28,7 +28,7 @@
 #define UNCORE_PCI_DEV_TYPE(data)  ((data >> 8) & 0xff)
 #define UNCORE_PCI_DEV_IDX(data)   (data & 0xff)
 #define UNCORE_EXTRA_PCI_DEV   0xff
-#define UNCORE_EXTRA_PCI_DEV_MAX   3
+#define UNCORE_EXTRA_PCI_DEV_MAX   4
 
 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
 
diff --git a/arch/x86/events/intel/uncore_snbep.c 
b/arch/x86/events/intel/uncore_snbep.c
index 87dc0263a2e1..51d7c117e3c7 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1029,6 +1029,7 @@ void snbep_uncore_cpu_init(void)
 enum {
SNBEP_PCI_QPI_PORT0_FILTER,
SNBEP_PCI_QPI_PORT1_FILTER,
+   BDX_PCI_QPI_PORT2_FILTER,
HSWEP_PCI_PCU_3,
 };
 
@@ -3286,15 +3287,18 @@ static const struct pci_device_id bdx_uncore_pci_ids[] 
= {
},
{ /* QPI Port 0 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f86),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 0),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  SNBEP_PCI_QPI_PORT0_FILTER),
},
{ /* QPI Port 1 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f96),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 1),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  SNBEP_PCI_QPI_PORT1_FILTER),
},
{ /* QPI Port 2 filter  */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f46),
-   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 2),
+   .driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
+  BDX_PCI_QPI_PORT2_FILTER),
},
{ /* PCU.3 (for Capability registers) */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6fc0),


Re: [RFC] blk-mq: clean up the hctx restart

2018-07-30 Thread jianchao.wang
Hi Ming

On 07/31/2018 12:58 PM, Ming Lei wrote:
> On Tue, Jul 31, 2018 at 12:02:15PM +0800, Jianchao Wang wrote:
>> Currently, we will always set SCHED_RESTART whenever there are
>> requests in hctx->dispatch, then when request is completed and
>> freed the hctx queues will be restarted to avoid IO hang. This
>> is unnecessary most of time. Especially when there are lots of
>> LUNs attached to one host, the RR restart loop could be very
>> expensive.
> 
> The big RR restart loop has been killed in the following commit:
> 
> commit 97889f9ac24f8d2fc8e703ea7f80c162bab10d4d
> Author: Ming Lei 
> Date:   Mon Jun 25 19:31:48 2018 +0800
> 
> blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()
> 
> 

Oh, sorry, I didn't look into this patch due to its title when iterated the 
mail list,
therefore I didn't realize the RR restart loop has already been killed. :)

The RR restart loop could ensure the fairness of sharing some LLDD resource,
not just avoid IO hung. Is it OK to kill it totally ?

Thanks
Jianchao



Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().

2018-07-30 Thread Michal Hocko
On Tue 31-07-18 06:01:48, Tetsuo Handa wrote:
> On 2018/07/31 4:10, Michal Hocko wrote:
> > Since should_reclaim_retry() should be a natural reschedule point,
> > let's do the short sleep for PF_WQ_WORKER threads unconditionally in
> > order to guarantee that other pending work items are started. This will
> > workaround this problem and it is less fragile than hunting down when
> > the sleep is missed. E.g. we used to have a sleeping point in the oom
> > path but this has been removed recently because it caused other issues.
> > Having a single sleeping point is more robust.
> 
> linux.git has not removed the sleeping point in the OOM path yet. Since 
> removing the
> sleeping point in the OOM path can mitigate CVE-2016-10723, please do so 
> immediately.

is this an {Acked,Reviewed,Tested}-by?

I will send the patch to Andrew if the patch is ok. 

> (And that change will conflict with Roman's cgroup aware OOM killer patchset. 
> But it
> should be easy to rebase.)

That is still a WIP so I would lose sleep over it.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().

2018-07-30 Thread Michal Hocko
On Tue 31-07-18 06:01:48, Tetsuo Handa wrote:
> On 2018/07/31 4:10, Michal Hocko wrote:
> > Since should_reclaim_retry() should be a natural reschedule point,
> > let's do the short sleep for PF_WQ_WORKER threads unconditionally in
> > order to guarantee that other pending work items are started. This will
> > workaround this problem and it is less fragile than hunting down when
> > the sleep is missed. E.g. we used to have a sleeping point in the oom
> > path but this has been removed recently because it caused other issues.
> > Having a single sleeping point is more robust.
> 
> linux.git has not removed the sleeping point in the OOM path yet. Since 
> removing the
> sleeping point in the OOM path can mitigate CVE-2016-10723, please do so 
> immediately.

is this an {Acked,Reviewed,Tested}-by?

I will send the patch to Andrew if the patch is ok. 

> (And that change will conflict with Roman's cgroup aware OOM killer patchset. 
> But it
> should be easy to rebase.)

That is still a WIP so I would lose sleep over it.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 0/2] dt: thermal: Fix broken cooling-maps

2018-07-30 Thread Viresh Kumar
On 05-07-18, 10:39, Viresh Kumar wrote:
> Hi,
> 
> This is an attempt to fix the broken or partially defined DT bindings
> for cooling-maps. We should list every device that participates in
> cooling down at a certain trip point, instead of just the first in the
> list as that depends on certain ordering of events to work properly.
> 
> The first patch extends the binding to allow a list of phandles in
> "cooling-device" property and the second patch fixes one of the
> platform's DT.
> 
> This will be followed up by fixing all platform DT bindings that have
> these issues after this set is accepted.
> 
> The kernel also requires some changes to handle the phandle list, but
> wouldn't break with these changes as it reads the first phandle in the
> list for now. We can update that separately.

@Zhang: Are you going to apply this for 4.19-rc1 ? There are lot of patches that
I am holding up until this gets merged.

-- 
viresh


Re: [PATCH 0/2] dt: thermal: Fix broken cooling-maps

2018-07-30 Thread Viresh Kumar
On 05-07-18, 10:39, Viresh Kumar wrote:
> Hi,
> 
> This is an attempt to fix the broken or partially defined DT bindings
> for cooling-maps. We should list every device that participates in
> cooling down at a certain trip point, instead of just the first in the
> list as that depends on certain ordering of events to work properly.
> 
> The first patch extends the binding to allow a list of phandles in
> "cooling-device" property and the second patch fixes one of the
> platform's DT.
> 
> This will be followed up by fixing all platform DT bindings that have
> these issues after this set is accepted.
> 
> The kernel also requires some changes to handle the phandle list, but
> wouldn't break with these changes as it reads the first phandle in the
> list for now. We can update that separately.

@Zhang: Are you going to apply this for 4.19-rc1 ? There are lot of patches that
I am holding up until this gets merged.

-- 
viresh


Re: [PATCH v7 0/4] ARM: davinci: complete the conversion to using the reset framework

2018-07-30 Thread Bjorn Andersson
On Mon 02 Jul 05:08 PDT 2018, Sekhar Nori wrote:

> Hi Bjorn,
> 
> On Thursday 21 June 2018 05:11 PM, Bartosz Golaszewski wrote:
> > 2018-06-21 12:52 GMT+02:00 Sekhar Nori :
> >> Hi Bartosz,
> >>
> >> On Thursday 21 June 2018 01:07 PM, Bartosz Golaszewski wrote:
> >>> From: Bartosz Golaszewski 
> >>>
> >>> These are the remaining patches that still need to be merged in order
> >>> to complete the conversion of the davinci dsp driver to using the reset
> >>> framework.
> >>>
> >>> They apply on top of v4.18-rc1 with David Lechner's remaining patches
> >>> merged.
> >>
> >> Series looks good to me.
> >>
> >> To preserve bisect, shouldn't the order of applying be patch #3, #4, #1
> >> and #2 ?
> >>
> >> Given the dependencies and to preserve bisect its easiest if I take the
> >> series with acks from remoteproc and clock maintainers.
> >>
> >> Open to other suggestions as well.
> >>
> >> Thanks,
> >> Sekhar
> > 
> > Oops you're right about the order. Do you want me to resend?
> 
> With your ack, I can queue 1/4 for v4.19 and provide an immutable commit
> to you (on top of v4.18-rc1) for you to merge any further changes you
> want to queue from your tree.
> 

I'm not sure why I didn't see your request earlier, sorry about that.

I seem to have one other davinci patch from Suman, which should be
possible to merge without any conflicts. So there's no need for an
immutable branch at this time.

Regards,
Bjorn


Re: [PATCH v7 0/4] ARM: davinci: complete the conversion to using the reset framework

2018-07-30 Thread Bjorn Andersson
On Mon 02 Jul 05:08 PDT 2018, Sekhar Nori wrote:

> Hi Bjorn,
> 
> On Thursday 21 June 2018 05:11 PM, Bartosz Golaszewski wrote:
> > 2018-06-21 12:52 GMT+02:00 Sekhar Nori :
> >> Hi Bartosz,
> >>
> >> On Thursday 21 June 2018 01:07 PM, Bartosz Golaszewski wrote:
> >>> From: Bartosz Golaszewski 
> >>>
> >>> These are the remaining patches that still need to be merged in order
> >>> to complete the conversion of the davinci dsp driver to using the reset
> >>> framework.
> >>>
> >>> They apply on top of v4.18-rc1 with David Lechner's remaining patches
> >>> merged.
> >>
> >> Series looks good to me.
> >>
> >> To preserve bisect, shouldn't the order of applying be patch #3, #4, #1
> >> and #2 ?
> >>
> >> Given the dependencies and to preserve bisect its easiest if I take the
> >> series with acks from remoteproc and clock maintainers.
> >>
> >> Open to other suggestions as well.
> >>
> >> Thanks,
> >> Sekhar
> > 
> > Oops you're right about the order. Do you want me to resend?
> 
> With your ack, I can queue 1/4 for v4.19 and provide an immutable commit
> to you (on top of v4.18-rc1) for you to merge any further changes you
> want to queue from your tree.
> 

I'm not sure why I didn't see your request earlier, sorry about that.

I seem to have one other davinci patch from Suman, which should be
possible to merge without any conflicts. So there's no need for an
immutable branch at this time.

Regards,
Bjorn


Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Paul E. McKenney
On Tue, Jul 31, 2018 at 09:58:36AM +0900, Byungchul Park wrote:
> Hello folks,
> 
> I'm careful in saying.. and curious about..
> 
> In restrictive cases like only addtions happen but never deletion, can't
> we safely traverse a llist? I believe llist can be more useful if we can
> release the restriction. Can't we?

Yes, but please give a thought to the people looking at your code some
time down the line.  If you are doing this, lots of comments, please.

Here are the approaches that I am aware of:

1.  Normal RCU.  Use list_add_rcu(), list_del_rcu(), and friends.

2.  Things are added but never deleted.  Use list_add_rcu() and
friends, but since you don't ever delete anything, you never
use list_del_rcu(), synchronize_rcu(), call_rcu(), and friends.

3.  Things are added, but deletion deletes the entire list.
You need to use something like list_del_rcu() to handle
this, and you need synchronize_rcu(), call_rcu(), and friends.
So really not all that much different than #1.

4.  Things are added, but deletions happen during some sort of
maintenance phase during which there are no readers.  This is
really easy to get wrong -- all you have to do is let one little
reader slip in and all is broken.  Also the maintenance phases
often take longer than planned.  (We used a trick somewhat
like this back when I worked on the dormitory system back at
university the first time around, but we had the advantage of
everyone using the system being in the same timezone and
the system being taken down every night anyway.)

5.  Just mark the deleted elements, but leave them in the list.
Actually remove them using one of the above techniques.

There are probably others, but those come to mind immediately.

I suggest that such special cases stay in the subsystem in question.
If a given technique gains wider use, then it might be time to
update header comments.

> If yes, we may add another function traversing starting from a head. Or
> just use existing funtion with head->first.

If you start with head->first, then you need to make sure that a concurrent
add of an element at the head of the list works.  You need at least a
READ_ONCE() and preferably an rcu_dereference() or similar.

> Thank a lot for your answers in advance :)

You did ask!

Thanx, Paul

> ->8-
> >From 1e73882799b269cd86e7a7c955021e3a18d1e6cf Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Tue, 31 Jul 2018 09:31:57 +0900
> Subject: [QUESTION] llist: Comment releasing 'must delete' restriction before
>  traversing
> 
> llist traversing can run without deletion in restrictive cases all
> items are added but never deleted like a rculist traversing such as
> list_for_each_entry_lockless. So add the comment.
> 
> Signed-off-by: Byungchul Park 
> ---
>  include/linux/llist.h | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/llist.h b/include/linux/llist.h
> index 85abc29..d012d3e 100644
> --- a/include/linux/llist.h
> +++ b/include/linux/llist.h
> @@ -32,8 +32,12 @@
>   * operation, with "-" being no lock needed, while "L" being lock is needed.
>   *
>   * The list entries deleted via llist_del_all can be traversed with
> - * traversing function such as llist_for_each etc.  But the list
> - * entries can not be traversed safely before deleted from the list.
> + * traversing function such as llist_for_each etc.  Normally the list
> + * entries cannot be traversed safely before deleted from the list
> + * except the cases items are added to the list but never deleted.  In
> + * that restrictive cases the list may be safely traversed concurrently
> + * with llist_add.
> + *
>   * The order of deleted entries is from the newest to the oldest added
>   * one.  If you want to traverse from the oldest to the newest, you
>   * must reverse the order by yourself before traversing.
> @@ -116,7 +120,9 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   *
>   * In general, some entries of the lock-less list can be traversed
>   * safely only after being deleted from list, so start with an entry
> - * instead of list head.
> + * instead of list head.  But in restrictive cases items are added to
> + * the list but never deleted, the list may be safely traversed
> + * concurrently with llist_add.
>   *
>   * If being used on entries deleted from lock-less list directly, the
>   * traverse order is from the newest to the oldest added entry.  If
> @@ -135,7 +141,9 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   *
>   * In general, some entries of the lock-less list can be traversed
>   * safely only after being deleted from list, so start with an entry
> - * instead of list head.
> + * instead of list head.  But in restrictive cases items are added to
> + * the list but never 

Re: [QUESTION] llist: Comment releasing 'must delete' restriction before traversing

2018-07-30 Thread Paul E. McKenney
On Tue, Jul 31, 2018 at 09:58:36AM +0900, Byungchul Park wrote:
> Hello folks,
> 
> I'm careful in saying.. and curious about..
> 
> In restrictive cases like only addtions happen but never deletion, can't
> we safely traverse a llist? I believe llist can be more useful if we can
> release the restriction. Can't we?

Yes, but please give a thought to the people looking at your code some
time down the line.  If you are doing this, lots of comments, please.

Here are the approaches that I am aware of:

1.  Normal RCU.  Use list_add_rcu(), list_del_rcu(), and friends.

2.  Things are added but never deleted.  Use list_add_rcu() and
friends, but since you don't ever delete anything, you never
use list_del_rcu(), synchronize_rcu(), call_rcu(), and friends.

3.  Things are added, but deletion deletes the entire list.
You need to use something like list_del_rcu() to handle
this, and you need synchronize_rcu(), call_rcu(), and friends.
So really not all that much different than #1.

4.  Things are added, but deletions happen during some sort of
maintenance phase during which there are no readers.  This is
really easy to get wrong -- all you have to do is let one little
reader slip in and all is broken.  Also the maintenance phases
often take longer than planned.  (We used a trick somewhat
like this back when I worked on the dormitory system back at
university the first time around, but we had the advantage of
everyone using the system being in the same timezone and
the system being taken down every night anyway.)

5.  Just mark the deleted elements, but leave them in the list.
Actually remove them using one of the above techniques.

There are probably others, but those come to mind immediately.

I suggest that such special cases stay in the subsystem in question.
If a given technique gains wider use, then it might be time to
update header comments.

> If yes, we may add another function traversing starting from a head. Or
> just use existing funtion with head->first.

If you start with head->first, then you need to make sure that a concurrent
add of an element at the head of the list works.  You need at least a
READ_ONCE() and preferably an rcu_dereference() or similar.

> Thank a lot for your answers in advance :)

You did ask!

Thanx, Paul

> ->8-
> >From 1e73882799b269cd86e7a7c955021e3a18d1e6cf Mon Sep 17 00:00:00 2001
> From: Byungchul Park 
> Date: Tue, 31 Jul 2018 09:31:57 +0900
> Subject: [QUESTION] llist: Comment releasing 'must delete' restriction before
>  traversing
> 
> llist traversing can run without deletion in restrictive cases all
> items are added but never deleted like a rculist traversing such as
> list_for_each_entry_lockless. So add the comment.
> 
> Signed-off-by: Byungchul Park 
> ---
>  include/linux/llist.h | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/llist.h b/include/linux/llist.h
> index 85abc29..d012d3e 100644
> --- a/include/linux/llist.h
> +++ b/include/linux/llist.h
> @@ -32,8 +32,12 @@
>   * operation, with "-" being no lock needed, while "L" being lock is needed.
>   *
>   * The list entries deleted via llist_del_all can be traversed with
> - * traversing function such as llist_for_each etc.  But the list
> - * entries can not be traversed safely before deleted from the list.
> + * traversing function such as llist_for_each etc.  Normally the list
> + * entries cannot be traversed safely before deleted from the list
> + * except the cases items are added to the list but never deleted.  In
> + * that restrictive cases the list may be safely traversed concurrently
> + * with llist_add.
> + *
>   * The order of deleted entries is from the newest to the oldest added
>   * one.  If you want to traverse from the oldest to the newest, you
>   * must reverse the order by yourself before traversing.
> @@ -116,7 +120,9 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   *
>   * In general, some entries of the lock-less list can be traversed
>   * safely only after being deleted from list, so start with an entry
> - * instead of list head.
> + * instead of list head.  But in restrictive cases items are added to
> + * the list but never deleted, the list may be safely traversed
> + * concurrently with llist_add.
>   *
>   * If being used on entries deleted from lock-less list directly, the
>   * traverse order is from the newest to the oldest added entry.  If
> @@ -135,7 +141,9 @@ static inline void init_llist_head(struct llist_head 
> *list)
>   *
>   * In general, some entries of the lock-less list can be traversed
>   * safely only after being deleted from list, so start with an entry
> - * instead of list head.
> + * instead of list head.  But in restrictive cases items are added to
> + * the list but never 

Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

2018-07-30 Thread Vinod
On 30-07-18, 12:46, Peter Ujfalusi wrote:
> Vinod,
> 
> On 2018-07-24 14:14, Vinod wrote:
>  Clients must not mix the two way of handling the metadata.
>  The set_len() is intended to tell the DMA driver the client provided
>  metadata size (in MEM_TO_DEV case mostly).
> 
>  MEM_TO_DEV flow on client side:
>  get_ptr()
>  fill in the metadata to the pointer (not exceeding max_len)
>  set_len() to tell the DMA driver the amount of valid bytes written
> 
>  DEV_TO_MEM flow on client side:
>  In the completion callback, get_ptr()
>  the metadata is payload_len bytes and can be accessed in the return 
>  pointer.
> >>>
> >>> I would think to unify this..
> >>
> >> I have tried it, but the attach mode and the pointer mode is hard to
> >> handle with a generic API.
> >> I will try to find a way to unify things in a sane way.
> > 
> > Hmmm, looking from the description they will be for different methods,
> > so lets make them orthogonal and not allow driver to register both.
> 
> I would allow DMA drivers to register both, but somehow enforce that
> clients are not mixing the two distinct way of dealing with the metadata.
> 
> The reason for that is for example the attach mode is the simplest (I
> implemented it first and I have a client using it), but if the pointer
> mode is found to be more efficient and feasible for the DMA then the DMA
> driver can implement that mode and the client can move as well w/o
> breaking anything.

Sounds reasonable...

-- 
~Vinod


Re: [PATCH] timers: Clear must_forward_clk inside base lock

2018-07-30 Thread Kohli, Gaurav

Hi John, Thomas,

Can you please review below patch and update your comments:

Regards
Gaurav

On 7/26/2018 2:12 PM, Gaurav Kohli wrote:

While migrating timer to new base, there is a need
to update base clk by calling forward_timer_base to
avoid stale clock , but at the same time if run_timer
is exectuing in new core it may set must_forward_clk
to false and due to this forward base logic may fail as
per below check:

if (likely(!base->must_forward_clk))
 return;

So preventing the same by putting must_forward_clk inside
base lock.

Signed-off-by: Gaurav Kohli 

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index cc2d23e..675241d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1657,6 +1657,19 @@ static inline void __run_timers(struct timer_base *base)
  
  	raw_spin_lock_irq(>lock);
  
+	/*

+* must_forward_clk must be cleared before running timers so that any
+* timer functions that call mod_timer will not try to forward the
+* base. idle trcking / clock forwarding logic is only used with
+* BASE_STD timers.
+*
+* The deferrable base does not do idle tracking at all, so we do
+* not forward it. This can result in very large variations in
+* granularity for deferrable timers, but they can be deferred for
+* long periods due to idle.
+*/
+   base->must_forward_clk = false;
+
while (time_after_eq(jiffies, base->clk)) {
  
  		levels = collect_expired_timers(base, heads);

@@ -1676,19 +1689,6 @@ static __latent_entropy void run_timer_softirq(struct 
softirq_action *h)
  {
struct timer_base *base = this_cpu_ptr(_bases[BASE_STD]);
  
-	/*

-* must_forward_clk must be cleared before running timers so that any
-* timer functions that call mod_timer will not try to forward the
-* base. idle trcking / clock forwarding logic is only used with
-* BASE_STD timers.
-*
-* The deferrable base does not do idle tracking at all, so we do
-* not forward it. This can result in very large variations in
-* granularity for deferrable timers, but they can be deferred for
-* long periods due to idle.
-*/
-   base->must_forward_clk = false;
-
__run_timers(base);
if (IS_ENABLED(CONFIG_NO_HZ_COMMON))
__run_timers(this_cpu_ptr(_bases[BASE_DEF]));



--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
Inc. is a member of the Code Aurora Forum,

a Linux Foundation Collaborative Project.


Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

2018-07-30 Thread Vinod
On 30-07-18, 12:46, Peter Ujfalusi wrote:
> Vinod,
> 
> On 2018-07-24 14:14, Vinod wrote:
>  Clients must not mix the two way of handling the metadata.
>  The set_len() is intended to tell the DMA driver the client provided
>  metadata size (in MEM_TO_DEV case mostly).
> 
>  MEM_TO_DEV flow on client side:
>  get_ptr()
>  fill in the metadata to the pointer (not exceeding max_len)
>  set_len() to tell the DMA driver the amount of valid bytes written
> 
>  DEV_TO_MEM flow on client side:
>  In the completion callback, get_ptr()
>  the metadata is payload_len bytes and can be accessed in the return 
>  pointer.
> >>>
> >>> I would think to unify this..
> >>
> >> I have tried it, but the attach mode and the pointer mode is hard to
> >> handle with a generic API.
> >> I will try to find a way to unify things in a sane way.
> > 
> > Hmmm, looking from the description they will be for different methods,
> > so lets make them orthogonal and not allow driver to register both.
> 
> I would allow DMA drivers to register both, but somehow enforce that
> clients are not mixing the two distinct way of dealing with the metadata.
> 
> The reason for that is for example the attach mode is the simplest (I
> implemented it first and I have a client using it), but if the pointer
> mode is found to be more efficient and feasible for the DMA then the DMA
> driver can implement that mode and the client can move as well w/o
> breaking anything.

Sounds reasonable...

-- 
~Vinod


Re: [PATCH] timers: Clear must_forward_clk inside base lock

2018-07-30 Thread Kohli, Gaurav

Hi John, Thomas,

Can you please review below patch and update your comments:

Regards
Gaurav

On 7/26/2018 2:12 PM, Gaurav Kohli wrote:

While migrating timer to new base, there is a need
to update base clk by calling forward_timer_base to
avoid stale clock , but at the same time if run_timer
is exectuing in new core it may set must_forward_clk
to false and due to this forward base logic may fail as
per below check:

if (likely(!base->must_forward_clk))
 return;

So preventing the same by putting must_forward_clk inside
base lock.

Signed-off-by: Gaurav Kohli 

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index cc2d23e..675241d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1657,6 +1657,19 @@ static inline void __run_timers(struct timer_base *base)
  
  	raw_spin_lock_irq(>lock);
  
+	/*

+* must_forward_clk must be cleared before running timers so that any
+* timer functions that call mod_timer will not try to forward the
+* base. idle trcking / clock forwarding logic is only used with
+* BASE_STD timers.
+*
+* The deferrable base does not do idle tracking at all, so we do
+* not forward it. This can result in very large variations in
+* granularity for deferrable timers, but they can be deferred for
+* long periods due to idle.
+*/
+   base->must_forward_clk = false;
+
while (time_after_eq(jiffies, base->clk)) {
  
  		levels = collect_expired_timers(base, heads);

@@ -1676,19 +1689,6 @@ static __latent_entropy void run_timer_softirq(struct 
softirq_action *h)
  {
struct timer_base *base = this_cpu_ptr(_bases[BASE_STD]);
  
-	/*

-* must_forward_clk must be cleared before running timers so that any
-* timer functions that call mod_timer will not try to forward the
-* base. idle trcking / clock forwarding logic is only used with
-* BASE_STD timers.
-*
-* The deferrable base does not do idle tracking at all, so we do
-* not forward it. This can result in very large variations in
-* granularity for deferrable timers, but they can be deferred for
-* long periods due to idle.
-*/
-   base->must_forward_clk = false;
-
__run_timers(base);
if (IS_ENABLED(CONFIG_NO_HZ_COMMON))
__run_timers(this_cpu_ptr(_bases[BASE_DEF]));



--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
Inc. is a member of the Code Aurora Forum,

a Linux Foundation Collaborative Project.


Re: [PATCH v7 1/4] remoteproc/davinci: use the reset framework

2018-07-30 Thread Bjorn Andersson
On Thu 21 Jun 00:37 PDT 2018, Bartosz Golaszewski wrote:

> From: Bartosz Golaszewski 
> 
> Switch to using the reset framework instead of handcoded reset routines
> we used so far.
> 
> Signed-off-by: Bartosz Golaszewski 
> Reviewed-by: Sekhar Nori 
> Reviewed-by: Philipp Zabel 

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/da8xx_remoteproc.c | 34 +++
>  1 file changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/remoteproc/da8xx_remoteproc.c 
> b/drivers/remoteproc/da8xx_remoteproc.c
> index b668e32996e2..76c06b70a1c6 100644
> --- a/drivers/remoteproc/da8xx_remoteproc.c
> +++ b/drivers/remoteproc/da8xx_remoteproc.c
> @@ -10,6 +10,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -20,8 +21,6 @@
>  #include 
>  #include 
>  
> -#include/* for davinci_clk_reset_assert/deassert() */
> -
>  #include "remoteproc_internal.h"
>  
>  static char *da8xx_fw_name;
> @@ -72,6 +71,7 @@ struct da8xx_rproc {
>   struct da8xx_rproc_mem *mem;
>   int num_mems;
>   struct clk *dsp_clk;
> + struct reset_control *dsp_reset;
>   void (*ack_fxn)(struct irq_data *data);
>   struct irq_data *irq_data;
>   void __iomem *chipsig;
> @@ -138,6 +138,7 @@ static int da8xx_rproc_start(struct rproc *rproc)
>   struct device *dev = rproc->dev.parent;
>   struct da8xx_rproc *drproc = (struct da8xx_rproc *)rproc->priv;
>   struct clk *dsp_clk = drproc->dsp_clk;
> + struct reset_control *dsp_reset = drproc->dsp_reset;
>   int ret;
>  
>   /* hw requires the start (boot) address be on 1KB boundary */
> @@ -155,7 +156,12 @@ static int da8xx_rproc_start(struct rproc *rproc)
>   return ret;
>   }
>  
> - davinci_clk_reset_deassert(dsp_clk);
> + ret = reset_control_deassert(dsp_reset);
> + if (ret) {
> + dev_err(dev, "reset_control_deassert() failed: %d\n", ret);
> + clk_disable_unprepare(dsp_clk);
> + return ret;
> + }
>  
>   return 0;
>  }
> @@ -163,8 +169,15 @@ static int da8xx_rproc_start(struct rproc *rproc)
>  static int da8xx_rproc_stop(struct rproc *rproc)
>  {
>   struct da8xx_rproc *drproc = rproc->priv;
> + struct device *dev = rproc->dev.parent;
> + int ret;
> +
> + ret = reset_control_assert(drproc->dsp_reset);
> + if (ret) {
> + dev_err(dev, "reset_control_assert() failed: %d\n", ret);
> + return ret;
> + }
>  
> - davinci_clk_reset_assert(drproc->dsp_clk);
>   clk_disable_unprepare(drproc->dsp_clk);
>  
>   return 0;
> @@ -232,6 +245,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>   struct resource *bootreg_res;
>   struct resource *chipsig_res;
>   struct clk *dsp_clk;
> + struct reset_control *dsp_reset;
>   void __iomem *chipsig;
>   void __iomem *bootreg;
>   int irq;
> @@ -268,6 +282,15 @@ static int da8xx_rproc_probe(struct platform_device 
> *pdev)
>   return PTR_ERR(dsp_clk);
>   }
>  
> + dsp_reset = devm_reset_control_get_exclusive(dev, NULL);
> + if (IS_ERR(dsp_reset)) {
> + if (PTR_ERR(dsp_reset) != -EPROBE_DEFER)
> + dev_err(dev, "unable to get reset control: %ld\n",
> + PTR_ERR(dsp_reset));
> +
> + return PTR_ERR(dsp_reset);
> + }
> +
>   if (dev->of_node) {
>   ret = of_reserved_mem_device_init(dev);
>   if (ret) {
> @@ -287,6 +310,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>   drproc = rproc->priv;
>   drproc->rproc = rproc;
>   drproc->dsp_clk = dsp_clk;
> + drproc->dsp_reset = dsp_reset;
>   rproc->has_iommu = false;
>  
>   ret = da8xx_rproc_get_internal_memories(pdev, drproc);
> @@ -309,7 +333,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>* *not* in reset, but da8xx_rproc_start() needs the DSP to be
>* held in reset at the time it is called.
>*/
> - ret = davinci_clk_reset_assert(drproc->dsp_clk);
> + ret = reset_control_assert(dsp_reset);
>   if (ret)
>   goto free_rproc;
>  
> -- 
> 2.17.1
> 


Re: [PATCH v7 1/4] remoteproc/davinci: use the reset framework

2018-07-30 Thread Bjorn Andersson
On Thu 21 Jun 00:37 PDT 2018, Bartosz Golaszewski wrote:

> From: Bartosz Golaszewski 
> 
> Switch to using the reset framework instead of handcoded reset routines
> we used so far.
> 
> Signed-off-by: Bartosz Golaszewski 
> Reviewed-by: Sekhar Nori 
> Reviewed-by: Philipp Zabel 

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/da8xx_remoteproc.c | 34 +++
>  1 file changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/remoteproc/da8xx_remoteproc.c 
> b/drivers/remoteproc/da8xx_remoteproc.c
> index b668e32996e2..76c06b70a1c6 100644
> --- a/drivers/remoteproc/da8xx_remoteproc.c
> +++ b/drivers/remoteproc/da8xx_remoteproc.c
> @@ -10,6 +10,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -20,8 +21,6 @@
>  #include 
>  #include 
>  
> -#include/* for davinci_clk_reset_assert/deassert() */
> -
>  #include "remoteproc_internal.h"
>  
>  static char *da8xx_fw_name;
> @@ -72,6 +71,7 @@ struct da8xx_rproc {
>   struct da8xx_rproc_mem *mem;
>   int num_mems;
>   struct clk *dsp_clk;
> + struct reset_control *dsp_reset;
>   void (*ack_fxn)(struct irq_data *data);
>   struct irq_data *irq_data;
>   void __iomem *chipsig;
> @@ -138,6 +138,7 @@ static int da8xx_rproc_start(struct rproc *rproc)
>   struct device *dev = rproc->dev.parent;
>   struct da8xx_rproc *drproc = (struct da8xx_rproc *)rproc->priv;
>   struct clk *dsp_clk = drproc->dsp_clk;
> + struct reset_control *dsp_reset = drproc->dsp_reset;
>   int ret;
>  
>   /* hw requires the start (boot) address be on 1KB boundary */
> @@ -155,7 +156,12 @@ static int da8xx_rproc_start(struct rproc *rproc)
>   return ret;
>   }
>  
> - davinci_clk_reset_deassert(dsp_clk);
> + ret = reset_control_deassert(dsp_reset);
> + if (ret) {
> + dev_err(dev, "reset_control_deassert() failed: %d\n", ret);
> + clk_disable_unprepare(dsp_clk);
> + return ret;
> + }
>  
>   return 0;
>  }
> @@ -163,8 +169,15 @@ static int da8xx_rproc_start(struct rproc *rproc)
>  static int da8xx_rproc_stop(struct rproc *rproc)
>  {
>   struct da8xx_rproc *drproc = rproc->priv;
> + struct device *dev = rproc->dev.parent;
> + int ret;
> +
> + ret = reset_control_assert(drproc->dsp_reset);
> + if (ret) {
> + dev_err(dev, "reset_control_assert() failed: %d\n", ret);
> + return ret;
> + }
>  
> - davinci_clk_reset_assert(drproc->dsp_clk);
>   clk_disable_unprepare(drproc->dsp_clk);
>  
>   return 0;
> @@ -232,6 +245,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>   struct resource *bootreg_res;
>   struct resource *chipsig_res;
>   struct clk *dsp_clk;
> + struct reset_control *dsp_reset;
>   void __iomem *chipsig;
>   void __iomem *bootreg;
>   int irq;
> @@ -268,6 +282,15 @@ static int da8xx_rproc_probe(struct platform_device 
> *pdev)
>   return PTR_ERR(dsp_clk);
>   }
>  
> + dsp_reset = devm_reset_control_get_exclusive(dev, NULL);
> + if (IS_ERR(dsp_reset)) {
> + if (PTR_ERR(dsp_reset) != -EPROBE_DEFER)
> + dev_err(dev, "unable to get reset control: %ld\n",
> + PTR_ERR(dsp_reset));
> +
> + return PTR_ERR(dsp_reset);
> + }
> +
>   if (dev->of_node) {
>   ret = of_reserved_mem_device_init(dev);
>   if (ret) {
> @@ -287,6 +310,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>   drproc = rproc->priv;
>   drproc->rproc = rproc;
>   drproc->dsp_clk = dsp_clk;
> + drproc->dsp_reset = dsp_reset;
>   rproc->has_iommu = false;
>  
>   ret = da8xx_rproc_get_internal_memories(pdev, drproc);
> @@ -309,7 +333,7 @@ static int da8xx_rproc_probe(struct platform_device *pdev)
>* *not* in reset, but da8xx_rproc_start() needs the DSP to be
>* held in reset at the time it is called.
>*/
> - ret = davinci_clk_reset_assert(drproc->dsp_clk);
> + ret = reset_control_assert(dsp_reset);
>   if (ret)
>   goto free_rproc;
>  
> -- 
> 2.17.1
> 


Re: Linux 4.18-rc7

2018-07-30 Thread John Stultz
On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins  wrote:
> On Mon, 30 Jul 2018, Linus Torvalds wrote:
>> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins  wrote:
>> >
>> > I have no problem with reverting -rc7's vma_is_anonymous() series.
>>
>> I don't think we need to revert the whole series: I think the rest are
>> all fairly obvious cleanups, and shouldn't really have any semantic
>> changes.
>
> Okay.
>
>>
>> It's literally only that last patch in the series that then changes
>> that meaning of "vm_ops". And I don't really _mind_ that last step
>> either, but since we don't know exactly what it was that it broke, and
>> we're past rc7, I don't think we really have any option but the revert
>> it.
>
> It took me a long time to grasp what was happening, that that last
> patch bfd40eaff5ab was fixing. Not quite explained in the commit.
>
> I think it was that by mistakenly passing the vma_is_anonymous() test,
> create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> COWing pages from kcov); which the truncate then had to split, and in
> going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
>
>>
>> And if we revert it, I think we need to just remove the
>> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
>> it is quite likely that the real bug is that overzealous BUG_ON(),
>> since I can't see any reason why anonymous mappings should be special
>> there.
>
> Yes, that probably has to go: but it's not clear what state it leaves
> us in, with an anon THP being split by a truncate, without the expected
> locking; I don't remember offhand, probably a subtler bug than that BUG,
> which you may or may not consider an improvement.
>
> I fear that Kirill has not missed inserting a vma_set_anonymous() from
> somewhere that it should be, but rather that zygote is working with some
> special mapping which used to satisfy vma_is_anonymous(), faults supplying
> backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> finds !dummy_vm_ops.fault hence SIGBUS.

I've been only casually following this thread (mostly just glad Amit
caught it and I could avoid having to bisect the issue in my own
Android testing), but this bit starting to shake a few old cobwebs
loose in my brain.

I'm wondering if Zygote is utilizing ashmem here, and we're somehow
traversing ashmem purged memory, or due to some setup issue the
initial traverse isn't being zero-filled as expected?

ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377


If we purge pages, it punches it out with:
vfs_fallocate(range->asma->file,
 FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
 start, end - start);
here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447

But in ashmem_pin(), we don't do anything other then returning if we
purged any page in the range.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577

And I believe the future assumption is the if we traverse those pages
they will be zero filled (if purged or even during the initial
traversal after mmap)

Its been a long time, and I've not looked at the code in question but
it sounds from Hugh's comments above that we might instead get a
SIGBUS here.

Looking more at the problematic patch..
Amit: Does adding something like (whitespace damaged, apologies):

index a1a0025..1af6915 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
vm_area_struct *vma)
fput(asma->file);
goto out;
}
-   }
+   } else
+   vma_set_anonymous(vma);

if (vma->vm_file)
fput(vma->vm_file);


Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my
desk for the night).
thanks
-john


Re: Linux 4.18-rc7

2018-07-30 Thread John Stultz
On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins  wrote:
> On Mon, 30 Jul 2018, Linus Torvalds wrote:
>> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins  wrote:
>> >
>> > I have no problem with reverting -rc7's vma_is_anonymous() series.
>>
>> I don't think we need to revert the whole series: I think the rest are
>> all fairly obvious cleanups, and shouldn't really have any semantic
>> changes.
>
> Okay.
>
>>
>> It's literally only that last patch in the series that then changes
>> that meaning of "vm_ops". And I don't really _mind_ that last step
>> either, but since we don't know exactly what it was that it broke, and
>> we're past rc7, I don't think we really have any option but the revert
>> it.
>
> It took me a long time to grasp what was happening, that that last
> patch bfd40eaff5ab was fixing. Not quite explained in the commit.
>
> I think it was that by mistakenly passing the vma_is_anonymous() test,
> create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> COWing pages from kcov); which the truncate then had to split, and in
> going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
>
>>
>> And if we revert it, I think we need to just remove the
>> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
>> it is quite likely that the real bug is that overzealous BUG_ON(),
>> since I can't see any reason why anonymous mappings should be special
>> there.
>
> Yes, that probably has to go: but it's not clear what state it leaves
> us in, with an anon THP being split by a truncate, without the expected
> locking; I don't remember offhand, probably a subtler bug than that BUG,
> which you may or may not consider an improvement.
>
> I fear that Kirill has not missed inserting a vma_set_anonymous() from
> somewhere that it should be, but rather that zygote is working with some
> special mapping which used to satisfy vma_is_anonymous(), faults supplying
> backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> finds !dummy_vm_ops.fault hence SIGBUS.

I've been only casually following this thread (mostly just glad Amit
caught it and I could avoid having to bisect the issue in my own
Android testing), but this bit starting to shake a few old cobwebs
loose in my brain.

I'm wondering if Zygote is utilizing ashmem here, and we're somehow
traversing ashmem purged memory, or due to some setup issue the
initial traverse isn't being zero-filled as expected?

ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377


If we purge pages, it punches it out with:
vfs_fallocate(range->asma->file,
 FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
 start, end - start);
here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447

But in ashmem_pin(), we don't do anything other then returning if we
purged any page in the range.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577

And I believe the future assumption is the if we traverse those pages
they will be zero filled (if purged or even during the initial
traversal after mmap)

Its been a long time, and I've not looked at the code in question but
it sounds from Hugh's comments above that we might instead get a
SIGBUS here.

Looking more at the problematic patch..
Amit: Does adding something like (whitespace damaged, apologies):

index a1a0025..1af6915 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
vm_area_struct *vma)
fput(asma->file);
goto out;
}
-   }
+   } else
+   vma_set_anonymous(vma);

if (vma->vm_file)
fput(vma->vm_file);


Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my
desk for the night).
thanks
-john


linux-next: manual merge of the kvms390 tree with the kvm-arm tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvms390 tree got a conflict in:

  include/uapi/linux/kvm.h

between commit:

  be26b3a73413 ("arm64: KVM: export the capability to set guest SError 
syndrome")

from the kvm-arm tree and commit:

  a449938297e5 ("KVM: s390: Add huge page enablement control")

from the kvms390 tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc include/uapi/linux/kvm.h
index a7d9bc4e4068,b955b986b341..
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@@ -949,7 -949,7 +949,8 @@@ struct kvm_ppc_resize_hpt 
  #define KVM_CAP_GET_MSR_FEATURES 153
  #define KVM_CAP_HYPERV_EVENTFD 154
  #define KVM_CAP_HYPERV_TLBFLUSH 155
 -#define KVM_CAP_S390_HPAGE_1M 156
 +#define KVM_CAP_ARM_INJECT_SERROR_ESR 156
++#define KVM_CAP_S390_HPAGE_1M 157
  
  #ifdef KVM_CAP_IRQ_ROUTING
  


pgpJN026N_IXn.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the kvms390 tree with the kvm-arm tree

2018-07-30 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvms390 tree got a conflict in:

  include/uapi/linux/kvm.h

between commit:

  be26b3a73413 ("arm64: KVM: export the capability to set guest SError 
syndrome")

from the kvm-arm tree and commit:

  a449938297e5 ("KVM: s390: Add huge page enablement control")

from the kvms390 tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc include/uapi/linux/kvm.h
index a7d9bc4e4068,b955b986b341..
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@@ -949,7 -949,7 +949,8 @@@ struct kvm_ppc_resize_hpt 
  #define KVM_CAP_GET_MSR_FEATURES 153
  #define KVM_CAP_HYPERV_EVENTFD 154
  #define KVM_CAP_HYPERV_TLBFLUSH 155
 -#define KVM_CAP_S390_HPAGE_1M 156
 +#define KVM_CAP_ARM_INJECT_SERROR_ESR 156
++#define KVM_CAP_S390_HPAGE_1M 157
  
  #ifdef KVM_CAP_IRQ_ROUTING
  


pgpJN026N_IXn.pgp
Description: OpenPGP digital signature


Re: [PATCH v0 3/4] drivers: edac: Add cache erp driver for Last Level Cache Controller (LLCC)

2018-07-30 Thread Borislav Petkov
On Mon, Jul 30, 2018 at 02:38:01PM -0700, vnkgu...@codeaurora.org wrote:
> Do you mean the Signed-off-by lines above? That's because
> Channagoud is the one who is the original author of this driver,
> and I'm the one who did the incremental changes (changes in llcc)
> and uploading it upstream.
> That's why the Signed-off is like that.
> Which way do you think it should be?

Then you need to figure out between you two who the author should be
because we have single authorship. When you do, commit it in git with

git commit --amend --author=...

so that that is reflected properly.

For expressing stuff like co-authorship we have

Co-Developed-by:

All explained in submitting-patches.rst.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH v0 3/4] drivers: edac: Add cache erp driver for Last Level Cache Controller (LLCC)

2018-07-30 Thread Borislav Petkov
On Mon, Jul 30, 2018 at 02:38:01PM -0700, vnkgu...@codeaurora.org wrote:
> Do you mean the Signed-off-by lines above? That's because
> Channagoud is the one who is the original author of this driver,
> and I'm the one who did the incremental changes (changes in llcc)
> and uploading it upstream.
> That's why the Signed-off is like that.
> Which way do you think it should be?

Then you need to figure out between you two who the author should be
because we have single authorship. When you do, commit it in git with

git commit --amend --author=...

so that that is reflected properly.

For expressing stuff like co-authorship we have

Co-Developed-by:

All explained in submitting-patches.rst.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] remoteproc: qcom: fix Q6V5_WCSS dependencies

2018-07-30 Thread Bjorn Andersson
On Wed 18 Jul 04:16 PDT 2018, Arnd Bergmann wrote:

> A new driver got added that depends on QCOM_SMD and fails to link
> as built-in with CONFIG_QCOM_SMD=m:
> 
> drivers/remoteproc/qcom_common.o: In function `smd_subdev_stop':
> qcom_common.c:(.text+0x674): undefined reference to `qcom_smd_unregister_edge'
> drivers/remoteproc/qcom_common.o: In function `smd_subdev_start':
> qcom_common.c:(.text+0x700): undefined reference to `qcom_smd_register_edge'
> 
> We've fixed the same thing several times before, so use the same
> dependency here.
> 

I have a change where I remove this inherited dependency, so I asked
Sricharan to only depend on the ones he actually uses - but I forgot
that I haven't pushed this.

Sorry about screwing this up again. Thanks for catching it!

Regards,
Bjorn

> Fixes: 3a3d4163e0bf ("remoteproc: qcom: Introduce Hexagon V5 based WCSS 
> driver")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/remoteproc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 0dde3753c3a5..052d4dd347f9 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -127,6 +127,7 @@ config QCOM_Q6V5_WCSS
>   tristate "Qualcomm Hexagon based WCSS Peripheral Image Loader"
>   depends on OF && ARCH_QCOM
>   depends on QCOM_SMEM
> + depends on RPMSG_QCOM_SMD || (COMPILE_TEST && RPMSG_QCOM_SMD=n)
>   depends on RPMSG_QCOM_GLINK_SMEM || RPMSG_QCOM_GLINK_SMEM=n
>   depends on QCOM_SYSMON || QCOM_SYSMON=n
>   select MFD_SYSCON
> -- 
> 2.9.0
> 


Re: [PATCH] remoteproc: qcom: fix Q6V5_WCSS dependencies

2018-07-30 Thread Bjorn Andersson
On Wed 18 Jul 04:16 PDT 2018, Arnd Bergmann wrote:

> A new driver got added that depends on QCOM_SMD and fails to link
> as built-in with CONFIG_QCOM_SMD=m:
> 
> drivers/remoteproc/qcom_common.o: In function `smd_subdev_stop':
> qcom_common.c:(.text+0x674): undefined reference to `qcom_smd_unregister_edge'
> drivers/remoteproc/qcom_common.o: In function `smd_subdev_start':
> qcom_common.c:(.text+0x700): undefined reference to `qcom_smd_register_edge'
> 
> We've fixed the same thing several times before, so use the same
> dependency here.
> 

I have a change where I remove this inherited dependency, so I asked
Sricharan to only depend on the ones he actually uses - but I forgot
that I haven't pushed this.

Sorry about screwing this up again. Thanks for catching it!

Regards,
Bjorn

> Fixes: 3a3d4163e0bf ("remoteproc: qcom: Introduce Hexagon V5 based WCSS 
> driver")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/remoteproc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 0dde3753c3a5..052d4dd347f9 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -127,6 +127,7 @@ config QCOM_Q6V5_WCSS
>   tristate "Qualcomm Hexagon based WCSS Peripheral Image Loader"
>   depends on OF && ARCH_QCOM
>   depends on QCOM_SMEM
> + depends on RPMSG_QCOM_SMD || (COMPILE_TEST && RPMSG_QCOM_SMD=n)
>   depends on RPMSG_QCOM_GLINK_SMEM || RPMSG_QCOM_GLINK_SMEM=n
>   depends on QCOM_SYSMON || QCOM_SYSMON=n
>   select MFD_SYSCON
> -- 
> 2.9.0
> 


Re: [PATCH 2/2] net/9p: add a per-client fcall kmem_cache

2018-07-30 Thread Dominique Martinet
Matthew Wilcox wrote on Mon, Jul 30, 2018:
> On Mon, Jul 30, 2018 at 11:34:23AM +0200, Dominique Martinet wrote:
> > -static int p9_fcall_alloc(struct p9_fcall *fc, int alloc_msize)
> > +static int p9_fcall_alloc(struct p9_client *c, struct p9_fcall *fc,
> > + int alloc_msize)
> >  {
> > -   fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
> > +   if (c->fcall_cache && alloc_msize == c->msize)
> > +   fc->sdata = kmem_cache_alloc(c->fcall_cache, GFP_NOFS);
> > +   else
> > +   fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
> 
> Could you simplify this by initialising c->msize to 0 and then this
> can simply be:
> 
> > +   if (alloc_msize == c->msize)
> ...

Hmm, this is rather tricky with the current flow of things;
p9_client_version() has multiple uses for that msize field.

Basically what happens is:
 - init client struct, set clip msize to mount option/transport-specific
max
 - p9_client_version() uses current c->msize to send a suggested value
to the server
 - p9_client_rpc() uses current c->msize to allocate that first rpc,
this is pretty much hard-coded and will be quite intrusive to make an
exception for
 - p9_client_version() looks at the msize the server suggested and clips
c->msize if the reply's is smaller than c->msize


I kind of agree it'd be nice to remove that check being done all the
time for just startup, but I don't see how to do this easily with the
current code.

Making p9_client_version take an extra argument would be easy but we'd
need to actually hardcode in p9_client_rpc that "if the message type is
TVERSION then use [page size or whatever] for allocation" and that kinds
of kills the point... The alternative being having p9_client_rpc takes
the actual size as argument itself but this once again is pretty
intrusive even if it could be done mechanically...

I'll think about this some more

> > +void p9_fcall_free(struct p9_client *c, struct p9_fcall *fc)
> > +{
> > +   /* sdata can be NULL for interrupted requests in trans_rdma,
> > +* and kmem_cache_free does not do NULL-check for us
> > +*/
> > +   if (unlikely(!fc->sdata))
> > +   return;
> > +
> > +   if (c->fcall_cache && fc->capacity == c->msize)
> > +   kmem_cache_free(c->fcall_cache, fc->sdata);
> > +   else
> > +   kfree(fc->sdata);
> > +}
> 
> Is it possible for fcall_cache to be allocated before fcall_free is
> called?  I'm concerned we might do this:
> 
> allocate message A
> allocate message B
> receive response A
> allocate fcall_cache
> receive response B
> 
> and then we'd call kmem_cache_free() for something allocated by kmalloc(),
> which works with slab and slub, but doesn't work with slob (alas).

Bleh, I checked this would work for slab and didn't really check
others..

This cannot happen right now because we only return the client struct
from p9_client_create after the first message is done (and, right now,
freed) but when we start adding refcounting to requests it'd be possible
to free the very first response after fcall_cache is allocated with a
"bad" server like syzcaller does sending the version reply before the
request came in.

I can't see any work-around around this other than storing how the fcall
was allocated in the struct itself though...
I guess I might as well do that now, unless you have a better idea.


> > @@ -980,6 +1000,9 @@ struct p9_client *p9_client_create(const char 
> > *dev_name, char *options)
> > if (err)
> > goto close_trans;
> >  
> > +   clnt->fcall_cache = kmem_cache_create("9p-fcall-cache", clnt->msize,
> > + 0, 0, NULL);
> > +
> 
> If we have slab merging turned off, or we have two mounts from servers
> with different msizes, we'll end up with two slabs called 9p-fcall-cache.
> I'm OK with that, but are you?

Yeah, the reason I didn't make it global like p9_req_cache is precisely
to get two separate caches if the msizes are different.

I actually considered adding msize to the string with snprintf or
something but someone looking at it through slabinfo or similar will
have the sizes anyway so I don't think this would bring anything, do you
know if/think that tools will choke on multiple caches with the same
name?


I'm not sure about slab merging being disabled though, from the little I
understand I do not see why anyone would do that except for debugging,
and I'm fine with that.
Please let me know if I'm missing something though!


Thanks for the review,
-- 
Dominique Martinet


Re: [PATCH 2/2] net/9p: add a per-client fcall kmem_cache

2018-07-30 Thread Dominique Martinet
Matthew Wilcox wrote on Mon, Jul 30, 2018:
> On Mon, Jul 30, 2018 at 11:34:23AM +0200, Dominique Martinet wrote:
> > -static int p9_fcall_alloc(struct p9_fcall *fc, int alloc_msize)
> > +static int p9_fcall_alloc(struct p9_client *c, struct p9_fcall *fc,
> > + int alloc_msize)
> >  {
> > -   fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
> > +   if (c->fcall_cache && alloc_msize == c->msize)
> > +   fc->sdata = kmem_cache_alloc(c->fcall_cache, GFP_NOFS);
> > +   else
> > +   fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
> 
> Could you simplify this by initialising c->msize to 0 and then this
> can simply be:
> 
> > +   if (alloc_msize == c->msize)
> ...

Hmm, this is rather tricky with the current flow of things;
p9_client_version() has multiple uses for that msize field.

Basically what happens is:
 - init client struct, set clip msize to mount option/transport-specific
max
 - p9_client_version() uses current c->msize to send a suggested value
to the server
 - p9_client_rpc() uses current c->msize to allocate that first rpc,
this is pretty much hard-coded and will be quite intrusive to make an
exception for
 - p9_client_version() looks at the msize the server suggested and clips
c->msize if the reply's is smaller than c->msize


I kind of agree it'd be nice to remove that check being done all the
time for just startup, but I don't see how to do this easily with the
current code.

Making p9_client_version take an extra argument would be easy but we'd
need to actually hardcode in p9_client_rpc that "if the message type is
TVERSION then use [page size or whatever] for allocation" and that kinds
of kills the point... The alternative being having p9_client_rpc takes
the actual size as argument itself but this once again is pretty
intrusive even if it could be done mechanically...

I'll think about this some more

> > +void p9_fcall_free(struct p9_client *c, struct p9_fcall *fc)
> > +{
> > +   /* sdata can be NULL for interrupted requests in trans_rdma,
> > +* and kmem_cache_free does not do NULL-check for us
> > +*/
> > +   if (unlikely(!fc->sdata))
> > +   return;
> > +
> > +   if (c->fcall_cache && fc->capacity == c->msize)
> > +   kmem_cache_free(c->fcall_cache, fc->sdata);
> > +   else
> > +   kfree(fc->sdata);
> > +}
> 
> Is it possible for fcall_cache to be allocated before fcall_free is
> called?  I'm concerned we might do this:
> 
> allocate message A
> allocate message B
> receive response A
> allocate fcall_cache
> receive response B
> 
> and then we'd call kmem_cache_free() for something allocated by kmalloc(),
> which works with slab and slub, but doesn't work with slob (alas).

Bleh, I checked this would work for slab and didn't really check
others..

This cannot happen right now because we only return the client struct
from p9_client_create after the first message is done (and, right now,
freed) but when we start adding refcounting to requests it'd be possible
to free the very first response after fcall_cache is allocated with a
"bad" server like syzcaller does sending the version reply before the
request came in.

I can't see any work-around around this other than storing how the fcall
was allocated in the struct itself though...
I guess I might as well do that now, unless you have a better idea.


> > @@ -980,6 +1000,9 @@ struct p9_client *p9_client_create(const char 
> > *dev_name, char *options)
> > if (err)
> > goto close_trans;
> >  
> > +   clnt->fcall_cache = kmem_cache_create("9p-fcall-cache", clnt->msize,
> > + 0, 0, NULL);
> > +
> 
> If we have slab merging turned off, or we have two mounts from servers
> with different msizes, we'll end up with two slabs called 9p-fcall-cache.
> I'm OK with that, but are you?

Yeah, the reason I didn't make it global like p9_req_cache is precisely
to get two separate caches if the msizes are different.

I actually considered adding msize to the string with snprintf or
something but someone looking at it through slabinfo or similar will
have the sizes anyway so I don't think this would bring anything, do you
know if/think that tools will choke on multiple caches with the same
name?


I'm not sure about slab merging being disabled though, from the little I
understand I do not see why anyone would do that except for debugging,
and I'm fine with that.
Please let me know if I'm missing something though!


Thanks for the review,
-- 
Dominique Martinet


Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]

2018-07-30 Thread Al Viro
On Fri, Jul 27, 2018 at 06:35:10PM +0100, David Howells wrote:
> params->request indicates the attribute/attributes to be queried.  This can
> be one of:
> 
>   fsinfo_attr_statfs  - statfs-style info
>   fsinfo_attr_fsinfo  - Information about fsinfo()
>   fsinfo_attr_ids - Filesystem IDs
>   fsinfo_attr_limits  - Filesystem limits
>   fsinfo_attr_supports- What's supported in statx(), IOC flags
>   fsinfo_attr_capabilities- Filesystem capabilities
>   fsinfo_attr_timestamp_info  - Inode timestamp info
>   fsinfo_attr_volume_id   - Volume ID (string)
>   fsinfo_attr_volume_uuid - Volume UUID
>   fsinfo_attr_volume_name - Volume name (string)
>   fsinfo_attr_cell_name   - Cell name (string)
>   fsinfo_attr_domain_name - Domain name (string)
>   fsinfo_attr_realm_name  - Realm name (string)
>   fsinfo_attr_server_name - Name of the Nth server (string)
>   fsinfo_attr_server_address  - Mth address of the Nth server
>   fsinfo_attr_parameter   - Nth mount parameter (string)
>   fsinfo_attr_source  - Nth mount source name (string)
>   fsinfo_attr_name_encoding   - Filename encoding (string)
>   fsinfo_attr_name_codepage   - Filename codepage (string)
>   fsinfo_attr_io_size - I/O size hints

Umm...  What's so special about cell/volume/domain/realm?  And
what do we do when a random filesystem gets added - should its
parameters go into catch-all pile (attr_parameter), or should they
get classes of their own?

For Cthulhu sake, who's going to maintain that enum in face of
random out-of-tree filesystems, each wanting a class or two its own?
We'd tried that with device numbers; ask hpa how well has that
worked and how much did he love the whole experience...


Re: [PATCH 34/38] vfs: syscall: Add fsinfo() to query filesystem information [ver #10]

2018-07-30 Thread Al Viro
On Fri, Jul 27, 2018 at 06:35:10PM +0100, David Howells wrote:
> params->request indicates the attribute/attributes to be queried.  This can
> be one of:
> 
>   fsinfo_attr_statfs  - statfs-style info
>   fsinfo_attr_fsinfo  - Information about fsinfo()
>   fsinfo_attr_ids - Filesystem IDs
>   fsinfo_attr_limits  - Filesystem limits
>   fsinfo_attr_supports- What's supported in statx(), IOC flags
>   fsinfo_attr_capabilities- Filesystem capabilities
>   fsinfo_attr_timestamp_info  - Inode timestamp info
>   fsinfo_attr_volume_id   - Volume ID (string)
>   fsinfo_attr_volume_uuid - Volume UUID
>   fsinfo_attr_volume_name - Volume name (string)
>   fsinfo_attr_cell_name   - Cell name (string)
>   fsinfo_attr_domain_name - Domain name (string)
>   fsinfo_attr_realm_name  - Realm name (string)
>   fsinfo_attr_server_name - Name of the Nth server (string)
>   fsinfo_attr_server_address  - Mth address of the Nth server
>   fsinfo_attr_parameter   - Nth mount parameter (string)
>   fsinfo_attr_source  - Nth mount source name (string)
>   fsinfo_attr_name_encoding   - Filename encoding (string)
>   fsinfo_attr_name_codepage   - Filename codepage (string)
>   fsinfo_attr_io_size - I/O size hints

Umm...  What's so special about cell/volume/domain/realm?  And
what do we do when a random filesystem gets added - should its
parameters go into catch-all pile (attr_parameter), or should they
get classes of their own?

For Cthulhu sake, who's going to maintain that enum in face of
random out-of-tree filesystems, each wanting a class or two its own?
We'd tried that with device numbers; ask hpa how well has that
worked and how much did he love the whole experience...


Re: [PATCH] remoteproc: Reset table_ptr in rproc_start() failure paths

2018-07-30 Thread Bjorn Andersson
On Thu 26 Jul 18:15 PDT 2018, Suman Anna wrote:

> Unwind the modified table_ptr and restore it to the local copy
> upon any subsequent failures in the rproc_start() function. This
> keeps the function to remain balanced on failures without the need
> to balance any modified variables elsewhere.
> 

Good catch.

> While at this, do some minor cleanup of the extra lines between
> the failure labels as well.
> 
> Signed-off-by: Suman Anna 
> ---
>  drivers/remoteproc/remoteproc_core.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index eadff6ce2f7f..afef2d491c5b 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -953,7 +953,7 @@ static int rproc_start(struct rproc *rproc, const struct 
> firmware *fw)
>   if (ret) {
>   dev_err(dev, "failed to prepare subdevices for %s: %d\n",
>   rproc->name, ret);
> - return ret;
> + goto reset_table_ptr;
>   }
>  
>   /* power up the remote processor */
> @@ -979,10 +979,11 @@ static int rproc_start(struct rproc *rproc, const 
> struct firmware *fw)
>  
>  stop_rproc:
>   rproc->ops->stop(rproc);
> -
>  unprepare_subdevices:
>   rproc_unprepare_subdevices(rproc);
> -
> +reset_table_ptr:
> + if (loaded_table)

Regardless of us having a loaded_table it should have the same value as
cached_table when we return - which might be NULL if we don't have a
resource table.

So I applied this without the conditional, please object if I missed
something.

Regards,
Bjorn

> + rproc->table_ptr = rproc->cached_table;
>   return ret;
>  }
>  
> -- 
> 2.18.0
> 


Re: [PATCH] remoteproc: Reset table_ptr in rproc_start() failure paths

2018-07-30 Thread Bjorn Andersson
On Thu 26 Jul 18:15 PDT 2018, Suman Anna wrote:

> Unwind the modified table_ptr and restore it to the local copy
> upon any subsequent failures in the rproc_start() function. This
> keeps the function to remain balanced on failures without the need
> to balance any modified variables elsewhere.
> 

Good catch.

> While at this, do some minor cleanup of the extra lines between
> the failure labels as well.
> 
> Signed-off-by: Suman Anna 
> ---
>  drivers/remoteproc/remoteproc_core.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index eadff6ce2f7f..afef2d491c5b 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -953,7 +953,7 @@ static int rproc_start(struct rproc *rproc, const struct 
> firmware *fw)
>   if (ret) {
>   dev_err(dev, "failed to prepare subdevices for %s: %d\n",
>   rproc->name, ret);
> - return ret;
> + goto reset_table_ptr;
>   }
>  
>   /* power up the remote processor */
> @@ -979,10 +979,11 @@ static int rproc_start(struct rproc *rproc, const 
> struct firmware *fw)
>  
>  stop_rproc:
>   rproc->ops->stop(rproc);
> -
>  unprepare_subdevices:
>   rproc_unprepare_subdevices(rproc);
> -
> +reset_table_ptr:
> + if (loaded_table)

Regardless of us having a loaded_table it should have the same value as
cached_table when we return - which might be NULL if we don't have a
resource table.

So I applied this without the conditional, please object if I missed
something.

Regards,
Bjorn

> + rproc->table_ptr = rproc->cached_table;
>   return ret;
>  }
>  
> -- 
> 2.18.0
> 


What is the best way for car care? B1A336FA-3326-49FF-9579-DDD500C851D3

2018-07-30 Thread arthur_king...@sina.com






Dear 
Manager,

 

Are 
you interested in clean car through below way?

1. Clean 
one car just using 2 liters water,

2. Just 
using water for car cleaning, no chemical,

3. Killing 
car bacteria and bad smelling in the car,

4. 360o comprehensive 
cleaning car, include car interior and exterior.

 

GOCLEAN 
steam car washing machine can achieve above items, please contact us for more 
details.

 

Best 
regards

 

Lotus 
CHAN

Mob.: 
+86 138 7319 2942 (Whatsapp, Skype, Viber, Wechat)




What is the best way for car care? B1A336FA-3326-49FF-9579-DDD500C851D3

2018-07-30 Thread arthur_king...@sina.com






Dear 
Manager,

 

Are 
you interested in clean car through below way?

1. Clean 
one car just using 2 liters water,

2. Just 
using water for car cleaning, no chemical,

3. Killing 
car bacteria and bad smelling in the car,

4. 360o comprehensive 
cleaning car, include car interior and exterior.

 

GOCLEAN 
steam car washing machine can achieve above items, please contact us for more 
details.

 

Best 
regards

 

Lotus 
CHAN

Mob.: 
+86 138 7319 2942 (Whatsapp, Skype, Viber, Wechat)




RE: [PATCH V2] x86/speculation: Support Enhanced IBRS on future CPUs

2018-07-30 Thread Prakhya, Sai Praneeth
> There is no reason not to use indentation and quotation marks in a changelog.
> Squeezing it into square brackets does not really improve readability.
> 
>  From the specification [1]:
> 
>   "With enhanced IBRS, the predicted targets of indirect branches executed
>cannot be controlled by software that was executed in a less privileged
>
>to entering a sleep state such as MWAIT or HLT."
> 
> Hmm?

Yes, this looks good. I have changed commit message in V3 accordingly.

> > x86_spec_ctrl_set_guest() before entering guest and
> > x86_spec_ctrl_restore_host() after leaving guest. So, the guest view
> > of SPEC_CTRL MSR is restored before entering guest and the host view
> > of SPEC_CTRL MSR is restored before entering host and hence IBRS will
> > be set after VMEXIT.
> 
> What you are trying to say here is:
> 
>  If Enhanced IBRS is selected as mitigation mechanism the IBRS bit is set  
> once at
> boot time and never cleared. This also has to be ensured after a  VMEXIT
> because the guest might have cleared the bit. This is already  covered by the
> existing x86_spec_ctrl_set_guest() and
>  x86_spec_ctrl_restore_host() speculation control functions.
> 
> Hmm?

Yes, that's correct. It's simple and concise. I have updated commit message as 
suggested.

> 
> > Intel's white paper on Retpoline [2] says that "Retpoline is known to
> > be an effective branch target injection (Spectre variant 2) mitigation
> > on Intel processors belonging to family 6 (enumerated by the CPUID
> > instruction) that do not have support for enhanced IBRS. On processors
> > that support enhanced IBRS, it should be used for mitigation instead
> > of retpoline."
> >
> > This means, Intel recommends using Enhanced IBRS over Retpoline where
> > available and it also means that retpoline provides less mitigation on
> > processors with enhanced IBRS compared to those without.
> 
> The cited part of the white paper does not say that its a broader mitigation 
> than
> what Retpoline covers. It merely recommends to use enhanced IBRS without
> providing a reason.
> 
> But chapter 4.3 contains the real reason for using Emhanced IBRS: The
> processors which support it also support CET and CET does not work well with
> Retpoline.
> 
> Please provide facts and not interpretations.

Sorry! got it.

> If you have additional knowledge
> about a broader mitigation scope, then clearly say so:
>

Hmm.. no.. not really. I just learned it from Dave.

> > Hence, on processors that support Enhanced IBRS, this patch makes
> > Enhanced IBRS as
> 
> Please search Documentation/process/submitting-patches.rst for 'This patch' 
> 
> 

Yes, got it. Will refrain myself from using 'This patch' further.

>  The reason why 'Enhanced IBRS is the recommended mitigation on processors
> which support it is that these processors also support CET which provides  a
> defense against ROP attacks. Retpoline is very similar to ROP techniques  and
> might trigger false positives in the CET defense.
> 
>  Enhanced IBRS still requires IBPB for full mitigation.
> 
> See? You might have noticed that aside of restructuring and enhancing the
> information I also got rid of all 'we' instances. Using 'we' is a form of
> impersonation which IMO blurs the technicality of a changelog.

Makes sense. Will stop using 'we' further.

> 
> 
> > [1]
> > https://software.intel.com/sites/default/files/managed/c5/63/336996-Sp
> > eculative-Execution-Side-Channel-Mitigations.pdf
> > [2]
> > https://software.intel.com/sites/default/files/managed/1d/46/Retpoline
> > -A-Branch-Target-Injection-Mitigation.pdf
> 
> These links are not really useful as sooner than later they are going to be 
> invalid.
> We recently started to put copies of such documents into the kernel.org 
> bugzilla
> and I'm pretty sure we have at least one of them already in one of the
> speculation mess related BZs. Could you please track that down and make sure
> that both files are available there in the latest version. Then provide links 
> to the
> BZ entry which are more likely to survive than content on a corporate website.
>

Sure! Makes sense.
I have updated Bugzilla link that has these documents and also updated commit 
message in V3.

> > @@ -219,6 +219,7 @@
> >  #define X86_FEATURE_IBPB   ( 7*32+26) /* Indirect Branch
> Prediction Barrier */
> >  #define X86_FEATURE_STIBP  ( 7*32+27) /* Single Thread Indirect
> Branch Predictors */
> >  #define X86_FEATURE_ZEN( 7*32+28) /* "" CPU is AMD
> family 0x17 (Zen) */
> > +#define X86_FEATURE_IBRS_ENHANCED  ( 7*32+29) /*
> "ibrs_enhanced" Use Enhanced IBRS in kernel */
> 
> That "ibrs_enhanced" part is not needed.

Just wanted to confirm with you before removing it, 
Presently, on platforms that support features like arch_capabilities, stibp, 
ibrs and ibpb 
/proc/cpuinfo does show them. Do you think we should really skip showing 
Enhanced IBRS capability?

> And 'Use' is also wrong. The feature
> merily reflects the 

RE: [PATCH V2] x86/speculation: Support Enhanced IBRS on future CPUs

2018-07-30 Thread Prakhya, Sai Praneeth
> There is no reason not to use indentation and quotation marks in a changelog.
> Squeezing it into square brackets does not really improve readability.
> 
>  From the specification [1]:
> 
>   "With enhanced IBRS, the predicted targets of indirect branches executed
>cannot be controlled by software that was executed in a less privileged
>
>to entering a sleep state such as MWAIT or HLT."
> 
> Hmm?

Yes, this looks good. I have changed commit message in V3 accordingly.

> > x86_spec_ctrl_set_guest() before entering guest and
> > x86_spec_ctrl_restore_host() after leaving guest. So, the guest view
> > of SPEC_CTRL MSR is restored before entering guest and the host view
> > of SPEC_CTRL MSR is restored before entering host and hence IBRS will
> > be set after VMEXIT.
> 
> What you are trying to say here is:
> 
>  If Enhanced IBRS is selected as mitigation mechanism the IBRS bit is set  
> once at
> boot time and never cleared. This also has to be ensured after a  VMEXIT
> because the guest might have cleared the bit. This is already  covered by the
> existing x86_spec_ctrl_set_guest() and
>  x86_spec_ctrl_restore_host() speculation control functions.
> 
> Hmm?

Yes, that's correct. It's simple and concise. I have updated commit message as 
suggested.

> 
> > Intel's white paper on Retpoline [2] says that "Retpoline is known to
> > be an effective branch target injection (Spectre variant 2) mitigation
> > on Intel processors belonging to family 6 (enumerated by the CPUID
> > instruction) that do not have support for enhanced IBRS. On processors
> > that support enhanced IBRS, it should be used for mitigation instead
> > of retpoline."
> >
> > This means, Intel recommends using Enhanced IBRS over Retpoline where
> > available and it also means that retpoline provides less mitigation on
> > processors with enhanced IBRS compared to those without.
> 
> The cited part of the white paper does not say that its a broader mitigation 
> than
> what Retpoline covers. It merely recommends to use enhanced IBRS without
> providing a reason.
> 
> But chapter 4.3 contains the real reason for using Emhanced IBRS: The
> processors which support it also support CET and CET does not work well with
> Retpoline.
> 
> Please provide facts and not interpretations.

Sorry! got it.

> If you have additional knowledge
> about a broader mitigation scope, then clearly say so:
>

Hmm.. no.. not really. I just learned it from Dave.

> > Hence, on processors that support Enhanced IBRS, this patch makes
> > Enhanced IBRS as
> 
> Please search Documentation/process/submitting-patches.rst for 'This patch' 
> 
> 

Yes, got it. Will refrain myself from using 'This patch' further.

>  The reason why 'Enhanced IBRS is the recommended mitigation on processors
> which support it is that these processors also support CET which provides  a
> defense against ROP attacks. Retpoline is very similar to ROP techniques  and
> might trigger false positives in the CET defense.
> 
>  Enhanced IBRS still requires IBPB for full mitigation.
> 
> See? You might have noticed that aside of restructuring and enhancing the
> information I also got rid of all 'we' instances. Using 'we' is a form of
> impersonation which IMO blurs the technicality of a changelog.

Makes sense. Will stop using 'we' further.

> 
> 
> > [1]
> > https://software.intel.com/sites/default/files/managed/c5/63/336996-Sp
> > eculative-Execution-Side-Channel-Mitigations.pdf
> > [2]
> > https://software.intel.com/sites/default/files/managed/1d/46/Retpoline
> > -A-Branch-Target-Injection-Mitigation.pdf
> 
> These links are not really useful as sooner than later they are going to be 
> invalid.
> We recently started to put copies of such documents into the kernel.org 
> bugzilla
> and I'm pretty sure we have at least one of them already in one of the
> speculation mess related BZs. Could you please track that down and make sure
> that both files are available there in the latest version. Then provide links 
> to the
> BZ entry which are more likely to survive than content on a corporate website.
>

Sure! Makes sense.
I have updated Bugzilla link that has these documents and also updated commit 
message in V3.

> > @@ -219,6 +219,7 @@
> >  #define X86_FEATURE_IBPB   ( 7*32+26) /* Indirect Branch
> Prediction Barrier */
> >  #define X86_FEATURE_STIBP  ( 7*32+27) /* Single Thread Indirect
> Branch Predictors */
> >  #define X86_FEATURE_ZEN( 7*32+28) /* "" CPU is AMD
> family 0x17 (Zen) */
> > +#define X86_FEATURE_IBRS_ENHANCED  ( 7*32+29) /*
> "ibrs_enhanced" Use Enhanced IBRS in kernel */
> 
> That "ibrs_enhanced" part is not needed.

Just wanted to confirm with you before removing it, 
Presently, on platforms that support features like arch_capabilities, stibp, 
ibrs and ibpb 
/proc/cpuinfo does show them. Do you think we should really skip showing 
Enhanced IBRS capability?

> And 'Use' is also wrong. The feature
> merily reflects the 

Do you know how long we need to clean carbon? D92EA61E-DAC1-400F-A050-3730030563A3

2018-07-30 Thread kingkarcar...@sina.com






Dear,

Do you know what is a HHO Carbon Cleaner ?

We combine HHO carbon machine with HHO carbon cleaning.

Thus achieving:

1. Reduce clean time, only 20 minutes.

2. Thoroughly clean the clean parts.

3. Cleaning effect increased by more than 30%.

4. Achieve the dual purpose of cleaning and maintenance.

Now it's time to move on and we'll work together to win and win.

New business and new distributor are waiting for you.

P. S. A picture worth a thousand words.Please tell me your whatsapp or skype 
so we can sent you pictures and video.

Looking forward to your reply!

Best regards

Lea




Do you know how long we need to clean carbon? D92EA61E-DAC1-400F-A050-3730030563A3

2018-07-30 Thread kingkarcar...@sina.com






Dear,

Do you know what is a HHO Carbon Cleaner ?

We combine HHO carbon machine with HHO carbon cleaning.

Thus achieving:

1. Reduce clean time, only 20 minutes.

2. Thoroughly clean the clean parts.

3. Cleaning effect increased by more than 30%.

4. Achieve the dual purpose of cleaning and maintenance.

Now it's time to move on and we'll work together to win and win.

New business and new distributor are waiting for you.

P. S. A picture worth a thousand words.Please tell me your whatsapp or skype 
so we can sent you pictures and video.

Looking forward to your reply!

Best regards

Lea




Re: [PATCH] arch/x86: Fix boot_cpu_data.microcode version output

2018-07-30 Thread Borislav Petkov
On Mon, Jul 30, 2018 at 01:53:50PM -0400, Prarit Bhargava wrote:
> I think this has to be
> 
>   boot_cpu_data.microcode = mc_amd->hdr.patch_id;

Yes, it does.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] arch/x86: Fix boot_cpu_data.microcode version output

2018-07-30 Thread Borislav Petkov
On Mon, Jul 30, 2018 at 01:53:50PM -0400, Prarit Bhargava wrote:
> I think this has to be
> 
>   boot_cpu_data.microcode = mc_amd->hdr.patch_id;

Yes, it does.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] hwspinlock: Fix incorrect return pointers

2018-07-30 Thread Bjorn Andersson
On Mon 30 Jul 04:34 PDT 2018, Baolin Wang wrote:

> Hi Bjorn,
> 
> On 28 June 2018 at 10:32, Baolin Wang  wrote:
> > The commit 4f1acd758b08 ("hwspinlock: Add devm_xxx() APIs to request/free
> > hwlock") introduces one bug, that will return one error pointer if failed
> > to request one hwlock, but we expect NULL pointer on error for consumers.
> > This patch will fix this issue.
> >
> > Reported-by: Dan Carpenter 
> > Signed-off-by: Baolin Wang 
> 
> Could you pick up this patch which fixes the incorrect return value
> issue? Thanks.
> 

I thought I had picked this already, it's applied now. Sorry about the
delay.

Regards,
Bjorn

> > ---
> >  drivers/hwspinlock/hwspinlock_core.c |8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/hwspinlock/hwspinlock_core.c 
> > b/drivers/hwspinlock/hwspinlock_core.c
> > index e16d648..2bad40d 100644
> > --- a/drivers/hwspinlock/hwspinlock_core.c
> > +++ b/drivers/hwspinlock/hwspinlock_core.c
> > @@ -877,10 +877,10 @@ struct hwspinlock *devm_hwspin_lock_request(struct 
> > device *dev)
> >
> > ptr = devres_alloc(devm_hwspin_lock_release, sizeof(*ptr), 
> > GFP_KERNEL);
> > if (!ptr)
> > -   return ERR_PTR(-ENOMEM);
> > +   return NULL;
> >
> > hwlock = hwspin_lock_request();
> > -   if (!IS_ERR(hwlock)) {
> > +   if (hwlock) {
> > *ptr = hwlock;
> > devres_add(dev, ptr);
> > } else {
> > @@ -913,10 +913,10 @@ struct hwspinlock 
> > *devm_hwspin_lock_request_specific(struct device *dev,
> >
> > ptr = devres_alloc(devm_hwspin_lock_release, sizeof(*ptr), 
> > GFP_KERNEL);
> > if (!ptr)
> > -   return ERR_PTR(-ENOMEM);
> > +   return NULL;
> >
> > hwlock = hwspin_lock_request_specific(id);
> > -   if (!IS_ERR(hwlock)) {
> > +   if (hwlock) {
> > *ptr = hwlock;
> > devres_add(dev, ptr);
> > } else {
> > --
> > 1.7.9.5
> >
> 
> 
> 
> -- 
> Baolin Wang
> Best Regards


Re: [PATCH] hwspinlock: Fix incorrect return pointers

2018-07-30 Thread Bjorn Andersson
On Mon 30 Jul 04:34 PDT 2018, Baolin Wang wrote:

> Hi Bjorn,
> 
> On 28 June 2018 at 10:32, Baolin Wang  wrote:
> > The commit 4f1acd758b08 ("hwspinlock: Add devm_xxx() APIs to request/free
> > hwlock") introduces one bug, that will return one error pointer if failed
> > to request one hwlock, but we expect NULL pointer on error for consumers.
> > This patch will fix this issue.
> >
> > Reported-by: Dan Carpenter 
> > Signed-off-by: Baolin Wang 
> 
> Could you pick up this patch which fixes the incorrect return value
> issue? Thanks.
> 

I thought I had picked this already, it's applied now. Sorry about the
delay.

Regards,
Bjorn

> > ---
> >  drivers/hwspinlock/hwspinlock_core.c |8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/hwspinlock/hwspinlock_core.c 
> > b/drivers/hwspinlock/hwspinlock_core.c
> > index e16d648..2bad40d 100644
> > --- a/drivers/hwspinlock/hwspinlock_core.c
> > +++ b/drivers/hwspinlock/hwspinlock_core.c
> > @@ -877,10 +877,10 @@ struct hwspinlock *devm_hwspin_lock_request(struct 
> > device *dev)
> >
> > ptr = devres_alloc(devm_hwspin_lock_release, sizeof(*ptr), 
> > GFP_KERNEL);
> > if (!ptr)
> > -   return ERR_PTR(-ENOMEM);
> > +   return NULL;
> >
> > hwlock = hwspin_lock_request();
> > -   if (!IS_ERR(hwlock)) {
> > +   if (hwlock) {
> > *ptr = hwlock;
> > devres_add(dev, ptr);
> > } else {
> > @@ -913,10 +913,10 @@ struct hwspinlock 
> > *devm_hwspin_lock_request_specific(struct device *dev,
> >
> > ptr = devres_alloc(devm_hwspin_lock_release, sizeof(*ptr), 
> > GFP_KERNEL);
> > if (!ptr)
> > -   return ERR_PTR(-ENOMEM);
> > +   return NULL;
> >
> > hwlock = hwspin_lock_request_specific(id);
> > -   if (!IS_ERR(hwlock)) {
> > +   if (hwlock) {
> > *ptr = hwlock;
> > devres_add(dev, ptr);
> > } else {
> > --
> > 1.7.9.5
> >
> 
> 
> 
> -- 
> Baolin Wang
> Best Regards


[PATCH v7 2/6] Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()

2018-07-30 Thread Ravi Bangoria
Add addition argument 'arch_uprobe' to uprobe_write_opcode().
We need this in later set of patches.

Signed-off-by: Ravi Bangoria 
---
 arch/arm/probes/uprobes/core.c | 2 +-
 arch/mips/kernel/uprobes.c | 2 +-
 include/linux/uprobes.h| 2 +-
 kernel/events/uprobes.c| 9 +
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm/probes/uprobes/core.c b/arch/arm/probes/uprobes/core.c
index d1329f1ba4e4..bf992264060e 100644
--- a/arch/arm/probes/uprobes/core.c
+++ b/arch/arm/probes/uprobes/core.c
@@ -32,7 +32,7 @@ bool is_swbp_insn(uprobe_opcode_t *insn)
 int set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm,
 unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr,
+   return uprobe_write_opcode(auprobe, mm, vaddr,
   __opcode_to_mem_arm(auprobe->bpinsn));
 }
 
diff --git a/arch/mips/kernel/uprobes.c b/arch/mips/kernel/uprobes.c
index f7a0645ccb82..4aaff3b3175c 100644
--- a/arch/mips/kernel/uprobes.c
+++ b/arch/mips/kernel/uprobes.c
@@ -224,7 +224,7 @@ unsigned long arch_uretprobe_hijack_return_addr(
 int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm,
unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, UPROBE_SWBP_INSN);
+   return uprobe_write_opcode(auprobe, mm, vaddr, UPROBE_SWBP_INSN);
 }
 
 void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 0a294e950df8..bb9d2084af03 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -121,7 +121,7 @@ extern bool is_swbp_insn(uprobe_opcode_t *insn);
 extern bool is_trap_insn(uprobe_opcode_t *insn);
 extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs);
 extern unsigned long uprobe_get_trap_addr(struct pt_regs *regs);
-extern int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr, 
uprobe_opcode_t);
+extern int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct 
*mm, unsigned long vaddr, uprobe_opcode_t);
 extern int uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
 extern int uprobe_apply(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc, bool);
 extern void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 471eac896635..c0418ba52ba8 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -299,8 +299,8 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
  * Called with mm->mmap_sem held for write.
  * Return 0 (success) or a negative errno.
  */
-int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
-   uprobe_opcode_t opcode)
+int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
+   unsigned long vaddr, uprobe_opcode_t opcode)
 {
struct page *old_page, *new_page;
struct vm_area_struct *vma;
@@ -351,7 +351,7 @@ int uprobe_write_opcode(struct mm_struct *mm, unsigned long 
vaddr,
  */
 int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm, 
unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, UPROBE_SWBP_INSN);
+   return uprobe_write_opcode(auprobe, mm, vaddr, UPROBE_SWBP_INSN);
 }
 
 /**
@@ -366,7 +366,8 @@ int __weak set_swbp(struct arch_uprobe *auprobe, struct 
mm_struct *mm, unsigned
 int __weak
 set_orig_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long 
vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, *(uprobe_opcode_t 
*)>insn);
+   return uprobe_write_opcode(auprobe, mm, vaddr,
+   *(uprobe_opcode_t *)>insn);
 }
 
 static struct uprobe *get_uprobe(struct uprobe *uprobe)
-- 
2.14.4



[PATCH v7 0/6] Uprobes: Support SDT markers having reference count (semaphore)

2018-07-30 Thread Ravi Bangoria
v7 changes:
 - Don't allow both zero and non-zero reference counter offset at
   the same time. It's painful and error prone.
 - Don't call delayed_uprobe_install() if vma->vm_mm does not have
   any breakpoint installed.

v6: https://lkml.org/lkml/2018/7/16/353


Description:
Userspace Statically Defined Tracepoints[1] are dtrace style markers
inside userspace applications. Applications like PostgreSQL, MySQL,
Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
have these markers embedded in them. These markers are added by developer
at important places in the code. Each marker source expands to a single
nop instruction in the compiled code but there may be additional
overhead for computing the marker arguments which expands to couple of
instructions. In case the overhead is more, execution of it can be
omitted by runtime if() condition when no one is tracing on the marker:

if (reference_counter > 0) {
Execute marker instructions;
}

Default value of reference counter is 0. Tracer has to increment the 
reference counter before tracing on a marker and decrement it when
done with the tracing.

Currently, perf tool has limited supports for SDT markers. I.e. it
can not trace markers surrounded by reference counter. Also, it's
not easy to add reference counter logic in userspace tool like perf,
so basic idea for this patchset is to add reference counter logic in
the a uprobe infrastructure. Ex,[2]

  # cat tick.c
... 
for (i = 0; i < 100; i++) {
DTRACE_PROBE1(tick, loop1, i);
if (TICK_LOOP2_ENABLED()) {
DTRACE_PROBE1(tick, loop2, i); 
}
printf("hi: %d\n", i); 
sleep(1);
}   
... 

Here tick:loop1 is marker without reference counter where as tick:loop2
is surrounded by reference counter condition.

  # perf buildid-cache --add /tmp/tick
  # perf probe sdt_tick:loop1
  # perf probe sdt_tick:loop2

  # perf stat -e sdt_tick:loop1,sdt_tick:loop2 -- /tmp/tick
  hi: 0
  hi: 1
  hi: 2
  ^C
  Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop1
 0  sdt_tick:loop2
 2.747086086 seconds time elapsed

Perf failed to record data for tick:loop2. Same experiment with this
patch series:

  # ./perf buildid-cache --add /tmp/tick
  # ./perf probe sdt_tick:loop2
  # ./perf stat -e sdt_tick:loop2 /tmp/tick
hi: 0
hi: 1
hi: 2
^C  
 Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop2
   2.561851452 seconds time elapsed


Ravi Bangoria (6):
  Uprobes: Simplify uprobe_register() body
  Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
  Uprobes: Support SDT markers having reference count (semaphore)
  Uprobes/sdt: Prevent multiple reference counter for same uprobe
  trace_uprobe/sdt: Prevent multiple reference counter for same uprobe
  perf probe: Support SDT markers having reference counter (semaphore)

 arch/arm/probes/uprobes/core.c |   2 +-
 arch/mips/kernel/uprobes.c |   2 +-
 include/linux/uprobes.h|   7 +-
 kernel/events/uprobes.c| 315 +++--
 kernel/trace/trace.c   |   2 +-
 kernel/trace/trace_uprobe.c|  75 +-
 tools/perf/util/probe-event.c  |  39 -
 tools/perf/util/probe-event.h  |   1 +
 tools/perf/util/probe-file.c   |  34 -
 tools/perf/util/probe-file.h   |   1 +
 tools/perf/util/symbol-elf.c   |  46 --
 tools/perf/util/symbol.h   |   7 +
 12 files changed, 459 insertions(+), 72 deletions(-)

-- 
2.14.4



[PATCH v7 5/6] trace_uprobe/sdt: Prevent multiple reference counter for same uprobe

2018-07-30 Thread Ravi Bangoria
We assume to have only one reference counter for one uprobe.
Don't allow user to add multiple trace_uprobe entries having
same inode+offset but different reference counter.

Ex,
  # echo "p:sdt_tick/loop2 /home/ravi/tick:0x6e4(0x10036)" > uprobe_events
  # echo "p:sdt_tick/loop2_1 /home/ravi/tick:0x6e4(0xf)" >> uprobe_events
  bash: echo: write error: Invalid argument

  # dmesg
  trace_kprobe: Reference counter offset mismatch.

There is one exception though:
When user is trying to replace the old entry with the new
one, we allow this if the new entry does not conflict with
any other existing entries.

Signed-off-by: Ravi Bangoria 
---
 kernel/trace/trace_uprobe.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index bf2be098eb08..be64d943d7ea 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -324,6 +324,35 @@ static int unregister_trace_uprobe(struct trace_uprobe *tu)
return 0;
 }
 
+/*
+ * Uprobe with multiple reference counter is not allowed. i.e.
+ * If inode and offset matches, reference counter offset *must*
+ * match as well. Though, there is one exception: If user is
+ * replacing old trace_uprobe with new one(same group/event),
+ * then we allow same uprobe with new reference counter as far
+ * as the new one does not conflict with any other existing
+ * ones.
+ */
+static struct trace_uprobe *find_old_trace_uprobe(struct trace_uprobe *new)
+{
+   struct trace_uprobe *tmp, *old = NULL;
+   struct inode *new_inode = d_real_inode(new->path.dentry);
+
+   old = find_probe_event(trace_event_name(>tp.call),
+   new->tp.call.class->system);
+
+   list_for_each_entry(tmp, _list, list) {
+   if ((old ? old != tmp : true) &&
+   new_inode == d_real_inode(tmp->path.dentry) &&
+   new->offset == tmp->offset &&
+   new->ref_ctr_offset != tmp->ref_ctr_offset) {
+   pr_warn("Reference counter offset mismatch.");
+   return ERR_PTR(-EINVAL);
+   }
+   }
+   return old;
+}
+
 /* Register a trace_uprobe and probe_event */
 static int register_trace_uprobe(struct trace_uprobe *tu)
 {
@@ -333,8 +362,12 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
mutex_lock(_lock);
 
/* register as an event */
-   old_tu = find_probe_event(trace_event_name(>tp.call),
-   tu->tp.call.class->system);
+   old_tu = find_old_trace_uprobe(tu);
+   if (IS_ERR(old_tu)) {
+   ret = PTR_ERR(old_tu);
+   goto end;
+   }
+
if (old_tu) {
/* delete old event */
ret = unregister_trace_uprobe(old_tu);
-- 
2.14.4



[PATCH v7 1/6] Uprobes: Simplify uprobe_register() body

2018-07-30 Thread Ravi Bangoria
Simplify uprobe_register() function body and let __uprobe_register()
handle everything. Also move dependency functions around to fix build
failures.

Signed-off-by: Ravi Bangoria 
---
 kernel/events/uprobes.c | 69 ++---
 1 file changed, 36 insertions(+), 33 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ccc579a7d32e..471eac896635 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -840,13 +840,8 @@ register_for_each_vma(struct uprobe *uprobe, struct 
uprobe_consumer *new)
return err;
 }
 
-static int __uprobe_register(struct uprobe *uprobe, struct uprobe_consumer *uc)
-{
-   consumer_add(uprobe, uc);
-   return register_for_each_vma(uprobe, uc);
-}
-
-static void __uprobe_unregister(struct uprobe *uprobe, struct uprobe_consumer 
*uc)
+static void
+__uprobe_unregister(struct uprobe *uprobe, struct uprobe_consumer *uc)
 {
int err;
 
@@ -860,24 +855,46 @@ static void __uprobe_unregister(struct uprobe *uprobe, 
struct uprobe_consumer *u
 }
 
 /*
- * uprobe_register - register a probe
+ * uprobe_unregister - unregister a already registered probe.
+ * @inode: the file in which the probe has to be removed.
+ * @offset: offset from the start of the file.
+ * @uc: identify which probe if multiple probes are colocated.
+ */
+void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
+{
+   struct uprobe *uprobe;
+
+   uprobe = find_uprobe(inode, offset);
+   if (WARN_ON(!uprobe))
+   return;
+
+   down_write(>register_rwsem);
+   __uprobe_unregister(uprobe, uc);
+   up_write(>register_rwsem);
+   put_uprobe(uprobe);
+}
+EXPORT_SYMBOL_GPL(uprobe_unregister);
+
+/*
+ * __uprobe_register - register a probe
  * @inode: the file in which the probe has to be placed.
  * @offset: offset from the start of the file.
  * @uc: information on howto handle the probe..
  *
- * Apart from the access refcount, uprobe_register() takes a creation
+ * Apart from the access refcount, __uprobe_register() takes a creation
  * refcount (thro alloc_uprobe) if and only if this @uprobe is getting
  * inserted into the rbtree (i.e first consumer for a @inode:@offset
  * tuple).  Creation refcount stops uprobe_unregister from freeing the
  * @uprobe even before the register operation is complete. Creation
  * refcount is released when the last @uc for the @uprobe
- * unregisters. Caller of uprobe_register() is required to keep @inode
+ * unregisters. Caller of __uprobe_register() is required to keep @inode
  * (and the containing mount) referenced.
  *
  * Return errno if it cannot successully install probes
  * else return 0 (success)
  */
-int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer 
*uc)
+static int __uprobe_register(struct inode *inode, loff_t offset,
+struct uprobe_consumer *uc)
 {
struct uprobe *uprobe;
int ret;
@@ -904,7 +921,8 @@ int uprobe_register(struct inode *inode, loff_t offset, 
struct uprobe_consumer *
down_write(>register_rwsem);
ret = -EAGAIN;
if (likely(uprobe_is_active(uprobe))) {
-   ret = __uprobe_register(uprobe, uc);
+   consumer_add(uprobe, uc);
+   ret = register_for_each_vma(uprobe, uc);
if (ret)
__uprobe_unregister(uprobe, uc);
}
@@ -915,6 +933,12 @@ int uprobe_register(struct inode *inode, loff_t offset, 
struct uprobe_consumer *
goto retry;
return ret;
 }
+
+int uprobe_register(struct inode *inode, loff_t offset,
+   struct uprobe_consumer *uc)
+{
+   return __uprobe_register(inode, offset, uc);
+}
 EXPORT_SYMBOL_GPL(uprobe_register);
 
 /*
@@ -946,27 +970,6 @@ int uprobe_apply(struct inode *inode, loff_t offset,
return ret;
 }
 
-/*
- * uprobe_unregister - unregister a already registered probe.
- * @inode: the file in which the probe has to be removed.
- * @offset: offset from the start of the file.
- * @uc: identify which probe if multiple probes are colocated.
- */
-void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
-{
-   struct uprobe *uprobe;
-
-   uprobe = find_uprobe(inode, offset);
-   if (WARN_ON(!uprobe))
-   return;
-
-   down_write(>register_rwsem);
-   __uprobe_unregister(uprobe, uc);
-   up_write(>register_rwsem);
-   put_uprobe(uprobe);
-}
-EXPORT_SYMBOL_GPL(uprobe_unregister);
-
 static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm)
 {
struct vm_area_struct *vma;
-- 
2.14.4



[PATCH v7 2/6] Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()

2018-07-30 Thread Ravi Bangoria
Add addition argument 'arch_uprobe' to uprobe_write_opcode().
We need this in later set of patches.

Signed-off-by: Ravi Bangoria 
---
 arch/arm/probes/uprobes/core.c | 2 +-
 arch/mips/kernel/uprobes.c | 2 +-
 include/linux/uprobes.h| 2 +-
 kernel/events/uprobes.c| 9 +
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm/probes/uprobes/core.c b/arch/arm/probes/uprobes/core.c
index d1329f1ba4e4..bf992264060e 100644
--- a/arch/arm/probes/uprobes/core.c
+++ b/arch/arm/probes/uprobes/core.c
@@ -32,7 +32,7 @@ bool is_swbp_insn(uprobe_opcode_t *insn)
 int set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm,
 unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr,
+   return uprobe_write_opcode(auprobe, mm, vaddr,
   __opcode_to_mem_arm(auprobe->bpinsn));
 }
 
diff --git a/arch/mips/kernel/uprobes.c b/arch/mips/kernel/uprobes.c
index f7a0645ccb82..4aaff3b3175c 100644
--- a/arch/mips/kernel/uprobes.c
+++ b/arch/mips/kernel/uprobes.c
@@ -224,7 +224,7 @@ unsigned long arch_uretprobe_hijack_return_addr(
 int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm,
unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, UPROBE_SWBP_INSN);
+   return uprobe_write_opcode(auprobe, mm, vaddr, UPROBE_SWBP_INSN);
 }
 
 void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 0a294e950df8..bb9d2084af03 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -121,7 +121,7 @@ extern bool is_swbp_insn(uprobe_opcode_t *insn);
 extern bool is_trap_insn(uprobe_opcode_t *insn);
 extern unsigned long uprobe_get_swbp_addr(struct pt_regs *regs);
 extern unsigned long uprobe_get_trap_addr(struct pt_regs *regs);
-extern int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr, 
uprobe_opcode_t);
+extern int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct 
*mm, unsigned long vaddr, uprobe_opcode_t);
 extern int uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
 extern int uprobe_apply(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc, bool);
 extern void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 471eac896635..c0418ba52ba8 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -299,8 +299,8 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
  * Called with mm->mmap_sem held for write.
  * Return 0 (success) or a negative errno.
  */
-int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
-   uprobe_opcode_t opcode)
+int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
+   unsigned long vaddr, uprobe_opcode_t opcode)
 {
struct page *old_page, *new_page;
struct vm_area_struct *vma;
@@ -351,7 +351,7 @@ int uprobe_write_opcode(struct mm_struct *mm, unsigned long 
vaddr,
  */
 int __weak set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm, 
unsigned long vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, UPROBE_SWBP_INSN);
+   return uprobe_write_opcode(auprobe, mm, vaddr, UPROBE_SWBP_INSN);
 }
 
 /**
@@ -366,7 +366,8 @@ int __weak set_swbp(struct arch_uprobe *auprobe, struct 
mm_struct *mm, unsigned
 int __weak
 set_orig_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long 
vaddr)
 {
-   return uprobe_write_opcode(mm, vaddr, *(uprobe_opcode_t 
*)>insn);
+   return uprobe_write_opcode(auprobe, mm, vaddr,
+   *(uprobe_opcode_t *)>insn);
 }
 
 static struct uprobe *get_uprobe(struct uprobe *uprobe)
-- 
2.14.4



[PATCH v7 0/6] Uprobes: Support SDT markers having reference count (semaphore)

2018-07-30 Thread Ravi Bangoria
v7 changes:
 - Don't allow both zero and non-zero reference counter offset at
   the same time. It's painful and error prone.
 - Don't call delayed_uprobe_install() if vma->vm_mm does not have
   any breakpoint installed.

v6: https://lkml.org/lkml/2018/7/16/353


Description:
Userspace Statically Defined Tracepoints[1] are dtrace style markers
inside userspace applications. Applications like PostgreSQL, MySQL,
Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
have these markers embedded in them. These markers are added by developer
at important places in the code. Each marker source expands to a single
nop instruction in the compiled code but there may be additional
overhead for computing the marker arguments which expands to couple of
instructions. In case the overhead is more, execution of it can be
omitted by runtime if() condition when no one is tracing on the marker:

if (reference_counter > 0) {
Execute marker instructions;
}

Default value of reference counter is 0. Tracer has to increment the 
reference counter before tracing on a marker and decrement it when
done with the tracing.

Currently, perf tool has limited supports for SDT markers. I.e. it
can not trace markers surrounded by reference counter. Also, it's
not easy to add reference counter logic in userspace tool like perf,
so basic idea for this patchset is to add reference counter logic in
the a uprobe infrastructure. Ex,[2]

  # cat tick.c
... 
for (i = 0; i < 100; i++) {
DTRACE_PROBE1(tick, loop1, i);
if (TICK_LOOP2_ENABLED()) {
DTRACE_PROBE1(tick, loop2, i); 
}
printf("hi: %d\n", i); 
sleep(1);
}   
... 

Here tick:loop1 is marker without reference counter where as tick:loop2
is surrounded by reference counter condition.

  # perf buildid-cache --add /tmp/tick
  # perf probe sdt_tick:loop1
  # perf probe sdt_tick:loop2

  # perf stat -e sdt_tick:loop1,sdt_tick:loop2 -- /tmp/tick
  hi: 0
  hi: 1
  hi: 2
  ^C
  Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop1
 0  sdt_tick:loop2
 2.747086086 seconds time elapsed

Perf failed to record data for tick:loop2. Same experiment with this
patch series:

  # ./perf buildid-cache --add /tmp/tick
  # ./perf probe sdt_tick:loop2
  # ./perf stat -e sdt_tick:loop2 /tmp/tick
hi: 0
hi: 1
hi: 2
^C  
 Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop2
   2.561851452 seconds time elapsed


Ravi Bangoria (6):
  Uprobes: Simplify uprobe_register() body
  Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
  Uprobes: Support SDT markers having reference count (semaphore)
  Uprobes/sdt: Prevent multiple reference counter for same uprobe
  trace_uprobe/sdt: Prevent multiple reference counter for same uprobe
  perf probe: Support SDT markers having reference counter (semaphore)

 arch/arm/probes/uprobes/core.c |   2 +-
 arch/mips/kernel/uprobes.c |   2 +-
 include/linux/uprobes.h|   7 +-
 kernel/events/uprobes.c| 315 +++--
 kernel/trace/trace.c   |   2 +-
 kernel/trace/trace_uprobe.c|  75 +-
 tools/perf/util/probe-event.c  |  39 -
 tools/perf/util/probe-event.h  |   1 +
 tools/perf/util/probe-file.c   |  34 -
 tools/perf/util/probe-file.h   |   1 +
 tools/perf/util/symbol-elf.c   |  46 --
 tools/perf/util/symbol.h   |   7 +
 12 files changed, 459 insertions(+), 72 deletions(-)

-- 
2.14.4



[PATCH v7 5/6] trace_uprobe/sdt: Prevent multiple reference counter for same uprobe

2018-07-30 Thread Ravi Bangoria
We assume to have only one reference counter for one uprobe.
Don't allow user to add multiple trace_uprobe entries having
same inode+offset but different reference counter.

Ex,
  # echo "p:sdt_tick/loop2 /home/ravi/tick:0x6e4(0x10036)" > uprobe_events
  # echo "p:sdt_tick/loop2_1 /home/ravi/tick:0x6e4(0xf)" >> uprobe_events
  bash: echo: write error: Invalid argument

  # dmesg
  trace_kprobe: Reference counter offset mismatch.

There is one exception though:
When user is trying to replace the old entry with the new
one, we allow this if the new entry does not conflict with
any other existing entries.

Signed-off-by: Ravi Bangoria 
---
 kernel/trace/trace_uprobe.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index bf2be098eb08..be64d943d7ea 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -324,6 +324,35 @@ static int unregister_trace_uprobe(struct trace_uprobe *tu)
return 0;
 }
 
+/*
+ * Uprobe with multiple reference counter is not allowed. i.e.
+ * If inode and offset matches, reference counter offset *must*
+ * match as well. Though, there is one exception: If user is
+ * replacing old trace_uprobe with new one(same group/event),
+ * then we allow same uprobe with new reference counter as far
+ * as the new one does not conflict with any other existing
+ * ones.
+ */
+static struct trace_uprobe *find_old_trace_uprobe(struct trace_uprobe *new)
+{
+   struct trace_uprobe *tmp, *old = NULL;
+   struct inode *new_inode = d_real_inode(new->path.dentry);
+
+   old = find_probe_event(trace_event_name(>tp.call),
+   new->tp.call.class->system);
+
+   list_for_each_entry(tmp, _list, list) {
+   if ((old ? old != tmp : true) &&
+   new_inode == d_real_inode(tmp->path.dentry) &&
+   new->offset == tmp->offset &&
+   new->ref_ctr_offset != tmp->ref_ctr_offset) {
+   pr_warn("Reference counter offset mismatch.");
+   return ERR_PTR(-EINVAL);
+   }
+   }
+   return old;
+}
+
 /* Register a trace_uprobe and probe_event */
 static int register_trace_uprobe(struct trace_uprobe *tu)
 {
@@ -333,8 +362,12 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
mutex_lock(_lock);
 
/* register as an event */
-   old_tu = find_probe_event(trace_event_name(>tp.call),
-   tu->tp.call.class->system);
+   old_tu = find_old_trace_uprobe(tu);
+   if (IS_ERR(old_tu)) {
+   ret = PTR_ERR(old_tu);
+   goto end;
+   }
+
if (old_tu) {
/* delete old event */
ret = unregister_trace_uprobe(old_tu);
-- 
2.14.4



[PATCH v7 1/6] Uprobes: Simplify uprobe_register() body

2018-07-30 Thread Ravi Bangoria
Simplify uprobe_register() function body and let __uprobe_register()
handle everything. Also move dependency functions around to fix build
failures.

Signed-off-by: Ravi Bangoria 
---
 kernel/events/uprobes.c | 69 ++---
 1 file changed, 36 insertions(+), 33 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ccc579a7d32e..471eac896635 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -840,13 +840,8 @@ register_for_each_vma(struct uprobe *uprobe, struct 
uprobe_consumer *new)
return err;
 }
 
-static int __uprobe_register(struct uprobe *uprobe, struct uprobe_consumer *uc)
-{
-   consumer_add(uprobe, uc);
-   return register_for_each_vma(uprobe, uc);
-}
-
-static void __uprobe_unregister(struct uprobe *uprobe, struct uprobe_consumer 
*uc)
+static void
+__uprobe_unregister(struct uprobe *uprobe, struct uprobe_consumer *uc)
 {
int err;
 
@@ -860,24 +855,46 @@ static void __uprobe_unregister(struct uprobe *uprobe, 
struct uprobe_consumer *u
 }
 
 /*
- * uprobe_register - register a probe
+ * uprobe_unregister - unregister a already registered probe.
+ * @inode: the file in which the probe has to be removed.
+ * @offset: offset from the start of the file.
+ * @uc: identify which probe if multiple probes are colocated.
+ */
+void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
+{
+   struct uprobe *uprobe;
+
+   uprobe = find_uprobe(inode, offset);
+   if (WARN_ON(!uprobe))
+   return;
+
+   down_write(>register_rwsem);
+   __uprobe_unregister(uprobe, uc);
+   up_write(>register_rwsem);
+   put_uprobe(uprobe);
+}
+EXPORT_SYMBOL_GPL(uprobe_unregister);
+
+/*
+ * __uprobe_register - register a probe
  * @inode: the file in which the probe has to be placed.
  * @offset: offset from the start of the file.
  * @uc: information on howto handle the probe..
  *
- * Apart from the access refcount, uprobe_register() takes a creation
+ * Apart from the access refcount, __uprobe_register() takes a creation
  * refcount (thro alloc_uprobe) if and only if this @uprobe is getting
  * inserted into the rbtree (i.e first consumer for a @inode:@offset
  * tuple).  Creation refcount stops uprobe_unregister from freeing the
  * @uprobe even before the register operation is complete. Creation
  * refcount is released when the last @uc for the @uprobe
- * unregisters. Caller of uprobe_register() is required to keep @inode
+ * unregisters. Caller of __uprobe_register() is required to keep @inode
  * (and the containing mount) referenced.
  *
  * Return errno if it cannot successully install probes
  * else return 0 (success)
  */
-int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer 
*uc)
+static int __uprobe_register(struct inode *inode, loff_t offset,
+struct uprobe_consumer *uc)
 {
struct uprobe *uprobe;
int ret;
@@ -904,7 +921,8 @@ int uprobe_register(struct inode *inode, loff_t offset, 
struct uprobe_consumer *
down_write(>register_rwsem);
ret = -EAGAIN;
if (likely(uprobe_is_active(uprobe))) {
-   ret = __uprobe_register(uprobe, uc);
+   consumer_add(uprobe, uc);
+   ret = register_for_each_vma(uprobe, uc);
if (ret)
__uprobe_unregister(uprobe, uc);
}
@@ -915,6 +933,12 @@ int uprobe_register(struct inode *inode, loff_t offset, 
struct uprobe_consumer *
goto retry;
return ret;
 }
+
+int uprobe_register(struct inode *inode, loff_t offset,
+   struct uprobe_consumer *uc)
+{
+   return __uprobe_register(inode, offset, uc);
+}
 EXPORT_SYMBOL_GPL(uprobe_register);
 
 /*
@@ -946,27 +970,6 @@ int uprobe_apply(struct inode *inode, loff_t offset,
return ret;
 }
 
-/*
- * uprobe_unregister - unregister a already registered probe.
- * @inode: the file in which the probe has to be removed.
- * @offset: offset from the start of the file.
- * @uc: identify which probe if multiple probes are colocated.
- */
-void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
-{
-   struct uprobe *uprobe;
-
-   uprobe = find_uprobe(inode, offset);
-   if (WARN_ON(!uprobe))
-   return;
-
-   down_write(>register_rwsem);
-   __uprobe_unregister(uprobe, uc);
-   up_write(>register_rwsem);
-   put_uprobe(uprobe);
-}
-EXPORT_SYMBOL_GPL(uprobe_unregister);
-
 static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm)
 {
struct vm_area_struct *vma;
-- 
2.14.4



[PATCH v7 3/6] Uprobes: Support SDT markers having reference count (semaphore)

2018-07-30 Thread Ravi Bangoria
Userspace Statically Defined Tracepoints[1] are dtrace style markers
inside userspace applications. Applications like PostgreSQL, MySQL,
Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
have these markers embedded in them. These markers are added by developer
at important places in the code. Each marker source expands to a single
nop instruction in the compiled code but there may be additional
overhead for computing the marker arguments which expands to couple of
instructions. In case the overhead is more, execution of it can be
omitted by runtime if() condition when no one is tracing on the marker:

if (reference_counter > 0) {
Execute marker instructions;
}

Default value of reference counter is 0. Tracer has to increment the
reference counter before tracing on a marker and decrement it when
done with the tracing.

Implement the reference counter logic in core uprobe. User will be
able to use it from trace_uprobe as well as from kernel module. New
trace_uprobe definition with reference counter will now be:

:[(ref_ctr_offset)]

where ref_ctr_offset is an optional field. For kernel module, new
variant of uprobe_register() has been introduced:

uprobe_register_refctr(inode, offset, ref_ctr_offset, consumer)

No new variant for uprobe_unregister() because it's assumed to have
only one reference counter for one uprobe.

[1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Note: 'reference counter' is called as 'semaphore' in original Dtrace
(or Systemtap, bcc and even in ELF) documentation and code. But the
term 'semaphore' is misleading in this context. This is just a counter
used to hold number of tracers tracing on a marker. This is not really
used for any synchronization. So we are referring it as 'reference
counter' in kernel / perf code.

Signed-off-by: Ravi Bangoria 
Reviewed-by: Masami Hiramatsu 
[Only trace_uprobe.c]
---
 include/linux/uprobes.h |   5 +
 kernel/events/uprobes.c | 232 ++--
 kernel/trace/trace.c|   2 +-
 kernel/trace/trace_uprobe.c |  38 +++-
 4 files changed, 267 insertions(+), 10 deletions(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index bb9d2084af03..103a48a48872 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -123,6 +123,7 @@ extern unsigned long uprobe_get_swbp_addr(struct pt_regs 
*regs);
 extern unsigned long uprobe_get_trap_addr(struct pt_regs *regs);
 extern int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct 
*mm, unsigned long vaddr, uprobe_opcode_t);
 extern int uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
+extern int uprobe_register_refctr(struct inode *inode, loff_t offset, loff_t 
ref_ctr_offset, struct uprobe_consumer *uc);
 extern int uprobe_apply(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc, bool);
 extern void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
 extern int uprobe_mmap(struct vm_area_struct *vma);
@@ -160,6 +161,10 @@ uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
 {
return -ENOSYS;
 }
+static inline int uprobe_register_refctr(struct inode *inode, loff_t offset, 
loff_t ref_ctr_offset, struct uprobe_consumer *uc)
+{
+   return -ENOSYS;
+}
 static inline int
 uprobe_apply(struct inode *inode, loff_t offset, struct uprobe_consumer *uc, 
bool add)
 {
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c0418ba52ba8..ad92fed11526 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -73,6 +73,7 @@ struct uprobe {
struct uprobe_consumer  *consumers;
struct inode*inode; /* Also hold a ref to inode */
loff_t  offset;
+   loff_t  ref_ctr_offset;
unsigned long   flags;
 
/*
@@ -88,6 +89,15 @@ struct uprobe {
struct arch_uprobe  arch;
 };
 
+struct delayed_uprobe {
+   struct list_head list;
+   struct uprobe *uprobe;
+   struct mm_struct *mm;
+};
+
+static DEFINE_MUTEX(delayed_uprobe_lock);
+static LIST_HEAD(delayed_uprobe_list);
+
 /*
  * Execute out of line area: anonymous executable mapping installed
  * by the probed task to execute the copy of the original instruction
@@ -282,6 +292,154 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
return 1;
 }
 
+static struct delayed_uprobe *
+delayed_uprobe_check(struct uprobe *uprobe, struct mm_struct *mm)
+{
+   struct delayed_uprobe *du;
+
+   list_for_each_entry(du, _uprobe_list, list)
+   if (du->uprobe == uprobe && du->mm == mm)
+   return du;
+   return NULL;
+}
+
+static int delayed_uprobe_add(struct uprobe *uprobe, struct mm_struct *mm)
+{
+   struct delayed_uprobe *du;
+
+   if (delayed_uprobe_check(uprobe, mm))
+   return 0;
+

[PATCH v7 4/6] Uprobes/sdt: Prevent multiple reference counter for same uprobe

2018-07-30 Thread Ravi Bangoria
We assume to have only one reference counter for one uprobe.
Don't allow user to register multiple uprobes having same
inode+offset but different reference counter.

Signed-off-by: Ravi Bangoria 
---
 kernel/events/uprobes.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad92fed11526..c27546929ae7 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -679,6 +679,12 @@ static struct uprobe *alloc_uprobe(struct inode *inode, 
loff_t offset,
cur_uprobe = insert_uprobe(uprobe);
/* a uprobe exists for this inode:offset combination */
if (cur_uprobe) {
+   if (cur_uprobe->ref_ctr_offset != uprobe->ref_ctr_offset) {
+   pr_warn("Reference counter offset mismatch.\n");
+   put_uprobe(cur_uprobe);
+   kfree(uprobe);
+   return ERR_PTR(-EINVAL);
+   }
kfree(uprobe);
uprobe = cur_uprobe;
}
@@ -1093,6 +1099,9 @@ static int __uprobe_register(struct inode *inode, loff_t 
offset,
uprobe = alloc_uprobe(inode, offset, ref_ctr_offset);
if (!uprobe)
return -ENOMEM;
+   if (IS_ERR(uprobe))
+   return PTR_ERR(uprobe);
+
/*
 * We can race with uprobe_unregister()->delete_uprobe().
 * Check uprobe_is_active() and retry if it is false.
-- 
2.14.4



[PATCH v7 6/6] perf probe: Support SDT markers having reference counter (semaphore)

2018-07-30 Thread Ravi Bangoria
With this, perf buildid-cache will save SDT markers with reference
counter in probe cache. Perf probe will be able to probe markers
having reference counter. Ex,

  # readelf -n /tmp/tick | grep -A1 loop2
Name: loop2
... Semaphore: 0x10020036

  # ./perf buildid-cache --add /tmp/tick
  # ./perf probe sdt_tick:loop2
  # ./perf stat -e sdt_tick:loop2 /tmp/tick
hi: 0
hi: 1
hi: 2
^C
 Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop2
   2.561851452 seconds time elapsed

Signed-off-by: Ravi Bangoria 
Acked-by: Masami Hiramatsu 
Acked-by: Srikar Dronamraju 
---
 tools/perf/util/probe-event.c | 39 
 tools/perf/util/probe-event.h |  1 +
 tools/perf/util/probe-file.c  | 34 ++--
 tools/perf/util/probe-file.h  |  1 +
 tools/perf/util/symbol-elf.c  | 46 ---
 tools/perf/util/symbol.h  |  7 +++
 6 files changed, 106 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index f119eb628dbb..e86f8be89157 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1819,6 +1819,12 @@ int parse_probe_trace_command(const char *cmd, struct 
probe_trace_event *tev)
tp->offset = strtoul(fmt2_str, NULL, 10);
}
 
+   if (tev->uprobes) {
+   fmt2_str = strchr(p, '(');
+   if (fmt2_str)
+   tp->ref_ctr_offset = strtoul(fmt2_str + 1, NULL, 0);
+   }
+
tev->nargs = argc - 2;
tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
if (tev->args == NULL) {
@@ -2012,6 +2018,22 @@ static int synthesize_probe_trace_arg(struct 
probe_trace_arg *arg,
return err;
 }
 
+static int
+synthesize_uprobe_trace_def(struct probe_trace_event *tev, struct strbuf *buf)
+{
+   struct probe_trace_point *tp = >point;
+   int err;
+
+   err = strbuf_addf(buf, "%s:0x%lx", tp->module, tp->address);
+
+   if (err >= 0 && tp->ref_ctr_offset) {
+   if (!uprobe_ref_ctr_is_supported())
+   return -1;
+   err = strbuf_addf(buf, "(0x%lx)", tp->ref_ctr_offset);
+   }
+   return err >= 0 ? 0 : -1;
+}
+
 char *synthesize_probe_trace_command(struct probe_trace_event *tev)
 {
struct probe_trace_point *tp = >point;
@@ -2041,15 +2063,17 @@ char *synthesize_probe_trace_command(struct 
probe_trace_event *tev)
}
 
/* Use the tp->address for uprobes */
-   if (tev->uprobes)
-   err = strbuf_addf(, "%s:0x%lx", tp->module, tp->address);
-   else if (!strncmp(tp->symbol, "0x", 2))
+   if (tev->uprobes) {
+   err = synthesize_uprobe_trace_def(tev, );
+   } else if (!strncmp(tp->symbol, "0x", 2)) {
/* Absolute address. See try_to_find_absolute_address() */
err = strbuf_addf(, "%s%s0x%lx", tp->module ?: "",
  tp->module ? ":" : "", tp->address);
-   else
+   } else {
err = strbuf_addf(, "%s%s%s+%lu", tp->module ?: "",
tp->module ? ":" : "", tp->symbol, tp->offset);
+   }
+
if (err)
goto error;
 
@@ -2633,6 +2657,13 @@ static void warn_uprobe_event_compat(struct 
probe_trace_event *tev)
 {
int i;
char *buf = synthesize_probe_trace_command(tev);
+   struct probe_trace_point *tp = >point;
+
+   if (tp->ref_ctr_offset && !uprobe_ref_ctr_is_supported()) {
+   pr_warning("A semaphore is associated with %s:%s and "
+  "seems your kernel doesn't support it.\n",
+  tev->group, tev->event);
+   }
 
/* Old uprobe event doesn't support memory dereference */
if (!tev->uprobes || tev->nargs == 0 || !buf)
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 45b14f020558..15a98c3a2a2f 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -27,6 +27,7 @@ struct probe_trace_point {
char*symbol;/* Base symbol */
char*module;/* Module name */
unsigned long   offset; /* Offset from symbol */
+   unsigned long   ref_ctr_offset; /* SDT reference counter offset */
unsigned long   address;/* Actual address of the trace point */
boolretprobe;   /* Return probe flag */
 };
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index b76088fadf3d..aac7817d9e14 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -696,8 +696,16 @@ int probe_cache__add_entry(struct probe_cache *pcache,
 #ifdef HAVE_GELF_GETNOTE_SUPPORT
 static unsigned long long sdt_note__get_addr(struct sdt_note *note)
 {
-   return note->bit32 ? (unsigned long 

[PATCH v7 4/6] Uprobes/sdt: Prevent multiple reference counter for same uprobe

2018-07-30 Thread Ravi Bangoria
We assume to have only one reference counter for one uprobe.
Don't allow user to register multiple uprobes having same
inode+offset but different reference counter.

Signed-off-by: Ravi Bangoria 
---
 kernel/events/uprobes.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad92fed11526..c27546929ae7 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -679,6 +679,12 @@ static struct uprobe *alloc_uprobe(struct inode *inode, 
loff_t offset,
cur_uprobe = insert_uprobe(uprobe);
/* a uprobe exists for this inode:offset combination */
if (cur_uprobe) {
+   if (cur_uprobe->ref_ctr_offset != uprobe->ref_ctr_offset) {
+   pr_warn("Reference counter offset mismatch.\n");
+   put_uprobe(cur_uprobe);
+   kfree(uprobe);
+   return ERR_PTR(-EINVAL);
+   }
kfree(uprobe);
uprobe = cur_uprobe;
}
@@ -1093,6 +1099,9 @@ static int __uprobe_register(struct inode *inode, loff_t 
offset,
uprobe = alloc_uprobe(inode, offset, ref_ctr_offset);
if (!uprobe)
return -ENOMEM;
+   if (IS_ERR(uprobe))
+   return PTR_ERR(uprobe);
+
/*
 * We can race with uprobe_unregister()->delete_uprobe().
 * Check uprobe_is_active() and retry if it is false.
-- 
2.14.4



[PATCH v7 6/6] perf probe: Support SDT markers having reference counter (semaphore)

2018-07-30 Thread Ravi Bangoria
With this, perf buildid-cache will save SDT markers with reference
counter in probe cache. Perf probe will be able to probe markers
having reference counter. Ex,

  # readelf -n /tmp/tick | grep -A1 loop2
Name: loop2
... Semaphore: 0x10020036

  # ./perf buildid-cache --add /tmp/tick
  # ./perf probe sdt_tick:loop2
  # ./perf stat -e sdt_tick:loop2 /tmp/tick
hi: 0
hi: 1
hi: 2
^C
 Performance counter stats for '/tmp/tick':
 3  sdt_tick:loop2
   2.561851452 seconds time elapsed

Signed-off-by: Ravi Bangoria 
Acked-by: Masami Hiramatsu 
Acked-by: Srikar Dronamraju 
---
 tools/perf/util/probe-event.c | 39 
 tools/perf/util/probe-event.h |  1 +
 tools/perf/util/probe-file.c  | 34 ++--
 tools/perf/util/probe-file.h  |  1 +
 tools/perf/util/symbol-elf.c  | 46 ---
 tools/perf/util/symbol.h  |  7 +++
 6 files changed, 106 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index f119eb628dbb..e86f8be89157 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1819,6 +1819,12 @@ int parse_probe_trace_command(const char *cmd, struct 
probe_trace_event *tev)
tp->offset = strtoul(fmt2_str, NULL, 10);
}
 
+   if (tev->uprobes) {
+   fmt2_str = strchr(p, '(');
+   if (fmt2_str)
+   tp->ref_ctr_offset = strtoul(fmt2_str + 1, NULL, 0);
+   }
+
tev->nargs = argc - 2;
tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
if (tev->args == NULL) {
@@ -2012,6 +2018,22 @@ static int synthesize_probe_trace_arg(struct 
probe_trace_arg *arg,
return err;
 }
 
+static int
+synthesize_uprobe_trace_def(struct probe_trace_event *tev, struct strbuf *buf)
+{
+   struct probe_trace_point *tp = >point;
+   int err;
+
+   err = strbuf_addf(buf, "%s:0x%lx", tp->module, tp->address);
+
+   if (err >= 0 && tp->ref_ctr_offset) {
+   if (!uprobe_ref_ctr_is_supported())
+   return -1;
+   err = strbuf_addf(buf, "(0x%lx)", tp->ref_ctr_offset);
+   }
+   return err >= 0 ? 0 : -1;
+}
+
 char *synthesize_probe_trace_command(struct probe_trace_event *tev)
 {
struct probe_trace_point *tp = >point;
@@ -2041,15 +2063,17 @@ char *synthesize_probe_trace_command(struct 
probe_trace_event *tev)
}
 
/* Use the tp->address for uprobes */
-   if (tev->uprobes)
-   err = strbuf_addf(, "%s:0x%lx", tp->module, tp->address);
-   else if (!strncmp(tp->symbol, "0x", 2))
+   if (tev->uprobes) {
+   err = synthesize_uprobe_trace_def(tev, );
+   } else if (!strncmp(tp->symbol, "0x", 2)) {
/* Absolute address. See try_to_find_absolute_address() */
err = strbuf_addf(, "%s%s0x%lx", tp->module ?: "",
  tp->module ? ":" : "", tp->address);
-   else
+   } else {
err = strbuf_addf(, "%s%s%s+%lu", tp->module ?: "",
tp->module ? ":" : "", tp->symbol, tp->offset);
+   }
+
if (err)
goto error;
 
@@ -2633,6 +2657,13 @@ static void warn_uprobe_event_compat(struct 
probe_trace_event *tev)
 {
int i;
char *buf = synthesize_probe_trace_command(tev);
+   struct probe_trace_point *tp = >point;
+
+   if (tp->ref_ctr_offset && !uprobe_ref_ctr_is_supported()) {
+   pr_warning("A semaphore is associated with %s:%s and "
+  "seems your kernel doesn't support it.\n",
+  tev->group, tev->event);
+   }
 
/* Old uprobe event doesn't support memory dereference */
if (!tev->uprobes || tev->nargs == 0 || !buf)
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 45b14f020558..15a98c3a2a2f 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -27,6 +27,7 @@ struct probe_trace_point {
char*symbol;/* Base symbol */
char*module;/* Module name */
unsigned long   offset; /* Offset from symbol */
+   unsigned long   ref_ctr_offset; /* SDT reference counter offset */
unsigned long   address;/* Actual address of the trace point */
boolretprobe;   /* Return probe flag */
 };
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index b76088fadf3d..aac7817d9e14 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -696,8 +696,16 @@ int probe_cache__add_entry(struct probe_cache *pcache,
 #ifdef HAVE_GELF_GETNOTE_SUPPORT
 static unsigned long long sdt_note__get_addr(struct sdt_note *note)
 {
-   return note->bit32 ? (unsigned long 

[PATCH v7 3/6] Uprobes: Support SDT markers having reference count (semaphore)

2018-07-30 Thread Ravi Bangoria
Userspace Statically Defined Tracepoints[1] are dtrace style markers
inside userspace applications. Applications like PostgreSQL, MySQL,
Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
have these markers embedded in them. These markers are added by developer
at important places in the code. Each marker source expands to a single
nop instruction in the compiled code but there may be additional
overhead for computing the marker arguments which expands to couple of
instructions. In case the overhead is more, execution of it can be
omitted by runtime if() condition when no one is tracing on the marker:

if (reference_counter > 0) {
Execute marker instructions;
}

Default value of reference counter is 0. Tracer has to increment the
reference counter before tracing on a marker and decrement it when
done with the tracing.

Implement the reference counter logic in core uprobe. User will be
able to use it from trace_uprobe as well as from kernel module. New
trace_uprobe definition with reference counter will now be:

:[(ref_ctr_offset)]

where ref_ctr_offset is an optional field. For kernel module, new
variant of uprobe_register() has been introduced:

uprobe_register_refctr(inode, offset, ref_ctr_offset, consumer)

No new variant for uprobe_unregister() because it's assumed to have
only one reference counter for one uprobe.

[1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Note: 'reference counter' is called as 'semaphore' in original Dtrace
(or Systemtap, bcc and even in ELF) documentation and code. But the
term 'semaphore' is misleading in this context. This is just a counter
used to hold number of tracers tracing on a marker. This is not really
used for any synchronization. So we are referring it as 'reference
counter' in kernel / perf code.

Signed-off-by: Ravi Bangoria 
Reviewed-by: Masami Hiramatsu 
[Only trace_uprobe.c]
---
 include/linux/uprobes.h |   5 +
 kernel/events/uprobes.c | 232 ++--
 kernel/trace/trace.c|   2 +-
 kernel/trace/trace_uprobe.c |  38 +++-
 4 files changed, 267 insertions(+), 10 deletions(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index bb9d2084af03..103a48a48872 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -123,6 +123,7 @@ extern unsigned long uprobe_get_swbp_addr(struct pt_regs 
*regs);
 extern unsigned long uprobe_get_trap_addr(struct pt_regs *regs);
 extern int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct 
*mm, unsigned long vaddr, uprobe_opcode_t);
 extern int uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
+extern int uprobe_register_refctr(struct inode *inode, loff_t offset, loff_t 
ref_ctr_offset, struct uprobe_consumer *uc);
 extern int uprobe_apply(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc, bool);
 extern void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc);
 extern int uprobe_mmap(struct vm_area_struct *vma);
@@ -160,6 +161,10 @@ uprobe_register(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
 {
return -ENOSYS;
 }
+static inline int uprobe_register_refctr(struct inode *inode, loff_t offset, 
loff_t ref_ctr_offset, struct uprobe_consumer *uc)
+{
+   return -ENOSYS;
+}
 static inline int
 uprobe_apply(struct inode *inode, loff_t offset, struct uprobe_consumer *uc, 
bool add)
 {
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c0418ba52ba8..ad92fed11526 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -73,6 +73,7 @@ struct uprobe {
struct uprobe_consumer  *consumers;
struct inode*inode; /* Also hold a ref to inode */
loff_t  offset;
+   loff_t  ref_ctr_offset;
unsigned long   flags;
 
/*
@@ -88,6 +89,15 @@ struct uprobe {
struct arch_uprobe  arch;
 };
 
+struct delayed_uprobe {
+   struct list_head list;
+   struct uprobe *uprobe;
+   struct mm_struct *mm;
+};
+
+static DEFINE_MUTEX(delayed_uprobe_lock);
+static LIST_HEAD(delayed_uprobe_list);
+
 /*
  * Execute out of line area: anonymous executable mapping installed
  * by the probed task to execute the copy of the original instruction
@@ -282,6 +292,154 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
return 1;
 }
 
+static struct delayed_uprobe *
+delayed_uprobe_check(struct uprobe *uprobe, struct mm_struct *mm)
+{
+   struct delayed_uprobe *du;
+
+   list_for_each_entry(du, _uprobe_list, list)
+   if (du->uprobe == uprobe && du->mm == mm)
+   return du;
+   return NULL;
+}
+
+static int delayed_uprobe_add(struct uprobe *uprobe, struct mm_struct *mm)
+{
+   struct delayed_uprobe *du;
+
+   if (delayed_uprobe_check(uprobe, mm))
+   return 0;
+

Re: [PATCH 1/2] leds: core: Introduce LED pattern trigger

2018-07-30 Thread Bjorn Andersson
On Mon 30 Jul 05:29 PDT 2018, Baolin Wang wrote:

> Some LED controllers have support for autonomously controlling
> brightness over time, according to some preprogrammed pattern or
> function.
> 
> This patch adds pattern trigger that LED device can configure the
> pattern and trigger it.
> 
> Signed-off-by: Raphael Teysseyre 
> Signed-off-by: Baolin Wang 
> ---
>  .../ABI/testing/sysfs-class-led-trigger-pattern|   21 ++
>  drivers/leds/trigger/Kconfig   |   10 +
>  drivers/leds/trigger/Makefile  |1 +
>  drivers/leds/trigger/ledtrig-pattern.c |  349 
> 
>  include/linux/leds.h   |   19 ++
>  5 files changed, 400 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern
>  create mode 100644 drivers/leds/trigger/ledtrig-pattern.c
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern 
> b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
> new file mode 100644
> index 000..c52da34
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
> @@ -0,0 +1,21 @@
> +What:/sys/class/leds//pattern
> +Date:August 2018
> +KernelVersion:   4.19
> +Description:
> + Specify a pattern for the LED, for LED hardware that support
> + altering the brightness as a function of time.
> +
> + The pattern is given by a series of tuples, of brightness and
> + duration (ms). The LED is expected to traverse the series and
> + each brightness value for the specified duration. Duration of
> + 0 means brightness should immediately change to new value.
> +
> + The format of the pattern values should be:
> + "brightness_1 duration_1, brightness_2 duration_2, brightness_3
> + duration_3, ...".
> +
> +What:/sys/class/leds//repeat
> +Date:August 2018
> +KernelVersion:   4.19
> +Description:
> + Specify a pattern repeat number. 0 means repeat indefinitely.

If 0 means infinity, does 1 mean "repeat 1 time"? If so how would I
specify that I want the pattern to run one time (i.e. 0 repetitions).

> diff --git a/drivers/leds/trigger/Kconfig b/drivers/leds/trigger/Kconfig
> index a2559b4..a03afcd 100644
> --- a/drivers/leds/trigger/Kconfig
> +++ b/drivers/leds/trigger/Kconfig
> @@ -125,6 +125,16 @@ config LEDS_TRIGGER_CAMERA
> This enables direct flash/torch on/off by the driver, kernel space.
> If unsure, say Y.
>  
> +config LEDS_TRIGGER_PATTERN
> +   tristate "LED Pattern Trigger"
> +   depends on LEDS_TRIGGERS
> +   help
> + This allows LEDs blinking with an arbitrary pattern. Can be useful
> + on embedded systems with no screen to give out a status code to
> + a human.

While the pattern mechanism could be used to communicate some message
the use cases we've seen so far is all about enabling hardware to pulse
LEDs instead of blinking them...

> +
> + If unsure, say N
> +
>  config LEDS_TRIGGER_PANIC
>   bool "LED Panic Trigger"
>   depends on LEDS_TRIGGERS
> diff --git a/drivers/leds/trigger/Makefile b/drivers/leds/trigger/Makefile
> index f3cfe19..c5d180e 100644
> --- a/drivers/leds/trigger/Makefile
> +++ b/drivers/leds/trigger/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_LEDS_TRIGGER_TRANSIENT)+= 
> ledtrig-transient.o
>  obj-$(CONFIG_LEDS_TRIGGER_CAMERA)+= ledtrig-camera.o
>  obj-$(CONFIG_LEDS_TRIGGER_PANIC) += ledtrig-panic.o
>  obj-$(CONFIG_LEDS_TRIGGER_NETDEV)+= ledtrig-netdev.o
> +obj-$(CONFIG_LEDS_TRIGGER_PATTERN) += ledtrig-pattern.o
> diff --git a/drivers/leds/trigger/ledtrig-pattern.c 
> b/drivers/leds/trigger/ledtrig-pattern.c
> new file mode 100644
> index 000..b709aa1
> --- /dev/null
> +++ b/drivers/leds/trigger/ledtrig-pattern.c
> @@ -0,0 +1,349 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * LED pattern trigger
> + *
> + * Idea discussed with Pavel Machek. Raphael Teysseyre implemented
> + * the first version, Baolin Wang simplified and improved the approach.

Might be a coincidence, but parts of this patch looks pretty close to
mine...

> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * The "pattern" attribute contains at most PAGE_SIZE characters. Each line
> + * in this attribute is at least 4 characters long (a 1-digit number for the
> + * led brighntess, a space, a 1-digit number for the time, a semi-colon).
> + * Therefore, there is at most PAGE_SIZE/4 patterns.
> + */

The brightness is a number between 0 and LED_FULL (or max_brightness)
and delta_t is measured in ms. So neither of these are likely to be a
single digit very often.

> +#define MAX_PATTERNS (PAGE_SIZE / 4)
> +#define PATTERN_SEPARATOR","
> +
> +struct pattern_trig_data {
> + struct led_classdev 

Re: [PATCH 1/2] leds: core: Introduce LED pattern trigger

2018-07-30 Thread Bjorn Andersson
On Mon 30 Jul 05:29 PDT 2018, Baolin Wang wrote:

> Some LED controllers have support for autonomously controlling
> brightness over time, according to some preprogrammed pattern or
> function.
> 
> This patch adds pattern trigger that LED device can configure the
> pattern and trigger it.
> 
> Signed-off-by: Raphael Teysseyre 
> Signed-off-by: Baolin Wang 
> ---
>  .../ABI/testing/sysfs-class-led-trigger-pattern|   21 ++
>  drivers/leds/trigger/Kconfig   |   10 +
>  drivers/leds/trigger/Makefile  |1 +
>  drivers/leds/trigger/ledtrig-pattern.c |  349 
> 
>  include/linux/leds.h   |   19 ++
>  5 files changed, 400 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern
>  create mode 100644 drivers/leds/trigger/ledtrig-pattern.c
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern 
> b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
> new file mode 100644
> index 000..c52da34
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
> @@ -0,0 +1,21 @@
> +What:/sys/class/leds//pattern
> +Date:August 2018
> +KernelVersion:   4.19
> +Description:
> + Specify a pattern for the LED, for LED hardware that support
> + altering the brightness as a function of time.
> +
> + The pattern is given by a series of tuples, of brightness and
> + duration (ms). The LED is expected to traverse the series and
> + each brightness value for the specified duration. Duration of
> + 0 means brightness should immediately change to new value.
> +
> + The format of the pattern values should be:
> + "brightness_1 duration_1, brightness_2 duration_2, brightness_3
> + duration_3, ...".
> +
> +What:/sys/class/leds//repeat
> +Date:August 2018
> +KernelVersion:   4.19
> +Description:
> + Specify a pattern repeat number. 0 means repeat indefinitely.

If 0 means infinity, does 1 mean "repeat 1 time"? If so how would I
specify that I want the pattern to run one time (i.e. 0 repetitions).

> diff --git a/drivers/leds/trigger/Kconfig b/drivers/leds/trigger/Kconfig
> index a2559b4..a03afcd 100644
> --- a/drivers/leds/trigger/Kconfig
> +++ b/drivers/leds/trigger/Kconfig
> @@ -125,6 +125,16 @@ config LEDS_TRIGGER_CAMERA
> This enables direct flash/torch on/off by the driver, kernel space.
> If unsure, say Y.
>  
> +config LEDS_TRIGGER_PATTERN
> +   tristate "LED Pattern Trigger"
> +   depends on LEDS_TRIGGERS
> +   help
> + This allows LEDs blinking with an arbitrary pattern. Can be useful
> + on embedded systems with no screen to give out a status code to
> + a human.

While the pattern mechanism could be used to communicate some message
the use cases we've seen so far is all about enabling hardware to pulse
LEDs instead of blinking them...

> +
> + If unsure, say N
> +
>  config LEDS_TRIGGER_PANIC
>   bool "LED Panic Trigger"
>   depends on LEDS_TRIGGERS
> diff --git a/drivers/leds/trigger/Makefile b/drivers/leds/trigger/Makefile
> index f3cfe19..c5d180e 100644
> --- a/drivers/leds/trigger/Makefile
> +++ b/drivers/leds/trigger/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_LEDS_TRIGGER_TRANSIENT)+= 
> ledtrig-transient.o
>  obj-$(CONFIG_LEDS_TRIGGER_CAMERA)+= ledtrig-camera.o
>  obj-$(CONFIG_LEDS_TRIGGER_PANIC) += ledtrig-panic.o
>  obj-$(CONFIG_LEDS_TRIGGER_NETDEV)+= ledtrig-netdev.o
> +obj-$(CONFIG_LEDS_TRIGGER_PATTERN) += ledtrig-pattern.o
> diff --git a/drivers/leds/trigger/ledtrig-pattern.c 
> b/drivers/leds/trigger/ledtrig-pattern.c
> new file mode 100644
> index 000..b709aa1
> --- /dev/null
> +++ b/drivers/leds/trigger/ledtrig-pattern.c
> @@ -0,0 +1,349 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * LED pattern trigger
> + *
> + * Idea discussed with Pavel Machek. Raphael Teysseyre implemented
> + * the first version, Baolin Wang simplified and improved the approach.

Might be a coincidence, but parts of this patch looks pretty close to
mine...

> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * The "pattern" attribute contains at most PAGE_SIZE characters. Each line
> + * in this attribute is at least 4 characters long (a 1-digit number for the
> + * led brighntess, a space, a 1-digit number for the time, a semi-colon).
> + * Therefore, there is at most PAGE_SIZE/4 patterns.
> + */

The brightness is a number between 0 and LED_FULL (or max_brightness)
and delta_t is measured in ms. So neither of these are likely to be a
single digit very often.

> +#define MAX_PATTERNS (PAGE_SIZE / 4)
> +#define PATTERN_SEPARATOR","
> +
> +struct pattern_trig_data {
> + struct led_classdev 

Re: [PATCH 06/11] sched/irq: add irq utilization tracking

2018-07-30 Thread Wanpeng Li
On Tue, 31 Jul 2018 at 00:43, Vincent Guittot
 wrote:
>
> Hi Wanpeng,
>
> On Thu, 26 Jul 2018 at 05:09, Wanpeng Li  wrote:
> >
> > Hi Vincent,
> > On Fri, 29 Jun 2018 at 03:07, Vincent Guittot
> >  wrote:
> > >
> > > interrupt and steal time are the only remaining activities tracked by
> > > rt_avg. Like for sched classes, we can use PELT to track their average
> > > utilization of the CPU. But unlike sched class, we don't track when
> > > entering/leaving interrupt; Instead, we take into account the time spent
> > > under interrupt context when we update rqs' clock (rq_clock_task).
> > > This also means that we have to decay the normal context time and account
> > > for interrupt time during the update.
> > >
> > > That's also important to note that because
> > >   rq_clock == rq_clock_task + interrupt time
> > > and rq_clock_task is used by a sched class to compute its utilization, the
> > > util_avg of a sched class only reflects the utilization of the time spent
> > > in normal context and not of the whole time of the CPU. The utilization of
> > > interrupt gives an more accurate level of utilization of CPU.
> > > The CPU utilization is :
> > >   avg_irq + (1 - avg_irq / max capacity) * /Sum avg_rq
> > >
> > > Most of the time, avg_irq is small and neglictible so the use of the
> > > approximation CPU utilization = /Sum avg_rq was enough
> > >
> > > Cc: Ingo Molnar 
> > > Cc: Peter Zijlstra 
> > > Signed-off-by: Vincent Guittot 
> > > ---
> > >  kernel/sched/core.c  |  4 +++-
> > >  kernel/sched/fair.c  | 13 ++---
> > >  kernel/sched/pelt.c  | 40 
> > >  kernel/sched/pelt.h  | 16 
> > >  kernel/sched/sched.h |  3 +++
> > >  5 files changed, 72 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 78d8fac..e5263a4 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -18,6 +18,8 @@
> > >  #include "../workqueue_internal.h"
> > >  #include "../smpboot.h"
> > >
> > > +#include "pelt.h"
> > > +
> > >  #define CREATE_TRACE_POINTS
> > >  #include 
> > >
> > > @@ -186,7 +188,7 @@ static void update_rq_clock_task(struct rq *rq, s64 
> > > delta)
> > >
> > >  #if defined(CONFIG_IRQ_TIME_ACCOUNTING) || 
> > > defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)
> > > if ((irq_delta + steal) && sched_feat(NONTASK_CAPACITY))
> > > -   sched_rt_avg_update(rq, irq_delta + steal);
> > > +   update_irq_load_avg(rq, irq_delta + steal);
> >
> > I think we should not add steal time into irq load tracking, steal
> > time is always 0 on native kernel which doesn't matter, what will
> > happen when guest disables IRQ_TIME_ACCOUNTING and enables
> > PARAVIRT_TIME_ACCOUNTING? Steal time is not the real irq util_avg. In
> > addition, we haven't exposed power management for performance which
> > means that e.g. schedutil governor can not cooperate with passive mode
> > intel_pstate driver to tune the OPP. To decay the old steal time avg
> > and add the new one just wastes cpu cycles.
>
> In fact, I have kept the same behavior as with rt_avg, which was
> already adding steal time when computing scale_rt_capacity, which is
> used to reflect the remaining capacity for FAIR tasks and is used in
> load balance. I'm not sure that it's worth using different variables
> for irq and steal.
> That being said, I see a possible optimization in schedutil when
> PARAVIRT_TIME_ACCOUNTING is enable and IRQ_TIME_ACCOUNTING is disable.
> With this kind of config, scale_irq_capacity can be a nop for
> schedutil but scales the utilization for scale_rt_capacity

Yeah, this is what in my mind before, you can make a patch for that. :)

Regards,
Wanpeng Li


Re: [PATCH 06/11] sched/irq: add irq utilization tracking

2018-07-30 Thread Wanpeng Li
On Tue, 31 Jul 2018 at 00:43, Vincent Guittot
 wrote:
>
> Hi Wanpeng,
>
> On Thu, 26 Jul 2018 at 05:09, Wanpeng Li  wrote:
> >
> > Hi Vincent,
> > On Fri, 29 Jun 2018 at 03:07, Vincent Guittot
> >  wrote:
> > >
> > > interrupt and steal time are the only remaining activities tracked by
> > > rt_avg. Like for sched classes, we can use PELT to track their average
> > > utilization of the CPU. But unlike sched class, we don't track when
> > > entering/leaving interrupt; Instead, we take into account the time spent
> > > under interrupt context when we update rqs' clock (rq_clock_task).
> > > This also means that we have to decay the normal context time and account
> > > for interrupt time during the update.
> > >
> > > That's also important to note that because
> > >   rq_clock == rq_clock_task + interrupt time
> > > and rq_clock_task is used by a sched class to compute its utilization, the
> > > util_avg of a sched class only reflects the utilization of the time spent
> > > in normal context and not of the whole time of the CPU. The utilization of
> > > interrupt gives an more accurate level of utilization of CPU.
> > > The CPU utilization is :
> > >   avg_irq + (1 - avg_irq / max capacity) * /Sum avg_rq
> > >
> > > Most of the time, avg_irq is small and neglictible so the use of the
> > > approximation CPU utilization = /Sum avg_rq was enough
> > >
> > > Cc: Ingo Molnar 
> > > Cc: Peter Zijlstra 
> > > Signed-off-by: Vincent Guittot 
> > > ---
> > >  kernel/sched/core.c  |  4 +++-
> > >  kernel/sched/fair.c  | 13 ++---
> > >  kernel/sched/pelt.c  | 40 
> > >  kernel/sched/pelt.h  | 16 
> > >  kernel/sched/sched.h |  3 +++
> > >  5 files changed, 72 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 78d8fac..e5263a4 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -18,6 +18,8 @@
> > >  #include "../workqueue_internal.h"
> > >  #include "../smpboot.h"
> > >
> > > +#include "pelt.h"
> > > +
> > >  #define CREATE_TRACE_POINTS
> > >  #include 
> > >
> > > @@ -186,7 +188,7 @@ static void update_rq_clock_task(struct rq *rq, s64 
> > > delta)
> > >
> > >  #if defined(CONFIG_IRQ_TIME_ACCOUNTING) || 
> > > defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)
> > > if ((irq_delta + steal) && sched_feat(NONTASK_CAPACITY))
> > > -   sched_rt_avg_update(rq, irq_delta + steal);
> > > +   update_irq_load_avg(rq, irq_delta + steal);
> >
> > I think we should not add steal time into irq load tracking, steal
> > time is always 0 on native kernel which doesn't matter, what will
> > happen when guest disables IRQ_TIME_ACCOUNTING and enables
> > PARAVIRT_TIME_ACCOUNTING? Steal time is not the real irq util_avg. In
> > addition, we haven't exposed power management for performance which
> > means that e.g. schedutil governor can not cooperate with passive mode
> > intel_pstate driver to tune the OPP. To decay the old steal time avg
> > and add the new one just wastes cpu cycles.
>
> In fact, I have kept the same behavior as with rt_avg, which was
> already adding steal time when computing scale_rt_capacity, which is
> used to reflect the remaining capacity for FAIR tasks and is used in
> load balance. I'm not sure that it's worth using different variables
> for irq and steal.
> That being said, I see a possible optimization in schedutil when
> PARAVIRT_TIME_ACCOUNTING is enable and IRQ_TIME_ACCOUNTING is disable.
> With this kind of config, scale_irq_capacity can be a nop for
> schedutil but scales the utilization for scale_rt_capacity

Yeah, this is what in my mind before, you can make a patch for that. :)

Regards,
Wanpeng Li


  1   2   3   4   5   6   7   8   9   10   >