Re: kernfs: can read/write method grow buffer size?

2019-03-28 Thread Greg Kroah-Hartman
On Fri, Mar 29, 2019 at 07:51:07AM +0100, Marek Behun wrote:
> > If this is just for kernfs, and you have your own filesystem, sure, we
> > can probably do something here.  But if this is for sysfs, no, you all
> > need to keep to the "one value per file" rule that we have there
> > please.
> 
> Greg, the file satisfies the "one value per file" rule. That value is
> the currently selected trigger. Writing a trigger name changes this
> value. It is just that reading also prints all the supported triggers.

"all supported triggers" is not a "single value" :)


Re: kernfs: can read/write method grow buffer size?

2019-03-28 Thread Marek Behun
> If this is just for kernfs, and you have your own filesystem, sure, we
> can probably do something here.  But if this is for sysfs, no, you all
> need to keep to the "one value per file" rule that we have there
> please.

Greg, the file satisfies the "one value per file" rule. That value is
the currently selected trigger. Writing a trigger name changes this
value. It is just that reading also prints all the supported triggers.


Re: [PATCH] RISC-V: Fix FIXMAP_TOP to avoid overlap with VMALLOC area

2019-03-28 Thread Anup Patel
Hi Palmer,

On Fri, Mar 29, 2019 at 11:48 AM Palmer Dabbelt  wrote:
>
> On Fri, 22 Mar 2019 06:25:09 PDT (-0700), Christoph Hellwig wrote:
> > Looks good,
> >
> > Reviewed-by: Christoph Hellwig 
>
> Thanks.  I've added this to my fixes list for the next RC.

I have another RC fix as well:

"[PATCH v4] RISC-V: Always compile mm/init.c with cmodel=medany and notrace "

https://patchwork.kernel.org/patch/10870605/

Regards,
Anup


[PATCH v3 7/9] ARM: dts: imx53: Specify IMX5_CLK_IPG as "ahb" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX5_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX5_CLK_AHB as "ahb" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx53.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx53.dtsi b/arch/arm/boot/dts/imx53.dtsi
index b3300300aabe..9b672ed2486d 100644
--- a/arch/arm/boot/dts/imx53.dtsi
+++ b/arch/arm/boot/dts/imx53.dtsi
@@ -702,7 +702,7 @@
reg = <0x63fb 0x4000>;
interrupts = <6>;
clocks = <&clks IMX5_CLK_SDMA_GATE>,
-<&clks IMX5_CLK_SDMA_GATE>;
+<&clks IMX5_CLK_AHB>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
fsl,sdma-ram-script-name = 
"imx/sdma/sdma-imx53.bin";
-- 
2.20.1



[PATCH v3 8/9] ARM: dts: imx51: Specify IMX5_CLK_IPG as "ahb" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX5_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX5_CLK_AHB as "ahb" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx51.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx51.dtsi b/arch/arm/boot/dts/imx51.dtsi
index 0b28b68a17bb..e3fa3d213cd8 100644
--- a/arch/arm/boot/dts/imx51.dtsi
+++ b/arch/arm/boot/dts/imx51.dtsi
@@ -502,7 +502,7 @@
reg = <0x83fb 0x4000>;
interrupts = <6>;
clocks = <&clks IMX5_CLK_SDMA_GATE>,
-<&clks IMX5_CLK_SDMA_GATE>;
+<&clks IMX5_CLK_AHB>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
fsl,sdma-ram-script-name = 
"imx/sdma/sdma-imx51.bin";
-- 
2.20.1



[PATCH v3 4/9] ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX6UL_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX6UL_CLK_IPG as "ipg" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx6ul.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index a77bbcae4571..bbf010c73336 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -708,7 +708,7 @@
 "fsl,imx35-sdma";
reg = <0x020ec000 0x4000>;
interrupts = ;
-   clocks = <&clks IMX6UL_CLK_SDMA>,
+   clocks = <&clks IMX6UL_CLK_IPG>,
 <&clks IMX6UL_CLK_SDMA>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
-- 
2.20.1



[PATCH v3 9/9] ARM: dts: imx50: Specify IMX5_CLK_IPG as "ahb" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX5_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX5_CLK_AHB as "ahb" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx50.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx50.dtsi b/arch/arm/boot/dts/imx50.dtsi
index 5dd61bff3b76..0bfe7c91d0eb 100644
--- a/arch/arm/boot/dts/imx50.dtsi
+++ b/arch/arm/boot/dts/imx50.dtsi
@@ -430,7 +430,7 @@
reg = <0x63fb 0x4000>;
interrupts = <6>;
clocks = <&clks IMX5_CLK_SDMA_GATE>,
-<&clks IMX5_CLK_SDMA_GATE>;
+<&clks IMX5_CLK_AHB>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
fsl,sdma-ram-script-name = 
"imx/sdma/sdma-imx50.bin";
-- 
2.20.1



[PATCH v3 2/9] ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX6SX_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX6SX_CLK_IPG as "ipg" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx6sx.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi
index df0c59519886..b16a123990a2 100644
--- a/arch/arm/boot/dts/imx6sx.dtsi
+++ b/arch/arm/boot/dts/imx6sx.dtsi
@@ -820,7 +820,7 @@
compatible = "fsl,imx6sx-sdma", 
"fsl,imx6q-sdma";
reg = <0x020ec000 0x4000>;
interrupts = ;
-   clocks = <&clks IMX6SX_CLK_SDMA>,
+   clocks = <&clks IMX6SX_CLK_IPG>,
 <&clks IMX6SX_CLK_SDMA>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
-- 
2.20.1



[PATCH v3 3/9] ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX7D_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX7D_CLK_IPG as "ipg" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx7s.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
index e67c8dd4bdab..31c610e6c788 100644
--- a/arch/arm/boot/dts/imx7s.dtsi
+++ b/arch/arm/boot/dts/imx7s.dtsi
@@ -1071,8 +1071,8 @@
compatible = "fsl,imx7d-sdma", "fsl,imx35-sdma";
reg = <0x30bd 0x1>;
interrupts = ;
-   clocks = <&clks IMX7D_SDMA_CORE_CLK>,
-<&clks IMX7D_AHB_CHANNEL_ROOT_CLK>;
+   clocks = <&clks IMX7D_IPG_ROOT_CLK>,
+<&clks IMX7D_SDMA_CORE_CLK>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
fsl,sdma-ram-script-name = 
"imx/sdma/sdma-imx7d.bin";
-- 
2.20.1



[PATCH v3 5/9] ARM: dts: imx6sll: Specify IMX6SLL_CLK_IPG as "ipg" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX6SLL_CLK_SDMA result in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX6SLL_CLK_IPG as "ipg" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx6sll.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx6sll.dtsi b/arch/arm/boot/dts/imx6sll.dtsi
index 62847c68330b..ed598d72038c 100644
--- a/arch/arm/boot/dts/imx6sll.dtsi
+++ b/arch/arm/boot/dts/imx6sll.dtsi
@@ -621,7 +621,7 @@
compatible = "fsl,imx6sll-sdma", 
"fsl,imx35-sdma";
reg = <0x020ec000 0x4000>;
interrupts = ;
-   clocks = <&clks IMX6SLL_CLK_SDMA>,
+   clocks = <&clks IMX6SLL_CLK_IPG>,
 <&clks IMX6SLL_CLK_SDMA>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
-- 
2.20.1



[PATCH v3 6/9] ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX6SL_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality. Fix the code
to specify IMX6SL_CLK_AHB as "ahb" clock for SDMA, to avoid detecting
incorrect clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx6sl.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx6sl.dtsi b/arch/arm/boot/dts/imx6sl.dtsi
index 0ad5d507abec..9ddbeea64b72 100644
--- a/arch/arm/boot/dts/imx6sl.dtsi
+++ b/arch/arm/boot/dts/imx6sl.dtsi
@@ -748,7 +748,7 @@
reg = <0x020ec000 0x4000>;
interrupts = <0 2 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6SL_CLK_SDMA>,
-<&clks IMX6SL_CLK_SDMA>;
+<&clks IMX6SL_CLK_AHB>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
/* imx6sl reuses imx6q sdma firmware */
-- 
2.20.1



[PATCH v3 1/9] ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA

2019-03-28 Thread Andrey Smirnov
Since 25aaa75df1e6 SDMA driver uses clock rates of "ipg" and "ahb"
clock to determine if it needs to configure the IP block as operating
at 1:1 or 1:2 clock ratio (ACR bit in SDMAARM_CONFIG). Specifying both
clocks as IMX6QDL_CLK_SDMA results in driver incorrectly thinking that
ratio is 1:1 which results in broken SDMA funtionality(this at least
breaks RAVE SP serdev driver on RDU2). Fix the code to specify
IMX6QDL_CLK_IPG as "ipg" clock for SDMA, to avoid detecting incorrect
clock ratio.

Fixes: 25aaa75df1e6 ("dmaengine: imx-sdma: add clock ratio 1:1 check")
Signed-off-by: Andrey Smirnov 
Reviewed-by: Lucas Stach 
Cc: Angus Ainslie (Purism) 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Fabio Estevam 
Cc: Shawn Guo 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/imx6qdl.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
index 9f9aa6e7ed0e..354feba077b2 100644
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@@ -949,7 +949,7 @@
compatible = "fsl,imx6q-sdma", "fsl,imx35-sdma";
reg = <0x020ec000 0x4000>;
interrupts = <0 2 IRQ_TYPE_LEVEL_HIGH>;
-   clocks = <&clks IMX6QDL_CLK_SDMA>,
+   clocks = <&clks IMX6QDL_CLK_IPG>,
 <&clks IMX6QDL_CLK_SDMA>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
-- 
2.20.1



[tip:efi/core] efi/libstub/arm: Omit unneeded stripping of ksymtab/kcrctab sections

2019-03-28 Thread tip-bot for Ard Biesheuvel
Commit-ID:  02562d0ca1084a688ac5c92e0e92947f62f13093
Gitweb: https://git.kernel.org/tip/02562d0ca1084a688ac5c92e0e92947f62f13093
Author: Ard Biesheuvel 
AuthorDate: Thu, 28 Mar 2019 20:34:29 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 29 Mar 2019 07:35:00 +0100

efi/libstub/arm: Omit unneeded stripping of ksymtab/kcrctab sections

Commit f922c4abdf764 ("module: allow symbol exports to be disabled")
introduced a way to inhibit generation of kcrctab/ksymtab sections
when building ordinary kernel code to be used in a different execution
context (decompressor, EFI stub, etc)

That means we no longer have to strip those sections explicitly when
building the EFI libstub objects, so drop this from the Makefile.

Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Nick Desaulniers 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190328193429.21373-6-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/libstub/Makefile | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/libstub/Makefile 
b/drivers/firmware/efi/libstub/Makefile
index ae9081988c88..b1f7b64652db 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -71,7 +71,6 @@ CFLAGS_arm64-stub.o   := -DTEXT_OFFSET=$(TEXT_OFFSET)
 extra-$(CONFIG_EFI_ARMSTUB):= $(lib-y)
 lib-$(CONFIG_EFI_ARMSTUB)  := $(patsubst %.o,%.stub.o,$(lib-y))
 
-STUBCOPY_RM-y  := -R *ksymtab* -R *kcrctab*
 STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \
   --prefix-symbols=__efistub_
 STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS
@@ -87,7 +86,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 #
 quiet_cmd_stubcopy = STUBCPY $@
   cmd_stubcopy =   \
-   $(STRIP) --strip-debug $(STUBCOPY_RM-y) -o $@ $<;   \
+   $(STRIP) --strip-debug -o $@ $<;\
if $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y); then\
echo "$@: absolute symbol references not allowed in the EFI 
stub" >&2; \
/bin/false; \


[tip:efi/core] efi: Unify DMI setup code over the arm/arm64, ia64 and x86 architectures

2019-03-28 Thread tip-bot for Robert Richter
Commit-ID:  0fca08122eaf5c956a2cbe12775245d747f8b1ac
Gitweb: https://git.kernel.org/tip/0fca08122eaf5c956a2cbe12775245d747f8b1ac
Author: Robert Richter 
AuthorDate: Thu, 28 Mar 2019 20:34:28 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 29 Mar 2019 07:35:00 +0100

efi: Unify DMI setup code over the arm/arm64, ia64 and x86 architectures

All architectures (arm/arm64, ia64 and x86) do the same here, so unify
the code.

Note: We do not need to call dump_stack_set_arch_desc() in case of
!dmi_available. Both strings, dmi_ids_string and dump_stack_arch_
desc_str are initialized zero and thus nothing would change.

Signed-off-by: Robert Richter 
Signed-off-by: Ard Biesheuvel 
Reviewed-by: Jean Delvare 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190328193429.21373-5-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 arch/ia64/kernel/setup.c   |  4 +---
 arch/x86/kernel/setup.c|  6 ++
 drivers/firmware/dmi_scan.c| 28 +++-
 drivers/firmware/efi/arm-runtime.c |  7 ++-
 include/linux/dmi.h|  8 ++--
 5 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 583a3746d70b..c9cfa760cd57 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -1058,9 +1058,7 @@ check_bugs (void)
 
 static int __init run_dmi_scan(void)
 {
-   dmi_scan_machine();
-   dmi_memdev_walk();
-   dmi_set_dump_stack_arch_desc();
+   dmi_setup();
return 0;
 }
 core_initcall(run_dmi_scan);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3d872a527cd9..3773905cd2c1 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1005,13 +1005,11 @@ void __init setup_arch(char **cmdline_p)
if (efi_enabled(EFI_BOOT))
efi_init();
 
-   dmi_scan_machine();
-   dmi_memdev_walk();
-   dmi_set_dump_stack_arch_desc();
+   dmi_setup();
 
/*
 * VMware detection requires dmi to be available, so this
-* needs to be done after dmi_scan_machine(), for the boot CPU.
+* needs to be done after dmi_setup(), for the boot CPU.
 */
init_hypervisor_platform();
 
diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index 099d83e4e910..fae2d5c43314 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -416,11 +416,8 @@ static void __init save_mem_devices(const struct 
dmi_header *dm, void *v)
nr++;
 }
 
-void __init dmi_memdev_walk(void)
+static void __init dmi_memdev_walk(void)
 {
-   if (!dmi_available)
-   return;
-
if (dmi_walk_early(count_mem_devices) == 0 && dmi_memdev_nr) {
dmi_memdev = dmi_alloc(sizeof(*dmi_memdev) * dmi_memdev_nr);
if (dmi_memdev)
@@ -614,7 +611,7 @@ static int __init dmi_smbios3_present(const u8 *buf)
return 1;
 }
 
-void __init dmi_scan_machine(void)
+static void __init dmi_scan_machine(void)
 {
char __iomem *p, *q;
char buf[32];
@@ -769,15 +766,20 @@ static int __init dmi_init(void)
 subsys_initcall(dmi_init);
 
 /**
- * dmi_set_dump_stack_arch_desc - set arch description for dump_stack()
+ * dmi_setup - scan and setup DMI system information
  *
- * Invoke dump_stack_set_arch_desc() with DMI system information so that
- * DMI identifiers are printed out on task dumps.  Arch boot code should
- * call this function after dmi_scan_machine() if it wants to print out DMI
- * identifiers on task dumps.
+ * Scan the DMI system information. This setups DMI identifiers
+ * (dmi_system_id) for printing it out on task dumps and prepares
+ * DIMM entry information (dmi_memdev_info) from the SMBIOS table
+ * for using this when reporting memory errors.
  */
-void __init dmi_set_dump_stack_arch_desc(void)
+void __init dmi_setup(void)
 {
+   dmi_scan_machine();
+   if (!dmi_available)
+   return;
+
+   dmi_memdev_walk();
dump_stack_set_arch_desc("%s", dmi_ids_string);
 }
 
@@ -841,7 +843,7 @@ static bool dmi_is_end_of_table(const struct dmi_system_id 
*dmi)
  * returns non zero or we hit the end. Callback function is called for
  * each successful match. Returns the number of matches.
  *
- * dmi_scan_machine must be called before this function is called.
+ * dmi_setup must be called before this function is called.
  */
 int dmi_check_system(const struct dmi_system_id *list)
 {
@@ -871,7 +873,7 @@ EXPORT_SYMBOL(dmi_check_system);
  * Walk the blacklist table until the first match is found.  Return the
  * pointer to the matching entry or NULL if there's no match.
  *
- * dmi_scan_machine must be called before this function is called.
+ * dmi_setup must be called before this function is called.
  */
 const struct dmi_system_id *dmi_first_match(c

[tip:efi/core] efi/arm: Show SMBIOS bank/device location in CPER and GHES error logs

2019-03-28 Thread tip-bot for Marcin Benka
Commit-ID:  5e83cfe947444c7f201f8c39ce0189922ec9f578
Gitweb: https://git.kernel.org/tip/5e83cfe947444c7f201f8c39ce0189922ec9f578
Author: Marcin Benka 
AuthorDate: Thu, 28 Mar 2019 20:34:27 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 29 Mar 2019 07:35:00 +0100

efi/arm: Show SMBIOS bank/device location in CPER and GHES error logs

Run dmi_memdev_walk() for arch arm* as other architectures do.
This improves error logging as the memory device handle is
translated now to the DIMM entry's name provided by the DMI handle.

Before:

 {1}[Hardware Error]:   DIMM location: not present. DMI handle: 0x0038

After:

 {1}[Hardware Error]:   DIMM location: N0 DIMM_A0

Signed-off-by: Marcin Benka 
Signed-off-by: Robert Richter 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190328193429.21373-4-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/arm-runtime.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/firmware/efi/arm-runtime.c 
b/drivers/firmware/efi/arm-runtime.c
index 0c1af675c338..4a0dfe4ab829 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -167,6 +167,7 @@ static int __init arm_dmi_init(void)
 * itself, depends on dmi_scan_machine() having been called already.
 */
dmi_scan_machine();
+   dmi_memdev_walk();
if (dmi_available)
dmi_set_dump_stack_arch_desc();
return 0;


Re: [PATCH v2 0/4] Clean up comments and codes in sparse_add_one_section()

2019-03-28 Thread Baoquan He
Talked to Michal, the local code refactorying may impact those big
feature or improvement patchset, e.g patch 2/4 and patch 3/4 will
conflict with Dan's patchset:
[PATCH v5 00/10] mm: Sub-section memory hotplug support

So I would like to discard them and only repost patch 1/4 and 4/4 after
addressing reviewers' concern. Sorry for the noise.

On 03/26/19 at 05:02pm, Baoquan He wrote:
> This is v2 post. V1 is here:
> http://lkml.kernel.org/r/20190320073540.12866-1-...@redhat.com
> 
> This patchset includes 4 patches. The first three patches are around
> sparse_add_one_section(). The last one is a simple clean up patch when
> review codes in hotplug path, carry it in this patchset.
> 
> Baoquan He (4):
>   mm/sparse: Clean up the obsolete code comment
>   mm/sparse: Optimize sparse_add_one_section()
>   mm/sparse: Rename function related to section memmap allocation/free
>   drivers/base/memory.c: Rename the misleading parameter
> 
>  drivers/base/memory.c |  6 ++---
>  mm/sparse.c   | 58 ++-
>  2 files changed, 33 insertions(+), 31 deletions(-)
> 
> -- 
> 2.17.2
> 


[tip:efi/core] efifb: Omit memory map check on legacy boot

2019-03-28 Thread tip-bot for Ard Biesheuvel
Commit-ID:  c2999c281ea2d2ebbdfce96cecc7b52e2ae7c406
Gitweb: https://git.kernel.org/tip/c2999c281ea2d2ebbdfce96cecc7b52e2ae7c406
Author: Ard Biesheuvel 
AuthorDate: Thu, 28 Mar 2019 20:34:26 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 29 Mar 2019 07:34:59 +0100

efifb: Omit memory map check on legacy boot

Since the following commit:

  38ac0287b7f4 ("fbdev/efifb: Honour UEFI memory map attributes when mapping 
the FB")

efifb_probe() checks its memory range via efi_mem_desc_lookup(),
and this leads to a spurious error message:

   EFI_MEMMAP is not enabled

at every boot on KVM.  This is quite annoying since the error message
appears even if you set "quiet" boot option.

Since this happens on legacy boot, which strangely enough exposes
a EFI framebuffer via screen_info, let's double check that we are
doing an EFI boot before attempting to access the EFI memory map.

Reported-by: Takashi Iwai 
Tested-by: Takashi Iwai 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Jones 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190328193429.21373-3-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/video/fbdev/efifb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c
index ba906876cc45..9e529cc2b4ff 100644
--- a/drivers/video/fbdev/efifb.c
+++ b/drivers/video/fbdev/efifb.c
@@ -464,7 +464,8 @@ static int efifb_probe(struct platform_device *dev)
info->apertures->ranges[0].base = efifb_fix.smem_start;
info->apertures->ranges[0].size = size_remap;
 
-   if (!efi_mem_desc_lookup(efifb_fix.smem_start, &md)) {
+   if (efi_enabled(EFI_BOOT) &&
+   !efi_mem_desc_lookup(efifb_fix.smem_start, &md)) {
if ((efifb_fix.smem_start + efifb_fix.smem_len) >
(md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT))) {
pr_err("efifb: video memory @ 0x%lx spans multiple EFI 
memory regions\n",


[RFC PATCH] drm/komeda: Creates plane alpha and blend mode properties

2019-03-28 Thread Lowry Li (Arm Technology China)
Creates plane alpha and blend mode properties attached to plane.

Signed-off-by: Lowry Li 
---
 drivers/gpu/drm/arm/display/komeda/komeda_plane.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
index af51f0c..0ebec39 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
@@ -212,6 +212,17 @@ static int komeda_plane_add(struct komeda_kms_dev *kms,
 
drm_plane_helper_add(plane, &komeda_plane_helper_funcs);
 
+   err = drm_plane_create_alpha_property(plane);
+   if (err)
+   goto cleanup;
+
+   err = drm_plane_create_blend_mode_property(plane,
+   BIT(DRM_MODE_BLEND_PIXEL_NONE) |
+   BIT(DRM_MODE_BLEND_PREMULTI)   |
+   BIT(DRM_MODE_BLEND_COVERAGE));
+   if (err)
+   goto cleanup;
+
return 0;
 cleanup:
komeda_plane_destroy(plane);
-- 
1.9.1



[tip:efi/core] efi/libstub: Refactor the cmd_stubcopy Makefile command

2019-03-28 Thread tip-bot for Masahiro Yamada
Commit-ID:  e8d368ad20f514dce86a64931fe4a6f06a0a6703
Gitweb: https://git.kernel.org/tip/e8d368ad20f514dce86a64931fe4a6f06a0a6703
Author: Masahiro Yamada 
AuthorDate: Thu, 28 Mar 2019 20:34:25 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 29 Mar 2019 07:34:59 +0100

efi/libstub: Refactor the cmd_stubcopy Makefile command

It took me a while to understand what is going on in the nested
if-blocks.

Simplify it by removing unneeded code.

  - if_changed automatically adds 'set -e', so any failure in the
series of commands makes it immediately fail as a whole.
So, the outer if block is entirely redundant.

  - Since commit 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special target"),
GNU Make automatically deletes the target on any failure
in its recipe. The explicit 'rm -f $@' is redundant.

  - Surrounding commands with ( ) will spawn a subshell to execute them
in it, but it is rarely useful to do so.

Signed-off-by: Masahiro Yamada 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190328193429.21373-2-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/libstub/Makefile | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/efi/libstub/Makefile 
b/drivers/firmware/efi/libstub/Makefile
index b0103e16fc1b..ae9081988c88 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -86,12 +86,13 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 # this time, use objcopy and leave all sections in place.
 #
 quiet_cmd_stubcopy = STUBCPY $@
-  cmd_stubcopy = if $(STRIP) --strip-debug $(STUBCOPY_RM-y) -o $@ $<; \
-then if $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y); \
-then (echo >&2 "$@: absolute symbol references not allowed 
in the EFI stub"; \
-  rm -f $@; /bin/false); \
-else $(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@; fi\
-else /bin/false; fi
+  cmd_stubcopy =   \
+   $(STRIP) --strip-debug $(STUBCOPY_RM-y) -o $@ $<;   \
+   if $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y); then\
+   echo "$@: absolute symbol references not allowed in the EFI 
stub" >&2; \
+   /bin/false; \
+   fi; \
+   $(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@
 
 #
 # ARM discards the .data section because it disallows r/w data in the


Re: [PATCH] ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock for SDMA

2019-03-28 Thread Andrey Smirnov
On Wed, Mar 27, 2019 at 4:17 AM Fabio Estevam  wrote:
>
> Hi Andrey,
>
> On Tue, Mar 26, 2019 at 11:52 PM Andrey Smirnov
>  wrote:
>
> > I think it is worth fixing in DT, regardless if we do anything about
> > SDMA driver or not. I'll add fixes for 6SX and 7D in v2.
>
> What about imx25, imx31, imx50, imx51, imx53, imx6sl, imx6ul, imx6sll,
> imx6ull, etc?
>

I believe imx25, imx31, imx35 already have proper clocks configured.
I'll add the rest in v3.

Thanks,
Andrey Smirnov


Re: [PATCHv2] x86/boot/KASLR: skip the specified crashkernel reserved region

2019-03-28 Thread Baoquan He
On 03/29/19 at 01:45pm, Pingfan Liu wrote:
> On Fri, Mar 22, 2019 at 4:34 PM Baoquan He  wrote:
> >
> > On 03/22/19 at 03:52pm, Baoquan He wrote:
> > > On 03/22/19 at 03:43pm, Pingfan Liu wrote:
> > > > > > +/* parse crashkernel=x@y option */
> > > > > > +static void mem_avoid_crashkernel_simple(char *option)
> > > > >
> > > > > Chao ever mentioned this, I want to ask again, why does it has to be
> > > > > xxx_simple()?
> > > > >
> > > > Seems that I had replied Chao's question in another email. The naming
> > > > follows the function parse_crashkernel_simple(), as the notes above
> > >
> > >
> > > Sorry, I don't get.  typo?
> >
> > OK, I misunderstood it. We do have parse_crashkernel_simple() to handle
> > crashkernel=size[@offset] case, to differente with other complicated
> > cases, like crashkernel=size,[high|low],
> >
> > Then I am fine with this naming. Soryy about the noise.
> >
> > By the way, do you think if we should take care of this case:
> > crashkernel=:[,:,...][@offset]
> >
> > It can also specify @offset. Not sure if it's too complicated, you may
> > have a investigation.
> >
> In this case, kernel should get the total memory size info. So
> process_e820_entries() or process_efi_entries() should be called
> twice. One before handle_mem_options(), so crashkernel can evaluate
> the reserved size. It is doable, and what is your opinion about the

You mean calling process_e820_entries to calculate the RAM size in
system? I may not do like that, please check what __find_max_addr() is
doing. Did I get it?


Re: kernfs: can read/write method grow buffer size?

2019-03-28 Thread Greg Kroah-Hartman
On Fri, Mar 29, 2019 at 04:09:22AM +0100, Marek Behun wrote:
> Hello Tejun and Greg,
> 
> kernfs_fop_open/read/write allocates a buffer for the ->read, ->write,
> or ->seq_read methods. This buffer is either preallocated or allocated
> on the spot, with minimum size being PAGE_SIZE, if ->atomic_write_len
> is not given.
> 
> There is a question/problem currently in the led-trigger API, that the
> PAGE_SIZE buffer can in some specific scenarios be too short.

And that file is in sysfs?  That's a huge abuse of the sysfs api then :(

> (The trigger file on read returns space separated list of all supported
> triggers, and the currently chosen one is marked specially. The cpu
> activity trigger lists "cpu%i" for all CPU cores, which actually broke
> on some machines with very large number of CPUs. Granted, this could
> have been solved another way (and maybe will be), but we are now
> discussing API for HW LED triggers, which can raise the problem anyway,
> if a specific LED controller supports too many HW LED triggers.)
> 
> Is it allowed to grow this buffer if needed, either via krealloc or by
> creating a special function in kernfs API which does this so that
> led-trigger could use it?
> 
> Or is this completely forbidden?

If this is just for kernfs, and you have your own filesystem, sure, we
can probably do something here.  But if this is for sysfs, no, you all
need to keep to the "one value per file" rule that we have there please.

thanks,

greg k-h


[PATCH v2] irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init()

2019-03-28 Thread Dan Carpenter
Currently, when irq_domain_add_linear() fails, the error code does not
get so it returns zero which is wrong.  Fix it by setting appropriate
error code.

Fixes: 9e543e22e204 ("irqchip: Add driver for Loongson-1 interrupt controller")
Signed-off-by: Dan Carpenter 
Reviewed-by: Mukesh Ojha 
---
V2: Improve the commit message

 drivers/irqchip/irq-ls1x.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/irqchip/irq-ls1x.c b/drivers/irqchip/irq-ls1x.c
index 86b72fbd3b45..353111a10413 100644
--- a/drivers/irqchip/irq-ls1x.c
+++ b/drivers/irqchip/irq-ls1x.c
@@ -130,6 +130,7 @@ static int __init ls1x_intc_of_init(struct device_node 
*node,
 NULL);
if (!priv->domain) {
pr_err("ls1x-irq: cannot add IRQ domain\n");
+   err = -ENOMEM;
goto out_iounmap;
}
 
-- 
2.17.1


Re: [PATCH] RISC-V: Fix FIXMAP_TOP to avoid overlap with VMALLOC area

2019-03-28 Thread Palmer Dabbelt

On Fri, 22 Mar 2019 06:25:09 PDT (-0700), Christoph Hellwig wrote:

Looks good,

Reviewed-by: Christoph Hellwig 


Thanks.  I've added this to my fixes list for the next RC.



Btw, what is the 32-bit test vehicle of choice?  qemu with the
virt machine?


Re: [PATCH v3 3/5] dt-bindings: phy: ti: Add dt binding documentation for SERDES in AM654x SoC

2019-03-28 Thread Kishon Vijay Abraham I
Hi Rob,

On 28/03/19 11:37 PM, Rob Herring wrote:
> On Mon, Mar 25, 2019 at 01:38:13PM +0530, Kishon Vijay Abraham I wrote:
>> AM654x has two SERDES instances. Each instance has three input clocks
>> (left input, externel reference clock and right input) and two output
>> clocks (left output and right output) in addition to a PLL mux clock
>> which the SERDES uses for Clock Multiplier Unit (CMU refclock).
>> The PLL mux clock can select from one of the three input clocks.
>> The right output can select between left input and external reference
>> clock while the left output can select between the right input and
>> external reference clock.
>>
>> The left and right input reference clock of SERDES0 and SERDES1
>> respectively are connected to the SoC clock. In the case of two lane
>> SERDES personality card, the left input of SERDES1 is connected to
>> the right output of SERDES0 in a chained fashion.
>>
>> See section "Reference Clock Distribution" of AM65x Sitara Processors
>> TRM (SPRUID7 – April 2018) for more details.
>>
>> Add dt-binding documentation in order to represent all these different
>> configurations in device tree.
>>
>> Signed-off-by: Kishon Vijay Abraham I 
>> ---
>>  .../bindings/phy/ti,phy-am654-serdes.txt  | 81 +++
>>  include/dt-bindings/phy/phy-am654-serdes.h| 13 +++
>>  2 files changed, 94 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt
>>  create mode 100644 include/dt-bindings/phy/phy-am654-serdes.h
>>
>> diff --git a/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt 
>> b/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt
>> new file mode 100644
>> index ..25a9206147ad
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt
>> @@ -0,0 +1,81 @@
>> +TI AM654 SERDES
>> +
>> +Required properties:
>> + - compatible: Should be "ti,phy-am654-serdes"
>> + - reg : Address and length of the register set for the device.
>> + - reg-names: Should be "serdes" which corresponds to the register space
>> +populated in "reg".
> 
> *-names is kind of pointless with only 1 entry.

okay. I'll drop this.
> 
>> + - #phy-cells: determine the number of cells that should be given in the
>> +phandle while referencing this phy. Should be "2". The 1st cell
>> +corresponds to the phy type (should be one of the types specified in
>> +include/dt-bindings/phy/phy.h) and the 2nd cell should be the serdes
>> +lane function.
>> +If SERDES0 is referenced 2nd cell should be:
>> +0 - USB3
>> +1 - PCIe0 Lane0
>> +2 - ICSS2 SGMII Lane0
>> +If SERDES1 is referenced 2nd cell should be:
>> +0 - PCIe1 Lane0
>> +1 - PCIe0 Lane1
>> +2 - ICSS2 SGMII Lane1
>> + - clocks: List of clock-specifiers representing the input to the SERDES.
>> +Should have 3 items representing the left input clock, external
>> +reference clock and right input clock in that order.
>> + - clock-output-names: List of clock names for each of the clock outputs of
>> +SERDES. Should have 3 items for CMU reference clock,
>> +left output clock and right output clock in that order.
>> + - assigned-clocks: As defined in
>> +Documentation/devicetree/bindings/clock/clock-bindings.txt
>> + - assigned-clock-parents: As defined in
>> +Documentation/devicetree/bindings/clock/clock-bindings.txt
>> + - #clock-cells: Should be <1> to choose between the 3 output clocks.
>> +Defined in Documentation/devicetree/bindings/clock/clock-bindings.txt
>> +
>> +   The following macros are defined in dt-bindings/phy/phy-am654-serdes.h
>> +   for selecting the correct reference clock. This can be used while
>> +   specifying the clocks created by SERDES.
>> +=> AM654_SERDES_CMU_REFCLK
>> +=> AM654_SERDES_LO_REFCLK
>> +=> AM654_SERDES_RO_REFCLK
>> +
>> + - mux-controls: phandle to the multiplexer
> 
> What does this mux?

The serdes is muxed between USB, PCIe and SGMII. The mux-controls will help to
select the SERDES function based on the board configuration.
> 
>> +
>> +Example:
>> +
>> +Example for SERDES0 is given below. It has 3 clock inputs;
>> +left input reference clock as indicated by <&k3_clks 153 4>, external
>> +reference clock as indicated by <&k3_clks 153 1> and right input
>> +reference clock as indicated by <&serdes1 AM654_SERDES_LO_REFCLK>. (The
>> +right input of SERDES0 is connected to the left output of SERDES1).
>> +
>> +SERDES0 registers 3 clock outputs as indicated in clock-output-names. The
>> +first refers to the CMU reference clock, second refers to the left output
>> +reference clock and the third refers to the right output reference clock.
>> +
>> +The assigned-clocks and assigned-clock-parents is used here to set the
>> +parent of left input reference clock to MAINHSDIV_CLKOUT4 and parent of
>> +CMU reference clock to left input reference clock.
>> +
>> +serdes0: serdes@90

RE: [PATCHv4 16/28] PCI: mobiveil: refactor Mobiveil PCIe Host Bridge IP driver

2019-03-28 Thread Z.q. Hou
Hi Lorenzo,

Thanks a lot for your comments!

> -Original Message-
> From: Lorenzo Pieralisi 
> Sent: 2019年3月29日 0:09
> To: Z.q. Hou 
> Cc: linux-...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> devicet...@vger.kernel.org; linux-kernel@vger.kernel.org;
> bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com;
> l.subrahma...@mobiveil.co.in; shawn...@kernel.org; Leo Li
> ; catalin.mari...@arm.com; will.dea...@arm.com;
> Mingkai Hu ; M.h. Lian ;
> Xiaowei Bao 
> Subject: Re: [PATCHv4 16/28] PCI: mobiveil: refactor Mobiveil PCIe Host
> Bridge IP driver
> 
> On Thu, Mar 28, 2019 at 02:09:55AM +, Z.q. Hou wrote:
> 
> [...]
> 
> > > > > On Mon, Mar 11, 2019 at 09:32:04AM +, Z.q. Hou wrote:
> > > > > > From: Hou Zhiqiang 
> > > > > >
> > > > > > As the Mobiveil PCIe controller support RC&EP DAUL mode, and to
> make
> > > > > > platforms which integrated the Mobiveil PCIe IP more easy to add
> their
> > > > > > drivers, this patch moved the Mobiveil driver to a new directory
> > > > > > 'drivers/pci/controller/mobiveil' and refactored it according to the
> > > > > > abstraction of RC&EP (EP driver will be added later).
> > > > >
> > > > > I do not want to create a subdirectory for every controller that can
> work in
> > > > > RC&EP so drop this patch, more so given that it will be required 
> > > > > "later",
> we
> > > > > will create a directory when and if we actually have to.
> > > >
> > > > Please don't drop this patch, Xiaowei Bao has sent v1 of the EP mode
> > > > driver, which is depends on this patch.
> > >
> > > I understand but he will have to rebase that code anyway, I do
> > > not want to create a directory for mobiveil before I see other
> > > IPs built on it, we did not do it for the cadence and rockchip
> > > controllers either. So you will have to drop this patch and
> > > rework it.
> >
> > It's different from cadence and rockchip cases, they only have 1
> > driver, so do not need to create a directory, while we have NXP's
> > LX2160 platform and Mobiveil's FPGA platform both use Mobiveil's GPEX
> > IP, so we already have 2 drivers, so I think it is reasonable to
> > collect into an new directory.
> 
> Wrap your messages to 80 characters per line. OK, please follow
> Bjorn's guidelines when reposting and split the series as
> discussed.

OK, I will.

Thanks,
Zhiqiang

> Thanks,
> Lorenzo
> 
> > Thanks,
> > Zhiqiang
> >
> > > Thanks,
> > > Lorenzo
> > >
> > > > > Thanks,
> > > > > Lorenzo
> > > > >
> > > > > > Signed-off-by: Hou Zhiqiang 
> > > > > > Reviewed-by: Minghuan Lian 
> > > > > > Reviewed-by: Subrahmanya Lingappa
> 
> > > > > > ---
> > > > > > V4:
> > > > > >  - no change
> > > > > >
> > > > > >  MAINTAINERS   |   2 +-
> > > > > >  drivers/pci/controller/Kconfig|  11 +-
> > > > > >  drivers/pci/controller/Makefile   |   2 +-
> > > > > >  drivers/pci/controller/mobiveil/Kconfig   |  24 +
> > > > > >  drivers/pci/controller/mobiveil/Makefile  |   4 +
> > > > > >  .../pcie-mobiveil-host.c} | 528
> > > +++---
> > > > > >  .../controller/mobiveil/pcie-mobiveil-plat.c  |  54 ++
> > > > > >  .../pci/controller/mobiveil/pcie-mobiveil.c   | 228 
> > > > > >  .../pci/controller/mobiveil/pcie-mobiveil.h   | 187 +++
> > > > > >  9 files changed, 587 insertions(+), 453 deletions(-)  create mode
> > > > > > 100644 drivers/pci/controller/mobiveil/Kconfig
> > > > > >  create mode 100644 drivers/pci/controller/mobiveil/Makefile
> > > > > >  rename drivers/pci/controller/{pcie-mobiveil.c =>
> > > > > > mobiveil/pcie-mobiveil-host.c} (55%)  create mode 100644
> > > > > > drivers/pci/controller/mobiveil/pcie-mobiveil-plat.c
> > > > > >  create mode 100644
> drivers/pci/controller/mobiveil/pcie-mobiveil.c
> > > > > >  create mode 100644
> drivers/pci/controller/mobiveil/pcie-mobiveil.h
> > > > > >
> > > > > > diff --git a/MAINTAINERS b/MAINTAINERS index
> > > > > > 1e64279f338a..1013e74b14f2 100644
> > > > > > --- a/MAINTAINERS
> > > > > > +++ b/MAINTAINERS
> > > > > > @@ -11877,7 +11877,7 @@ M:  Subrahmanya Lingappa
> > > > > 
> > > > > >  L: linux-...@vger.kernel.org
> > > > > >  S: Supported
> > > > > >  F: Documentation/devicetree/bindings/pci/mobiveil-pcie.txt
> > > > > > -F: drivers/pci/controller/pcie-mobiveil.c
> > > > > > +F: drivers/pci/controller/mobiveil/pcie-mobiveil*
> > > > > >
> > > > > >  PCI DRIVER FOR MVEBU (Marvell Armada 370 and Armada XP SOC
> > > > > support)
> > > > > >  M: Thomas Petazzoni 
> > > > > > diff --git a/drivers/pci/controller/Kconfig
> > > > > > b/drivers/pci/controller/Kconfig index 6671946dbf66..0e981ed00a75
> > > > > > 100644
> > > > > > --- a/drivers/pci/controller/Kconfig
> > > > > > +++ b/drivers/pci/controller/Kconfig
> > > > > > @@ -241,16 +241,6 @@ config PCIE_MEDIATEK
> > > > > >   Say Y here if you want to enable PCIe controller support on
> > > > > >   MediaTek SoCs.
> > > > > >
> > > > > > -config PCIE_MOBI

Re: [PATCH v2] HID: intel-ish-hid: ISH firmware loader client driver

2019-03-28 Thread Nick Crews
This is so close! There are just one or two tiny things.

On Thu, Mar 28, 2019 at 1:20 PM Rushikesh S Kadam
 wrote:
>
> This driver adds support for loading Intel Integrated
> Sensor Hub (ISH) firmware from host file system to ISH
> SRAM and start execution.
>
> At power-on, the ISH subsystem shall boot to an interim
> Shim loader-firmware, which shall expose an ISHTP loader
> device.
>
> The driver implements an ISHTP client that communicates
> with the Shim ISHTP loader device over the intel-ish-hid
> stack, to download the main ISH firmware.
>
> Signed-off-by: Rushikesh S Kadam 
> ---
> The patches are baselined to hid git tree, branch for-5.2/ish
> https://git.kernel.org/pub/scm/linux/kernel/git/hid/hid.git/log/?h=for-5.2/ish
>
> The v2 revision primarily address review comments received on
> the v1 patch.
>
> v2
>  - Change loader_cl_send() so that the calling function
>shall allocate and pass the buffer to be used for
>receiving firwmare response data. Corresponding changes
>in calling function and process_recv().
>  - Introduced struct response_info to encapsulate and pass
>data between from the process_recv() callback to
>calling function loader_cl_send().
>  - Keep count of host firmware load retries, and fail after
>3 unsuccessful attempts.
>  - Dropped report_bad_packets() function previously used for
>keeping count of bad packets.
>  - Inlined loader_ish_hw_reset()'s functionality
>
> v1
>  - Initial version.
>
>  drivers/hid/Makefile|1 +
>  drivers/hid/intel-ish-hid/Kconfig   |   15 +
>  drivers/hid/intel-ish-hid/Makefile  |3 +
>  drivers/hid/intel-ish-hid/ishtp-fw-loader.c | 1084 
> +++
>  4 files changed, 1103 insertions(+)
>  create mode 100644 drivers/hid/intel-ish-hid/ishtp-fw-loader.c
>
> diff --git a/drivers/hid/Makefile b/drivers/hid/Makefile
> index 170163b..d8d393e 100644
> --- a/drivers/hid/Makefile
> +++ b/drivers/hid/Makefile
> @@ -134,3 +134,4 @@ obj-$(CONFIG_USB_KBD)   += usbhid/
>  obj-$(CONFIG_I2C_HID)  += i2c-hid/
>
>  obj-$(CONFIG_INTEL_ISH_HID)+= intel-ish-hid/
> +obj-$(INTEL_ISH_FIRMWARE_DOWNLOADER)   += intel-ish-hid/
> diff --git a/drivers/hid/intel-ish-hid/Kconfig 
> b/drivers/hid/intel-ish-hid/Kconfig
> index 519e4c8..786adbc 100644
> --- a/drivers/hid/intel-ish-hid/Kconfig
> +++ b/drivers/hid/intel-ish-hid/Kconfig
> @@ -14,4 +14,19 @@ config INTEL_ISH_HID
>   Broxton and Kaby Lake.
>
>   Say Y here if you want to support Intel ISH. If unsure, say N.
> +
> +config INTEL_ISH_FIRMWARE_DOWNLOADER
> +   tristate "Host Firmware Load feature for Intel ISH"
> +   depends on INTEL_ISH_HID
> +   depends on X86
> +   help
> + The Integrated Sensor Hub (ISH) enables the kernel to offload
> + sensor polling and algorithm processing to a dedicated low power
> + processor in the chipset.
> +
> + The Host Firmware Load feature adds support to load the ISH
> + firmware from host file system at boot.
> +
> + Say M here if you want to support Host Firmware Loading feature
> + for Intel ISH. If unsure, say N.
>  endmenu
> diff --git a/drivers/hid/intel-ish-hid/Makefile 
> b/drivers/hid/intel-ish-hid/Makefile
> index 825b70a..2de97e4 100644
> --- a/drivers/hid/intel-ish-hid/Makefile
> +++ b/drivers/hid/intel-ish-hid/Makefile
> @@ -20,4 +20,7 @@ obj-$(CONFIG_INTEL_ISH_HID) += intel-ishtp-hid.o
>  intel-ishtp-hid-objs := ishtp-hid.o
>  intel-ishtp-hid-objs += ishtp-hid-client.o
>
> +obj-$(CONFIG_INTEL_ISH_FIRMWARE_DOWNLOADER) += intel-ishtp-loader.o
> +intel-ishtp-loader-objs += ishtp-fw-loader.o
> +
>  ccflags-y += -Idrivers/hid/intel-ish-hid/ishtp
> diff --git a/drivers/hid/intel-ish-hid/ishtp-fw-loader.c 
> b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
> new file mode 100644
> index 000..8685fa6
> --- /dev/null
> +++ b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
> @@ -0,0 +1,1084 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ISH-TP client driver for ISH firmware loading
> + *
> + * Copyright (c) 2019, Intel Corporation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/* ISH TX/RX ring buffer pool size */
> +#define LOADER_CL_RX_RING_SIZE 1
> +#define LOADER_CL_TX_RING_SIZE 1
> +
> +/*
> + * ISH Shim firmware loader reserves 4 Kb buffer in SRAM. The buffer is
> + * used to temporarily hold the data transferred from host to Shim
> + * firmware loader. Reason for the odd size of 3968 bytes? Each IPC
> + * transfer is 128 bytes (= 4 bytes header + 124 bytes payload). So the
> + * 4 Kb buffer can hold maximum of 32 IPC transfers, which means we can
> + * have a max payload of 3968 bytes (= 32 x 124 payload).
> + */
> +#define LOADER_SHIM_IPC_BUF_SIZE   3968
> +
> +/**
> + * enum ish_loader_commands -  ISH loader host commands.
> + * LOADER_CMD_XFER_QUERY   Query the Shim firmwar

Re: [PATCH 1/2] kernel.h: use parentheses around argument in u64_to_user_ptr()

2019-03-28 Thread Mukesh Ojha



On 3/29/2019 2:53 AM, Jann Horn wrote:

Use parentheses around uses of the argument in u64_to_user_ptr() to ensure
that the cast doesn't apply to part of the argument.

There are existing uses of the macro of the form `u64_to_user_ptr(A + B)`,
which expands to `(void __user *)(uintptr_t)A + B` (the cast applies to the
first operand of the addition, the addition is a pointer addition). This
happens to still work as intended, the semantic difference doesn't cause a
difference in behavior.
But I want to use u64_to_user_ptr() with a ternary operator in the
argument, like so: `u64_to_user_ptr(A ? B : C)`. This currently doesn't
work as intended.

Fixes: f09174c501f8 ("x86: add user_atomic_cmpxchg_inatomic at uaccess.h")
Signed-off-by: Jann Horn 



Looks good to me.
Reviewed-by: Mukesh Ojha 

-Mukesh


---
Can we take this patch through the x86 tree with the following one, or
do we need to get this one through akpm's tree first?

  include/linux/kernel.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 34a5036debd3..2d14e21c16c0 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -47,8 +47,8 @@
  
  #define u64_to_user_ptr(x) (		\

  { \
-   typecheck(u64, x);  \
-   (void __user *)(uintptr_t)x;\
+   typecheck(u64, (x));\
+   (void __user *)(uintptr_t)(x);  \
  } \
  )
  


Re: [PATCHv2] x86/boot/KASLR: skip the specified crashkernel reserved region

2019-03-28 Thread Pingfan Liu
On Fri, Mar 22, 2019 at 4:34 PM Baoquan He  wrote:
>
> On 03/22/19 at 03:52pm, Baoquan He wrote:
> > On 03/22/19 at 03:43pm, Pingfan Liu wrote:
> > > > > +/* parse crashkernel=x@y option */
> > > > > +static void mem_avoid_crashkernel_simple(char *option)
> > > >
> > > > Chao ever mentioned this, I want to ask again, why does it has to be
> > > > xxx_simple()?
> > > >
> > > Seems that I had replied Chao's question in another email. The naming
> > > follows the function parse_crashkernel_simple(), as the notes above
> >
> >
> > Sorry, I don't get.  typo?
>
> OK, I misunderstood it. We do have parse_crashkernel_simple() to handle
> crashkernel=size[@offset] case, to differente with other complicated
> cases, like crashkernel=size,[high|low],
>
> Then I am fine with this naming. Soryy about the noise.
>
> By the way, do you think if we should take care of this case:
> crashkernel=:[,:,...][@offset]
>
> It can also specify @offset. Not sure if it's too complicated, you may
> have a investigation.
>
In this case, kernel should get the total memory size info. So
process_e820_entries() or process_efi_entries() should be called
twice. One before handle_mem_options(), so crashkernel can evaluate
the reserved size. It is doable, and what is your opinion about the
extra complicate?

Thanks,
Pingfan
[...]


Re: Page-allocation-failure

2019-03-28 Thread Pankaj Suryawanshi



From: Pankaj Suryawanshi
Sent: 28 March 2019 13:17
To: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Re: Page-allocation-failure



From: Pankaj Suryawanshi
Sent: 28 March 2019 13:12
To: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Page-allocation-failure

Hello ,

I am facing issue related to page allocation failure.

If anyone is familiar with this issue, let me know what is the issue?
How to solved/debug it.

Failure logs -:

-
[   45.073877] kswapd0: page allocation failure: order:0, 
mode:0x1080020(GFP_ATOMIC), nodemask=(null)
[   45.073897] CPU: 1 PID: 716 Comm: kswapd0 Tainted: P   O4.14.65 
#3
[   45.073899] Hardware name: Android (Flattened Device Tree)
[   45.073901] Backtrace:
[   45.073915] [<8020dbec>] (dump_backtrace) from [<8020ded0>] 
(show_stack+0x18/0x1c)
[   45.073920]  r6:600f0093 r5:8141bd5c r4: r3:3abdc664
[   45.073928] [<8020deb8>] (show_stack) from [<80ba5e30>] 
(dump_stack+0x94/0xa8)
[   45.073936] [<80ba5d9c>] (dump_stack) from [<80350610>] 
(warn_alloc+0xe0/0x194)
[   45.073940]  r6:80e090cc r5: r4:81216588 r3:3abdc664
[   45.073946] [<80350534>] (warn_alloc) from [<803514e0>] 
(__alloc_pages_nodemask+0xd70/0x124c)
[   45.073949]  r3: r2:80e090cc
[   45.073952]  r6:0001 r5: r4:8121696c
[   45.073959] [<80350770>] (__alloc_pages_nodemask) from [<803a6c20>] 
(allocate_slab+0x364/0x3e4)
[   45.073964]  r10:0080 r9: r8:01081220 r7: r6: 
r5:01080020
[   45.073966]  r4:bd00d180
[   45.073971] [<803a68bc>] (allocate_slab) from [<803a8c98>] 
(___slab_alloc.constprop.6+0x420/0x4b8)
[   45.073977]  r10: r9: r8:bd00d180 r7:01080020 r6:81216588 
r5:be586360
[   45.073978]  r4:
[   45.073984] [<803a8878>] (___slab_alloc.constprop.6) from [<803a8d54>] 
(__slab_alloc.constprop.5+0x24/0x2c)
[   45.073989]  r10:0004e299 r9:bd00d180 r8:01080020 r7:8147b954 r6:bd6e5a68 
r5:
[   45.073991]  r4:600f0093
[   45.073996] [<803a8d30>] (__slab_alloc.constprop.5) from [<803a9058>] 
(kmem_cache_alloc+0x16c/0x2d0)
[   45.073999]  r4:bd00d180 r3:be586360
---


[   44.966861] Mem-Info:
[   44.966872] active_anon:106078 inactive_anon:142 isolated_anon:0
[   44.966872]  active_file:39117 inactive_file:34254 isolated_file:101
[   44.966872]  unevictable:597 dirty:157 writeback:0 unstable:0
[   44.966872]  slab_reclaimable:4967 slab_unreclaimable:9288
[   44.966872]  mapped:60971 shmem:185 pagetables:5905 bounce:0
[   44.966872]  free:2363 free_pcp:334 free_cma:0
[   44.966879] Node 0 active_anon:424312kB inactive_anon:568kB 
active_file:156468kB inactive_file:137016kB unevictable:2388kB 
isolated(anon):0kB isolated(file):404kB mapped:243884kB dirty:628kB 
writeback:0kB shmem:740kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[   44.966889] DMA free:9348kB min:3772kB low:15684kB high:16624kB 
active_anon:420592kB inactive_anon:284kB active_file:155116kB 
inactive_file:135724kB unevictable:1592kB writepending:628kB present:928768kB 
managed:892508kB mlocked:1592kB kernel_stack:9304kB pagetables:23440kB 
bounce:0kB free_pcp:1336kB local_pcp:672kB free_cma:0kB
[   44.966890] lowmem_reserve[]: 0 0 8 8
[   44.966903] HighMem free:104kB min:128kB low:236kB high:244kB 
active_anon:2632kB inactive_anon:284kB active_file:1912kB inactive_file:1732kB 
unevictable:796kB writepending:0kB present:1056768kB managed:8192kB 
mlocked:796kB kernel_stack:0kB pagetables:180kB bounce:0kB free_pcp:0kB 
local_pcp:0kB free_cma:0kB
[   44.966904] lowmem_reserve[]: 0 0 0 0
[   44.966909] DMA: 148*4kB (UMH) 52*8kB (UMH) 30*16kB (UMH) 19*32kB (MH) 
7*64kB (MH) 4*128kB (H) 5*256kB (H) 0*512kB 0*1024kB 1*2048kB (H) 1*4096kB (H) 
0*8192kB 0*16384kB = 10480kB
[   44.966936] HighMem: 2*4kB (UM) 2*8kB (UM) 1*16kB (U) 0*32kB 1*64kB (U) 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 104kB
[   44.966958] 74134 total pagecache pages
[   44.966961] 0 pages in swap cache
[   44.966963] Swap cache stats: add 0, delete 0, find 0/0
[   44.966965] Free swap  = 0kB
[   44.966966] Total swap = 0kB
[   44.966967] 496384 pages RAM
[   44.966969] 264192 pages HighMem/MovableOnly
[   44.966972] 271209 pages reserved
[   44.966975] 262144 pages cma reserved

/proc/sys/vm/min_free_kbytes = 3774
Can increasing this value will solved this issue ?

Regards,
Pankaj
*
 eInfochips Business Disclaimer: This e-mail message and all attachments 
transmitted with it are intended 

Re: [PATCH 4/4] leds: lm3532: Introduce the lm3532 LED driver

2019-03-28 Thread Sebastian Reichel
Hi,

On Mon, Mar 25, 2019 at 11:01:18AM -0500, Dan Murphy wrote:
> On 3/25/19 9:54 AM, Tony Lindgren wrote:
> > * Dan Murphy  [190325 12:36]:
> >> On 3/22/19 5:16 PM, Tony Lindgren wrote:
> >>> I can control the backlight brightness just fine via /sys, and
> >>> backlight shows up as the trigger in /sys/class/leds/lm3532:backlight,
> >>> but /sys/class/backlight is empty and looks like drm can't find it.
> >>>
> >>> Do I need to enable some additional driver(s) to get this to work
> >>> with the drm driver?
> >>>
> >>
> >> Can you dump or point to the defconfig?
> > 
> > This is just with the omap2plus_defconfig as in the droid4-pending-v5.0
> > test branch below [0]. That branch has Sebastian's drm patches. The
> > branch also has the older ti-lmu patches that I reverted for testing
> > before applying your new series and enabling it in .config.
> > 
> > The drm driver (drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c) just
> > does:
> > 
> > backlight = of_parse_phandle(node, "backlight", 0);
> > ...
> > 
> > That should still work the same, right?
> 
> Yes it should still work the same.
> I did not change the node name.
> So the DRM driver should find the node.

This will not work, since the next line tries to get it as a
backlight device, but it's an LED device instead:

of_find_backlight_by_node(backlight);

I suppose the backlight device could be instantiated on top
of the LED device somehow.

(sorry for slow responses; I'm quite busy right now)

-- Sebastian


signature.asc
Description: PGP signature


Re: [PATCH 2/2] ARM: dts: Add support for ZII i.MX7 RPU2 board

2019-03-28 Thread Andrey Smirnov
On Tue, Mar 26, 2019 at 11:41 PM Andrey Smirnov
 wrote:
>
> Add support for ZII's i.MX7 based Remote Peripheral Unit 2 (RPU2)
> board.
>
> Signed-off-by: Andrey Smirnov 
> Cc: Shawn Guo 
> Cc: Chris Healy 
> Cc: Andrew Lunn 
> Cc: Fabio Estevam 
> Cc: Rob Herring 
> Cc: linux-kernel@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> ---
>  arch/arm/boot/dts/Makefile   |   1 +
>  arch/arm/boot/dts/imx7d-zii-rpu2.dts | 936 +++
>  2 files changed, 937 insertions(+)
>  create mode 100644 arch/arm/boot/dts/imx7d-zii-rpu2.dts
>
> diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
> index efefce8efa05..687222907154 100644
> --- a/arch/arm/boot/dts/Makefile
> +++ b/arch/arm/boot/dts/Makefile
> @@ -586,6 +586,7 @@ dtb-$(CONFIG_SOC_IMX7D) += \
> imx7d-sdb.dtb \
> imx7d-sdb-reva.dtb \
> imx7d-sdb-sht11.dtb \
> +   imx7d-zii-rpu2.dtb \
> imx7s-colibri-eval-v3.dtb \
> imx7s-warp.dtb
>  dtb-$(CONFIG_SOC_IMX7ULP) += \
> diff --git a/arch/arm/boot/dts/imx7d-zii-rpu2.dts 
> b/arch/arm/boot/dts/imx7d-zii-rpu2.dts
> new file mode 100644
> index ..6f36c7d9d214
> --- /dev/null
> +++ b/arch/arm/boot/dts/imx7d-zii-rpu2.dts
> @@ -0,0 +1,936 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +
> +/*
> + * Device tree file for ZII's RPU2 board
> + *
> + * RPU - Remote Peripheral Unit
> + *
> + * Copyright (C) 2019 Zodiac Inflight Innovations
> + */
> +
> +/dts-v1/;
> +#include "imx7d.dtsi"
> +
> +/ {
> +   model = "ZII RPU2 Board";
> +   compatible = "zii,imx7d-rpu2", "fsl,imx7d";
> +
> +   chosen {
> +   stdout-path = &uart1;
> +   };
> +
> +   gpio-leds {
> +   compatible = "gpio-leds";
> +   pinctrl-0 = <&pinctrl_leds_debug>;
> +   pinctrl-names = "default";
> +
> +   debug {
> +   label = "zii:green:debug1";
> +   gpios = <&gpio2 8 GPIO_ACTIVE_HIGH>;
> +   linux,default-trigger = "heartbeat";
> +   };
> +   };
> +
> +   reg_can1_stby: regulator-can1-stby {
> +   compatible = "regulator-fixed";
> +   pinctrl-names = "default";
> +   pinctrl-0 = <&pinctrl_flexcan1_stby>;
> +   regulator-name = "can1-3v3";
> +   regulator-min-microvolt = <330>;
> +   regulator-max-microvolt = <330>;
> +   gpio = <&gpio1 9 GPIO_ACTIVE_HIGH>;
> +   enable-active-high;
> +   };
> +
> +   reg_can2_stby: regulator-can2-stby {
> +   compatible = "regulator-fixed";
> +   pinctrl-names = "default";
> +   pinctrl-0 = <&pinctrl_flexcan2_stby>;
> +   regulator-name = "can2-3v3";
> +   regulator-min-microvolt = <330>;
> +   regulator-max-microvolt = <330>;
> +   gpio = <&gpio1 8 GPIO_ACTIVE_HIGH>;
> +   enable-active-high;
> +   };
> +
> +   reg_vref_1v8: regulator-vref-1v8 {
> +   compatible = "regulator-fixed";
> +   regulator-name = "vref-1v8";
> +   regulator-min-microvolt = <180>;
> +   regulator-max-microvolt = <180>;
> +   regulator-always-on;
> +   };
> +
> +   reg_3p3v: regulator-3p3v {
> +   compatible = "regulator-fixed";
> +   regulator-name = "GEN_3V3";
> +   regulator-min-microvolt = <330>;
> +   regulator-max-microvolt = <330>;
> +   regulator-always-on;
> +   };
> +
> +   reg_5p0v_main: regulator-5p0v-main {
> +   compatible = "regulator-fixed";
> +   regulator-name = "5V_MAIN";
> +   regulator-min-microvolt = <500>;
> +   regulator-max-microvolt = <500>;
> +   regulator-always-on;
> +   };
> +
> +   sound1 {
> +   compatible = "simple-audio-card";
> +   simple-audio-card,name = "Audio Output 1";
> +   simple-audio-card,format = "i2s";
> +   simple-audio-card,bitclock-master = <&sound1_codec>;
> +   simple-audio-card,frame-master = <&sound1_codec>;
> +   simple-audio-card,widgets =
> +   "Headphone", "Headphone Jack";
> +   simple-audio-card,routing =
> +   "Headphone Jack", "HPLEFT",
> +   "Headphone Jack", "HPRIGHT",
> +   "LEFTIN", "HPL",
> +   "RIGHTIN", "HPR";
> +   simple-audio-card,aux-devs = <&hpa1>;
> +
> +   simple-audio-card,cpu {
> +   sound-dai = <&sai1>;
> +   };
> +
> +   sound1_codec: simple-audio-card,codec {
> +   sound-dai = <&codec1>;
> +   clocks = <&cs2000>;
> +   };
> +   };
> +
> +   sound2 {
> +

Re: [PATCH 1/2] platform/x86: intel_pmc_core: Convert to a platform_driver

2019-03-28 Thread Rajat Jain
Hi Srinivas,


On Thu, Mar 28, 2019 at 8:41 PM Srinivas Pandruvada
 wrote:
>
> On Mon, 2019-03-25 at 18:41 -0700, Rajat Jain wrote:
> > Hi Rajneesh,
> >
> >
> > On Mon, Mar 25, 2019 at 3:23 AM Bhardwaj, Rajneesh
> >  wrote:
> > >
> > > Hi Rajat
> > >
> > > On 23-Mar-19 6:00 AM, Rajat Jain wrote:
> > > > Hi Rajneesh,
> > > >
> > > >
> > > >
> > > > On Fri, Mar 22, 2019 at 12:56 PM Bhardwaj, Rajneesh
> > > >  wrote:
> > > > > Some suggestions below
> > > > >
> > > > > On 18-Mar-19 8:36 PM, Rajat Jain wrote:
> > > > >
> > > > > On Sat, Mar 16, 2019 at 1:30 AM Rajneesh Bhardwaj
> > > > >  wrote:
> > > > >
> > > > > On Wed, Mar 13, 2019 at 03:21:23PM -0700, Rajat Jain wrote:
> > > > >
> > > > > Convert the intel_pmc_core driver to a platform driver. There
> > > > > is no
> > > > > functional change. Some code that tries to determine what kind
> > > > > of
> > > > > CPU this is, has been moved code is moved from pmc_core_probe()
> > > > > to
> > > > >
> > > > > Possible typo here.
> > > > >
> > > > > Ummm, you mean grammar error I guess? Sure, I will rephrase.
> > > > >
> > > > > pmc_core_init().
> > > > >
> > > > > Signed-off-by: Rajat Jain 
> > > > >
> > > > > Thanks for sending this. This is certainly useful to support
> > > > > suspend-resume
> > > > > functionality for this driver which is otherwise only possible
> > > > > with PM
> > > > > notifiers otherwise and that is not desirable. Initially this
> > > > > was a PCI
> > > > > driver and after design discussion it was converted to module.
> > > > > I would like
> > > > > to consult Andy and Srinivas for their opinion about binding it
> > > > > to actual
> > > > > platform bus instead of the virtual bus as in its current form.
> > > > > In one of the
> > > > > internal versions, we used a known acpi PNP HID.
> > > > >
> > > > > Sure, if there is an established ACPI PNP HID, then we could
> > > > > bind it
> > > > > using that, on platforms where we are still developing BIOS /
> > > > > coreboot. However, this might not be possible for shipping
> > > > > systems
> > > > > (Kabylake / skylake) where there is no plan to change the BIOS.
> > > > >
> > > > > In one of our internal patches, i had used HID of power engine
> > > > > plugin. IIRC, During my testing it was working on KBL, CNL with
> > > > > UEFI BIOS but i highly recommend testing it.
> > > > >
> > > > > ---8<8<-
> > > > >
> > > > > +static const struct acpi_device_id pmc_acpi_ids[] = {
> > > > >
> > > > > + {"INT33A1", 0}, /* _HID for Intel Power Engine,
> > > > > _CID PNP0D80*/
> > > > >
> > > > > + { }
> > > > >
> > > > >   };
> > > >
> > > > We do not have this device in any of our ACPI tables today. If
> > > > Intel
> > > > can confirm that this is a well known HID to be used for
> > > > attaching
> > > > this driver, we can start putting it on our platform's ACPI going
> > > > forward (Whiskeylake, Cometlake, Cannonlake, Icelake ...). But I
> > > > believe we also need to have this driver attach with the device
> > > > on
> > > > older platforms (Skylake, Kabylake, Amberlake) that are already
> > > > shipping, and running a Non UEFI BIOS (that may not have this HID
> > > > since it is not published).
> > > >
> > > > Currently the intel_pmc_core driver attaches itself to the
> > > > following
> > > > table of CPU families, without regard to whether it has that HID
> > > > in
> > > > the ACPI or not:
> > > >
> > > > static const struct x86_cpu_id intel_pmc_core_ids[] = {
> > > >  INTEL_CPU_FAM6(SKYLAKE_MOBILE, spt_reg_map),
> > > >  INTEL_CPU_FAM6(SKYLAKE_DESKTOP, spt_reg_map),
> > > >  INTEL_CPU_FAM6(KABYLAKE_MOBILE, spt_reg_map),
> > > >  INTEL_CPU_FAM6(KABYLAKE_DESKTOP, spt_reg_map),
> > > >  INTEL_CPU_FAM6(CANNONLAKE_MOBILE, cnp_reg_map),
> > > >  INTEL_CPU_FAM6(ICELAKE_MOBILE, icl_reg_map),
> > > >  {}
> > > > };
> > >
> > > In the past i tried one hybrid approach i.e. PCI and Platform
> > > driver at
> > > the same time. Based on that, i feel that this idea of spilling
> > > probe
> > > like this may not be the best option. The ACPI CID that i suggested
> > > is
> > > available on most Intel Core Platforms that i have worked on and i
> > > can
> > > help you in verifying it with UEFI BIOS if you want. Meanwhile,
> > > please
> > > see this https://patchwork.kernel.org/patch/9806565/ it gives some
> > > background about this ACPI ID and also points to the LPIT spec.
> > >
> > > >
> > > > So to avoid a regression, I suggest that we still maintain the
> > > > above
> > > > table (may be eliminate few entries) and always attach if the CPU
> > > > is
> > > > among the table, and if the CPU is not among the table, use the
> > > > ACPI
> > > > HID to attach. I propose to attach to at least Skylake and
> > > > Kabylake
> > > > systems using the table above, and for Canonlake and Icelake and
> > > > newer, we can rely on BIOS providing the ACPI HID. Of course I do
> > > > not
> > > > know if all non-Google Can

Re: [External] Re: vmscan: Reclaim unevictable pages

2019-03-28 Thread Pankaj Suryawanshi



From: Michal Hocko 
Sent: 26 March 2019 15:06
To: Pankaj Suryawanshi
Cc: Kirill Tkhai; Vlastimil Babka; aneesh.ku...@linux.ibm.com; 
linux-kernel@vger.kernel.org; minc...@kernel.org; linux...@kvack.org; 
khand...@linux.vnet.ibm.com
Subject: Re: [External] Re: vmscan: Reclaim unevictable pages

On Tue 26-03-19 09:16:11, Pankaj Suryawanshi wrote:
>
> 
> From: Michal Hocko 
> Sent: 26 March 2019 14:31
> To: Pankaj Suryawanshi
> Cc: Kirill Tkhai; Vlastimil Babka; aneesh.ku...@linux.ibm.com; 
> linux-kernel@vger.kernel.org; minc...@kernel.org; linux...@kvack.org; 
> khand...@linux.vnet.ibm.com
> Subject: Re: [External] Re: vmscan: Reclaim unevictable pages
>
> [You were asked to use a reasonable quoting several times. This is
> really annoying because it turns the email thread into a complete mess]
>
> [Already fix the email client, but dont know the reason for quoting Maybe 
> account issue.]

You clearly haven't

> As i said earlier, i am using vanilla kernel 4.14.65.

This got lost in the quoting mess. Can you reproduce with 5.0?
Actually i am using android pie-9.0, can i replace kernel 4.14.65 to 5.0 ?
--
Michal Hocko
SUSE Labs
*
 eInfochips Business Disclaimer: This e-mail message and all attachments 
transmitted with it are intended solely for the use of the addressee and may 
contain legally privileged and confidential information. If the reader of this 
message is not the intended recipient, or an employee or agent responsible for 
delivering this message to the intended recipient, you are hereby notified that 
any dissemination, distribution, copying, or other use of this message or its 
attachments is strictly prohibited. If you have received this message in error, 
please notify the sender immediately by replying to this message and please 
delete it from your computer. Any views expressed in this message are those of 
the individual sender unless otherwise stated. Company has taken enough 
precautions to prevent the spread of viruses. However the company accepts no 
liability for any damage caused by any virus transmitted by this email. 
*


Re: [PATCH REBASED] hugetlbfs: fix potential over/underflow setting node specific nr_hugepages

2019-03-28 Thread Naoya Horiguchi
On Thu, Mar 28, 2019 at 03:05:33PM -0700, Mike Kravetz wrote:
> The number of node specific huge pages can be set via a file such as:
> /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
> When a node specific value is specified, the global number of huge
> pages must also be adjusted.  This adjustment is calculated as the
> specified node specific value + (global value - current node value).
> If the node specific value provided by the user is large enough, this
> calculation could overflow an unsigned long leading to a smaller
> than expected number of huge pages.
> 
> To fix, check the calculation for overflow.  If overflow is detected,
> use ULONG_MAX as the requested value.  This is inline with the user
> request to allocate as many huge pages as possible.
> 
> It was also noticed that the above calculation was done outside the
> hugetlb_lock.  Therefore, the values could be inconsistent and result
> in underflow.  To fix, the calculation is moved within the routine
> set_max_huge_pages() where the lock is held.
> 
> In addition, the code in __nr_hugepages_store_common() which tries to
> handle the case of not being able to allocate a node mask would likely
> result in incorrect behavior.  Luckily, it is very unlikely we will
> ever take this path.  If we do, simply return ENOMEM.
> 
> Reported-by: Jing Xiangfeng 
> Signed-off-by: Mike Kravetz 

Looks good to me.

Reviewed-by: Naoya Horiguchi 

> ---
> This was sent upstream during 5.1 merge window, but dropped as it was
> based on an earlier version of Alex Ghiti's patch which was dropped.
> Now rebased on top of Alex Ghiti's "[PATCH v8 0/4] Fix free/allocation
> of runtime gigantic pages" series which was just added to mmotm.
> 
>  mm/hugetlb.c | 41 ++---
>  1 file changed, 34 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f3e84c1bef11..f79ae4e42159 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2287,13 +2287,33 @@ static int adjust_pool_surplus(struct hstate *h, 
> nodemask_t *nodes_allowed,
>  }
>  
>  #define persistent_huge_pages(h) (h->nr_huge_pages - h->surplus_huge_pages)
> -static int set_max_huge_pages(struct hstate *h, unsigned long count,
> +static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> nodemask_t *nodes_allowed)
>  {
>   unsigned long min_count, ret;
>  
>   spin_lock(&hugetlb_lock);
>  
> + /*
> +  * Check for a node specific request.
> +  * Changing node specific huge page count may require a corresponding
> +  * change to the global count.  In any case, the passed node mask
> +  * (nodes_allowed) will restrict alloc/free to the specified node.
> +  */
> + if (nid != NUMA_NO_NODE) {
> + unsigned long old_count = count;
> +
> + count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
> + /*
> +  * User may have specified a large count value which caused the
> +  * above calculation to overflow.  In this case, they wanted
> +  * to allocate as many huge pages as possible.  Set count to
> +  * largest possible value to align with their intention.
> +  */
> + if (count < old_count)
> + count = ULONG_MAX;
> + }
> +
>   /*
>* Gigantic pages runtime allocation depend on the capability for large
>* page range allocation.
> @@ -2445,15 +2465,22 @@ static ssize_t __nr_hugepages_store_common(bool 
> obey_mempolicy,
>   }
>   } else if (nodes_allowed) {
>   /*
> -  * per node hstate attribute: adjust count to global,
> -  * but restrict alloc/free to the specified node.
> +  * Node specific request.  count adjustment happens in
> +  * set_max_huge_pages() after acquiring hugetlb_lock.
>*/
> - count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
>   init_nodemask_of_node(nodes_allowed, nid);
> - } else
> - nodes_allowed = &node_states[N_MEMORY];
> + } else {
> + /*
> +  * Node specific request, but we could not allocate the few
> +  * words required for a node mask.  We are unlikely to hit
> +  * this condition.  Since we can not pass down the appropriate
> +  * node mask, just return ENOMEM.
> +  */
> + err = -ENOMEM;
> + goto out;
> + }
>  
> - err = set_max_huge_pages(h, count, nodes_allowed);
> + err = set_max_huge_pages(h, count, nid, nodes_allowed);
>  
>  out:
>   if (nodes_allowed != &node_states[N_MEMORY])
> -- 
> 2.20.1
> 
> 


Re: [External] Re: Print map for total physical and virtual memory

2019-03-28 Thread Pankaj Suryawanshi



From: Matthew Wilcox 
Sent: 26 March 2019 18:13
To: Pankaj Suryawanshi
Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Re: [External] Re: Print map for total physical and virtual memory

On Tue, Mar 26, 2019 at 12:35:25PM +, Pankaj Suryawanshi wrote:
> From: Matthew Wilcox 
> Sent: 26 March 2019 17:06
> To: Pankaj Suryawanshi
> Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
> Subject: [External] Re: Print map for total physical and virtual memory
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.

... you should probably use gmail or something.  Whatever broken email
system your employer provides makes it really hard for you to participate
in any meaningful way.

Okay i will use gmail.

> Can you please elaborate about tools/vm/page-types.c ?

cd tools/vm/
make
sudo ./page-types

If that doesn't do exactly what you need, you can use the source code to
make a program which does.

Thanks Matthew.
*
 eInfochips Business Disclaimer: This e-mail message and all attachments 
transmitted with it are intended solely for the use of the addressee and may 
contain legally privileged and confidential information. If the reader of this 
message is not the intended recipient, or an employee or agent responsible for 
delivering this message to the intended recipient, you are hereby notified that 
any dissemination, distribution, copying, or other use of this message or its 
attachments is strictly prohibited. If you have received this message in error, 
please notify the sender immediately by replying to this message and please 
delete it from your computer. Any views expressed in this message are those of 
the individual sender unless otherwise stated. Company has taken enough 
precautions to prevent the spread of viruses. However the company accepts no 
liability for any damage caused by any virus transmitted by this email. 
*


RE: [LINUX PATCH v13] rawnand: pl353: Add basic driver for arm pl353 smc nand interface

2019-03-28 Thread Naga Sureshkumar Relli
Hi Helmut,

> -Original Message-
> From: Helmut Grohne 
> Sent: Thursday, March 28, 2019 5:21 PM
> To: Naga Sureshkumar Relli 
> Cc: bbrezil...@kernel.org; miquel.ray...@bootlin.com; rich...@nod.at;
> dw...@infradead.org; computersforpe...@gmail.com; marek.va...@gmail.com; 
> linux-
> m...@lists.infradead.org; linux-kernel@vger.kernel.org; Michal Simek 
> ;
> nagasureshkumarre...@gmail.com
> Subject: Re: [LINUX PATCH v13] rawnand: pl353: Add basic driver for arm pl353 
> smc nand
> interface
> 
> Hi Naga,
> 
> On Wed, Mar 27, 2019 at 09:13:59AM +, Naga Sureshkumar Relli wrote:
> > It's a on-die ECC capable device. Did u mentioned nand-ecc-mode = "on-die" 
> > in dts.
> > The same part I tested by mentioning "on-die" property in dts and it worked 
> > for me.
> > Please share the dts entries for NAND.
> > Also if it is x8 bus then please mention nand-bus-width = <8>; If it
> > is x16 mention nand-bus-width = <16>;
> 
> Thank you for pointing at the relevant properties. Indeed, these were missing 
> in my previous
> tests. I am now using the following dt (generated from multiple fragments, 
> giving the
> decompiled dt here):
> 
> | memory-controller@e000e000 {
> | #address-cells = <0x2>;
> | #size-cells = <0x1>;
> | status = "okay";
> | clock-names = "memclk", "apb_pclk";
> | clocks = <0x1 0xb 0x1 0x2c>;
> | compatible = "arm,pl353-smc-r2p1", "arm,primecell";
> | interrupt-parent = <0x4>;
> | interrupts = <0x0 0x12 0x4>;
> | ranges = <0x0 0x0 0xe100 0x100>;
> | reg = <0xe000e000 0x1000>;
> |
> | flash@e100 {
> | status = "okay";
> | compatible = "arm,pl353-nand-r2p1";
> | reg = <0x0 0x0 0x100>;
> | #address-cells = <0x1>;
> | #size-cells = <0x1>;
> | nand-ecc-mode = "on-die";
> | nand-ecc-algo = "hamming";
> | nand-bus-width = <0x8>;
> | };
> | };
> 
> With this dt, the device is successfully initialized and the data read is 
> mostly intact. When
> using it with jffs2, I get loads of ECC errors though (offsets and lengths 
> vary):
> 
> | jffs2: mtd->read(0x800 bytes from 0xb6) returned ECC error
> 
> Reverting back to the out-of-tree driver (4.14), it works normally, so a 
> hardware defect seems
> unlikely. I compared a register dump of the smc between those drivers and the 
> only difference I
> could find was NAND timings (at 0xE000E180), which are much lower with the 
> new drivers
> as it does not consume the arm,nand-cycle-* properties that the old driver 
> consumed. I tried
> hard coding the previous timings, but the ECC errors persist. This leads me 
> to conclude that
> timings are not the cause for what I am seeing.
> 
> Is there anything else I can try to diagnose it?
Thanks for trying with new dts.
Previously we will pass the nand-cycle-* through dts.
But now framework is giving all the timing information of SDR. So we will just 
configure those
Timings. I will recheck the driver about the timings.

Till now I tried mtd-utils(mtd-debug) and ubifs.
I haven't tried jffs2. Let me give a try and will let you know.

Thanks,
Naga Sureshkumar Relli.
> 
> Helmut


Re: [PATCH v2] RISC-V: Implement ASID allocator

2019-03-28 Thread Anup Patel
On Fri, Mar 29, 2019 at 10:34 AM Paul Walmsley  wrote:
>
>
> On Thu, 28 Mar 2019, Anup Patel wrote:
>
> > Signed-off-by: Gary Guo 
> > Signed-off-by: Anup Patel 
> > ---
> > Changes since v1:
> > - We adapt good aspects from Gary Guo's ASID allocator implementation
> >   and provide due credit to him by adding his SoB.
>
> This isn't the right way to use Signed-off-by: lines in the kernel.  See
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/5.Posting.rst#n213
>
> and
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n418
>
> The only people who should be listed in Signed-off-by: lines are patch
> authors.
>
> If your intention is to give Gary credit for changes in your patch,
> describe what ideas or code snippets you've taken in the patch description
> itself, above your Signed-off-by: lines so they would be included in the
> git commit description should your patch be merged.

Thanks for clarifying...

Under "Changes since v1" I have clarified what suggestions I have
taken from Gray.

Gray insist that I use "Co-developed-by:" so I have posted v3 with
"Co-developed-by:". Honestly, I don't mind "Signed-off-by: " or
"Co-developed-by:" as long as I am acknowledging other person's
efforts.

Regards,
Anup


Re: [PATCH v2] RISC-V: Implement ASID allocator

2019-03-28 Thread Paul Walmsley


On Thu, 28 Mar 2019, Anup Patel wrote:

> Signed-off-by: Gary Guo 
> Signed-off-by: Anup Patel 
> ---
> Changes since v1:
> - We adapt good aspects from Gary Guo's ASID allocator implementation
>   and provide due credit to him by adding his SoB.

This isn't the right way to use Signed-off-by: lines in the kernel.  See

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/5.Posting.rst#n213

and

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n418

The only people who should be listed in Signed-off-by: lines are patch 
authors.

If your intention is to give Gary credit for changes in your patch, 
describe what ideas or code snippets you've taken in the patch description 
itself, above your Signed-off-by: lines so they would be included in the 
git commit description should your patch be merged.


- Paul


[PATCH v3] RISC-V: Implement ASID allocator

2019-03-28 Thread Anup Patel
Currently, we do local TLB flush on every MM switch. This is very harsh on
performance because we are forcing page table walks after every MM switch.

This patch implements ASID allocator for assigning an ASID to a MM context.
The number of ASIDs are limited in HW so we create a logical entity named
CONTEXTID for assigning to MM context. The lower bits of CONTEXTID are ASID
and upper bits are VERSION number. The number of usable ASID bits supported
by HW are detected at boot-time by writing 1s to ASID bits in SATP CSR.

We allocate new CONTEXTID on first MM switch for a MM context where the
ASID is allocated from an ASID bitmap and VERSION is provide by an atomic
counter. At time of allocating new CONTEXTID, if we run out of available
ASIDs then:
1. We flush the ASID bitmap
2. Increment current VERSION atomic counter
3. Re-allocate ASID from ASID bitmap
4. Flush TLB on all CPUs
5. Try CONTEXTID re-assignment on all CPUs

Please note that we don't use ASID #0 because it is used at boot-time by
all CPUs for initial MM context. Also, newly created context is always
assigned CONTEXTID #0 (i.e. VERSION #0 and ASID #0) which is an invalid
context in our implementation.

Using above approach, we have virtually infinite CONTEXTIDs on-top-of
limited number of HW ASIDs. This approach is inspired from ASID allocator
used for Linux ARM/ARM64 but we have adapted it for RISC-V. Overall, this
ASID allocator helps us reduce rate of local TLB flushes on every CPU
thereby increasing performance.

This patch is tested on QEMU/virt machine and SiFive Unleashed board. On
QEMU/virt machine, we see 10% (approx) performance improvement with SW
emulated TLBs provided by QEMU. Unfortunately, ASID bits of SATP CSR are
not implemented on SiFive Unleashed board so we don't see any change in
performance.

Co-developed-by:: Gary Guo 
Signed-off-by: Anup Patel 
---
Changes since v2:
- Move to lazy TLB flushing because we get slow path warnings if we
  use flush_tlb_all()
- Don't set ASID bits to all 1s in head.s. Instead just do it on
  boot CPU calling asids_init() for determining number of HW ASID bits
- Make CONTEXT version comparison more readable in set_mm_asid()
- Fix typo in __flush_context()

Changes since v1:
- We adapt good aspects from Gary Guo's ASID allocator implementation
  and provide due credit to him by adding his SoB.
- Track ASIDs active during context flush and mark them as reserved
- Set ASID bits to all 1s to simplify number of ASID bit detection
- Use atomic_long_t instead of atomic64_t for being 32bit friendly
- Use unsigned long instead of u64 for being 32bit friendly
- Use flush_tlb_all() instead of lazy local_tlb_flush_all() at time
  of context flush

This patch is based on Linux-5.1-rc2 and TLB flush cleanup patches v4
from Gary Guo. It can be also found in riscv_asid_allocator_v3 branch
of https://github.com/avpatel/linux.git
---
 arch/riscv/include/asm/csr.h |   6 +
 arch/riscv/include/asm/mmu.h |   1 +
 arch/riscv/include/asm/mmu_context.h |   1 +
 arch/riscv/mm/context.c  | 261 ++-
 4 files changed, 259 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 28a0d1cb374c..ce18ab8f53ed 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -45,10 +45,16 @@
 #define SATP_PPN _AC(0x003F, UL)
 #define SATP_MODE_32 _AC(0x8000, UL)
 #define SATP_MODESATP_MODE_32
+#define SATP_ASID_BITS 9
+#define SATP_ASID_SHIFT22
+#define SATP_ASID_MASK _AC(0x1FF, UL)
 #else
 #define SATP_PPN _AC(0x0FFF, UL)
 #define SATP_MODE_39 _AC(0x8000, UL)
 #define SATP_MODESATP_MODE_39
+#define SATP_ASID_BITS 16
+#define SATP_ASID_SHIFT44
+#define SATP_ASID_MASK _AC(0x, UL)
 #endif

 /* Interrupt Enable and Interrupt Pending flags */
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index 5df2dccdba12..42a9ca0fe1fb 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -18,6 +18,7 @@
 #ifndef __ASSEMBLY__

 typedef struct {
+   atomic_long_t id;
void *vdso;
 #ifdef CONFIG_SMP
/* A local icache flush is needed before user execution can resume. */
diff --git a/arch/riscv/include/asm/mmu_context.h 
b/arch/riscv/include/asm/mmu_context.h
index bf4f097a9051..bd271c6b0e5e 100644
--- a/arch/riscv/include/asm/mmu_context.h
+++ b/arch/riscv/include/asm/mmu_context.h
@@ -30,6 +30,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
 static inline int init_new_context(struct task_struct *task,
struct mm_struct *mm)
 {
+   atomic_long_set(&mm->context.id, 0);
return 0;
 }

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 0f787bcd3a7a..863b6926d6d9 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -2,13 +2,213 @@
 /*
  * Copyright (C) 2012 Regents of the University of California
  * Copyright (C) 2017 SiFive
+ * Copyri

Re: [PATCH v2] RISC-V: Implement ASID allocator

2019-03-28 Thread Anup Patel
On Thu, Mar 28, 2019 at 8:03 PM Gary Guo  wrote:
>
>
>
> On 28/03/2019 14:13, Anup Patel wrote:
> > On Thu, Mar 28, 2019 at 7:07 PM Gary Guo  wrote:
> >>
> >> Hi Anup,
> >>
> >> The code still does not use ASID in TLB flush routines. Without this
> >> added the code does not boot on systems with true ASID support.
> >>
> >> We also need to consider the case of CONTEXTID overflow on 32-bit
> >> systems. 32-bit CONTEXTID may overflow in a month time.
> >>
> >> Please all see my inline comments.
> >>
> >> Best,
> >> Gary
> >>
> >> On 28/03/2019 06:32, Anup Patel wrote:
> >>> Currently, we do local TLB flush on every MM switch. This is very harsh
> >>> on performance because we are forcing page table walks after every MM
> >>> switch.
> >>>
> >>> This patch implements ASID allocator for assigning an ASID to every MM
> >>> context. The number of ASIDs are limited in HW so we create a logical
> >>> entity named CONTEXTID for assigning to MM context. The lower bits of
> >>> CONTEXTID are ASID and upper bits are VERSION number. The number of
> >>> usable ASID bits supported by HW are detected at boot-time by writing
> >>> 1s to ASID bits in SATP CSR. This means last ASID is always reserved
> >>> because it is used for initial MM context.
> >>>
> >>> We allocate new CONTEXTID on first MM switch for a MM context where
> >>> the ASID is allocated from an ASID bitmap and VERSION is provide by
> >>> an atomic counter. At time of allocating new CONTEXTID, if we run out
> >>> of available ASIDs then:
> >>> 1. We flush the ASID bitmap
> >>> 2. Increment current VERSION atomic counter
> >>> 3. Re-allocate ASID from ASID bitmap
> >>> 4. Flush TLB on all CPUs
> >>> 5. Try CONTEXTID re-assignment on all CPUs
> >>>
> >>> Using above approach, we have virtually infinite CONTEXTIDs on-top-of
> >>> limited number of HW ASIDs. This approach is inspired from ASID allocator
> >>> used for Linux ARM/ARM64 but we have adapted it for RISC-V. Overall,
> >>> this ASID allocator helps us reduce rate of local TLB flushes on every
> >>> CPU thereby increasing performance.
> >>>
> >>> This patch is tested on QEMU/virt machine and SiFive Unleashed board.
> >>> On QEMU/virt machine, we see 10% (approx) performance improvement with
> >>> SW emulated TLBs provided by QEMU. Unfortunately, ASID bits of SATP CSR
> >>> are not implemented on SiFive Unleashed board so we don't see any change
> >>> in performance.
> >>>
> >>> Signed-off-by: Gary Guo 
> >> Could you add a Co-developed-by line in addition to Signed-off-by as
> >> well? Thanks.
> >>> Signed-off-by: Anup Patel 
> >>> ---
> >>> Changes since v1:
> >>> - We adapt good aspects from Gary Guo's ASID allocator implementation
> >>> and provide due credit to him by adding his SoB.
> >>> - Track ASIDs active during context flush and mark them as reserved
> >>> - Set ASID bits to all 1s to simplify number of ASID bit detection
> >>> - Use atomic_long_t instead of atomic64_t for being 32bit friendly
> >>> - Use unsigned long instead of u64 for being 32bit friendly
> >>> - Use flush_tlb_all() instead of lazy local_tlb_flush_all() at time
> >>> of context flush
> >>>
> >>> This patch is based on Linux-5.1-rc2 and TLB flush cleanup patches v4
> >>> from Gary Guo. It can be also found in riscv_asid_allocator_v2 branch
> >>> of https://github.com/avpatel/linux.git
> >>> ---
> >>>arch/riscv/include/asm/csr.h |   6 +
> >>>arch/riscv/include/asm/mmu.h |   1 +
> >>>arch/riscv/include/asm/mmu_context.h |   1 +
> >>>arch/riscv/kernel/head.S |   2 +
> >>>arch/riscv/mm/context.c  | 249 +--
> >>>5 files changed, 247 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> >>> index 28a0d1cb374c..ce18ab8f53ed 100644
> >>> --- a/arch/riscv/include/asm/csr.h
> >>> +++ b/arch/riscv/include/asm/csr.h
> >>> @@ -45,10 +45,16 @@
> >>>#define SATP_PPN _AC(0x003F, UL)
> >>>#define SATP_MODE_32 _AC(0x8000, UL)
> >>>#define SATP_MODESATP_MODE_32
> >>> +#define SATP_ASID_BITS   9
> >>> +#define SATP_ASID_SHIFT  22
> >>> +#define SATP_ASID_MASK   _AC(0x1FF, UL)
> >>>#else
> >>>#define SATP_PPN _AC(0x0FFF, UL)
> >>>#define SATP_MODE_39 _AC(0x8000, UL)
> >>>#define SATP_MODESATP_MODE_39
> >>> +#define SATP_ASID_BITS   16
> >>> +#define SATP_ASID_SHIFT  44
> >>> +#define SATP_ASID_MASK   _AC(0x, UL)
> >>>#endif
> >>>
> >>>/* Interrupt Enable and Interrupt Pending flags */
> >>> diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
> >>> index 5df2dccdba12..42a9ca0fe1fb 100644
> >>> --- a/arch/riscv/include/asm/mmu.h
> >>> +++ b/arch/riscv/include/asm/mmu.h
> >>> @@ -18,6 +18,7 @@
> >>>#ifndef __ASSEMBLY__
> >>>
> >>>typedef struct {
> >>> + atomic_long_t id;
> >>>void *vdso;
> >>>#ifdef CONFIG_SMP
> >>>/* A local ic

Re: [PATCH v2] RISC-V: Implement ASID allocator

2019-03-28 Thread Anup Patel
On Thu, Mar 28, 2019 at 8:00 PM Gary Guo  wrote:
>
>
>
> On 28/03/2019 14:09, Anup Patel wrote:
> > On Thu, Mar 28, 2019 at 7:07 PM Gary Guo  wrote:
> >>
> >> Hi Anup,
> >>
> >> The code still does not use ASID in TLB flush routines. Without this
> >> added the code does not boot on systems with true ASID support.
> >
> > Can you elaborate why flush by ASID is need and flush_tlb_all() will
> > not work?
>  >
> flush_tlb_all() will work, but not flush_tlb_mm, flush_tlb_page,
> flush_tlb_range. When we want to flush something related to a MM we need
> to get its ASID and SFENCE with that ASID.

Please don't make self contradicting statements without looking at code.

The flush_tlb_all()/local_flush_tlb_all() are only suitable here.

The ASID space is global across CPUs so when we run-out of ASIDs
we have to flush complete TLB for all CPUs because we cannot be
certain what all ASIDs where used on each CPU.

> >>
> >> We also need to consider the case of CONTEXTID overflow on 32-bit
> >> systems. 32-bit CONTEXTID may overflow in a month time.
> >
> > On 32bit systems, upper 24bits of CONTEXTID will be VERSION and
> > lower 8bits will be HW ASID.
> >
> > Can you elaborate how did you reach to conclusion that CONTEXID
> > will overflow in a month time?
> >
> Assume a case where we have 256 processes to run, and 8 cores,
> 2^32/(250Hz)/8 = 24 days.

I think you are confusing CONTEXTID with CONTEXTID register in
ARMv7/ARMv8 world.

The CONTEXTID is a software entity that we assigne to a mmu_context

It has two parts:
1. Lower bits are ASID that we will programe in SATP CSR
2. Upper bits is the current version number

Whenever we run-out ASIDs in ASID bitmap at time of creating new
CONTEXTID, we simply flush the complete ASID bitmap and atomically
increment version number.

The most important thing here is that version number before and after
ASID bitmap flush should be different so that all CPU see the change
in current version after a ASID bitmap flush.

It is totally fine if the current version number rolls-over as long as we
get different current version after ASID bitmap flush.

> >>
> >> Please all see my inline comments.
> >>
> >> Best,
> >> Gary
> >>
> >> On 28/03/2019 06:32, Anup Patel wrote:
> >>> Currently, we do local TLB flush on every MM switch. This is very harsh
> >>> on performance because we are forcing page table walks after every MM
> >>> switch.
> >>>
> >>> This patch implements ASID allocator for assigning an ASID to every MM
> >>> context. The number of ASIDs are limited in HW so we create a logical
> >>> entity named CONTEXTID for assigning to MM context. The lower bits of
> >>> CONTEXTID are ASID and upper bits are VERSION number. The number of
> >>> usable ASID bits supported by HW are detected at boot-time by writing
> >>> 1s to ASID bits in SATP CSR. This means last ASID is always reserved
> >>> because it is used for initial MM context.
> >>>
> >>> We allocate new CONTEXTID on first MM switch for a MM context where
> >>> the ASID is allocated from an ASID bitmap and VERSION is provide by
> >>> an atomic counter. At time of allocating new CONTEXTID, if we run out
> >>> of available ASIDs then:
> >>> 1. We flush the ASID bitmap
> >>> 2. Increment current VERSION atomic counter
> >>> 3. Re-allocate ASID from ASID bitmap
> >>> 4. Flush TLB on all CPUs
> >>> 5. Try CONTEXTID re-assignment on all CPUs
> >>>
> >>> Using above approach, we have virtually infinite CONTEXTIDs on-top-of
> >>> limited number of HW ASIDs. This approach is inspired from ASID allocator
> >>> used for Linux ARM/ARM64 but we have adapted it for RISC-V. Overall,
> >>> this ASID allocator helps us reduce rate of local TLB flushes on every
> >>> CPU thereby increasing performance.
> >>>
> >>> This patch is tested on QEMU/virt machine and SiFive Unleashed board.
> >>> On QEMU/virt machine, we see 10% (approx) performance improvement with
> >>> SW emulated TLBs provided by QEMU. Unfortunately, ASID bits of SATP CSR
> >>> are not implemented on SiFive Unleashed board so we don't see any change
> >>> in performance.
> >>>
> >>> Signed-off-by: Gary Guo 
> >> Could you add a Co-developed-by line in addition to Signed-off-by as
> >> well? Thanks.
> >
> > Sure, I will add.
> >
> >>> Signed-off-by: Anup Patel 
> >>> ---
> >>> Changes since v1:
> >>> - We adapt good aspects from Gary Guo's ASID allocator implementation
> >>> and provide due credit to him by adding his SoB.
> >>> - Track ASIDs active during context flush and mark them as reserved
> >>> - Set ASID bits to all 1s to simplify number of ASID bit detection
> >>> - Use atomic_long_t instead of atomic64_t for being 32bit friendly
> >>> - Use unsigned long instead of u64 for being 32bit friendly
> >>> - Use flush_tlb_all() instead of lazy local_tlb_flush_all() at time
> >>> of context flush
> >>>
> >>> This patch is based on Linux-5.1-rc2 and TLB flush cleanup patches v4
> >>> from Gary Guo. It can be also found in riscv_asid_allocator_v2 branch
> >>> of https://gith

Re: [PATCH v6 1/2] Bluetooth: hci_qca: Added support for WCN3998

2019-03-28 Thread Harish Bandi

Hi Matthias,

On 2019-03-29 02:53, Matthias Kaehlcke wrote:

On Fri, Mar 29, 2019 at 05:05:49AM +0800, kbuild test robot wrote:

Hi Harish,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bluetooth-next/master]
[also build test WARNING on next-20190328]
[cannot apply to v5.1-rc2]
[if your patch is applied to the wrong git tree, please drop us a note 
to help improve the system]


url:
https://github.com/0day-ci/linux/commits/Harish-Bandi/Bluetooth-hci_qca-Added-support-for-WCN3998/20190328-213357
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git 
master

config: riscv-allmodconfig (attached as .config)
compiler: riscv64-linux-gcc (GCC) 8.1.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross 
-O ~/bin/make.cross

chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=8.1.0 make.cross ARCH=riscv

Note: it may well be a FALSE warning. FWIW you are at least aware of 
it now.

http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/bluetooth/btqca.c: In function 'qca_uart_setup':
>> drivers/bluetooth/btqca.c:369:3: warning: 'rom_ver' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
  snprintf(config.fwname, sizeof(config.fwname),
  ^~
"qca/crnv%02x.bin", rom_ver);


vim +/rom_ver +369 drivers/bluetooth/btqca.c

83e81961 Ben Young Tae Kim  2015-08-10  333
aadebac4 Balakrishna Godavarthi 2018-08-03  334  int 
qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate,
aadebac4 Balakrishna Godavarthi 2018-08-03  335  		   enum 
qca_btsoc_type soc_type, u32 soc_ver)

83e81961 Ben Young Tae Kim  2015-08-10  336  {
83e81961 Ben Young Tae Kim  2015-08-10  337  	struct rome_config 
config;

83e81961 Ben Young Tae Kim  2015-08-10  338 int err;
4219d468 Balakrishna Godavarthi 2018-08-03  339 u8 rom_ver;


With the use of qca_is_wcn399x() the compiler can't determine anymore
that rom_ver is only read for WCN399x, in which case it is also
initialized. Just initialize rom_ver with 0 to keep the compiler happy.

[Harish] - sure, will wait for few more comments and address all 
comments in new patch version.



83e81961 Ben Young Tae Kim  2015-08-10  340
ba493d4f Balakrishna Godavarthi 2018-08-03  341  	bt_dev_dbg(hdev, 
"QCA setup on UART");

83e81961 Ben Young Tae Kim  2015-08-10  342
83e81961 Ben Young Tae Kim  2015-08-10  343  
	config.user_baud_rate = baudrate;

83e81961 Ben Young Tae Kim  2015-08-10  344
83e81961 Ben Young Tae Kim  2015-08-10  345  	/* Download rampatch 
file */
83e81961 Ben Young Tae Kim  2015-08-10  346  	config.type = 
TLV_TYPE_PATCH;
cc8a70bd Harish Bandi   2019-03-27  347  	if 
(qca_is_wcn399x(soc_type)) {
4219d468 Balakrishna Godavarthi 2018-08-03  348  		/* Firmware files 
to download are based on ROM version.
4219d468 Balakrishna Godavarthi 2018-08-03  349  		 * ROM version is 
derived from last two bytes of soc_ver.

4219d468 Balakrishna Godavarthi 2018-08-03  350  */
4219d468 Balakrishna Godavarthi 2018-08-03  351  		rom_ver = ((soc_ver 
& 0x0f00) >> 0x04) |
4219d468 Balakrishna Godavarthi 2018-08-03  352  			(soc_ver & 
0x000f);
4219d468 Balakrishna Godavarthi 2018-08-03  353  
		snprintf(config.fwname, sizeof(config.fwname),
4219d468 Balakrishna Godavarthi 2018-08-03  354  			 
"qca/crbtfw%02x.tlv", rom_ver);

4219d468 Balakrishna Godavarthi 2018-08-03  355 } else {
4219d468 Balakrishna Godavarthi 2018-08-03  356  
		snprintf(config.fwname, sizeof(config.fwname),
4219d468 Balakrishna Godavarthi 2018-08-03  357  			 
"qca/rampatch_%08x.bin", soc_ver);

4219d468 Balakrishna Godavarthi 2018-08-03  358 }
4219d468 Balakrishna Godavarthi 2018-08-03  359
ba493d4f Balakrishna Godavarthi 2018-08-03  360  	err = 
qca_download_firmware(hdev, &config);

83e81961 Ben Young Tae Kim  2015-08-10  361 if (err < 0) {
ba493d4f Balakrishna Godavarthi 2018-08-03  362  		bt_dev_err(hdev, 
"QCA Failed to download patch (%d)", err);

83e81961 Ben Young Tae Kim  2015-08-10  363 return err;
83e81961 Ben Young Tae Kim  2015-08-10  364 }
83e81961 Ben Young Tae Kim  2015-08-10  365
83e81961 Ben Young Tae Kim  2015-08-10  366  	/* Download NVM 
configuration */
83e81961 Ben Young Tae Kim  2015-08-10  367  	config.type = 
TLV_TYPE_NVM;
cc8a70bd Harish Bandi   2019-03-27  368  	if 
(qca_is_wcn399x(soc_type))
4219d468 Balakrishna Godavarthi 2018-08-03 @369  
		snprintf(config.fwname, sizeof(config.fwname),
4219d468 Balakrishna Godavarthi 2018-08-03  370  			 
"qca/crnv%02x.bin", rom_ver);

421

Re: [PATCH v6 1/2] Bluetooth: hci_qca: Added support for WCN3998

2019-03-28 Thread Harish Bandi

Hi Matthias,

On 2019-03-27 22:26, Matthias Kaehlcke wrote:

On Wed, Mar 27, 2019 at 05:58:42PM +0530, Harish Bandi wrote:

Added new compatible for WCN3998 and corresponding voltage
and current values to WCN3998 compatible.
Changed driver code to support WCN3998

Signed-off-by: Harish Bandi 


You forgot to add 'Reviewed-by' my tag from v5. No need to resend,
I'll add it again below, but it's the general practice to include tags
like 'Reviewed-by' or 'Acked-by' when sending a new revision.


[Harish] - sorry for missing, will follow from new version.

---
Changes in V6:
- changed return value to false in the qca_is_wcn399x()stub
---
 drivers/bluetooth/btqca.c   | 13 +++--
 drivers/bluetooth/btqca.h   |  8 +++-
 drivers/bluetooth/hci_qca.c | 40 
++--

 3 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index 6122685..383e99f 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -344,7 +344,7 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t 
baudrate,


/* Download rampatch file */
config.type = TLV_TYPE_PATCH;
-   if (soc_type == QCA_WCN3990) {
+   if (qca_is_wcn399x(soc_type)) {
/* Firmware files to download are based on ROM version.
 * ROM version is derived from last two bytes of soc_ver.
 */
@@ -365,7 +365,7 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t 
baudrate,


/* Download NVM configuration */
config.type = TLV_TYPE_NVM;
-   if (soc_type == QCA_WCN3990)
+   if (qca_is_wcn399x(soc_type))
snprintf(config.fwname, sizeof(config.fwname),
 "qca/crnv%02x.bin", rom_ver);
else
@@ -410,6 +410,15 @@ int qca_set_bdaddr(struct hci_dev *hdev, const 
bdaddr_t *bdaddr)

 }
 EXPORT_SYMBOL_GPL(qca_set_bdaddr);

+bool qca_is_wcn399x(enum qca_btsoc_type soc_type)
+{
+   if ((soc_type == QCA_WCN3990) || (soc_type == QCA_WCN3998))
+   return true;
+
+   return false;
+}
+EXPORT_SYMBOL_GPL(qca_is_wcn399x);
+
 MODULE_AUTHOR("Ben Young Tae Kim ");
 MODULE_DESCRIPTION("Bluetooth support for Qualcomm Atheros family ver 
" VERSION);

 MODULE_VERSION(VERSION);
diff --git a/drivers/bluetooth/btqca.h b/drivers/bluetooth/btqca.h
index 6fdc25d..0f68c9e7 100644
--- a/drivers/bluetooth/btqca.h
+++ b/drivers/bluetooth/btqca.h
@@ -132,7 +132,8 @@ enum qca_btsoc_type {
QCA_INVALID = -1,
QCA_AR3002,
QCA_ROME,
-   QCA_WCN3990
+   QCA_WCN3990,
+   QCA_WCN3998,
 };

 #if IS_ENABLED(CONFIG_BT_QCA)
@@ -142,6 +143,7 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t 
baudrate,

   enum qca_btsoc_type soc_type, u32 soc_ver);
 int qca_read_soc_version(struct hci_dev *hdev, u32 *soc_version);
 int qca_set_bdaddr(struct hci_dev *hdev, const bdaddr_t *bdaddr);
+bool qca_is_wcn399x(enum qca_btsoc_type soc_type);
 #else

 static inline int qca_set_bdaddr_rome(struct hci_dev *hdev, const 
bdaddr_t *bdaddr)
@@ -165,4 +167,8 @@ static inline int qca_set_bdaddr(struct hci_dev 
*hdev, const bdaddr_t *bdaddr)

return -EOPNOTSUPP;
 }

+static inline bool qca_is_wcn399x(enum qca_btsoc_type soc_type)
+{
+   return false;
+}
 #endif
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 4ea995d..4af580a 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -521,7 +521,7 @@ static int qca_open(struct hci_uart *hu)
if (hu->serdev) {

qcadev = serdev_device_get_drvdata(hu->serdev);
-   if (qcadev->btsoc_type != QCA_WCN3990) {
+   if (!qca_is_wcn399x(qcadev->btsoc_type)) {
gpiod_set_value_cansleep(qcadev->bt_en, 1);
} else {
hu->init_speed = qcadev->init_speed;
@@ -627,7 +627,7 @@ static int qca_close(struct hci_uart *hu)

if (hu->serdev) {
qcadev = serdev_device_get_drvdata(hu->serdev);
-   if (qcadev->btsoc_type == QCA_WCN3990)
+   if (qca_is_wcn399x(qcadev->btsoc_type))
qca_power_shutdown(hu);
else
gpiod_set_value_cansleep(qcadev->bt_en, 0);
@@ -1008,7 +1008,7 @@ static int qca_set_baudrate(struct hci_dev 
*hdev, uint8_t baudrate)

  msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS));

/* Give the controller time to process the request */
-   if (qca_soc_type(hu) == QCA_WCN3990)
+   if (qca_is_wcn399x(qca_soc_type(hu)))
msleep(10);
else
msleep(300);
@@ -1084,7 +1084,7 @@ static unsigned int qca_get_speed(struct 
hci_uart *hu,


 static int qca_check_speeds(struct hci_uart *hu)
 {
-   if (qca_soc_type(hu) == QCA_WCN3990) {
+   if (qca_is_wcn399x(qca_soc_type(hu))) {
if (!qca_get_speed(hu, QCA_INIT_SPEED) &&
!qca_get_spee

Re: [PATCH v6 2/2] dt-bindings: net: bluetooth: Add device tree bindings for QTI chip WCN3998

2019-03-28 Thread Harish Bandi

Hi Matthias,

On 2019-03-27 22:59, Matthias Kaehlcke wrote:

On Wed, Mar 27, 2019 at 05:58:43PM +0530, Harish Bandi wrote:

This patch enables regulators for the Qualcomm Bluetooth WCN3998
controller.


I commented on this on v3, but you didn't update it:

  No, it doesn't.

  The next version should probably say something like "Add compatible
  string for the Qualcomm WCN3998 Bluetooth controller".


[Harish] - will take care in next version.

Thanks

Matthias





Query on how WWAN and Modem interact

2019-03-28 Thread Ajay Garg
Hi All.

I am sorry upfront if this might seem like a stupid question.
I am wanting to have some insight on how a modem plugs into the
linux-kernel at architecture level.

First a bit of history.
I have prior experience of using an onboard-modem on a ubuntu, via
support from modem-manager. There, after the modem was (successfully)
set up via "mmcli", the "wwan" interface started showing in
"ifconfig", which then presented itself to be used just like any other
wifi/ethernet listings in "ifconfig".

Now, we are in a bit of situation, where only the AT-interface (of the
modem) might be available to us (via a serial /dev/tty interface).
Also, using AT-commands to actually set up sockets is also fine.

Now, my query is once the AT-commands succesfully set up a socket, how
does that link to "wwan" interface in "ifconfig" listings.


Will be grateful to have the experts throw some light on this.



Thanks and Regards,
Ajay


[PATCH 6/5] lib/list_sort: Fix GCC warning

2019-03-28 Thread George Spelvin
It turns out that GCC 4.9, 7.3, and 8.1 ignore the __pure
attribute on function pointers and (with the standard kernel
compile flags) emit a warning about it.

Even though it accurately describes a comparison function
(the compiler need not reload cached pointers across the call),
it doesn't actually help GCC 8.3's code generation, so just
omit it.

Signed-off-by: George Spelvin 
Fixes: 820c81be5237 ("lib/list_sort: simplify and remove MAX_LIST_LENGTH_BITS")
Cc: Andrew Morton 
Cc: Stephen Rothwell 
---
 lib/list_sort.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/lib/list_sort.c b/lib/list_sort.c
index 623a9158ac8a..b1b492e20f1d 100644
--- a/lib/list_sort.c
+++ b/lib/list_sort.c
@@ -8,12 +8,16 @@
 #include 
 
 /*
- * By declaring the compare function with the __pure attribute, we give
- * the compiler more opportunity to optimize.  Ideally, we'd use this in
- * the prototype of list_sort(), but that would involve a lot of churn
- * at all call sites, so just cast the function pointer passed in.
+ * A more accurate type for comparison functions.  Ideally, we'd use
+ * this in the prototype of list_sort(), but that would involve a lot of
+ * churn at all call sites, so just cast the function pointer passed in.
+ *
+ * This could also include __pure to give the compiler more opportunity
+ * to optimize, but that elicits an "attribute ignored" warning on
+ * GCC <= 8.1, and doesn't change GCC 8.3's code generation at all,
+ * so it's omitted.
  */
-typedef int __pure __attribute__((nonnull(2,3))) (*cmp_func)(void *,
+typedef int __attribute__((nonnull(2,3))) (*cmp_func)(void *,
struct list_head const *, struct list_head const *);
 
 /*
-- 
2.20.1



Re: [RESEND PATCH v2 4/5] lib/list_sort: Simplify and remove MAX_LIST_LENGTH_BITS

2019-03-28 Thread George Spelvin
Than you all for the build warning report.

The warning produced by gcc versions 4.9, 7.3, 8.1, whatever version
Stephen Rothwell is running, is:
lib/list_sort.c:17:36: warning: __pure__ attribute ignored [-Wattributes]

The relevant code is:
10: /*
11:  * By declaring the compare function with the __pure attribute, we give
12:  * the compiler more opportunity to optimize.  Ideally, we'd use this in
13:  * the prototype of list_sort(), but that would involve a lot of churn
14:  * at all call sites, so just cast the function pointer passed in.
15:  */
16: typedef int __pure __attribute__((nonnull(2,3))) (*cmp_func)(void *,
17: struct list_head const *, struct list_head const *);

As the comment says, the purpose of the __pure attribute is to tell
the compiler that, after a call via a function pointer of this
type, memory is not clobbered and it is not necessary to reload
any cached list pointers.

This is, of course, purely optional and may be deleted harmlessly.
I just checked, and that makes no difference at all to gcc-8 code
generation, so there's no point messing with #ifdef.

There are only two questions: how to update the comment, and how
to submit the fix. I'm thinking of
/*
 * A more accurate type for comparison functions.  Ideally, we'd use
 * this in the prototype of list_sort(), but that would involve a lot of
 * churn at all call sites, so just cast the function pointer passed in.
 *
 * This could also include __pure to give the compiler more opportunity
 * to optimize, but that elicits an "attribute ignored" warning on
 * GCC <= 8.1, and doesn't change GCC 8.3's code generation at all,
 * so it's omitted.
 */

How to submit the fix: Andrew, do you prefer a replacement patch
or a small fix patch?  I'll assume the latter and send it in a few
minutes.


Re: [RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg

2019-03-28 Thread Shakeel Butt
On Thu, Mar 28, 2019 at 7:36 PM Matthew Wilcox  wrote:
>
> On Thu, Mar 28, 2019 at 06:28:36PM -0700, Shakeel Butt wrote:
> > A VCPU of a VM can allocate upto three pages which can be mmap'ed by the
> > user space application. At the moment this memory is not charged. On a
> > large machine running large number of VMs (or small number of VMs having
> > large number of VCPUs), this unaccounted memory can be very significant.
> > So, this memory should be charged to a kmemcg. However that is not
> > possible as these pages are mmapped to the userspace and PageKmemcg()
> > was designed with the assumption that such pages will never be mmapped
> > to the userspace.
> >
> > One way to solve this problem is by introducing an additional memcg
> > charging API similar to mem_cgroup_[un]charge_skmem(). However skmem
> > charging API usage is contained and shared and no new users are
> > expected but the pages which can be mmapped and should be charged to
> > kmemcg can and will increase. So, requiring the usage for such API will
> > increase the maintenance burden. The simplest solution is to remove the
> > assumption of no mmapping PageKmemcg() pages to user space.
>
> The usual response under these circumstances is "No, you can't have a
> page flag bit".
>

I would say for systems having CONFIG_MEMCG_KMEM, a page flag bit is
not that expensive.

> I don't understand why we need a PageKmemcg anyway.  We already
> have an entire pointer in struct page; can we not just check whether
> page->mem_cgroup is NULL or not?

PageKmemcg is for kmem while page->mem_cgroup is used for anon, file
and kmem memory. So, page->mem_cgroup can not be used for NULL check
unless we unify them. Not sure how complicated would that be.

Shakeel


(.init.text+0x134): multiple definition of `plat_irq_setup'

2019-03-28 Thread kbuild test robot
Hi Takashi,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   9936328b41ce4bce8f20269dcac8cb476c8d0820
commit: c97617a81a7616d49bc3700959e08c6c6f447093 ALSA: hda/ca0132 - Fix build 
error without CONFIG_PCI
date:   7 weeks ago
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout c97617a81a7616d49bc3700959e08c6c6f447093
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh 

All errors (new ones prefixed by >>):

   arch/sh/boards/of-generic.o: In function `plat_irq_setup':
>> (.init.text+0x134): multiple definition of `plat_irq_setup'
   arch/sh/kernel/cpu/sh2/setup-sh7619.o:(.init.text+0x30): first defined here
   arch/sh/boards/of-generic.o: In function `arch_init_clk_ops':
>> (.init.text+0x118): multiple definition of `arch_init_clk_ops'
   arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.init.text+0x0): first defined here

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 1/2] platform/x86: intel_pmc_core: Convert to a platform_driver

2019-03-28 Thread Srinivas Pandruvada
On Mon, 2019-03-25 at 18:41 -0700, Rajat Jain wrote:
> Hi Rajneesh,
> 
> 
> On Mon, Mar 25, 2019 at 3:23 AM Bhardwaj, Rajneesh
>  wrote:
> > 
> > Hi Rajat
> > 
> > On 23-Mar-19 6:00 AM, Rajat Jain wrote:
> > > Hi Rajneesh,
> > > 
> > > 
> > > 
> > > On Fri, Mar 22, 2019 at 12:56 PM Bhardwaj, Rajneesh
> > >  wrote:
> > > > Some suggestions below
> > > > 
> > > > On 18-Mar-19 8:36 PM, Rajat Jain wrote:
> > > > 
> > > > On Sat, Mar 16, 2019 at 1:30 AM Rajneesh Bhardwaj
> > > >  wrote:
> > > > 
> > > > On Wed, Mar 13, 2019 at 03:21:23PM -0700, Rajat Jain wrote:
> > > > 
> > > > Convert the intel_pmc_core driver to a platform driver. There
> > > > is no
> > > > functional change. Some code that tries to determine what kind
> > > > of
> > > > CPU this is, has been moved code is moved from pmc_core_probe()
> > > > to
> > > > 
> > > > Possible typo here.
> > > > 
> > > > Ummm, you mean grammar error I guess? Sure, I will rephrase.
> > > > 
> > > > pmc_core_init().
> > > > 
> > > > Signed-off-by: Rajat Jain 
> > > > 
> > > > Thanks for sending this. This is certainly useful to support
> > > > suspend-resume
> > > > functionality for this driver which is otherwise only possible
> > > > with PM
> > > > notifiers otherwise and that is not desirable. Initially this
> > > > was a PCI
> > > > driver and after design discussion it was converted to module.
> > > > I would like
> > > > to consult Andy and Srinivas for their opinion about binding it
> > > > to actual
> > > > platform bus instead of the virtual bus as in its current form.
> > > > In one of the
> > > > internal versions, we used a known acpi PNP HID.
> > > > 
> > > > Sure, if there is an established ACPI PNP HID, then we could
> > > > bind it
> > > > using that, on platforms where we are still developing BIOS /
> > > > coreboot. However, this might not be possible for shipping
> > > > systems
> > > > (Kabylake / skylake) where there is no plan to change the BIOS.
> > > > 
> > > > In one of our internal patches, i had used HID of power engine
> > > > plugin. IIRC, During my testing it was working on KBL, CNL with
> > > > UEFI BIOS but i highly recommend testing it.
> > > > 
> > > > ---8<8<-
> > > > 
> > > > +static const struct acpi_device_id pmc_acpi_ids[] = {
> > > > 
> > > > + {"INT33A1", 0}, /* _HID for Intel Power Engine,
> > > > _CID PNP0D80*/
> > > > 
> > > > + { }
> > > > 
> > > >   };
> > > 
> > > We do not have this device in any of our ACPI tables today. If
> > > Intel
> > > can confirm that this is a well known HID to be used for
> > > attaching
> > > this driver, we can start putting it on our platform's ACPI going
> > > forward (Whiskeylake, Cometlake, Cannonlake, Icelake ...). But I
> > > believe we also need to have this driver attach with the device
> > > on
> > > older platforms (Skylake, Kabylake, Amberlake) that are already
> > > shipping, and running a Non UEFI BIOS (that may not have this HID
> > > since it is not published).
> > > 
> > > Currently the intel_pmc_core driver attaches itself to the
> > > following
> > > table of CPU families, without regard to whether it has that HID
> > > in
> > > the ACPI or not:
> > > 
> > > static const struct x86_cpu_id intel_pmc_core_ids[] = {
> > >  INTEL_CPU_FAM6(SKYLAKE_MOBILE, spt_reg_map),
> > >  INTEL_CPU_FAM6(SKYLAKE_DESKTOP, spt_reg_map),
> > >  INTEL_CPU_FAM6(KABYLAKE_MOBILE, spt_reg_map),
> > >  INTEL_CPU_FAM6(KABYLAKE_DESKTOP, spt_reg_map),
> > >  INTEL_CPU_FAM6(CANNONLAKE_MOBILE, cnp_reg_map),
> > >  INTEL_CPU_FAM6(ICELAKE_MOBILE, icl_reg_map),
> > >  {}
> > > };
> > 
> > In the past i tried one hybrid approach i.e. PCI and Platform
> > driver at
> > the same time. Based on that, i feel that this idea of spilling
> > probe
> > like this may not be the best option. The ACPI CID that i suggested
> > is
> > available on most Intel Core Platforms that i have worked on and i
> > can
> > help you in verifying it with UEFI BIOS if you want. Meanwhile,
> > please
> > see this https://patchwork.kernel.org/patch/9806565/ it gives some
> > background about this ACPI ID and also points to the LPIT spec.
> > 
> > > 
> > > So to avoid a regression, I suggest that we still maintain the
> > > above
> > > table (may be eliminate few entries) and always attach if the CPU
> > > is
> > > among the table, and if the CPU is not among the table, use the
> > > ACPI
> > > HID to attach. I propose to attach to at least Skylake and
> > > Kabylake
> > > systems using the table above, and for Canonlake and Icelake and
> > > newer, we can rely on BIOS providing the ACPI HID. Of course I do
> > > not
> > > know if all non-Google Canonlake/Icelake platforms will have this
> > > HID
> > > in their BIOS. If we are not sure, we can include Canonlake and
> > > Icelake also in that list, an. Please let me know what do you
> > > think.
> > 
> > If Coreboot firmware can not be updated for the shipping devices,
> > then
> > can Chromium kernel

Re: [PATCH 4/6] x86, mm: make split_mem_range() more easy to read

2019-03-28 Thread Wei Yang
On Thu, Mar 28, 2019 at 09:08:43AM +0100, Thomas Gleixner wrote:
>On Thu, 28 Mar 2019, Wei Yang wrote:
>> On Sun, Mar 24, 2019 at 03:29:04PM +0100, Thomas Gleixner wrote:
>> My question is to the for loop.
>> 
>> For example, we have a range
>> 
>>+--+-+---+
>>^ 128M   1G  2G
>>128M - 4K
>> 
>
>Yes. You misread mr_try_map().

You are right, I misunderstand the functionality of mr_try_map().

I went through the code and this looks nice to me. I have to say you are
genius.

Thanks for your code and I really learned a lot from this.

BTW, for the test cases, I thinks mem-hotplug may be introduce layout
diversity. Since mem-hotplug's range has to be 128M aligned.

>
>Thanks,
>
>   tglx

-- 
Wei Yang
Help you, Help me


Re: [PATCH] ARM: dts: vf610-zii-cfu1: Disable NOR flash/SPI controller

2019-03-28 Thread Shawn Guo
On Mon, Mar 25, 2019 at 11:30:17AM -0700, Andrey Smirnov wrote:
> Only a certain number of CFU1's come with NOR flash populated. Disable
> it by default to avoid trying to probe NOR flash on devices that don't
> have it. Devices that do have it can rely on the bootloader to enable
> this node.
> 
> Signed-off-by: Andrey Smirnov 
> Cc: Shawn Guo 
> Cc: Chris Healy 
> Cc: Andrew Lunn 
> Cc: Fabio Estevam 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org

Applied, thanks.


[RFC][PATCH 2/4 v2] tracing/syscalls: Pass in hardcoded 6 into syscall_get_arguments()

2019-03-28 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

The only users that calls syscall_get_arguments() with a variable and not a
hard coded '6' is ftrace_syscall_enter(). syscall_get_arguments() can be
optimized by removing a variable input, and always grabbing 6 arguments
regardless of what the system call actually uses.

Change ftrace_syscall_enter() to pass the 6 args into a local stack array
and copy the necessary arguments into the trace event as needed.

This is needed to remove two parameters from syscall_get_arguments().

Link: http://lkml.kernel.org/r/20161107213233.627583...@goodmis.org

Signed-off-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_syscalls.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index f93a56d2db27..e9f5bbbad6d9 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -314,6 +314,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs 
*regs, long id)
struct ring_buffer_event *event;
struct ring_buffer *buffer;
unsigned long irq_flags;
+   unsigned long args[6];
int pc;
int syscall_nr;
int size;
@@ -347,7 +348,8 @@ static void ftrace_syscall_enter(void *data, struct pt_regs 
*regs, long id)
 
entry = ring_buffer_event_data(event);
entry->nr = syscall_nr;
-   syscall_get_arguments(current, regs, 0, sys_data->nb_args, entry->args);
+   syscall_get_arguments(current, regs, 0, 6, args);
+   memcpy(entry->args, args, sizeof(unsigned long) * sys_data->nb_args);
 
event_trigger_unlock_commit(trace_file, buffer, event, entry,
irq_flags, pc);
@@ -583,6 +585,7 @@ static void perf_syscall_enter(void *ignore, struct pt_regs 
*regs, long id)
struct syscall_metadata *sys_data;
struct syscall_trace_enter *rec;
struct hlist_head *head;
+   unsigned long args[6];
bool valid_prog_array;
int syscall_nr;
int rctx;
@@ -613,8 +616,8 @@ static void perf_syscall_enter(void *ignore, struct pt_regs 
*regs, long id)
return;
 
rec->nr = syscall_nr;
-   syscall_get_arguments(current, regs, 0, sys_data->nb_args,
-  (unsigned long *)&rec->args);
+   syscall_get_arguments(current, regs, 0, 6, args);
+   memcpy(&rec->args, args, sizeof(unsigned long) * sys_data->nb_args);
 
if ((valid_prog_array &&
 !perf_call_bpf_enter(sys_data->enter_event, regs, sys_data, rec)) 
||
-- 
2.20.1




[RFC][PATCH 0/4 v2] sycalls: Remove args i and n from syscall_get_arguments()

2019-03-28 Thread Steven Rostedt


Two and a half years ago I sent out 3 patches and a title letter that
had this[1]:

  At Linux Plumbers, Andy Lutomirski approached me to tell me that the
  syscall_get_arguments() implementation in x86 was horrible and gcc
  certainly gets it wrong. He said that since the tracepoints only pass
  in 0 and 6 for i and n repectively, it should be optimized for that case.
  Inspecting the kernel, I discovered that all users pass in 0 for i and
  only one file passing in something other than 6 for the number of arguments.
  That code happens to be my own code used for the special syscall tracing.
  That can easily be converted to just using 0 and 6 as well, and only copying
  what is needed. Which is probably the faster path anyway for that case.

  I haven't run the numbers (I can do that when I get some time), but since
  pretty much all use cases use 0 and 6 and that would allow these functions
  not to need strange logic to handle odd cases, I think this is still a win.

It received positive comments but also Linus asked to remove the separate
arg pointers and replace them with a single structure and fill that
instead. But for some reason, this got pushed aside and forgotten (probably,
had to do with the fact that I left Red Hat shortly after this).

Recently, it was brought back up again[2] and I decided to dust off these
patches and resubmit them. I also added one more patch to do the same
for syscall_set_arguments() that I did for syscall_get_arguments() even
though syscall_set_arguments() currently (and never has) had any callers.
But we are told that in the near future it may have one.

The changes do optimize the logic a little, but for most archs I just kept
the same logic (loops and such) as I don't have a way to test it, and
didn't want to break the logic.

I added a new struct syscall_info that holds seccomp_data and also
includes a stack pointer (sp) field. I would change seccomp_data,
but because its in include/uapi/linux/seccomp.h I didn't want to
touch it and break userspace. Perhaps we could add the field at the
end, but I didn't want to chance it (unless others say its OK).

I ran these through zero-day-bot and compiled tested these changes for
all architectures except for csky which I do not have a cross compiler
for.

Note the following archs fail normal builds, but they fail the same
with these patches:

   arc
   h8300
   parisc64

Note, you may notice that I have "(Red Hat)" as the author of the
first three patches (even though they are signed off by "(VMware)").
This is because those patches were originally written while I was
working for Red Hat. But as I forward ported them while working for
VMware, my signed-off-by reflects that.

[1] - https://lore.kernel.org/lkml/20161107212634.529267...@goodmis.org/T/#u
[2] - https://lore.kernel.org/lkml/20190326151244.gc16...@redhat.com/T/#u

Steven Rostedt (Red Hat) (3):
  ptrace: Remove maxargs from task_current_syscall()
  tracing/syscalls: Pass in hardcoded 6 into syscall_get_arguments()
  syscalls: Remove start and number from syscall_get_arguments() args

Steven Rostedt (VMware) (1):
  syscalls: Remove start and number from syscall_set_arguments() args


 arch/arc/include/asm/syscall.h|   7 +-
 arch/arm/include/asm/syscall.h|  47 ++-
 arch/arm64/include/asm/syscall.h  |  46 ++-
 arch/c6x/include/asm/syscall.h|  79 ---
 arch/csky/include/asm/syscall.h   |  26 ++-
 arch/h8300/include/asm/syscall.h  |  34 ++--
 arch/hexagon/include/asm/syscall.h|   4 +-
 arch/ia64/include/asm/syscall.h   |  13 +---
 arch/ia64/kernel/ptrace.c |   7 +-
 arch/microblaze/include/asm/syscall.h |   8 +-
 arch/mips/include/asm/syscall.h   |   3 +-
 arch/mips/kernel/ptrace.c |   2 +-
 arch/nds32/include/asm/syscall.h  |  62 +++
 arch/nios2/include/asm/syscall.h  |  84 
 arch/openrisc/include/asm/syscall.h   |  12 +--
 arch/parisc/include/asm/syscall.h |  30 ++-
 arch/powerpc/include/asm/syscall.h|  15 ++--
 arch/riscv/include/asm/syscall.h  |  24 ++
 arch/s390/include/asm/syscall.h   |  28 +++
 arch/sh/include/asm/syscall_32.h  |  47 +++
 arch/sh/include/asm/syscall_64.h  |   8 +-
 arch/sparc/include/asm/syscall.h  |  11 ++-
 arch/um/include/asm/syscall-generic.h |  78 +++
 arch/x86/include/asm/syscall.h| 142 --
 arch/xtensa/include/asm/syscall.h |  33 ++--
 fs/proc/base.c|  17 ++--
 include/asm-generic/syscall.h |  21 ++---
 include/linux/ptrace.h|  11 ++-
 include/trace/events/syscalls.h   |   2 +-
 kernel/seccomp.c  |   2 +-
 kernel/trace/trace_syscalls.c |   9 ++-
 lib/syscall.c |  57 ++
 32 files changed, 247 insertions(+), 722 deletions(-)


[RFC][PATCH 4/4 v2] syscalls: Remove start and number from syscall_set_arguments() args

2019-03-28 Thread Steven Rostedt
From: "Steven Rostedt (VMware)" 

After removing the start and count arguments of syscall_get_arguments() it
seems reasonable to remove them from syscall_set_arguments(). Note, as of
today, there are no users of syscall_set_arguments(). But we are told that
there will be soon. But for now, at least make it consistent with
syscall_get_arguments().

Link: http://lkml.kernel.org/r/20190327222014.ga32...@altlinux.org

Cc: Oleg Nesterov 
Cc: Thomas Gleixner 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Dominik Brodowski 
Cc: Dave Martin 
Cc: "Dmitry V. Levin" 
Cc: x...@kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux-a...@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) 
---
 arch/arm/include/asm/syscall.h| 22 ++---
 arch/arm64/include/asm/syscall.h  | 22 ++---
 arch/c6x/include/asm/syscall.h| 38 +++
 arch/csky/include/asm/syscall.h   | 13 ++---
 arch/ia64/include/asm/syscall.h   | 10 ++--
 arch/ia64/kernel/ptrace.c |  7 ++-
 arch/microblaze/include/asm/syscall.h |  4 +-
 arch/nds32/include/asm/syscall.h  | 29 ++-
 arch/nios2/include/asm/syscall.h  | 42 +++-
 arch/openrisc/include/asm/syscall.h   |  6 +--
 arch/powerpc/include/asm/syscall.h|  7 +--
 arch/riscv/include/asm/syscall.h  | 12 ++---
 arch/s390/include/asm/syscall.h   | 11 ++---
 arch/sh/include/asm/syscall_32.h  | 21 +++-
 arch/sh/include/asm/syscall_64.h  |  4 +-
 arch/sparc/include/asm/syscall.h  |  7 ++-
 arch/um/include/asm/syscall-generic.h | 39 +++
 arch/x86/include/asm/syscall.h| 69 +++
 arch/xtensa/include/asm/syscall.h | 17 ++-
 include/asm-generic/syscall.h | 10 +---
 20 files changed, 88 insertions(+), 302 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index db969a2972ae..080ce70cab12 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -65,26 +65,12 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
 
 static inline void syscall_set_arguments(struct task_struct *task,
 struct pt_regs *regs,
-unsigned int i, unsigned int n,
 const unsigned long *args)
 {
-   if (n == 0)
-   return;
-
-   if (i + n > SYSCALL_MAX_ARGS) {
-   pr_warn("%s called with max args %d, handling only %d\n",
-   __func__, i + n, SYSCALL_MAX_ARGS);
-   n = SYSCALL_MAX_ARGS - i;
-   }
-
-   if (i == 0) {
-   regs->ARM_ORIG_r0 = args[0];
-   args++;
-   i++;
-   n--;
-   }
-
-   memcpy(®s->ARM_r0 + i, args, n * sizeof(args[0]));
+   regs->ARM_ORIG_r0 = args[0];
+   args++;
+
+   memcpy(®s->ARM_r0 + 1, args, 5 * sizeof(args[0]));
 }
 
 static inline int syscall_get_arch(void)
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 55b2dab21023..a179df3674a1 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -75,26 +75,12 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
 
 static inline void syscall_set_arguments(struct task_struct *task,
 struct pt_regs *regs,
-unsigned int i, unsigned int n,
 const unsigned long *args)
 {
-   if (n == 0)
-   return;
-
-   if (i + n > SYSCALL_MAX_ARGS) {
-   pr_warning("%s called with max args %d, handling only %d\n",
-  __func__, i + n, SYSCALL_MAX_ARGS);
-   n = SYSCALL_MAX_ARGS - i;
-   }
-
-   if (i == 0) {
-   regs->orig_x0 = args[0];
-   args++;
-   i++;
-   n--;
-   }
-
-   memcpy(®s->regs[i], args, n * sizeof(args[0]));
+   regs->orig_x0 = args[0];
+   args++;
+
+   memcpy(®s->regs[1], args, 5 * sizeof(args[0]));
 }
 
 /*
diff --git a/arch/c6x/include/asm/syscall.h b/arch/c6x/include/asm/syscall.h
index 06db3251926b..15ba8599858e 100644
--- a/arch/c6x/include/asm/syscall.h
+++ b/arch/c6x/include/asm/syscall.h
@@ -59,40 +59,14 @@ static inline void syscall_get_arguments(struct task_struc

[RFC][PATCH 3/4 v2] syscalls: Remove start and number from syscall_get_arguments() args

2019-03-28 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.

This should help out the performance of tracing system calls by ptrace,
ftrace and perf.

Link: http://lkml.kernel.org/r/20161107213233.754809...@goodmis.org

Cc: Oleg Nesterov 
Cc: Thomas Gleixner 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Dominik Brodowski 
Cc: Dave Martin 
Cc: "Dmitry V. Levin" 
Cc: x...@kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux-a...@vger.kernel.org
Reported-by: Andy Lutomirski 
Signed-off-by: Steven Rostedt (VMware) 
---
 arch/arc/include/asm/syscall.h|  7 ++-
 arch/arm/include/asm/syscall.h| 23 ++---
 arch/arm64/include/asm/syscall.h  | 22 ++--
 arch/c6x/include/asm/syscall.h| 41 +++
 arch/csky/include/asm/syscall.h   | 13 ++---
 arch/h8300/include/asm/syscall.h  | 34 +++--
 arch/hexagon/include/asm/syscall.h|  4 +-
 arch/ia64/include/asm/syscall.h   |  5 +-
 arch/microblaze/include/asm/syscall.h |  4 +-
 arch/mips/include/asm/syscall.h   |  3 +-
 arch/mips/kernel/ptrace.c |  2 +-
 arch/nds32/include/asm/syscall.h  | 33 +++-
 arch/nios2/include/asm/syscall.h  | 42 +++
 arch/openrisc/include/asm/syscall.h   |  6 +--
 arch/parisc/include/asm/syscall.h | 30 +++
 arch/powerpc/include/asm/syscall.h|  8 ++-
 arch/riscv/include/asm/syscall.h  | 12 ++---
 arch/s390/include/asm/syscall.h   | 17 ++-
 arch/sh/include/asm/syscall_32.h  | 26 +++---
 arch/sh/include/asm/syscall_64.h  |  4 +-
 arch/sparc/include/asm/syscall.h  |  4 +-
 arch/um/include/asm/syscall-generic.h | 39 +++---
 arch/x86/include/asm/syscall.h| 73 +++
 arch/xtensa/include/asm/syscall.h | 16 ++
 include/asm-generic/syscall.h | 11 ++--
 include/trace/events/syscalls.h   |  2 +-
 kernel/seccomp.c  |  2 +-
 kernel/trace/trace_syscalls.c |  4 +-
 lib/syscall.c |  2 +-
 29 files changed, 113 insertions(+), 376 deletions(-)

diff --git a/arch/arc/include/asm/syscall.h b/arch/arc/include/asm/syscall.h
index 29de09804306..c7a4201ed62b 100644
--- a/arch/arc/include/asm/syscall.h
+++ b/arch/arc/include/asm/syscall.h
@@ -55,12 +55,11 @@ syscall_set_return_value(struct task_struct *task, struct 
pt_regs *regs,
  */
 static inline void
 syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
 {
unsigned long *inside_ptregs = &(regs->r0);
-   inside_ptregs -= i;
-
-   BUG_ON((i + n) > 6);
+   unsigned int n = 6;
+   unsigned int i = 0;
 
while (n--) {
args[i++] = (*inside_ptregs);
diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 06dea6bce293..db969a2972ae 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -55,29 +55,12 @@ static inline void syscall_set_return_value(struct 
task_struct *task,
 
 static inline void syscall_get_arguments(struct task_struct *task,
 struct pt_regs *regs,
-unsigned int i, unsigned int n,
 unsigned long *args)
 {
-   if (n == 0)
-   return;
-
-   if (i + n > SYSCALL_MAX_ARGS) {
-   unsigned long *args_bad = args + SYSCALL_MAX_ARGS - i;
-   unsigned int n_bad = n + i - SYSCALL_MAX_ARGS;
-   pr_warn("%s called with max args %d, handling only %d\n",
-   __func__, i + n, SYSCALL_MAX_ARGS);
-   memset(args_bad, 0, n_bad * sizeof(args[0]));
-   n = SYSCALL_MAX_ARGS - i;
-   }
-
-   if (i == 0) {
-   args[0] = regs->ARM_ORIG_r0;
-

[RFC][PATCH 1/4 v2] ptrace: Remove maxargs from task_current_syscall()

2019-03-28 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

task_current_syscall() has a single user that passes in 6 for maxargs, which
is the maximum arguments that can be used to get system calls from
syscall_get_arguments(). Instead of passing in a number of arguments to
grab, just get 6 arguments. The args argument even specifies that it's an
array of 6 items.

This will also allow changing syscall_get_arguments() to not get a variable
number of arguments, but always grab 6.

Linus also suggested not passing in a bunch of arguments to
task_current_syscall() but to instead pass in a pointer to a structure, and
just fill the structure. struct seccomp_data has almost all the parameters
that is needed except for the stack pointer (sp). As seccomp_data is part of
uapi, and I'm afraid to change it, a new structure was created
"syscall_info", which includes seccomp_data and adds the "sp" field.

Link: http://lkml.kernel.org/r/20161107213233.466776...@goodmis.org

Cc: Thomas Gleixner 
Cc: Andy Lutomirski 
Cc: Alexey Dobriyan 
Cc: Oleg Nesterov 
Cc: Kees Cook 
Cc: Al Viro 
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) 
---
 fs/proc/base.c | 17 +++--
 include/linux/ptrace.h | 11 +---
 lib/syscall.c  | 57 ++
 3 files changed, 42 insertions(+), 43 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ddef482f1334..6a803a0b75df 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -616,24 +616,25 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
 static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
 {
-   long nr;
-   unsigned long args[6], sp, pc;
+   struct syscall_info info;
+   u64 *args = &info.data.args[0];
int res;
 
res = lock_trace(task);
if (res)
return res;
 
-   if (task_current_syscall(task, &nr, args, 6, &sp, &pc))
+   if (task_current_syscall(task, &info))
seq_puts(m, "running\n");
-   else if (nr < 0)
-   seq_printf(m, "%ld 0x%lx 0x%lx\n", nr, sp, pc);
+   else if (info.data.nr < 0)
+   seq_printf(m, "%d 0x%llx 0x%llx\n",
+  info.data.nr, info.sp, 
info.data.instruction_pointer);
else
seq_printf(m,
-  "%ld 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx\n",
-  nr,
+  "%d 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 
0x%llx\n",
+  info.data.nr,
   args[0], args[1], args[2], args[3], args[4], args[5],
-  sp, pc);
+  info.sp, info.data.instruction_pointer);
unlock_trace(task);
 
return 0;
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index edb9b040c94c..d5084ebd9f03 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -9,6 +9,13 @@
 #include  /* For BUG_ON.  */
 #include/* For task_active_pid_ns.  */
 #include 
+#include 
+
+/* Add sp to seccomp_data, as seccomp is user API, we don't want to modify it 
*/
+struct syscall_info {
+   __u64   sp;
+   struct seccomp_data data;
+};
 
 extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
void *buf, int len, unsigned int gup_flags);
@@ -407,9 +414,7 @@ static inline void user_single_step_report(struct pt_regs 
*regs)
 #define current_user_stack_pointer() user_stack_pointer(current_pt_regs())
 #endif
 
-extern int task_current_syscall(struct task_struct *target, long *callno,
-   unsigned long args[6], unsigned int maxargs,
-   unsigned long *sp, unsigned long *pc);
+extern int task_current_syscall(struct task_struct *target, struct 
syscall_info *info);
 
 extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction 
*oact);
 #endif
diff --git a/lib/syscall.c b/lib/syscall.c
index 1a7077f20eae..e8467e17b9a2 100644
--- a/lib/syscall.c
+++ b/lib/syscall.c
@@ -5,16 +5,14 @@
 #include 
 #include 
 
-static int collect_syscall(struct task_struct *target, long *callno,
-  unsigned long args[6], unsigned int maxargs,
-  unsigned long *sp, unsigned long *pc)
+static int collect_syscall(struct task_struct *target, struct syscall_info 
*info)
 {
struct pt_regs *regs;
 
if (!try_get_task_stack(target)) {
/* Task has no stack, so the task isn't in a syscall. */
-   *sp = *pc = 0;
-   *callno = -1;
+   memset(info, 0, sizeof(*info));
+   info->data.nr = -1;
return 0;
}
 
@@ -24,12 +22,13 @@ static int collect_syscall(struct task_struct *target, long 
*callno,
return -EAGAIN;
}
 
-   *sp =

linux-next: Tree for Mar 29

2019-03-28 Thread Stephen Rothwell
Hi all,

Changes since 20190328:

The pidfd tree lost its build failures.

Non-merge commits (relative to Linus' tree): 3597
 3205 files changed, 100345 insertions(+), 49777 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 299 trees (counting Linus' and 70 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (8c7ae38d1ce1 afs: Fix StoreData op marshalling)
Merging fixes/master (b352face4ca9 adfs: mark expected switch fall-throughs)
Merging kspp-gustavo/for-next/kspp (1f7ae812f87e x86/syscalls: Mark expected 
switch fall-throughs)
Merging kbuild-current/fixes (54a7151b1496 kbuild: modversions: Fix relative 
CRC byte order interpretation)
Merging arc-current/for-curr (172fe06c57b8 ARC: ioc: diasble ioc if HIGHMEM/PAE 
iso panic)
Merging arm-current/fixes (d410a8a49e3e ARM: 8849/1: NOMMU: Fix encodings for 
PMSAv8's PRBAR4/PRLAR4)
Merging arm64-fixes/for-next/fixes (9e0a17db517d arm64: replace 
memblock_alloc_low with memblock_alloc)
Merging m68k-current/for-linus (28713169d879 m68k: Add -ffreestanding to CFLAGS)
Merging powerpc-fixes/fixes (92edf8df0ff2 powerpc/security: Fix spectre_v2 
reporting)
Merging sparc/master (7d762d69145a afs: Fix manually set volume location server 
list)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (d3332184f1e9 Merge tag 'batadv-net-for-davem-20190328' of 
git://git.open-mesh.org/linux-merge)
Merging bpf/master (8543e4378079 bpf, libbpf: fix quiet install_headers)
Merging ipsec/master (8dfb4eba4100 esp4: add length check for UDP encapsulation)
Merging netfilter/master (5f543a54eec0 net: hns3: fix for not calculating tx bd 
num correctly)
Merging ipvs/master (b2e3d68d1251 netfilter: nft_compat: destroy function must 
not have side effects)
Merging wireless-drivers/master (1017e0987117 vrf: prevent adding upper devices)
Merging mac80211/master (d235c48b40d3 net: dsa: mv88e6xxx: power serdes on/off 
for 10G interfaces on 6390X)
Merging rdma-fixes/for-rc (1abe186ed8a6 IB/mlx5: Reset access mask when looping 
inside page fault handler)
Merging sound-current/for-linus (e2a829b3da01 ALSA: hda/realtek - Fix speakers 
on Acer Predator Helios 500 Ryzen laptops)
Merging sound-asoc-fixes/for-linus (facc6b730db0 Merge branch 'asoc-5.1' into 
asoc-linus)
Merging regmap-fixes/for-linus (9e98c678c2d6 Linux 5.1-rc1)
Merging regulator-fixes/for-linus (6b0c8dbad170 Merge branch 'regulator-5.1' 
into regulator-linus)
Merging spi-fixes/for-linus (3f591b792427 Merge branch 'spi-5.1' into spi-linus)
Merging pci-current/for-linus (0fa635aec9ab PCI/LINK: Deduplicate bandwidth 
reports for multi-function devices)
Merging driver-core.current/driver-core-linus (cd1b772d4881 driver core: remove 
BUS_ATTR())
Merging tty.current/tty-linus (f4e68d58cf2b tty: fix NULL pointer issue when 
tty_port ops is not set)
Merging usb.current/usb-linus (f276e002793c usb: u132-hcd: fix resource leak)
Merging usb-gadget-fixes/fixes (072684e8c58d USB: gadget: f_hid: fix deadlock 
in f_hidg_write())
Merging usb-serial-fixes/usb-linus (84f3b43f7378 USB: serial: option: add 
Olicard 600)
Merging usb-chipidea-fixes/ci-for-usb-stable (d6d768a0ec3c usb: chipidea: fix 
static checker warning for NULL pointer)
Merging

Re: mmotm 2019-03-28-15-50 uploaded (gcov)

2019-03-28 Thread Randy Dunlap
On 3/28/19 3:51 PM, a...@linux-foundation.org wrote:
> The mm-of-the-moment snapshot 2019-03-28-15-50 has been uploaded to
> 
>http://www.ozlabs.org/~akpm/mmotm/
> 
> mmotm-readme.txt says
> 
> README for mm-of-the-moment:
> 
> http://www.ozlabs.org/~akpm/mmotm/
> 
> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
> more than once a week.



when # CONFIG_MODULES is not set:

  CC  kernel/gcov/gcc_4_7.o
../kernel/gcov/gcc_4_7.c: In function ‘gcov_info_within_module’:
../kernel/gcov/gcc_4_7.c:162:2: error: implicit declaration of function 
‘within_module’ [-Werror=implicit-function-declaration]
  return within_module((unsigned long)info, mod);
  ^


-- 
~Randy


Re: [PATCH v3 1/3] dt-bindings: arm: fsl: Add supported ZII VF610 boards to DT schema

2019-03-28 Thread Shawn Guo
On Mon, Mar 25, 2019 at 11:22:41AM -0700, Andrey Smirnov wrote:
> Add already supported ZII VF610 boards to DT schema.
> 
> Signed-off-by: Andrey Smirnov 
> Cc: Shawn Guo 
> Cc: Chris Healy 
> Cc: Andrew Lunn 
> Cc: Fabio Estevam 
> Cc: Rob Herring 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: devicet...@vger.kernel.org

Applied all, thanks.


kernfs: can read/write method grow buffer size?

2019-03-28 Thread Marek Behun
Hello Tejun and Greg,

kernfs_fop_open/read/write allocates a buffer for the ->read, ->write,
or ->seq_read methods. This buffer is either preallocated or allocated
on the spot, with minimum size being PAGE_SIZE, if ->atomic_write_len
is not given.

There is a question/problem currently in the led-trigger API, that the
PAGE_SIZE buffer can in some specific scenarios be too short.
(The trigger file on read returns space separated list of all supported
triggers, and the currently chosen one is marked specially. The cpu
activity trigger lists "cpu%i" for all CPU cores, which actually broke
on some machines with very large number of CPUs. Granted, this could
have been solved another way (and maybe will be), but we are now
discussing API for HW LED triggers, which can raise the problem anyway,
if a specific LED controller supports too many HW LED triggers.)

Is it allowed to grow this buffer if needed, either via krealloc or by
creating a special function in kernfs API which does this so that
led-trigger could use it?

Or is this completely forbidden?

Thank you.

Marek


[PATCH v14 10/11] gpio: 74x164: Utilize the for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in set_multiple callback with
for_each_set_clump8 macro to simplify code and improve clarity.

Suggested-by: Andy Shevchenko 
Cc: Geert Uytterhoeven 
Cc: Phil Reid 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-74x164.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/gpio/gpio-74x164.c b/drivers/gpio/gpio-74x164.c
index fb7b620763a2..de1e8b37e102 100644
--- a/drivers/gpio/gpio-74x164.c
+++ b/drivers/gpio/gpio-74x164.c
@@ -9,6 +9,7 @@
  *  published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -75,20 +76,18 @@ static void gen_74x164_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
unsigned long *bits)
 {
struct gen_74x164_chip *chip = gpiochip_get_data(gc);
-   unsigned int i, idx, shift;
-   u8 bank, bankmask;
+   unsigned long offset;
+   unsigned long bankmask;
+   size_t bank;
+   unsigned long bitmask;
 
mutex_lock(&chip->lock);
-   for (i = 0, bank = chip->registers - 1; i < chip->registers;
-i++, bank--) {
-   idx = i / sizeof(*mask);
-   shift = i % sizeof(*mask) * BITS_PER_BYTE;
-   bankmask = mask[idx] >> shift;
-   if (!bankmask)
-   continue;
+   for_each_set_clump8(offset, bankmask, mask, chip->registers * 8) {
+   bank = chip->registers - 1 - offset / 8;
+   bitmask = bitmap_get_value8(bits, offset) & bankmask;
 
chip->buffer[bank] &= ~bankmask;
-   chip->buffer[bank] |= bankmask & (bits[idx] >> shift);
+   chip->buffer[bank] |= bitmask;
}
__gen_74x164_write_config(chip);
mutex_unlock(&chip->lock);
-- 
2.21.0



[PATCH v14 11/11] thermal: intel: intel_soc_dts_iosf: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Utilize for_each_set_clump8 macro, and the bitmap_set_value8 and
bitmap_get_value8 functions, where appropriate. In addition, remove the
now unnecessary temp_mask and temp_shift members of the
intel_soc_dts_sensor_entry structure.

Suggested-by: Andy Shevchenko 
Tested-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
 drivers/thermal/intel/intel_soc_dts_iosf.c | 29 +-
 drivers/thermal/intel/intel_soc_dts_iosf.h |  2 --
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/thermal/intel/intel_soc_dts_iosf.c 
b/drivers/thermal/intel/intel_soc_dts_iosf.c
index e0813dfaa278..2aa16b5262e9 100644
--- a/drivers/thermal/intel/intel_soc_dts_iosf.c
+++ b/drivers/thermal/intel/intel_soc_dts_iosf.c
@@ -15,6 +15,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -111,6 +112,7 @@ static int update_trip_temp(struct 
intel_soc_dts_sensor_entry *dts,
 {
int status;
u32 temp_out;
+   unsigned long update_ptps;
u32 out;
u32 store_ptps;
u32 store_ptmc;
@@ -129,8 +131,9 @@ static int update_trip_temp(struct 
intel_soc_dts_sensor_entry *dts,
if (status)
return status;
 
-   out = (store_ptps & ~(0xFF << (thres_index * 8)));
-   out |= (temp_out & 0xFF) << (thres_index * 8);
+   update_ptps = store_ptps;
+   bitmap_set_value8(&update_ptps, temp_out & 0xFF, thres_index * 8);
+   out = update_ptps;
status = iosf_mbi_write(BT_MBI_UNIT_PMC, MBI_REG_WRITE,
SOC_DTS_OFFSET_PTPS, out);
if (status)
@@ -232,6 +235,7 @@ static int sys_get_curr_temp(struct thermal_zone_device 
*tzd,
u32 out;
struct intel_soc_dts_sensor_entry *dts;
struct intel_soc_dts_sensors *sensors;
+   unsigned long temp_raw;
 
dts = tzd->devdata;
sensors = dts->sensors;
@@ -240,7 +244,8 @@ static int sys_get_curr_temp(struct thermal_zone_device 
*tzd,
if (status)
return status;
 
-   out = (out & dts->temp_mask) >> dts->temp_shift;
+   temp_raw = out;
+   out = bitmap_get_value8(&temp_raw, dts->id * 8);
out -= SOC_DTS_TJMAX_ENCODING;
*temp = sensors->tj_max - out * 1000;
 
@@ -290,10 +295,13 @@ static int add_dts_thermal_zone(int id, struct 
intel_soc_dts_sensor_entry *dts,
 {
char name[10];
int trip_count = 0;
+   int writable_trip_count = 0;
int trip_mask = 0;
u32 store_ptps;
int ret;
-   int i;
+   unsigned long i;
+   unsigned long trip;
+   unsigned long ptps;
 
/* Store status to restor on exit */
ret = iosf_mbi_read(BT_MBI_UNIT_PMC, MBI_REG_READ,
@@ -302,11 +310,10 @@ static int add_dts_thermal_zone(int id, struct 
intel_soc_dts_sensor_entry *dts,
goto err_ret;
 
dts->id = id;
-   dts->temp_mask = 0x00FF << (id * 8);
-   dts->temp_shift = id * 8;
if (notification_support) {
trip_count = min(SOC_MAX_DTS_TRIPS, trip_cnt);
-   trip_mask = BIT(trip_count - read_only_trip_cnt) - 1;
+   writable_trip_count = trip_count - read_only_trip_cnt;
+   trip_mask = GENMASK(writable_trip_count - 1, 0);
}
 
/* Check if the writable trip we provide is not used by BIOS */
@@ -315,11 +322,9 @@ static int add_dts_thermal_zone(int id, struct 
intel_soc_dts_sensor_entry *dts,
if (ret)
trip_mask = 0;
else {
-   for (i = 0; i < trip_count; ++i) {
-   if (trip_mask & BIT(i))
-   if (store_ptps & (0xff << (i * 8)))
-   trip_mask &= ~BIT(i);
-   }
+   ptps = store_ptps;
+   for_each_set_clump8(i, trip, &ptps, writable_trip_count * 8)
+   trip_mask &= ~BIT(i / 8);
}
dts->trip_mask = trip_mask;
dts->trip_count = trip_count;
diff --git a/drivers/thermal/intel/intel_soc_dts_iosf.h 
b/drivers/thermal/intel/intel_soc_dts_iosf.h
index 625e37bf93dc..d0362d7acdef 100644
--- a/drivers/thermal/intel/intel_soc_dts_iosf.h
+++ b/drivers/thermal/intel/intel_soc_dts_iosf.h
@@ -33,8 +33,6 @@ struct intel_soc_dts_sensors;
 
 struct intel_soc_dts_sensor_entry {
int id;
-   u32 temp_mask;
-   u32 temp_shift;
u32 store_status;
u32 trip_mask;
u32 trip_count;
-- 
2.21.0



[PATCH v14 09/11] gpio: uniphier: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in set_multiple callback with
for_each_set_clump8 macro to simplify code and improve clarity. An
improvement in this case is that banks that are not masked will now be
skipped.

Cc: Masahiro Yamada 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-uniphier.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/gpio/gpio-uniphier.c b/drivers/gpio/gpio-uniphier.c
index 0f662b297a95..d79c34e9b23b 100644
--- a/drivers/gpio/gpio-uniphier.c
+++ b/drivers/gpio/gpio-uniphier.c
@@ -15,9 +15,6 @@
 #include 
 #include 
 
-#define UNIPHIER_GPIO_BANK_MASK\
-   GENMASK((UNIPHIER_GPIO_LINES_PER_BANK) - 1, 0)
-
 #define UNIPHIER_GPIO_IRQ_MAX_NUM  24
 
 #define UNIPHIER_GPIO_PORT_DATA0x0 /* data */
@@ -147,15 +144,14 @@ static void uniphier_gpio_set(struct gpio_chip *chip,
 static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
   unsigned long *mask, unsigned long *bits)
 {
-   unsigned int bank, shift, bank_mask, bank_bits;
-   int i;
+   unsigned long i;
+   unsigned long bank_mask;
+   unsigned long bank;
+   unsigned long bank_bits;
 
-   for (i = 0; i < chip->ngpio; i += UNIPHIER_GPIO_LINES_PER_BANK) {
+   for_each_set_clump8(i, bank_mask, mask, chip->ngpio) {
bank = i / UNIPHIER_GPIO_LINES_PER_BANK;
-   shift = i % BITS_PER_LONG;
-   bank_mask = (mask[BIT_WORD(i)] >> shift) &
-   UNIPHIER_GPIO_BANK_MASK;
-   bank_bits = bits[BIT_WORD(i)] >> shift;
+   bank_bits = bitmap_get_value8(bits, i);
 
uniphier_gpio_bank_write(chip, bank, UNIPHIER_GPIO_PORT_DATA,
 bank_mask, bank_bits);
-- 
2.21.0



[PATCH v14 08/11] gpio: pcie-idio-24: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-pcie-idio-24.c | 109 ---
 1 file changed, 40 insertions(+), 69 deletions(-)

diff --git a/drivers/gpio/gpio-pcie-idio-24.c b/drivers/gpio/gpio-pcie-idio-24.c
index 52f1647a46fd..924ec916b358 100644
--- a/drivers/gpio/gpio-pcie-idio-24.c
+++ b/drivers/gpio/gpio-pcie-idio-24.c
@@ -198,52 +198,34 @@ static int idio_24_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
-   size_t i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
-   unsigned long port_state;
+   unsigned long offset;
+   unsigned long gpio_mask;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23, &idio24gpio->reg->in0_7,
&idio24gpio->reg->in8_15, &idio24gpio->reg->in16_23,
};
+   size_t index;
+   unsigned long port_state;
const unsigned long out_mode_mask = BIT(1);
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports) + 1; i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
 
/* read bits from current gpio port (port 6 is TTL GPIO) */
-   if (i < 6)
-   port_state = ioread8(ports[i]);
+   if (index < 6)
+   port_state = ioread8(ports[index]);
else if (ioread8(&idio24gpio->reg->ctl) & out_mode_mask)
port_state = ioread8(&idio24gpio->reg->ttl_out0_7);
else
port_state = ioread8(&idio24gpio->reg->ttl_in0_7);
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   port_state &= gpio_mask;
+
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
@@ -294,59 +276,48 @@ static void idio_24_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
-   size_t i;
-   unsigned long bits_offset;
+   unsigned long offset;
unsigned long gpio_mask;
-   const unsigned int gpio_reg_size = 8;
-   const unsigned long port_mask = GENMASK(gpio_reg_size, 0);
-   unsigned long flags;
-   unsigned int out_state;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23
};
+   size_t index;
+   unsigned long bitmask;
+   unsigned long flags;
+   unsigned long out_state;
const unsigned long out_mode_mask = BIT(1);
-   const unsigned int ttl_offset = 48;
-   const size_t ttl_i = BIT_WORD(ttl_offset);
-   const unsigned int word_offset = ttl_offset % BITS_PER_LONG;
-   const unsigned long ttl_mask = (mask[ttl_i] >> word_offset) & port_mask;
-   const unsigned long ttl_bits = (bits[ttl_i] >> word_offset) & ttl_mask;
-
-   /* set bits are processed a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* check if any set bits for current port */
-   gpio_mask = (*mask >> bits_offset) & port_mask;
-   if (!gpio_mask) {
-   /* no set bits for this port so move on to next port */
-   continue;
-   }
 
-   raw_spin_lock_irqsave(&idio24gpio->lock, flags);
+   for_each_set_clump8(offset, gpio_mask, mask, A

[PATCH v14 07/11] gpio: pci-idio-16: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-pci-idio-16.c | 75 -
 1 file changed, 27 insertions(+), 48 deletions(-)

diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c
index 6b7349783223..a67388db28ad 100644
--- a/drivers/gpio/gpio-pci-idio-16.c
+++ b/drivers/gpio/gpio-pci-idio-16.c
@@ -108,45 +108,23 @@ static int idio_16_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
-   size_t i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
-   unsigned long port_state;
+   unsigned long offset;
+   unsigned long gpio_mask;
void __iomem *ports[] = {
&idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
&idio16gpio->reg->in0_7, &idio16gpio->reg->in8_15,
};
+   void __iomem *port_addr;
+   unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = ports[offset / 8];
+   port_state = ioread8(port_addr) & gpio_mask;
 
-   /* read bits from current gpio port */
-   port_state = ioread8(ports[i]);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
@@ -186,30 +164,31 @@ static void idio_16_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
+   unsigned long offset;
+   unsigned long gpio_mask;
+   void __iomem *ports[] = {
+   &idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
+   };
+   size_t index;
+   void __iomem *port_addr;
+   unsigned long bitmask;
unsigned long flags;
-   unsigned int out_state;
+   unsigned long out_state;
 
-   raw_spin_lock_irqsave(&idio16gpio->lock, flags);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = ports[index];
 
-   /* process output lines 0-7 */
-   if (*mask & 0xFF) {
-   out_state = ioread8(&idio16gpio->reg->out0_7) & ~*mask;
-   out_state |= *mask & *bits;
-   iowrite8(out_state, &idio16gpio->reg->out0_7);
-   }
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
+
+   raw_spin_lock_irqsave(&idio16gpio->lock, flags);
 
-   /* shift to next output line word */
-   *mask >>= 8;
+   out_state = ioread8(port_addr) & ~gpio_mask;
+   out_state |= bitmask;
+   iowrite8(out_state, port_addr);
 
-   /* process output lines 8-15 */
-   if (*mask & 0xFF) {
-   *bits >>= 8;
-   out_state = ioread8(&idio16gpio->reg->out8_15) & ~*mask;
-   out_state |= *mask & *bits;
-   iowrite8(out_state, &idio16gpio->reg->out8_15);
+   raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
}
-
-   raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
 }
 
 static void idio_16_irq_ack(struct irq_data *data)
-- 
2.21.0



[PATCH v14 06/11] gpio: ws16c48: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-ws16c48.c | 73 ++---
 1 file changed, 20 insertions(+), 53 deletions(-)

diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c
index 5cf3697bfb15..ee30417d6394 100644
--- a/drivers/gpio/gpio-ws16c48.c
+++ b/drivers/gpio/gpio-ws16c48.c
@@ -134,42 +134,19 @@ static int ws16c48_gpio_get_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
-   const unsigned int gpio_reg_size = 8;
-   size_t i;
-   const size_t num_ports = chip->ngpio / gpio_reg_size;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned long offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < num_ports; i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+   port_addr = ws16c48gpio->base + offset / 8;
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(ws16c48gpio->base + i);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
@@ -203,39 +180,29 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int iomask;
-   unsigned int bitmask;
+   unsigned long offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
+   unsigned long bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
-
-   port = i / gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+   index = offset / 8;
+   port_addr = ws16c48gpio->base + index;
 
/* mask out GPIO configured for input */
-   iomask = mask[BIT_WORD(i)] & ~ws16c48gpio->io_state[port];
-   bitmask = iomask & bits[BIT_WORD(i)];
+   gpio_mask &= ~ws16c48gpio->io_state[index];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
raw_spin_lock_irqsave(&ws16c48gpio->lock, flags);
 
/* update output state data and set device gpio register */
-   ws16c48gpio->out_state[port] &= ~iomask;
-   ws16c48gpio->out_state[port] |= bitmask;
-   outb(ws16c48gpio->out_state[port], ws16c48gpio->base + port);
+   ws16c48gpio->out_state[index] &= ~gpio_mask;
+   ws16c48gpio->out_state[index] |= bitmask;
+   outb(ws16c48gpio->out_state[index], port_addr);
 
raw_spin_unlock_irqrestore(&ws16c48gpio->lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.21.0



[PATCH v14 05/11] gpio: gpio-mm: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-gpio-mm.c | 73 +++--
 1 file changed, 21 insertions(+), 52 deletions(-)

diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c
index 8c150fd68d9d..0cef50d14c5a 100644
--- a/drivers/gpio/gpio-gpio-mm.c
+++ b/drivers/gpio/gpio-gpio-mm.c
@@ -172,46 +172,25 @@ static int gpiomm_gpio_get(struct gpio_chip *chip, 
unsigned int offset)
return !!(port_state & mask);
 }
 
+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
 static int gpiomm_gpio_get_multiple(struct gpio_chip *chip, unsigned long 
*mask,
unsigned long *bits)
 {
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
-   size_t i;
-   static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned long offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(gpiommgpio->base + ports[i]);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = gpiommgpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
@@ -242,37 +221,27 @@ static void gpiomm_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int out_port;
-   unsigned int bitmask;
+   unsigned long offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
+   unsigned long bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = gpiommgpio->base + ports[index];
 
-   port = i / gpio_reg_size;
-   out_port = (port > 2) ? port + 1 : port;
-   bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
spin_lock_irqsave(&gpiommgpio->lock, flags);
 
/* update output state data and set device gpio register */
-   gpiommgpio->out_state[port] &= ~mask[BIT_WORD(i)];
-   gpiommgpio->out_state[port] |= bitmask;
-   outb(gpiommgpio->out_state[port], gpiommgpio->base + out_port);
+   gpiommgpio->out_state[index] &= ~gpio_mask;
+   gpiommgpio->out_state[index] |= bitmask;
+   outb(gpiommgpio->out_state[index], port_addr);
 
spin_unlock_irqrestore(&gpiommgpio->lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.21.0



[PATCH v14 04/11] gpio: 104-idi-48: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-104-idi-48.c | 36 +++---
 1 file changed, 7 insertions(+), 29 deletions(-)

diff --git a/drivers/gpio/gpio-104-idi-48.c b/drivers/gpio/gpio-104-idi-48.c
index 88dc6f2449f6..9b43964b0412 100644
--- a/drivers/gpio/gpio-104-idi-48.c
+++ b/drivers/gpio/gpio-104-idi-48.c
@@ -93,42 +93,20 @@ static int idi_48_gpio_get_multiple(struct gpio_chip *chip, 
unsigned long *mask,
unsigned long *bits)
 {
struct idi_48_gpio *const idi48gpio = gpiochip_get_data(chip);
-   size_t i;
+   unsigned long offset;
+   unsigned long gpio_mask;
static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = idi48gpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(idi48gpio->base + ports[i]);
-
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
-- 
2.21.0



[PATCH v14 03/11] gpio: 104-dio-48e: Utilize for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 drivers/gpio/gpio-104-dio-48e.c | 73 ++---
 1 file changed, 21 insertions(+), 52 deletions(-)

diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c
index 92c8f944bf64..2fc6d2b11d25 100644
--- a/drivers/gpio/gpio-104-dio-48e.c
+++ b/drivers/gpio/gpio-104-dio-48e.c
@@ -183,46 +183,25 @@ static int dio48e_gpio_get(struct gpio_chip *chip, 
unsigned offset)
return !!(port_state & mask);
 }
 
+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
 static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long 
*mask,
unsigned long *bits)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   size_t i;
-   static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
-   const unsigned int gpio_reg_size = 8;
-   unsigned int bits_offset;
-   size_t word_index;
-   unsigned int word_offset;
-   unsigned long word_mask;
-   const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+   unsigned long offset;
+   unsigned long gpio_mask;
+   unsigned int port_addr;
unsigned long port_state;
 
/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);
 
-   /* get bits are evaluated a gpio port register at a time */
-   for (i = 0; i < ARRAY_SIZE(ports); i++) {
-   /* gpio offset in bits array */
-   bits_offset = i * gpio_reg_size;
-
-   /* word index for bits array */
-   word_index = BIT_WORD(bits_offset);
-
-   /* gpio offset within current word of bits array */
-   word_offset = bits_offset % BITS_PER_LONG;
-
-   /* mask of get bits for current gpio within current word */
-   word_mask = mask[word_index] & (port_mask << word_offset);
-   if (!word_mask) {
-   /* no get bits in this port so skip to next one */
-   continue;
-   }
-
-   /* read bits from current gpio port */
-   port_state = inb(dio48egpio->base + ports[i]);
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   port_addr = dio48egpio->base + ports[offset / 8];
+   port_state = inb(port_addr) & gpio_mask;
 
-   /* store acquired bits at respective bits array offset */
-   bits[word_index] |= (port_state << word_offset) & word_mask;
+   bitmap_set_value8(bits, port_state, offset);
}
 
return 0;
@@ -252,37 +231,27 @@ static void dio48e_gpio_set_multiple(struct gpio_chip 
*chip,
unsigned long *mask, unsigned long *bits)
 {
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
-   unsigned int i;
-   const unsigned int gpio_reg_size = 8;
-   unsigned int port;
-   unsigned int out_port;
-   unsigned int bitmask;
+   unsigned long offset;
+   unsigned long gpio_mask;
+   size_t index;
+   unsigned int port_addr;
+   unsigned long bitmask;
unsigned long flags;
 
-   /* set bits are evaluated a gpio register size at a time */
-   for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
-   /* no more set bits in this mask word; skip to the next word */
-   if (!mask[BIT_WORD(i)]) {
-   i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
-   continue;
-   }
+   for_each_set_clump8(offset, gpio_mask, mask, ARRAY_SIZE(ports) * 8) {
+   index = offset / 8;
+   port_addr = dio48egpio->base + ports[index];
 
-   port = i / gpio_reg_size;
-   out_port = (port > 2) ? port + 1 : port;
-   bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+   bitmask = bitmap_get_value8(bits, offset) & gpio_mask;
 
raw_spin_lock_irqsave(&dio48egpio->lock, flags);
 
/* update output state data and set device gpio register */
-   dio48egpio->out_state[port] &= ~mask[BIT_WORD(i)];
-   dio48egpio->out_state[port] |= bitmask;
-   outb(dio48egpio->out_state[port], dio48egpio->base + out_port);
+   dio48egpio->out_state[index] &= ~gpio_mask;
+   dio48egpio->out_state[index] |= bitmask;
+   outb(dio48egpio->out_state[index], port_addr);
 
raw_spin_unlock_irqrestore(&dio48egpio->lock, flags);
-
-   /* prepare for next gpio register set */
-   mask[BIT_WORD(i)] >>= gpio_reg_size;
-   bits[BIT_WORD(i)] >>= gpio_reg_size;
}
 }
 
-- 
2.21.0



[PATCH v14 01/11] bitops: Introduce the for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
This macro iterates for each 8-bit group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to the
bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value8 and bitmap_set_value8 functions are introduced to
respectively get and set an 8-bit value in a bitmap memory region.

Suggested-by: Andy Shevchenko 
Suggested-by: Rasmus Villemoes 
Suggested-by: Lukas Wunner 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Cc: Linus Walleij 
Acked-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
 include/asm-generic/bitops/find.h | 61 +++
 include/linux/bitops.h|  5 +++
 2 files changed, 66 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 8a1ee10014de..45aa6d718cbd 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -80,4 +80,65 @@ extern unsigned long find_first_zero_bit(const unsigned long 
*addr,
 
 #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
+/**
+ * bitmap_get_value8 - get an 8-bit value within a memory region
+ * @addr: address to the bitmap memory region
+ * @start: bit offset of the 8-bit value; must be a multiple of 8
+ *
+ * Returns the 8-bit value located at the @start bit offset within the @addr
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value8(const unsigned long *addr,
+ unsigned long start)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+
+   return (addr[index] >> offset) & 0xFF;
+}
+
+/**
+ * bitmap_set_value8 - set an 8-bit value within a memory region
+ * @addr: address to the bitmap memory region
+ * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
+ * @start: bit offset of the 8-bit value; must be a multiple of 8
+ */
+static inline void bitmap_set_value8(unsigned long *addr, unsigned long value,
+unsigned long start)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+
+   addr[index] &= ~(0xFF << offset);
+   addr[index] |= value << offset;
+}
+
+/**
+ * find_next_clump8 - find next 8-bit clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+static inline unsigned long find_next_clump8(unsigned long *clump,
+const unsigned long *addr,
+unsigned long size,
+unsigned long offset)
+{
+   offset = find_next_bit(addr, size, offset);
+   if (offset == size)
+   return size;
+
+   offset = round_down(offset, 8);
+   *clump = bitmap_get_value8(addr, offset);
+
+   return offset;
+}
+
+#define find_first_clump8(clump, bits, size) \
+   find_next_clump8((clump), (bits), (size), 0)
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 602af23b98c7..1d9b5efb9bd4 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w);
 (bit) < (size);\
 (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
 
+#define for_each_set_clump8(start, clump, bits, size) \
+   for ((start) = find_first_clump8(&(clump), (bits), (size)); \
+(start) < (size); \
+(start) = find_next_clump8(&(clump), (bits), (size), (start) + 8))
+
 static inline int get_bitmask_order(unsigned int count)
 {
int order;
-- 
2.21.0



[PATCH v14 02/11] lib/test_bitmap.c: Add for_each_set_clump8 test cases

2019-03-28 Thread William Breathitt Gray
The introduction of the for_each_set_clump8 macro warrants test cases to
verify the implementation. This patch adds test case checks for whether
an out-of-bounds clump index is returned, a zero clump is returned, or
the returned clump value differs from the expected clump value.

Cc: Rasmus Villemoes 
Acked-by: Andrew Morton 
Reviewed-by: Andy Shevchenko 
Reviewed-by: Linus Walleij 
Signed-off-by: William Breathitt Gray 
---
 lib/test_bitmap.c | 65 +++
 1 file changed, 65 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6cd7d0740005..8d1f268069c1 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -88,6 +88,36 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned char *const clump_exp,
+   const unsigned long *const clump)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / 8];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -104,6 +134,7 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
 #define expect_eq_bitmap(...)  __expect_eq(bitmap, ##__VA_ARGS__)
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
+#define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -361,6 +392,39 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned char clump_exp[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static void __init test_for_each_set_clump8(void)
+{
+#define CLUMP_EXP_NUMBITS 64
+   DECLARE_BITMAP(bits, CLUMP_EXP_NUMBITS);
+   unsigned int start;
+   unsigned long clump;
+
+   /* set bitmap to test case */
+   bitmap_zero(bits, CLUMP_EXP_NUMBITS);
+   bitmap_set(bits, 0, 1); /* 0x01 */
+   bitmap_set(bits, 9, 1); /* 0x02 */
+   bitmap_set(bits, 27, 3);/* 0x28 */
+   bitmap_set(bits, 35, 3);/* 0x28 */
+   bitmap_set(bits, 40, 4);/* 0x0F */
+   bitmap_set(bits, 48, 8);/* 0xFF */
+   bitmap_set(bits, 56, 1);/* 0x05 - part 1 */
+   bitmap_set(bits, 58, 1);/* 0x05 - part 2 */
+
+   for_each_set_clump8(start, clump, bits, CLUMP_EXP_NUMBITS)
+   expect_eq_clump8(start, CLUMP_EXP_NUMBITS, clump_exp, &clump);
+}
+
 static int __init test_bitmap_init(void)
 {
test_zero_clear();
@@ -369,6 +433,7 @@ static int __init test_bitmap_init(void)
test_bitmap_arr32();
test_bitmap_parselist();
test_mem_optimisations();
+   test_for_each_set_clump8();
 
if (failed_tests == 0)
pr_info("all %u tests passed\n", total_tests);
-- 
2.21.0



[PATCH v14 00/11] Introduce the for_each_set_clump8 macro

2019-03-28 Thread William Breathitt Gray
Changes in v14:
  - Redefine bitmap_get_value8, bitmap_set_value8, and find_next_clump8
as static inline functions
  - Rename 'idx' variable to 'index' in the bitmap_get_value8 and
bitmap_set_value8 functions
  - Remove superfluous parens in gen_74x164_set_multiple

While adding GPIO get_multiple/set_multiple callback support for various
drivers, I noticed a pattern of looping manifesting that would be useful
standardized as a macro.

This patchset introduces the for_each_set_clump8 macro and utilizes it
in several GPIO drivers. The for_each_set_clump macro8 facilitates a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

The for_each_set_clump8 macro has four parameters:

* start: set to the bit offset of the current clump
* clump: set to the current clump value
* bits: bitmap to search within
* size: bitmap size in number of bits

In this version of the patchset, the for_each_set_clump macro has been
reimplemented and simplified based on the suggestions provided by Rasmus
Villemoes and Andy Shevchenko in the version 4 submission.

In particular, the function of the for_each_set_clump macro has been
restricted to handle only 8-bit clumps; the drivers that use the
for_each_set_clump macro only handle 8-bit ports so a generic
for_each_set_clump implementation is not necessary. Thus, a solution for
large clumps (i.e. those larger than the width of a bitmap word) can be
postponed until a driver appears that actually requires such a generic
for_each_set_clump implementation.

For what it's worth, a semi-generic for_each_set_clump (i.e. for clumps
smaller than the width of a bitmap word) can be implemented by simply
replacing the hardcoded '8' and '0xFF' instances with respective
variables. I have not yet had a need for such an implementation, and
since it falls short of a true generic for_each_set_clump function, I
have decided to forgo such an implementation for now.

In addition, the bitmap_get_value8 and bitmap_set_value8 functions are
introduced to get and set 8-bit values respectively. Their use is based
on the behavior suggested in the patchset version 4 review.

William Breathitt Gray (11):
  bitops: Introduce the for_each_set_clump8 macro
  lib/test_bitmap.c: Add for_each_set_clump8 test cases
  gpio: 104-dio-48e: Utilize for_each_set_clump8 macro
  gpio: 104-idi-48: Utilize for_each_set_clump8 macro
  gpio: gpio-mm: Utilize for_each_set_clump8 macro
  gpio: ws16c48: Utilize for_each_set_clump8 macro
  gpio: pci-idio-16: Utilize for_each_set_clump8 macro
  gpio: pcie-idio-24: Utilize for_each_set_clump8 macro
  gpio: uniphier: Utilize for_each_set_clump8 macro
  gpio: 74x164: Utilize the for_each_set_clump8 macro
  thermal: intel: intel_soc_dts_iosf: Utilize for_each_set_clump8 macro

 drivers/gpio/gpio-104-dio-48e.c|  73 --
 drivers/gpio/gpio-104-idi-48.c |  36 ++-
 drivers/gpio/gpio-74x164.c |  19 ++--
 drivers/gpio/gpio-gpio-mm.c|  73 --
 drivers/gpio/gpio-pci-idio-16.c|  75 +-
 drivers/gpio/gpio-pcie-idio-24.c   | 109 -
 drivers/gpio/gpio-uniphier.c   |  16 ++-
 drivers/gpio/gpio-ws16c48.c|  73 --
 drivers/thermal/intel/intel_soc_dts_iosf.c |  29 +++---
 drivers/thermal/intel/intel_soc_dts_iosf.h |   2 -
 include/asm-generic/bitops/find.h  |  61 
 include/linux/bitops.h |   5 +
 lib/test_bitmap.c  |  65 
 13 files changed, 299 insertions(+), 337 deletions(-)

-- 
2.21.0



[PATCH v2] sched/clock: Prevent generic sched_clock wrap caused by tick_freeze()

2019-03-28 Thread Chang-An Chen
tick_freeze() introduced by suspend-to-idle in commit 124cf9117c5f
("PM / sleep: Make it possible to quiesce timers during suspend-to-idle")
will use timekeeping_suspend() instead of syscore_suspend() during
suspend-to-idle. It means that generic sched_clock will keep going because
sched_clock_suspend() and sched_clock_resume() are not taken during
suspend-to-idle. This will lead to generic sched_clock wrap.

For example:
In my arm system with suspend-to-idle enabled, sched_clock is registered
as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which
means the real wrapping duration is 8796093022202ns.

[  134.551779] suspend-to-idle suspend (timekeeping_suspend())
[ 1204.912239] suspend-to-idle resume (timekeeping_resume())
..
[ 1206.912239] suspend-to-idle suspend (timekeeping_suspend())
[ 5880.502807] suspend-to-idle resume (timekeeping_resume())
..
[ 6000.403724] suspend-to-idle suspend (timekeeping_suspend())
[ 8035.753167] suspend-to-idle resume  (timekeeping_resume())
..
[ 8795.786684] (2)[321:charger_thread]..
[ 8795.788387] (2)[321:charger_thread]..
[0.057226] (0)[0:swapper/0]..
[0.061447] (2)[0:swapper/2]..

Sched_clock was not stopped during suspend-to-idle, and sched_clock_poll
hrtimer was not expired because timekeeping_suspend() is taken during
suspend-to-idle. It makes sched_clock wrap at kernel time 8796s.

To fix this issue, we add sched_clock_suspend() and sched_clock_resume() in
tick_freeze() together with timekeeping_suspend() and timekeeping_resume()
to make sure generic sched_clock wrapping will not happen.

Signed-off-by: Chang-An Chen 
Fixes: 124cf9117c5f (PM / sleep: Make it possible to quiesce timers during 
suspend-to-idle)
---
 kernel/time/sched_clock.c |4 ++--
 kernel/time/tick-common.c |2 ++
 kernel/time/timekeeping.h |7 +++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 094b82c..930113b 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -272,7 +272,7 @@ static u64 notrace suspended_sched_clock_read(void)
return cd.read_data[seq & 1].epoch_cyc;
 }
 
-static int sched_clock_suspend(void)
+int sched_clock_suspend(void)
 {
struct clock_read_data *rd = &cd.read_data[0];
 
@@ -283,7 +283,7 @@ static int sched_clock_suspend(void)
return 0;
 }
 
-static void sched_clock_resume(void)
+void sched_clock_resume(void)
 {
struct clock_read_data *rd = &cd.read_data[0];
 
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 529143b..df40146 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -487,6 +487,7 @@ void tick_freeze(void)
trace_suspend_resume(TPS("timekeeping_freeze"),
 smp_processor_id(), true);
system_state = SYSTEM_SUSPEND;
+   sched_clock_suspend();
timekeeping_suspend();
} else {
tick_suspend_local();
@@ -510,6 +511,7 @@ void tick_unfreeze(void)
 
if (tick_freeze_depth == num_online_cpus()) {
timekeeping_resume();
+   sched_clock_resume();
system_state = SYSTEM_RUNNING;
trace_suspend_resume(TPS("timekeeping_freeze"),
 smp_processor_id(), false);
diff --git a/kernel/time/timekeeping.h b/kernel/time/timekeeping.h
index 7a9b4eb..141ab3a 100644
--- a/kernel/time/timekeeping.h
+++ b/kernel/time/timekeeping.h
@@ -14,6 +14,13 @@ extern ktime_t ktime_get_update_offsets_now(unsigned int 
*cwsseq,
 extern void timekeeping_warp_clock(void);
 extern int timekeeping_suspend(void);
 extern void timekeeping_resume(void);
+#ifdef CONFIG_GENERIC_SCHED_CLOCK
+extern int sched_clock_suspend(void);
+extern void sched_clock_resume(void);
+#else
+static inline int sched_clock_suspend(void) { return 0; }
+static inline void sched_clock_resume(void) { }
+#endif
 
 extern void do_timer(unsigned long ticks);
 extern void update_wall_time(void);
-- 
1.7.9.5



Re: [PATCH v2 06/11] mm/hmm: improve driver API to work and wait over a range v2

2019-03-28 Thread Ira Weiny
On Thu, Mar 28, 2019 at 08:56:54PM -0400, Jerome Glisse wrote:
> On Thu, Mar 28, 2019 at 09:12:21AM -0700, Ira Weiny wrote:
> > On Mon, Mar 25, 2019 at 10:40:06AM -0400, Jerome Glisse wrote:
> > > From: Jérôme Glisse 
> > > 

[snip]

> > > +/*
> > > + * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a 
> > > range
> > > + *
> > > + * When waiting for mmu notifiers we need some kind of time out 
> > > otherwise we
> > > + * could potentialy wait for ever, 1000ms ie 1s sounds like a long time 
> > > to
> > > + * wait already.
> > > + */
> > > +#define HMM_RANGE_DEFAULT_TIMEOUT 1000
> > > +
> > >  /* This is a temporary helper to avoid merge conflict between trees. */
> > > +static inline bool hmm_vma_range_done(struct hmm_range *range)
> > > +{
> > > + bool ret = hmm_range_valid(range);
> > > +
> > > + hmm_range_unregister(range);
> > > + return ret;
> > > +}
> > > +
> > >  static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> > >  {
> > > - long ret = hmm_range_fault(range, block);
> > > - if (ret == -EBUSY)
> > > - ret = -EAGAIN;
> > > - else if (ret == -EAGAIN)
> > > - ret = -EBUSY;
> > > - return ret < 0 ? ret : 0;
> > > + long ret;
> > > +
> > > + ret = hmm_range_register(range, range->vma->vm_mm,
> > > +  range->start, range->end);
> > > + if (ret)
> > > + return (int)ret;
> > > +
> > > + if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
> > > + up_read(&range->vma->vm_mm->mmap_sem);
> > > + return -EAGAIN;
> > > + }
> > > +
> > > + ret = hmm_range_fault(range, block);
> > > + if (ret <= 0) {
> > > + if (ret == -EBUSY || !ret) {
> > > + up_read(&range->vma->vm_mm->mmap_sem);
> > > + ret = -EBUSY;
> > > + } else if (ret == -EAGAIN)
> > > + ret = -EBUSY;
> > > + hmm_range_unregister(range);
> > > + return ret;
> > > + }
> > > + return 0;
> > 
> > Is hmm_vma_fault() also temporary to keep the nouveau driver working?  It 
> > looks
> > like it to me.
> > 
> > This and hmm_vma_range_done() above are part of the old interface which is 
> > in
> > the Documentation correct?  As stated above we should probably change that
> > documentation with this patch to ensure no new users of these 2 functions
> > appear.
> 
> Ok will update the documentation, note that i already posted patches to use
> this new API see the ODP RDMA link in the cover letter.
> 

Thanks,  Sorry for my previous email on this patch.  After looking more I see
that this is the old interface but this was not clear.  And I have not had time
to follow the previous threads.  I'm finding time to do this now...

Sorry,
Ira



Re: [PATCH 2/5] ARM: dts: imx50: Add Kobo Aura DTS

2019-03-28 Thread Shawn Guo
On Tue, Mar 26, 2019 at 05:26:53PM +0100, Jonathan Neuschäfer wrote:
> Hi, thanks for your comments. I'll address them in v2.
> 
> On Fri, Mar 22, 2019 at 09:31:53AM +0800, Shawn Guo wrote:
> > On Tue, Mar 19, 2019 at 04:24:17PM +0100, Jonathan Neuschäfer wrote:
> > > The Kobo Aura is an e-book reader released in 2013.
> [...]
> > > + sd2_pwrseq: pwrseq {
> > > + compatible = "mmc-pwrseq-simple";
> > > + pinctrl-names = "default";
> > > + pinctrl-0 = <&pinctrl_sd2_reset>;
> > > +
> > 
> > Please do not have random newlines.
> 
> Does that apply to all empty lines between properties?

Yes, that's what we do for i.MX device trees.

> 
> > 
> > > + reset-gpios = <&gpio4 17 GPIO_ACTIVE_LOW>;
> > > + };
> > > +
> [...]
> > > +&iomuxc {
> > > + pinctrl_uart2: uart2 {
> > > + fsl,pins = <
> > > + MX50_PAD_UART2_TXD__UART2_TXD_MUX   0x1e4
> > > + MX50_PAD_UART2_RXD__UART2_RXD_MUX   0x1e4
> > > + >;
> > > + };
> > > +
> > > + pinctrl_i2c1: i2c1 {
> > 
> > Please sort these pinctrl nodes alphabetically.
> 
> It doesn't make a difference here, but should I generally sort by name
> or by label in cases like this one?

Keep using the naming schema below, and it always makes no difference
then.

pinctrl_xxx: xxx

Shawn


Re: [PATCH v2 10/11] mm/hmm: add helpers for driver to safely take the mmap_sem v2

2019-03-28 Thread Ira Weiny
On Thu, Mar 28, 2019 at 04:34:04PM -0700, John Hubbard wrote:
> On 3/28/19 4:24 PM, Jerome Glisse wrote:
> > On Thu, Mar 28, 2019 at 04:20:37PM -0700, John Hubbard wrote:
> >> On 3/28/19 4:05 PM, Jerome Glisse wrote:
> >>> On Thu, Mar 28, 2019 at 03:43:33PM -0700, John Hubbard wrote:
>  On 3/28/19 3:40 PM, Jerome Glisse wrote:
> > On Thu, Mar 28, 2019 at 03:25:39PM -0700, John Hubbard wrote:
> >> On 3/28/19 3:08 PM, Jerome Glisse wrote:
> >>> On Thu, Mar 28, 2019 at 02:41:02PM -0700, John Hubbard wrote:
>  On 3/28/19 2:30 PM, Jerome Glisse wrote:
> > On Thu, Mar 28, 2019 at 01:54:01PM -0700, John Hubbard wrote:
> >> On 3/25/19 7:40 AM, jgli...@redhat.com wrote:
> >>> From: Jérôme Glisse 
> >> [...]
> >> OK, so let's either drop this patch, or if merge windows won't allow 
> >> that,
> >> then *eventually* drop this patch. And instead, put in a 
> >> hmm_sanity_check()
> >> that does the same checks.
> >
> > RDMA depends on this, so does the nouveau patchset that convert to new 
> > API.
> > So i do not see reason to drop this. They are user for this they are 
> > posted
> > and i hope i explained properly the benefit.
> >
> > It is a common pattern. Yes it only save couple lines of code but down 
> > the
> > road i will also help for people working on the mmap_sem patchset.
> >
> 
>  It *adds* a couple of lines that are misleading, because they look like 
>  they
>  make things safer, but they don't actually do so.
> >>>
> >>> It is not about safety, sorry if it confused you but there is nothing 
> >>> about
> >>> safety here, i can add a big fat comment that explains that there is no 
> >>> safety
> >>> here. The intention is to allow the page fault handler that potential have
> >>> hundred of page fault queue up to abort as soon as it sees that it is 
> >>> pointless
> >>> to keep faulting on a dying process.
> >>>
> >>> Again if we race it is _fine_ nothing bad will happen, we are just doing 
> >>> use-
> >>> less work that gonna be thrown on the floor and we are just slowing down 
> >>> the
> >>> process tear down.
> >>>
> >>
> >> In addition to a comment, how about naming this thing to indicate the 
> >> above 
> >> intention?  I have a really hard time with this odd down_read() wrapper, 
> >> which
> >> allows code to proceed without really getting a lock. It's just too 
> >> wrong-looking.
> >> If it were instead named:
> >>
> >>hmm_is_exiting()
> > 
> > What about: hmm_lock_mmap_if_alive() ?
> > 
> 
> That's definitely better, but I want to vote for just doing a check, not 
> taking any locks.
> 
> I'm not super concerned about the exact name, but I really want a routine that
> just checks (and optionally asserts, via WARN or BUG), and that's *all*. Then
> drivers can scatter that around like pixie dust as they see fit. Maybe right 
> before
> taking a lock, maybe in other places. Decoupled from locking.

I agree.  Names matter and any function which is called *_down_read and could
potentially not take the lock should be called try_*_down_read.  Furthermore
users should be checking the return values from any try_*.

It is also odd that we are calling "down/up" on something which is not a
semaphore.  So the user here needs to _know_ that they are really getting the
lock on the mm which sits behind the scenes.  What John is proposing is more
explicit when reading driver code.

Ira

> 
> thanks,
> -- 
> John Hubbard
> NVIDIA
> 


linux-next: build warning after merge of the akpm-current tree

2019-03-28 Thread Stephen Rothwell
Hi all,

After merging the akpm-current tree, today's linux-next build (arm
multi_v7_defconfig) produced this warning:

lib/list_sort.c:17:36: warning: 'pure' attribute ignored [-Wattributes]
   struct list_head const *, struct list_head const *);
^

Introduced by commit

  820c81be5237 ("lib/list_sort: simplify and remove MAX_LIST_LENGTH_BITS")

-- 
Cheers,
Stephen Rothwell


pgpTpvx7K6CxJ.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 03/12] dt-binding: gce: add binding for gce event property

2019-03-28 Thread Bibby Hsieh
Hi, Rob,

Thanks for your review and comments.

On Thu, 2019-03-28 at 13:44 -0500, Rob Herring wrote:
> On Thu, Mar 28, 2019 at 10:19:24AM +0800, Bibby Hsieh wrote:
> > Client hardware would send event to GCE hardware,
> > so #event-cells, mediatek,gce-event-names, mediatek,gce-events.
> > present the event.
> > 
> > Signed-off-by: Bibby Hsieh 
> > ---
> >  Documentation/devicetree/bindings/mailbox/mtk-gce.txt | 17 
> > ++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/mailbox/mtk-gce.txt 
> > b/Documentation/devicetree/bindings/mailbox/mtk-gce.txt
> > index 1f7f8f2..2f175d6 100644
> > --- a/Documentation/devicetree/bindings/mailbox/mtk-gce.txt
> > +++ b/Documentation/devicetree/bindings/mailbox/mtk-gce.txt
> > @@ -21,12 +21,21 @@ Required properties:
> > priority: Priority of GCE thread.
> > atomic_exec: GCE processing continuous packets of commands in atomic
> > way.
> > +- #event-cells: Should be 1.
> > +   <&phandle event_number>
> > +   phandle: Label name of a gce node.
> > +   event_number: the event number defined in 'dt-bindings/gce/mt8173-gce.h'
> > + or 'dt-binding/gce/mt8183-gce.h'.
> 
> You only need to have a #*-cells if the number is variable.
> 
I think #event-cells can be removed here.The cmdq_dev_get_event can be
modified as:

+u32 cmdq_dev_get_event(struct device *dev, const char *name)
+{
+   <... snip ...>

+   index = of_property_match_string(dev->of_node,
+"mediatek,gce-event-names", 
+   <... snip ...>

+
+   if (of_property_read_u32_index(dev->of_node, "mediatek,gce-events",
index, &result)) {
+   dev_err(dev, "can't parse gce-events property");
+
+   return -ENODEV;
+   }
+
+   return result;
+}
+EXPORT_SYMBOL(cmdq_dev_get_event);

> What are 'events' here? Sounds like interrupts or MSI?
Yes, events is a kind of communicating method from the other hardware
with GCE. It's likes interrupts.

> 
> >  
> >  Required properties for a client device:
> >  - mboxes: Client use mailbox to communicate with GCE, it should have this
> >property and list of phandle, mailbox specifiers.
> >  - mediatek,gce-subsys: u32, specify the sub-system id which is 
> > corresponding
> >to the register address.
> > +Optional propertier for a client device:
> > +- mediatek,gce-event-names: the event name can be defined by user.
> > +- mediatek,gce-events: u32, the event number defined in
> > +  'dt-bindings/gce/mt8173-gce.h' or 'dt-binding/gce/mt8183-gce.h'.
> >  
> >  Some vaules of properties are defined in 'dt-bindings/gce/mt8173-gce.h'
> >  or 'dt-binding/gce/mt8183-gce.h'. Such as sub-system ids, thread priority, 
> > event ids.
> > @@ -40,6 +49,7 @@ Example:
> > clocks = <&infracfg CLK_INFRA_GCE>;
> > clock-names = "gce";
> > #mbox-cells = <3>;
> > +   #event-cells = <1>;
> > };
> >  
> >  Example for a client device:
> > @@ -49,8 +59,9 @@ Example for a client device:
> > mboxes = <&gce 0 CMDQ_THR_PRIO_LOWEST 1>,
> >  <&gce 1 CMDQ_THR_PRIO_LOWEST 1>;
> > mediatek,gce-subsys = ;
> > -   mutex-event-eof =  > -   CMDQ_EVENT_MUTEX1_STREAM_EOF>;
> > -
> > +   mediatek,gce-event-names = "rdma0_sof",
> > +  "rsz0_sof";
> > +   mediatek,gce-events = <&gce CMDQ_EVENT_MDP_RDMA0_SOF>,
> > + <&gce CMDQ_EVENT_MDP_RSZ0_SOF>;
> > ...
> > };
> > -- 
> > 1.9.1
> > 

-- 
Bibby



Re: [RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg

2019-03-28 Thread Matthew Wilcox
On Thu, Mar 28, 2019 at 06:28:36PM -0700, Shakeel Butt wrote:
> A VCPU of a VM can allocate upto three pages which can be mmap'ed by the
> user space application. At the moment this memory is not charged. On a
> large machine running large number of VMs (or small number of VMs having
> large number of VCPUs), this unaccounted memory can be very significant.
> So, this memory should be charged to a kmemcg. However that is not
> possible as these pages are mmapped to the userspace and PageKmemcg()
> was designed with the assumption that such pages will never be mmapped
> to the userspace.
> 
> One way to solve this problem is by introducing an additional memcg
> charging API similar to mem_cgroup_[un]charge_skmem(). However skmem
> charging API usage is contained and shared and no new users are
> expected but the pages which can be mmapped and should be charged to
> kmemcg can and will increase. So, requiring the usage for such API will
> increase the maintenance burden. The simplest solution is to remove the
> assumption of no mmapping PageKmemcg() pages to user space.

The usual response under these circumstances is "No, you can't have a
page flag bit".

I don't understand why we need a PageKmemcg anyway.  We already
have an entire pointer in struct page; can we not just check whether
page->mem_cgroup is NULL or not?


Re: [PATCH] Convert struct pid count to refcount_t

2019-03-28 Thread Joel Fernandes
On Thu, Mar 28, 2019 at 10:39:58AM -0400, Joel Fernandes wrote:
> On Thu, Mar 28, 2019 at 03:26:19PM +0100, Oleg Nesterov wrote:
> > On 03/27, Joel Fernandes wrote:
> > >
> > > Also, based on Kees comment, I think it appears to me that get_pid and
> > > put_pid can race in this way in the original code right?
> > >
> > > get_pid   put_pid
> > >
> > >   atomic_dec_and_test returns 1
> > > atomic_inc
> > >   kfree
> > >
> > > deref pid /* boom */
> > > -
> > >
> > > I think get_pid needs to call atomic_inc_not_zero()
> > 
> > No.
> > 
> > get_pid() should only be used if you already have a reference or you do
> > something like
> > 
> > rcu_read_lock();
> > pid = find_vpid();
> > get_pid();
> > rcu_read_lock();
> > 
> > in this case we rely on call_rcu(delayed_put_pid) which drops the initial
> > reference.
> > 
> > If put_pid() sees pid->count == 1, then a) nobody else has a reference and
> > b) nobody else can find this pid on rcu-protected lists, so it is safe to
> > free it.
> 
> I agree. Check my reply to Jann, I already replied to him about this. thanks!
> 

Also Oleg, why not just call refcount_dec_and_test like below? If count is 1,
then it will decrement to 0 and return true anyway. Is this because we want
to avoid writes at the cost of more reads? Did I miss something? Thank you.

I don't remember very clearly, but I think Kees also asked about the same thing.

diff --git a/kernel/pid.c b/kernel/pid.c
index 2095c7da644d..89c4849fab5d 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -106,8 +106,7 @@ void put_pid(struct pid *pid)
return;
 
ns = pid->numbers[pid->level].ns;
-   if ((refcount_read(&pid->count) == 1) ||
-refcount_dec_and_test(&pid->count)) {
+   if (refcount_dec_and_test(&pid->count)) {
kmem_cache_free(ns->pid_cachep, pid);
put_pid_ns(ns);
}


Re: [PATCH 1/1] sched/clock: Prevent generic sched_clock wrap caused by tick_freeze()

2019-03-28 Thread Chang-An Chen
On Mon, 2019-03-25 at 21:54 +0800, Thomas Gleixner wrote:
> On Mon, 25 Mar 2019, Chang-An Chen wrote:
> > --- a/include/linux/sched_clock.h
> > +++ b/include/linux/sched_clock.h
> > @@ -13,6 +13,10 @@
> >  
> >  extern void sched_clock_register(u64 (*read)(void), int bits,
> >  unsigned long rate);
> > +
> > +extern int sched_clock_suspend(void);
> > +
> > +extern void sched_clock_resume(void);
> >  #else
> >  static inline void generic_sched_clock_init(void) { }
> >  
> > @@ -20,6 +24,10 @@ static inline void sched_clock_register(u64 
> > (*read)(void), int bits,
> > unsigned long rate)
> >  {
> >  }
> > +
> > +static int sched_clock_suspend(void) { }
> 
> static inline ...
> 
> > +
> > +static void sched_clock_resume(void) { }
> 
> Ditto
> 
> >  #endif
> 
> Please do not expose this in the global header. All of this is local to
> kernel/time/. So adding this to kernel/time/timekeeping.h is sufficient.
Thanks so much for review and the suggestion, I'll fix it in next
version.

Thanks,
Chang-An
> 
> Thanks,
> 
>   tglx
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek





Re: [PATCH] Convert struct pid count to refcount_t

2019-03-28 Thread Joel Fernandes
On Thu, Mar 28, 2019 at 04:00:52PM -0400, Joel Fernandes wrote:
> On Thu, Mar 28, 2019 at 04:17:50PM +0100, Jann Horn wrote:
> > Since we're just talking about RCU stuff now, adding Paul McKenney to
> > the thread.
> > 
> > On Thu, Mar 28, 2019 at 3:37 PM Joel Fernandes  
> > wrote:
> > > On Thu, Mar 28, 2019 at 03:57:44AM +0100, Jann Horn wrote:
> > > > On Thu, Mar 28, 2019 at 3:34 AM Joel Fernandes  
> > > > wrote:
> > > > > On Thu, Mar 28, 2019 at 01:59:45AM +0100, Jann Horn wrote:
> > > > > > On Thu, Mar 28, 2019 at 1:06 AM Kees Cook  
> > > > > > wrote:
> > > > > > > On Wed, Mar 27, 2019 at 7:53 AM Joel Fernandes (Google)
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > struct pid's count is an atomic_t field used as a refcount. Use
> > > > > > > > refcount_t for it which is basically atomic_t but does 
> > > > > > > > additional
> > > > > > > > checking to prevent use-after-free bugs. No change in behavior 
> > > > > > > > if
> > > > > > > > CONFIG_REFCOUNT_FULL=n.
> > > > > > > >
> > > > > > > > Cc: keesc...@chromium.org
> > > > > > > > Cc: kernel-t...@android.com
> > > > > > > > Cc: kernel-harden...@lists.openwall.com
> > > > > > > > Signed-off-by: Joel Fernandes (Google) 
> > > > > > > > [...]
> > > > > > > > diff --git a/kernel/pid.c b/kernel/pid.c
> > > > > > > > index 20881598bdfa..2095c7da644d 100644
> > > > > > > > --- a/kernel/pid.c
> > > > > > > > +++ b/kernel/pid.c
> > > > > > > > @@ -37,7 +37,7 @@
> > > > > > > >  #include 
> > > > > > > >  #include 
> > > > > > > >  #include 
> > > > > > > > -#include 
> > > > > > > > +#include 
> > > > > > > >  #include 
> > > > > > > >  #include 
> > > > > > > >
> > > > > > > > @@ -106,8 +106,8 @@ void put_pid(struct pid *pid)
> > > > > > > > return;
> > > > > > > >
> > > > > > > > ns = pid->numbers[pid->level].ns;
> > > > > > > > -   if ((atomic_read(&pid->count) == 1) ||
> > > > > > > > -atomic_dec_and_test(&pid->count)) {
> > > > > > > > +   if ((refcount_read(&pid->count) == 1) ||
> > > > > > > > +refcount_dec_and_test(&pid->count)) {
> > > > > > >
> > > > > > > Why is this (and the original code) safe in the face of a race 
> > > > > > > against
> > > > > > > get_pid()? i.e. shouldn't this only use refcount_dec_and_test()? I
> > > > > > > don't see this code pattern anywhere else in the kernel.
> > > > > >
> > > > > > Semantically, it doesn't make a difference whether you do this or
> > > > > > leave out the "refcount_read(&pid->count) == 1". If you read a 1 
> > > > > > from
> > > > > > refcount_read(), then you have the only reference to "struct pid", 
> > > > > > and
> > > > > > therefore you want to free it. If you don't get a 1, you have to
> > > > > > atomically drop a reference, which, if someone else is concurrently
> > > > > > also dropping a reference, may leave you with the last reference (in
> > > > > > the case where refcount_dec_and_test() returns true), in which case
> > > > > > you still have to take care of freeing it.
> > > > >
> > > > > Also, based on Kees comment, I think it appears to me that get_pid and
> > > > > put_pid can race in this way in the original code right?
> > > > >
> > > > > get_pid put_pid
> > > > >
> > > > > atomic_dec_and_test returns 1
> > > >
> > > > This can't happen. get_pid() can only be called on an existing
> > > > reference. If you are calling get_pid() on an existing reference, and
> > > > someone else is dropping another reference with put_pid(), then when
> > > > both functions start running, the refcount must be at least 2.
> > >
> > > Sigh, you are right. Ok. I was quite tired last night when I wrote this.
> > > Obviously, I should have waited a bit and thought it through.
> > >
> > > Kees can you describe more the race you had in mind?
> > >
> > > > > atomic_inc
> > > > > kfree
> > > > >
> > > > > deref pid /* boom */
> > > > > -
> > > > >
> > > > > I think get_pid needs to call atomic_inc_not_zero() and put_pid should
> > > > > not test for pid->count == 1 as condition for freeing, but rather 
> > > > > just do
> > > > > atomic_dec_and_test. So something like the following diff. (And I see 
> > > > > a
> > > > > similar pattern used in drivers/net/mac.c)
> > > >
> > > > get_pid() can only be called when you already have a refcounted
> > > > reference; in other words, when the reference count is at least one.
> > > > The lifetime management of struct pid differs from the lifetime
> > > > management of most other objects in the kernel; the usual patterns
> > > > don't quite apply here.
> > > >
> > > > Look at put_pid(): When the refcount has reached zero, there is no RCU
> > > > grace period (unlike most other objects with RCU-managed lifetimes).
> > > > Instead, free_pid() has an RCU grace period *before* it invokes
> > > > delayed_put_pid() to drop a reference; and free_pid() is also the
> > > > function that removes a PID f

Re: [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 11:21:00AM -0700, Ira Weiny wrote:
> On Thu, Mar 28, 2019 at 09:50:03PM -0400, Jerome Glisse wrote:
> > On Thu, Mar 28, 2019 at 06:18:35PM -0700, John Hubbard wrote:
> > > On 3/28/19 6:00 PM, Jerome Glisse wrote:
> > > > On Thu, Mar 28, 2019 at 09:57:09AM -0700, Ira Weiny wrote:
> > > >> On Thu, Mar 28, 2019 at 05:39:26PM -0700, John Hubbard wrote:
> > > >>> On 3/28/19 2:21 PM, Jerome Glisse wrote:
> > >  On Thu, Mar 28, 2019 at 01:43:13PM -0700, John Hubbard wrote:
> > > > On 3/28/19 12:11 PM, Jerome Glisse wrote:
> > > >> On Thu, Mar 28, 2019 at 04:07:20AM -0700, Ira Weiny wrote:
> > > >>> On Mon, Mar 25, 2019 at 10:40:02AM -0400, Jerome Glisse wrote:
> > >  From: Jérôme Glisse 
> > > >>> [...]
> > >  @@ -67,14 +78,9 @@ struct hmm {
> > >    */
> > >   static struct hmm *hmm_register(struct mm_struct *mm)
> > >   {
> > >  -struct hmm *hmm = READ_ONCE(mm->hmm);
> > >  +struct hmm *hmm = mm_get_hmm(mm);
> > > >>>
> > > >>> FWIW: having hmm_register == "hmm get" is a bit confusing...
> > > >>
> > > >> The thing is that you want only one hmm struct per process and thus
> > > >> if there is already one and it is not being destroy then you want 
> > > >> to
> > > >> reuse it.
> > > >>
> > > >> Also this is all internal to HMM code and so it should not confuse
> > > >> anyone.
> > > >>
> > > >
> > > > Well, it has repeatedly come up, and I'd claim that it is quite 
> > > > counter-intuitive. So if there is an easy way to make this internal 
> > > > HMM code clearer or better named, I would really love that to 
> > > > happen.
> > > >
> > > > And we shouldn't ever dismiss feedback based on "this is just 
> > > > internal
> > > > xxx subsystem code, no need for it to be as clear as other parts of 
> > > > the
> > > > kernel", right?
> > > 
> > >  Yes but i have not seen any better alternative that present code. If
> > >  there is please submit patch.
> > > 
> > > >>>
> > > >>> Ira, do you have any patch you're working on, or a more detailed 
> > > >>> suggestion there?
> > > >>> If not, then I might (later, as it's not urgent) propose a small 
> > > >>> cleanup patch 
> > > >>> I had in mind for the hmm_register code. But I don't want to 
> > > >>> duplicate effort 
> > > >>> if you're already thinking about it.
> > > >>
> > > >> No I don't have anything.
> > > >>
> > > >> I was just really digging into these this time around and I was about 
> > > >> to
> > > >> comment on the lack of "get's" for some "puts" when I realized that
> > > >> "hmm_register" _was_ the get...
> > > >>
> > > >> :-(
> > > >>
> > > > 
> > > > The get is mm_get_hmm() were you get a reference on HMM from a mm 
> > > > struct.
> > > > John in previous posting complained about me naming that function 
> > > > hmm_get()
> > > > and thus in this version i renamed it to mm_get_hmm() as we are getting
> > > > a reference on hmm from a mm struct.
> > > 
> > > Well, that's not what I recommended, though. The actual conversation went 
> > > like
> > > this [1]:
> > > 
> > > ---
> > > >> So for this, hmm_get() really ought to be symmetric with
> > > >> hmm_put(), by taking a struct hmm*. And the null check is
> > > >> not helping here, so let's just go with this smaller version:
> > > >>
> > > >> static inline struct hmm *hmm_get(struct hmm *hmm)
> > > >> {
> > > >> if (kref_get_unless_zero(&hmm->kref))
> > > >> return hmm;
> > > >>
> > > >> return NULL;
> > > >> }
> > > >>
> > > >> ...and change the few callers accordingly.
> > > >>
> > > >
> > > > What about renaning hmm_get() to mm_get_hmm() instead ?
> > > >
> > > 
> > > For a get/put pair of functions, it would be ideal to pass
> > > the same argument type to each. It looks like we are passing
> > > around hmm*, and hmm retains a reference count on hmm->mm,
> > > so I think you have a choice of using either mm* or hmm* as
> > > the argument. I'm not sure that one is better than the other
> > > here, as the lifetimes appear to be linked pretty tightly.
> > > 
> > > Whichever one is used, I think it would be best to use it
> > > in both the _get() and _put() calls. 
> > > ---
> > > 
> > > Your response was to change the name to mm_get_hmm(), but that's not
> > > what I recommended.
> > 
> > Because i can not do that, hmm_put() can _only_ take hmm struct as
> > input while hmm_get() can _only_ get mm struct as input.
> > 
> > hmm_put() can only take hmm because the hmm we are un-referencing
> > might no longer be associated with any mm struct and thus i do not
> > have a mm struct to use.
> > 
> > hmm_get() can only get mm as input as we need to be careful when
> > accessing the hmm field within the mm struct and thus it is better
> > to have that code 

Re: [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2

2019-03-28 Thread Ira Weiny
On Thu, Mar 28, 2019 at 09:50:03PM -0400, Jerome Glisse wrote:
> On Thu, Mar 28, 2019 at 06:18:35PM -0700, John Hubbard wrote:
> > On 3/28/19 6:00 PM, Jerome Glisse wrote:
> > > On Thu, Mar 28, 2019 at 09:57:09AM -0700, Ira Weiny wrote:
> > >> On Thu, Mar 28, 2019 at 05:39:26PM -0700, John Hubbard wrote:
> > >>> On 3/28/19 2:21 PM, Jerome Glisse wrote:
> >  On Thu, Mar 28, 2019 at 01:43:13PM -0700, John Hubbard wrote:
> > > On 3/28/19 12:11 PM, Jerome Glisse wrote:
> > >> On Thu, Mar 28, 2019 at 04:07:20AM -0700, Ira Weiny wrote:
> > >>> On Mon, Mar 25, 2019 at 10:40:02AM -0400, Jerome Glisse wrote:
> >  From: Jérôme Glisse 
> > >>> [...]
> >  @@ -67,14 +78,9 @@ struct hmm {
> >    */
> >   static struct hmm *hmm_register(struct mm_struct *mm)
> >   {
> >  -  struct hmm *hmm = READ_ONCE(mm->hmm);
> >  +  struct hmm *hmm = mm_get_hmm(mm);
> > >>>
> > >>> FWIW: having hmm_register == "hmm get" is a bit confusing...
> > >>
> > >> The thing is that you want only one hmm struct per process and thus
> > >> if there is already one and it is not being destroy then you want to
> > >> reuse it.
> > >>
> > >> Also this is all internal to HMM code and so it should not confuse
> > >> anyone.
> > >>
> > >
> > > Well, it has repeatedly come up, and I'd claim that it is quite 
> > > counter-intuitive. So if there is an easy way to make this internal 
> > > HMM code clearer or better named, I would really love that to happen.
> > >
> > > And we shouldn't ever dismiss feedback based on "this is just internal
> > > xxx subsystem code, no need for it to be as clear as other parts of 
> > > the
> > > kernel", right?
> > 
> >  Yes but i have not seen any better alternative that present code. If
> >  there is please submit patch.
> > 
> > >>>
> > >>> Ira, do you have any patch you're working on, or a more detailed 
> > >>> suggestion there?
> > >>> If not, then I might (later, as it's not urgent) propose a small 
> > >>> cleanup patch 
> > >>> I had in mind for the hmm_register code. But I don't want to duplicate 
> > >>> effort 
> > >>> if you're already thinking about it.
> > >>
> > >> No I don't have anything.
> > >>
> > >> I was just really digging into these this time around and I was about to
> > >> comment on the lack of "get's" for some "puts" when I realized that
> > >> "hmm_register" _was_ the get...
> > >>
> > >> :-(
> > >>
> > > 
> > > The get is mm_get_hmm() were you get a reference on HMM from a mm struct.
> > > John in previous posting complained about me naming that function 
> > > hmm_get()
> > > and thus in this version i renamed it to mm_get_hmm() as we are getting
> > > a reference on hmm from a mm struct.
> > 
> > Well, that's not what I recommended, though. The actual conversation went 
> > like
> > this [1]:
> > 
> > ---
> > >> So for this, hmm_get() really ought to be symmetric with
> > >> hmm_put(), by taking a struct hmm*. And the null check is
> > >> not helping here, so let's just go with this smaller version:
> > >>
> > >> static inline struct hmm *hmm_get(struct hmm *hmm)
> > >> {
> > >> if (kref_get_unless_zero(&hmm->kref))
> > >> return hmm;
> > >>
> > >> return NULL;
> > >> }
> > >>
> > >> ...and change the few callers accordingly.
> > >>
> > >
> > > What about renaning hmm_get() to mm_get_hmm() instead ?
> > >
> > 
> > For a get/put pair of functions, it would be ideal to pass
> > the same argument type to each. It looks like we are passing
> > around hmm*, and hmm retains a reference count on hmm->mm,
> > so I think you have a choice of using either mm* or hmm* as
> > the argument. I'm not sure that one is better than the other
> > here, as the lifetimes appear to be linked pretty tightly.
> > 
> > Whichever one is used, I think it would be best to use it
> > in both the _get() and _put() calls. 
> > ---
> > 
> > Your response was to change the name to mm_get_hmm(), but that's not
> > what I recommended.
> 
> Because i can not do that, hmm_put() can _only_ take hmm struct as
> input while hmm_get() can _only_ get mm struct as input.
> 
> hmm_put() can only take hmm because the hmm we are un-referencing
> might no longer be associated with any mm struct and thus i do not
> have a mm struct to use.
> 
> hmm_get() can only get mm as input as we need to be careful when
> accessing the hmm field within the mm struct and thus it is better
> to have that code within a function than open coded and duplicated
> all over the place.

The input value is not the problem.  The problem is in the naming.

obj = get_obj( various parameters );
put_obj(obj);


The problem is that the function is named hmm_register() either "gets" a
reference to _or_ creates and gets a reference to the hmm object.

What John is probably

Re: [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 07:11:17PM -0700, John Hubbard wrote:
> On 3/28/19 6:50 PM, Jerome Glisse wrote:
> [...]
> >>>
> >>> The hmm_put() is just releasing the reference on the hmm struct.
> >>>
> >>> Here i feel i am getting contradicting requirement from different people.
> >>> I don't think there is a way to please everyone here.
> >>>
> >>
> >> That's not a true conflict: you're comparing your actual implementation
> >> to Ira's request, rather than comparing my request to Ira's request.
> >>
> >> I think there's a way forward. Ira and I are actually both asking for the
> >> same thing:
> >>
> >> a) clear, concise get/put routines
> >>
> >> b) avoiding odd side effects in functions that have one name, but do
> >> additional surprising things.
> > 
> > Please show me code because i do not see any other way to do it then
> > how i did.
> > 
> 
> Sure, I'll take a run at it. I've driven you crazy enough with the naming 
> today, it's time to back it up with actual code. :)

Note that every single line in mm_get_hmm() do matter.

> I hope this is not one of those "we must also change Nouveau in N+M steps" 
> situations, though. I'm starting to despair about reviewing code that
> basically can't be changed...

It can be change but i rather not do too many in one go, each change is
like a tango with one partner and having tango with multiple partner at
once is painful much more likely to step on each other foot.

Cheers,
Jérôme


Re: [PATCH v2 09/11] mm/hmm: allow to mirror vma of a file on a DAX backed filesystem v2

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 11:04:26AM -0700, Ira Weiny wrote:
> On Mon, Mar 25, 2019 at 10:40:09AM -0400, Jerome Glisse wrote:
> > From: Jérôme Glisse 
> > 
> > HMM mirror is a device driver helpers to mirror range of virtual address.
> > It means that the process jobs running on the device can access the same
> > virtual address as the CPU threads of that process. This patch adds support
> > for mirroring mapping of file that are on a DAX block device (ie range of
> > virtual address that is an mmap of a file in a filesystem on a DAX block
> > device). There is no reason to not support such case when mirroring virtual
> > address on a device.
> > 
> > Note that unlike GUP code we do not take page reference hence when we
> > back-off we have nothing to undo.
> > 
> > Changes since v1:
> > - improved commit message
> > - squashed: Arnd Bergmann: fix unused variable warning in 
> > hmm_vma_walk_pud
> > 
> > Signed-off-by: Jérôme Glisse 
> > Reviewed-by: Ralph Campbell 
> > Cc: Andrew Morton 
> > Cc: Dan Williams 
> > Cc: John Hubbard 
> > Cc: Arnd Bergmann 
> > ---
> >  mm/hmm.c | 132 ++-
> >  1 file changed, 111 insertions(+), 21 deletions(-)
> > 
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index 64a33770813b..ce33151c6832 100644
> > --- a/mm/hmm.c
> > +++ b/mm/hmm.c
> > @@ -325,6 +325,7 @@ EXPORT_SYMBOL(hmm_mirror_unregister);
> >  
> >  struct hmm_vma_walk {
> > struct hmm_range*range;
> > +   struct dev_pagemap  *pgmap;
> > unsigned long   last;
> > boolfault;
> > boolblock;
> > @@ -499,6 +500,15 @@ static inline uint64_t pmd_to_hmm_pfn_flags(struct 
> > hmm_range *range, pmd_t pmd)
> > range->flags[HMM_PFN_VALID];
> >  }
> >  
> > +static inline uint64_t pud_to_hmm_pfn_flags(struct hmm_range *range, pud_t 
> > pud)
> > +{
> > +   if (!pud_present(pud))
> > +   return 0;
> > +   return pud_write(pud) ? range->flags[HMM_PFN_VALID] |
> > +   range->flags[HMM_PFN_WRITE] :
> > +   range->flags[HMM_PFN_VALID];
> > +}
> > +
> >  static int hmm_vma_handle_pmd(struct mm_walk *walk,
> >   unsigned long addr,
> >   unsigned long end,
> > @@ -520,8 +530,19 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk,
> > return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
> >  
> > pfn = pmd_pfn(pmd) + pte_index(addr);
> > -   for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++)
> > +   for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) {
> > +   if (pmd_devmap(pmd)) {
> > +   hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
> > + hmm_vma_walk->pgmap);
> > +   if (unlikely(!hmm_vma_walk->pgmap))
> > +   return -EBUSY;
> > +   }
> > pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags;
> > +   }
> > +   if (hmm_vma_walk->pgmap) {
> > +   put_dev_pagemap(hmm_vma_walk->pgmap);
> > +   hmm_vma_walk->pgmap = NULL;
> > +   }
> > hmm_vma_walk->last = end;
> > return 0;
> >  }
> > @@ -608,10 +629,24 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
> > unsigned long addr,
> > if (fault || write_fault)
> > goto fault;
> >  
> > +   if (pte_devmap(pte)) {
> > +   hmm_vma_walk->pgmap = get_dev_pagemap(pte_pfn(pte),
> > + hmm_vma_walk->pgmap);
> > +   if (unlikely(!hmm_vma_walk->pgmap))
> > +   return -EBUSY;
> > +   } else if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pte_special(pte)) 
> > {
> > +   *pfn = range->values[HMM_PFN_SPECIAL];
> > +   return -EFAULT;
> > +   }
> > +
> > *pfn = hmm_pfn_from_pfn(range, pte_pfn(pte)) | cpu_flags;
> 
>   
> 
> > return 0;
> >  
> >  fault:
> > +   if (hmm_vma_walk->pgmap) {
> > +   put_dev_pagemap(hmm_vma_walk->pgmap);
> > +   hmm_vma_walk->pgmap = NULL;
> > +   }
> > pte_unmap(ptep);
> > /* Fault any virtual address we were asked to fault */
> > return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
> > @@ -699,12 +734,83 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> > return r;
> > }
> > }
> > +   if (hmm_vma_walk->pgmap) {
> > +   put_dev_pagemap(hmm_vma_walk->pgmap);
> > +   hmm_vma_walk->pgmap = NULL;
> > +   }
> 
> 
> Why is this here and not in hmm_vma_handle_pte()?  Unless I'm just getting
> tired this is the corresponding put when hmm_vma_handle_pte() returns 0 above
> at  above.

This is because get_dev_pagemap() optimize away the reference getting
if we already hold a reference on the correct dev_pagemap. So if we
were releasing the reference within hmm_vma_handle_pte() then we would
loose the get_dev_pagemap() optimization.

Cheers,
Jérôme


[PATCH v2 1/2] mtd: maps: physmap: Store gpio_values correctly

2019-03-28 Thread Chris Packham
When the gpio-addr-flash.c driver was merged with physmap-core.c the
code to store the current gpio_values was lost. This meant that once a
gpio was asserted it was never de-asserted. Fix this by storing the
current offset in gpio_values like the old driver used to.

Fixes: commit ba32ce95cbd9 ("mtd: maps: Merge gpio-addr-flash.c into 
physmap-core.c")
Cc: 
Signed-off-by: Chris Packham 
Reviewed-by: Boris Brezillon 
---
Changes in v2:
- Cc stable, add Boris' review

 drivers/mtd/maps/physmap-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mtd/maps/physmap-core.c b/drivers/mtd/maps/physmap-core.c
index d9a3e4bebe5d..21b556afc305 100644
--- a/drivers/mtd/maps/physmap-core.c
+++ b/drivers/mtd/maps/physmap-core.c
@@ -132,6 +132,8 @@ static void physmap_set_addr_gpios(struct 
physmap_flash_info *info,
 
gpiod_set_value(info->gpios->desc[i], !!(BIT(i) & ofs));
}
+
+   info->gpio_values = ofs;
 }
 
 #define win_mask(order)(BIT(order) - 1)
-- 
2.21.0



[PATCH v2 2/2] dt-binding: mtd: physmap: Add example using addr-gpios property

2019-03-28 Thread Chris Packham
Add an example showing how to use the addr-gpios property to deal with a
system with limited IO space.

Cc: devicet...@vger.kernel.org
Signed-off-by: Chris Packham 
---
Changes in v2:
- None

 .../devicetree/bindings/mtd/mtd-physmap.txt  | 16 
 1 file changed, 16 insertions(+)

diff --git a/Documentation/devicetree/bindings/mtd/mtd-physmap.txt 
b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt
index 7df0dcaccb7d..c69f4f065d23 100644
--- a/Documentation/devicetree/bindings/mtd/mtd-physmap.txt
+++ b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt
@@ -96,3 +96,19 @@ An example using SRAM:
bank-width = <2>;
};
 
+An example using gpio-addrs
+
+   flash@2000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "cfi-flash", "jedec-flash";
+   reg = <0x2000 0x0200>;
+   ranges = <0 0x 0x0200
+ 1 0x0200 0x0200>;
+   bank-width = <2>;
+   addr-gpios = <&gpio1 2 GPIO_ACTIVE_HIGH>;
+   partition@0 {
+   label = "test-part1";
+   reg = <0 0x0400>;
+   };
+   };
-- 
2.21.0



[PATCH v2 0/2] mtd: physmap: Using gpio-addrs

2019-03-28 Thread Chris Packham
I have a system with ADDR24 of the flash chip connected to a gpio pin. This
series fixes a bug in physmap-core.c and adds an example for using gpio-addrs
to the dt-binding based on the system I'm using.

Chris Packham (2):
  mtd: maps: physmap: Store gpio_values correctly
  dt-binding: mtd: physmap: Add example using addr-gpios property

 .../devicetree/bindings/mtd/mtd-physmap.txt  | 16 
 drivers/mtd/maps/physmap-core.c  |  2 ++
 2 files changed, 18 insertions(+)

-- 
2.21.0



Re: [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2

2019-03-28 Thread John Hubbard
On 3/28/19 6:50 PM, Jerome Glisse wrote:
[...]
>>>
>>> The hmm_put() is just releasing the reference on the hmm struct.
>>>
>>> Here i feel i am getting contradicting requirement from different people.
>>> I don't think there is a way to please everyone here.
>>>
>>
>> That's not a true conflict: you're comparing your actual implementation
>> to Ira's request, rather than comparing my request to Ira's request.
>>
>> I think there's a way forward. Ira and I are actually both asking for the
>> same thing:
>>
>> a) clear, concise get/put routines
>>
>> b) avoiding odd side effects in functions that have one name, but do
>> additional surprising things.
> 
> Please show me code because i do not see any other way to do it then
> how i did.
> 

Sure, I'll take a run at it. I've driven you crazy enough with the naming 
today, it's time to back it up with actual code. :)

I hope this is not one of those "we must also change Nouveau in N+M steps" 
situations, though. I'm starting to despair about reviewing code that
basically can't be changed...

thanks,
-- 
John Hubbard
NVIDIA


Re: [PATCH v2 07/11] mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays.

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 07:05:21PM -0700, John Hubbard wrote:
> On 3/28/19 6:59 PM, Jerome Glisse wrote:
> >> [...]
> > Indeed I did not realize there is an hmm "pfn" until I saw this 
> > function:
> >
> > /*
> >  * hmm_pfn_from_pfn() - create a valid HMM pfn value from pfn
> >  * @range: range use to encode HMM pfn value
> >  * @pfn: pfn value for which to create the HMM pfn
> >  * Returns: valid HMM pfn for the pfn
> >  */
> > static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
> > unsigned long pfn)
> >
> > So should this patch contain some sort of helper like this... maybe?
> >
> > I'm assuming the "hmm_pfn" being returned above is the device pfn being
> > discussed here?
> >
> > I'm also thinking calling it pfn is confusing.  I'm not advocating a 
> > new type
> > but calling the "device pfn's" "hmm_pfn" or "device_pfn" seems like it 
> > would
> > have shortened the discussion here.
> >
> 
>  That helper is also use today by nouveau so changing that name is not 
>  that
>  easy it does require the multi-release dance. So i am not sure how much
>  value there is in a name change.
> 
> >>>
> >>> Once the dust settles, I would expect that a name change for this could go
> >>> via Andrew's tree, right? It seems incredible to claim that we've built 
> >>> something
> >>> that effectively does not allow any minor changes!
> >>>
> >>> I do think it's worth some *minor* trouble to improve the name, assuming 
> >>> that we
> >>> can do it in a simple patch, rather than some huge maintainer-level 
> >>> effort.
> >>
> >> Change to nouveau have to go through nouveau tree so changing name means:
> 
> Yes, I understand the guideline, but is that always how it must be done? Ben 
> (+cc)?

Yes, it is not only about nouveau, it will be about every single
upstream driver using HMM. It is the easiest solution all other
solution involve coordination and/or risk of people that handle
the conflict to do something that break things.

Cheers,
Jérôme


Re: [PATCH v2 09/11] mm/hmm: allow to mirror vma of a file on a DAX backed filesystem v2

2019-03-28 Thread Ira Weiny
On Mon, Mar 25, 2019 at 10:40:09AM -0400, Jerome Glisse wrote:
> From: Jérôme Glisse 
> 
> HMM mirror is a device driver helpers to mirror range of virtual address.
> It means that the process jobs running on the device can access the same
> virtual address as the CPU threads of that process. This patch adds support
> for mirroring mapping of file that are on a DAX block device (ie range of
> virtual address that is an mmap of a file in a filesystem on a DAX block
> device). There is no reason to not support such case when mirroring virtual
> address on a device.
> 
> Note that unlike GUP code we do not take page reference hence when we
> back-off we have nothing to undo.
> 
> Changes since v1:
> - improved commit message
> - squashed: Arnd Bergmann: fix unused variable warning in hmm_vma_walk_pud
> 
> Signed-off-by: Jérôme Glisse 
> Reviewed-by: Ralph Campbell 
> Cc: Andrew Morton 
> Cc: Dan Williams 
> Cc: John Hubbard 
> Cc: Arnd Bergmann 
> ---
>  mm/hmm.c | 132 ++-
>  1 file changed, 111 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 64a33770813b..ce33151c6832 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -325,6 +325,7 @@ EXPORT_SYMBOL(hmm_mirror_unregister);
>  
>  struct hmm_vma_walk {
>   struct hmm_range*range;
> + struct dev_pagemap  *pgmap;
>   unsigned long   last;
>   boolfault;
>   boolblock;
> @@ -499,6 +500,15 @@ static inline uint64_t pmd_to_hmm_pfn_flags(struct 
> hmm_range *range, pmd_t pmd)
>   range->flags[HMM_PFN_VALID];
>  }
>  
> +static inline uint64_t pud_to_hmm_pfn_flags(struct hmm_range *range, pud_t 
> pud)
> +{
> + if (!pud_present(pud))
> + return 0;
> + return pud_write(pud) ? range->flags[HMM_PFN_VALID] |
> + range->flags[HMM_PFN_WRITE] :
> + range->flags[HMM_PFN_VALID];
> +}
> +
>  static int hmm_vma_handle_pmd(struct mm_walk *walk,
> unsigned long addr,
> unsigned long end,
> @@ -520,8 +530,19 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk,
>   return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
>  
>   pfn = pmd_pfn(pmd) + pte_index(addr);
> - for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++)
> + for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) {
> + if (pmd_devmap(pmd)) {
> + hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
> +   hmm_vma_walk->pgmap);
> + if (unlikely(!hmm_vma_walk->pgmap))
> + return -EBUSY;
> + }
>   pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags;
> + }
> + if (hmm_vma_walk->pgmap) {
> + put_dev_pagemap(hmm_vma_walk->pgmap);
> + hmm_vma_walk->pgmap = NULL;
> + }
>   hmm_vma_walk->last = end;
>   return 0;
>  }
> @@ -608,10 +629,24 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
> unsigned long addr,
>   if (fault || write_fault)
>   goto fault;
>  
> + if (pte_devmap(pte)) {
> + hmm_vma_walk->pgmap = get_dev_pagemap(pte_pfn(pte),
> +   hmm_vma_walk->pgmap);
> + if (unlikely(!hmm_vma_walk->pgmap))
> + return -EBUSY;
> + } else if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pte_special(pte)) 
> {
> + *pfn = range->values[HMM_PFN_SPECIAL];
> + return -EFAULT;
> + }
> +
>   *pfn = hmm_pfn_from_pfn(range, pte_pfn(pte)) | cpu_flags;



>   return 0;
>  
>  fault:
> + if (hmm_vma_walk->pgmap) {
> + put_dev_pagemap(hmm_vma_walk->pgmap);
> + hmm_vma_walk->pgmap = NULL;
> + }
>   pte_unmap(ptep);
>   /* Fault any virtual address we were asked to fault */
>   return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
> @@ -699,12 +734,83 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>   return r;
>   }
>   }
> + if (hmm_vma_walk->pgmap) {
> + put_dev_pagemap(hmm_vma_walk->pgmap);
> + hmm_vma_walk->pgmap = NULL;
> + }


Why is this here and not in hmm_vma_handle_pte()?  Unless I'm just getting
tired this is the corresponding put when hmm_vma_handle_pte() returns 0 above
at  above.

Ira



Re: [PATCH v2 07/11] mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays.

2019-03-28 Thread John Hubbard
On 3/28/19 6:59 PM, Jerome Glisse wrote:
>> [...]
> Indeed I did not realize there is an hmm "pfn" until I saw this function:
>
> /*
>  * hmm_pfn_from_pfn() - create a valid HMM pfn value from pfn
>  * @range: range use to encode HMM pfn value
>  * @pfn: pfn value for which to create the HMM pfn
>  * Returns: valid HMM pfn for the pfn
>  */
> static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
> unsigned long pfn)
>
> So should this patch contain some sort of helper like this... maybe?
>
> I'm assuming the "hmm_pfn" being returned above is the device pfn being
> discussed here?
>
> I'm also thinking calling it pfn is confusing.  I'm not advocating a new 
> type
> but calling the "device pfn's" "hmm_pfn" or "device_pfn" seems like it 
> would
> have shortened the discussion here.
>

 That helper is also use today by nouveau so changing that name is not that
 easy it does require the multi-release dance. So i am not sure how much
 value there is in a name change.

>>>
>>> Once the dust settles, I would expect that a name change for this could go
>>> via Andrew's tree, right? It seems incredible to claim that we've built 
>>> something
>>> that effectively does not allow any minor changes!
>>>
>>> I do think it's worth some *minor* trouble to improve the name, assuming 
>>> that we
>>> can do it in a simple patch, rather than some huge maintainer-level effort.
>>
>> Change to nouveau have to go through nouveau tree so changing name means:

Yes, I understand the guideline, but is that always how it must be done? Ben 
(+cc)?

>>  -  release N add function with new name, maybe make the old function just
>> a wrapper to the new function
>>  -  release N+1 update user to use the new name
>>  -  release N+2 remove the old name
>>
>> So it is do-able but it is painful so i rather do that one latter that now
>> as i am sure people will then complain again about some little thing and it
>> will post pone this whole patchset on that new bit. To avoid post-poning
>> RDMA and bunch of other patchset that build on top of that i rather get
>> this patchset in and then do more changes in the next cycle.
>>
>> This is just a capacity thing.
> 
> Also for clarity changes to API i am doing in this patchset is to make
> the ODP convertion easier and thus they bring a real hard value. Renaming
> those function is esthetic, i am not saying it is useless, i am saying it
> does not have the same value as those other changes and i would rather not
> miss another merge window just for esthetic changes.
> 

Agreed, that this minor point should not hold up this patch.

thanks,
-- 
John Hubbard
NVIDIA


[PATCH] max98357a:add dai without triggered by pcm

2019-03-28 Thread tony . zou . hq
From: Tony Zou 

max98357a's enable pin need setting independently
when max98357a is shared I2S with other codec.

add dai "max98357a-hifi" without pcm trigger,
and use "Spk PA Switch" to set the enable pin.

Signed-off-by: Tony Zou 
---
 sound/soc/codecs/max98357a.c |   94 ++
 1 file changed, 77 insertions(+), 17 deletions(-)

diff --git a/sound/soc/codecs/max98357a.c b/sound/soc/codecs/max98357a.c
index d469576..bd3e77b 100644
--- a/sound/soc/codecs/max98357a.c
+++ b/sound/soc/codecs/max98357a.c
@@ -51,12 +51,52 @@ static int max98357a_daiops_trigger(struct 
snd_pcm_substream *substream,
return 0;
 }
 
+static const char * const ext_spk_text[] = {
+   "Off", "On"
+};
+
+static const struct soc_enum ext_spk_enum =
+   SOC_ENUM_SINGLE(SND_SOC_NOPM, 0,
+   ARRAY_SIZE(ext_spk_text), ext_spk_text);
+
+
+static const struct snd_kcontrol_new ext_spk_mux =
+   SOC_DAPM_ENUM("Spk PA Switch", ext_spk_enum);
+
+
+static int max98357a_enable_spk_pa(struct snd_soc_dapm_widget *w,
+   struct snd_kcontrol *kcontrol, int event)
+{
+   struct snd_soc_component *cmpnt = snd_soc_dapm_to_component(w->dapm);
+   struct gpio_desc *sdmode = snd_soc_component_get_drvdata(cmpnt);
+
+   if (!sdmode)
+   return 0;
+
+   switch (event) {
+   case SND_SOC_DAPM_POST_PMU:
+   gpiod_set_value(sdmode, 1);
+   break;
+   case SND_SOC_DAPM_PRE_PMD:
+   gpiod_set_value(sdmode, 0);
+   break;
+   }
+   return 0;
+}
+
+
 static const struct snd_soc_dapm_widget max98357a_dapm_widgets[] = {
SND_SOC_DAPM_OUTPUT("Speaker"),
+   SND_SOC_DAPM_SPK("Spk PA", max98357a_enable_spk_pa),
+   SND_SOC_DAPM_MUX("Spk PA Switch", SND_SOC_NOPM, 0, 0,
+   &ext_spk_mux),
 };
 
 static const struct snd_soc_dapm_route max98357a_dapm_routes[] = {
{"Speaker", NULL, "HiFi Playback"},
+   {"Speaker", NULL, "Spk PA"},
+   {"Spk PA", NULL, "Spk PA Switch"},
+   {"Spk PA Switch", "On", "HiFi Playback1"},
 };
 
 static int max98357a_component_probe(struct snd_soc_component *component)
@@ -88,30 +128,50 @@ static int max98357a_component_probe(struct 
snd_soc_component *component)
.trigger= max98357a_daiops_trigger,
 };
 
-static struct snd_soc_dai_driver max98357a_dai_driver = {
-   .name = "HiFi",
-   .playback = {
-   .stream_name= "HiFi Playback",
-   .formats= SNDRV_PCM_FMTBIT_S16 |
-   SNDRV_PCM_FMTBIT_S24 |
-   SNDRV_PCM_FMTBIT_S32,
-   .rates  = SNDRV_PCM_RATE_8000 |
-   SNDRV_PCM_RATE_16000 |
-   SNDRV_PCM_RATE_48000 |
-   SNDRV_PCM_RATE_96000,
-   .rate_min   = 8000,
-   .rate_max   = 96000,
-   .channels_min   = 1,
-   .channels_max   = 2,
+static struct snd_soc_dai_driver max98357a_dai_driver[] = {
+   {
+   .name = "HiFi",
+   .playback = {
+   .stream_name= "HiFi Playback",
+   .formats= SNDRV_PCM_FMTBIT_S16 |
+   SNDRV_PCM_FMTBIT_S24 |
+   SNDRV_PCM_FMTBIT_S32,
+   .rates  = SNDRV_PCM_RATE_8000 |
+   SNDRV_PCM_RATE_16000 |
+   SNDRV_PCM_RATE_48000 |
+   SNDRV_PCM_RATE_96000,
+   .rate_min   = 8000,
+   .rate_max   = 96000,
+   .channels_min   = 1,
+   .channels_max   = 2,
+   },
+   .ops= &max98357a_dai_ops,
+   },
+   {
+   .name = "max98357a-hifi",
+   .playback = {
+   .stream_name= "HiFi Playback1",
+   .formats= SNDRV_PCM_FMTBIT_S16 |
+   SNDRV_PCM_FMTBIT_S24 |
+   SNDRV_PCM_FMTBIT_S32,
+   .rates  = SNDRV_PCM_RATE_8000 |
+   SNDRV_PCM_RATE_16000 |
+   SNDRV_PCM_RATE_48000 |
+   SNDRV_PCM_RATE_96000,
+   .rate_min   = 8000,
+   .rate_max   = 96000,
+   .channels_min   = 1,
+   .channels_max   = 2,
+   },
+   .ops= NULL,
},
-   .ops= &max98357a_dai_ops,
 };
 
 static int max98357a_platform_probe(struct platform_device 

Re: [PATCH RFC] KVM: x86: vmx: throttle immediate exit through preemtion timer to assist buggy guests

2019-03-28 Thread Liran Alon



> On 28 Mar 2019, at 22:31, Vitaly Kuznetsov  wrote:
> 
> This is embarassing but we have another Windows/Hyper-V issue to workaround
> in KVM (or QEMU). Hope "RFC" makes it less offensive.
> 
> It was noticed that Hyper-V guest on q35 KVM/QEMU VM hangs on boot if e.g.
> 'piix4-usb-uhci' device is attached. The problem with this device is that
> it uses level-triggered interrupts.
> 
> The 'hang' scenario develops like this:
> 1) Hyper-V boots and QEMU is trying to inject two irq simultaneously. One
> of them is level-triggered. KVM injects the edge-triggered one and
> requests immediate exit to inject the level-triggered:
> 
> kvm_set_irq:  gsi 23 level 1 source 0
> kvm_msi_set_irq:  dst 0 vec 80 (Fixed|physical|level)
> kvm_apic_accept_irq:  apicid 0 vec 80 (Fixed|edge)
> kvm_msi_set_irq:  dst 0 vec 96 (Fixed|physical|edge)
> kvm_apic_accept_irq:  apicid 0 vec 96 (Fixed|edge)
> kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 
> 8060 int_info_err 0

There is no immediate-exit here.
Here QEMU just set two pending irqs: vector 80 and vector 96.
Because vCPU 0 is running at non-root-mode, KVM emulates an exit from L2 to L1 
on EXTERNAL_INTERRUPT.
Note that EXTERNAL_INTERRUPT is emulated on vector 0x60==96 which is the higher 
vector which is pending which is correct.

BTW, I don’t know why both are set in LAPIC as edge-triggered and not 
level-triggered.
But it can be seen from trace pattern that these interrupts are both 
level-triggered. (See QEMU’s ioapic_service()).
How did you deduce that one is edge-triggered and the other is level-triggered?

> 
> 2) Hyper-V requires one of its VMs to run to handle the situation but
> immediate exit happens:
> 
> kvm_entry:vcpu 0
> kvm_exit: reason VMRESUME rip 0xf80006a40115 info 0 0
> kvm_entry:vcpu 0
> kvm_exit: reason PREEMPTION_TIMER rip 0xf8022f3d8350 info 0 0
> kvm_nested_vmexit:rip f8022f3d8350 reason PREEMPTION_TIMER info1 0 
> info2 0 int_info 0 int_info_err 0
> kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 
> 8050 int_info_err 0

I assume that as part of Hyper-V VMExit handler for EXTERNAL_INTERRUPT, it will 
forward the interrupt to the host.
As done in KVM vcpu_enter_guest() calling kvm_x86_ops->handle_external_intr().
Because vmcs->vm_exit_intr_info specifies vector 96, we are still left with 
vector 80 pending.

I also assume that Hyper-V utilise VM_EXIT_ACK_INTR_ON_EXIT and thus vector 96 
is cleared from LAPIC IRR
and the bit in LAPIC ISR for vector 96 is set.
This is emulated by L0 KVM at nested_vmx_vmexit() -> kvm_cpu_get_interrupt().

I further assume that at the point that vector 96 runs in L1, interrupts are 
disabled.
Afterwards I would expect L1 to enable interrupts (Similar to 
vcpu_enter_guest() calling local_irq_enable() after 
kvm_x86_ops->handle_external_intr()).

I would expect Hyper-V handler for vector 96 at some point to do EOI such that 
when interrupts are later enabled, vector 80 will also get injected.
All of this before attempting to resume back into L2.

However, it can be seen that indeed at this resume, you receive, after an 
immediate-exit, an injection of EXTERNAL_INTERRUPT on vector 0x50==80.
As if Hyper-V never enabled interrupts after handling vector 96 before doing a 
resume again to L2.

This is still valid of course but just a bit bizarre and inefficient. Oh well. 
:)

> 
> 3) Hyper-V doesn't want to deal with the second irq (as its VM still didn't
> process the first one)

Both interrupts are for L1 not L2.

> so it just does 'EOI' for level-triggered interrupt
> and this causes re-injection:
> 
> kvm_exit: reason EOI_INDUCED rip 0xf80006a17e1a info 50 0
> kvm_eoi:  apicid 0 vector 80
> kvm_userspace_exit:   reason KVM_EXIT_IOAPIC_EOI (26)
> kvm_set_irq:  gsi 23 level 1 source 0
> kvm_msi_set_irq:  dst 0 vec 80 (Fixed|physical|level)
> kvm_apic_accept_irq:  apicid 0 vec 80 (Fixed|edge)
> kvm_entry:vcpu 0

What happens here is that Hyper-V as a response to second EXTERNAL_INTERRUPT on 
vector 80,
it invokes vector 80 handler which performs EOI which is configured in 
ioapic_exit_bitmap to cause EOI_INDUCED exit to L0.
The EOI_INDUCED handler will reach handle_apic_eoi_induced() -> 
kvm_apic_set_eoi_accelerated() -> kvm_ioapic_send_eoi() -> 
kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT),
which will cause the exit on KVM_EXIT_IOAPIC_EOI to QEMU as required.

As part of QEMU handling for this exit (ioapic_eoi_broadcast()), it will note 
that pin’s irr is still set (irq-line was not lowered by vector 80 interrupt 
handler before EOI),
and thus vector 80 is re-injected by IOAPIC at ioapic_service().

If this is indeed a level-triggered interrupt, then it seems buggy to me that 
vector 80 handler haven’t lowered the irq-line before EOI.
I would suggest adding a trace to QEMU’s ioapic_set_irq() for when vector=80 
and level=0 and i

Re: [PATCH v2 07/11] mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays.

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 09:42:59PM -0400, Jerome Glisse wrote:
> On Thu, Mar 28, 2019 at 06:30:26PM -0700, John Hubbard wrote:
> > On 3/28/19 6:17 PM, Jerome Glisse wrote:
> > > On Thu, Mar 28, 2019 at 09:42:31AM -0700, Ira Weiny wrote:
> > >> On Thu, Mar 28, 2019 at 04:28:47PM -0700, John Hubbard wrote:
> > >>> On 3/28/19 4:21 PM, Jerome Glisse wrote:
> >  On Thu, Mar 28, 2019 at 03:40:42PM -0700, John Hubbard wrote:
> > > On 3/28/19 3:31 PM, Jerome Glisse wrote:
> > >> On Thu, Mar 28, 2019 at 03:19:06PM -0700, John Hubbard wrote:
> > >>> On 3/28/19 3:12 PM, Jerome Glisse wrote:
> >  On Thu, Mar 28, 2019 at 02:59:50PM -0700, John Hubbard wrote:
> > > On 3/25/19 7:40 AM, jgli...@redhat.com wrote:
> > >> From: Jérôme Glisse 
> > >>> [...]
> > >> Indeed I did not realize there is an hmm "pfn" until I saw this function:
> > >>
> > >> /*
> > >>  * hmm_pfn_from_pfn() - create a valid HMM pfn value from pfn
> > >>  * @range: range use to encode HMM pfn value
> > >>  * @pfn: pfn value for which to create the HMM pfn
> > >>  * Returns: valid HMM pfn for the pfn
> > >>  */
> > >> static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
> > >> unsigned long pfn)
> > >>
> > >> So should this patch contain some sort of helper like this... maybe?
> > >>
> > >> I'm assuming the "hmm_pfn" being returned above is the device pfn being
> > >> discussed here?
> > >>
> > >> I'm also thinking calling it pfn is confusing.  I'm not advocating a new 
> > >> type
> > >> but calling the "device pfn's" "hmm_pfn" or "device_pfn" seems like it 
> > >> would
> > >> have shortened the discussion here.
> > >>
> > > 
> > > That helper is also use today by nouveau so changing that name is not that
> > > easy it does require the multi-release dance. So i am not sure how much
> > > value there is in a name change.
> > > 
> > 
> > Once the dust settles, I would expect that a name change for this could go
> > via Andrew's tree, right? It seems incredible to claim that we've built 
> > something
> > that effectively does not allow any minor changes!
> > 
> > I do think it's worth some *minor* trouble to improve the name, assuming 
> > that we
> > can do it in a simple patch, rather than some huge maintainer-level effort.
> 
> Change to nouveau have to go through nouveau tree so changing name means:
>  -  release N add function with new name, maybe make the old function just
> a wrapper to the new function
>  -  release N+1 update user to use the new name
>  -  release N+2 remove the old name
> 
> So it is do-able but it is painful so i rather do that one latter that now
> as i am sure people will then complain again about some little thing and it
> will post pone this whole patchset on that new bit. To avoid post-poning
> RDMA and bunch of other patchset that build on top of that i rather get
> this patchset in and then do more changes in the next cycle.
> 
> This is just a capacity thing.

Also for clarity changes to API i am doing in this patchset is to make
the ODP convertion easier and thus they bring a real hard value. Renaming
those function is esthetic, i am not saying it is useless, i am saying it
does not have the same value as those other changes and i would rather not
miss another merge window just for esthetic changes.

Cheers,
Jérôme


Re: [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2

2019-03-28 Thread Jerome Glisse
On Thu, Mar 28, 2019 at 06:18:35PM -0700, John Hubbard wrote:
> On 3/28/19 6:00 PM, Jerome Glisse wrote:
> > On Thu, Mar 28, 2019 at 09:57:09AM -0700, Ira Weiny wrote:
> >> On Thu, Mar 28, 2019 at 05:39:26PM -0700, John Hubbard wrote:
> >>> On 3/28/19 2:21 PM, Jerome Glisse wrote:
>  On Thu, Mar 28, 2019 at 01:43:13PM -0700, John Hubbard wrote:
> > On 3/28/19 12:11 PM, Jerome Glisse wrote:
> >> On Thu, Mar 28, 2019 at 04:07:20AM -0700, Ira Weiny wrote:
> >>> On Mon, Mar 25, 2019 at 10:40:02AM -0400, Jerome Glisse wrote:
>  From: Jérôme Glisse 
> >>> [...]
>  @@ -67,14 +78,9 @@ struct hmm {
>    */
>   static struct hmm *hmm_register(struct mm_struct *mm)
>   {
>  -struct hmm *hmm = READ_ONCE(mm->hmm);
>  +struct hmm *hmm = mm_get_hmm(mm);
> >>>
> >>> FWIW: having hmm_register == "hmm get" is a bit confusing...
> >>
> >> The thing is that you want only one hmm struct per process and thus
> >> if there is already one and it is not being destroy then you want to
> >> reuse it.
> >>
> >> Also this is all internal to HMM code and so it should not confuse
> >> anyone.
> >>
> >
> > Well, it has repeatedly come up, and I'd claim that it is quite 
> > counter-intuitive. So if there is an easy way to make this internal 
> > HMM code clearer or better named, I would really love that to happen.
> >
> > And we shouldn't ever dismiss feedback based on "this is just internal
> > xxx subsystem code, no need for it to be as clear as other parts of the
> > kernel", right?
> 
>  Yes but i have not seen any better alternative that present code. If
>  there is please submit patch.
> 
> >>>
> >>> Ira, do you have any patch you're working on, or a more detailed 
> >>> suggestion there?
> >>> If not, then I might (later, as it's not urgent) propose a small cleanup 
> >>> patch 
> >>> I had in mind for the hmm_register code. But I don't want to duplicate 
> >>> effort 
> >>> if you're already thinking about it.
> >>
> >> No I don't have anything.
> >>
> >> I was just really digging into these this time around and I was about to
> >> comment on the lack of "get's" for some "puts" when I realized that
> >> "hmm_register" _was_ the get...
> >>
> >> :-(
> >>
> > 
> > The get is mm_get_hmm() were you get a reference on HMM from a mm struct.
> > John in previous posting complained about me naming that function hmm_get()
> > and thus in this version i renamed it to mm_get_hmm() as we are getting
> > a reference on hmm from a mm struct.
> 
> Well, that's not what I recommended, though. The actual conversation went like
> this [1]:
> 
> ---
> >> So for this, hmm_get() really ought to be symmetric with
> >> hmm_put(), by taking a struct hmm*. And the null check is
> >> not helping here, so let's just go with this smaller version:
> >>
> >> static inline struct hmm *hmm_get(struct hmm *hmm)
> >> {
> >> if (kref_get_unless_zero(&hmm->kref))
> >> return hmm;
> >>
> >> return NULL;
> >> }
> >>
> >> ...and change the few callers accordingly.
> >>
> >
> > What about renaning hmm_get() to mm_get_hmm() instead ?
> >
> 
> For a get/put pair of functions, it would be ideal to pass
> the same argument type to each. It looks like we are passing
> around hmm*, and hmm retains a reference count on hmm->mm,
> so I think you have a choice of using either mm* or hmm* as
> the argument. I'm not sure that one is better than the other
> here, as the lifetimes appear to be linked pretty tightly.
> 
> Whichever one is used, I think it would be best to use it
> in both the _get() and _put() calls. 
> ---
> 
> Your response was to change the name to mm_get_hmm(), but that's not
> what I recommended.

Because i can not do that, hmm_put() can _only_ take hmm struct as
input while hmm_get() can _only_ get mm struct as input.

hmm_put() can only take hmm because the hmm we are un-referencing
might no longer be associated with any mm struct and thus i do not
have a mm struct to use.

hmm_get() can only get mm as input as we need to be careful when
accessing the hmm field within the mm struct and thus it is better
to have that code within a function than open coded and duplicated
all over the place.

> 
> > 
> > The hmm_put() is just releasing the reference on the hmm struct.
> > 
> > Here i feel i am getting contradicting requirement from different people.
> > I don't think there is a way to please everyone here.
> > 
> 
> That's not a true conflict: you're comparing your actual implementation
> to Ira's request, rather than comparing my request to Ira's request.
> 
> I think there's a way forward. Ira and I are actually both asking for the
> same thing:
> 
> a) clear, concise get/put routines
> 
> b) avoiding odd side effects in functions that have one name, but do
> ad

[PATCH] regulator: vctrl: Remove unneeded continue statement

2019-03-28 Thread Axel Lin
Signed-off-by: Axel Lin 
---
 drivers/regulator/vctrl-regulator.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/regulator/vctrl-regulator.c 
b/drivers/regulator/vctrl-regulator.c
index 78de002037c7..259864520a06 100644
--- a/drivers/regulator/vctrl-regulator.c
+++ b/drivers/regulator/vctrl-regulator.c
@@ -334,10 +334,8 @@ static int vctrl_init_vtable(struct platform_device *pdev)
ctrl_uV = regulator_list_voltage(ctrl_reg, i);
 
if (ctrl_uV < vrange_ctrl->min_uV ||
-   ctrl_uV > vrange_ctrl->max_uV) {
+   ctrl_uV > vrange_ctrl->max_uV)
rdesc->n_voltages--;
-   continue;
-   }
}
 
if (rdesc->n_voltages == 0) {
-- 
2.17.1



  1   2   3   4   5   6   7   >