Re: [v2] powernv:idle: Fix bug due to labeling ambiguity in power_enter_stop

2017-03-07 Thread Michael Ellerman
On Mon, 2017-02-27 at 05:40:07 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" 
> 
> Commit 09206b600c76 ("powernv: Pass PSSCR value and mask to
> power9_idle_stop") added additional code in power_enter_stop() to
> distinguish between stop requests whose PSSCR had ESL=EC=1 from those
> which did not. When ESL=EC=1, we do a forward-jump to a location
> labelled by "1", which had the code to handle the ESL=EC=1 case.
> 
> Unforunately just a couple of instructions before this label, is the
> macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its
> expansion.
> 
> As a result, the current code can result in directly executing stop
> instruction for deep stop requests with PSSCR ESL=EC=1, without saving
> the hypervisor state.
> 
> Fix this BUG by labeling the location that handles ESL=EC=1 case with
> a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion
> a la .Lxx from Anton Blanchard).
> 
> While at it, rename the label "2" labelling the location of the code
> handling entry into deep stop states with ".Lhandle_deep_stop".
> 
> For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro
> to an not-so commonly used value in order to avoid similar mishaps in
> the future.
> 
> Fixes: 09206b600c76 ("powernv: Pass PSSCR value and mask to
> power9_idle_stop")
> 
> Cc: Michael Neuling 
> Cc: Vaidyanathan Srinivasan 
> Cc: Michael Ellerman 
> Signed-off-by: Gautham R. Shenoy 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/424f8acd328a111319ae30bf384e5d

cheers


Re: [v2] powernv:idle: Fix bug due to labeling ambiguity in power_enter_stop

2017-03-07 Thread Michael Ellerman
On Mon, 2017-02-27 at 05:40:07 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" 
> 
> Commit 09206b600c76 ("powernv: Pass PSSCR value and mask to
> power9_idle_stop") added additional code in power_enter_stop() to
> distinguish between stop requests whose PSSCR had ESL=EC=1 from those
> which did not. When ESL=EC=1, we do a forward-jump to a location
> labelled by "1", which had the code to handle the ESL=EC=1 case.
> 
> Unforunately just a couple of instructions before this label, is the
> macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its
> expansion.
> 
> As a result, the current code can result in directly executing stop
> instruction for deep stop requests with PSSCR ESL=EC=1, without saving
> the hypervisor state.
> 
> Fix this BUG by labeling the location that handles ESL=EC=1 case with
> a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion
> a la .Lxx from Anton Blanchard).
> 
> While at it, rename the label "2" labelling the location of the code
> handling entry into deep stop states with ".Lhandle_deep_stop".
> 
> For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro
> to an not-so commonly used value in order to avoid similar mishaps in
> the future.
> 
> Fixes: 09206b600c76 ("powernv: Pass PSSCR value and mask to
> power9_idle_stop")
> 
> Cc: Michael Neuling 
> Cc: Vaidyanathan Srinivasan 
> Cc: Michael Ellerman 
> Signed-off-by: Gautham R. Shenoy 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/424f8acd328a111319ae30bf384e5d

cheers


Re: [v3, 1/2] powerpc: Emulation support for load/store instructions on LE

2017-03-07 Thread Michael Ellerman
On Tue, 2017-02-14 at 09:16:42 UTC, Ravi Bangoria wrote:
> emulate_step() uses a number of underlying kernel functions that were
> initially not enabled for LE. This has been rectified since. So, fix
> emulate_step() for LE for the corresponding instructions.
> 
> Reported-by: Anton Blanchard 
> Signed-off-by: Ravi Bangoria 

Series applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/e148bd17f48bd17fca2f4f089ec879

cheers


Re: [v3, 1/2] powerpc: Emulation support for load/store instructions on LE

2017-03-07 Thread Michael Ellerman
On Tue, 2017-02-14 at 09:16:42 UTC, Ravi Bangoria wrote:
> emulate_step() uses a number of underlying kernel functions that were
> initially not enabled for LE. This has been rectified since. So, fix
> emulate_step() for LE for the corresponding instructions.
> 
> Reported-by: Anton Blanchard 
> Signed-off-by: Ravi Bangoria 

Series applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/e148bd17f48bd17fca2f4f089ec879

cheers


[PATCH v4 7/7] arm: imx_v6_v7_defconfig: Select hid-multitouchdriver

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Select CONFIG_HID_MULTITOUCH so that we can have multi touchscreen
funtionality via USB by default on Engicam i.CoreM6 Quad with
OpenFrame Cap 10.1 display boards.

Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index 842168f..914e8cb 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -174,6 +174,7 @@ CONFIG_TOUCHSCREEN_SX8654=y
 CONFIG_TOUCHSCREEN_COLIBRI_VF50=y
 CONFIG_INPUT_MISC=y
 CONFIG_INPUT_MMA8450=y
+CONFIG_HID_MULTITOUCH=y
 CONFIG_SERIO_SERPORT=m
 # CONFIG_LEGACY_PTYS is not set
 CONFIG_SERIAL_IMX=y
-- 
1.9.1



[PATCH v4 7/7] arm: imx_v6_v7_defconfig: Select hid-multitouchdriver

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Select CONFIG_HID_MULTITOUCH so that we can have multi touchscreen
funtionality via USB by default on Engicam i.CoreM6 Quad with
OpenFrame Cap 10.1 display boards.

Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index 842168f..914e8cb 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -174,6 +174,7 @@ CONFIG_TOUCHSCREEN_SX8654=y
 CONFIG_TOUCHSCREEN_COLIBRI_VF50=y
 CONFIG_INPUT_MISC=y
 CONFIG_INPUT_MMA8450=y
+CONFIG_HID_MULTITOUCH=y
 CONFIG_SERIO_SERPORT=m
 # CONFIG_LEGACY_PTYS is not set
 CONFIG_SERIAL_IMX=y
-- 
1.9.1



[PATCH 1/2] x86/efi: Correct a tiny mistake in code comment

2017-03-07 Thread Baoquan He
EFI allocate runtime services regions down from EFI_VA_START, -4G.
It should be top-down handling.

Signed-off-by: Baoquan He 
---
 arch/x86/platform/efi/efi_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index a4695da..6cbf9e0 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -47,7 +47,7 @@
 #include 
 
 /*
- * We allocate runtime services regions bottom-up, starting from -4G, i.e.
+ * We allocate runtime services regions top-down, starting from -4G, i.e.
  * 0x___ and limit EFI VA mapping space to 64G.
  */
 static u64 efi_va = EFI_VA_START;
-- 
2.5.5



[PATCH] mips: add missing include files

2017-03-07 Thread Arnd Bergmann
After the split of linux/sched.h, several platforms in arch/mips stopped 
building,
This add the respective additional #include statements to fix the problem I 
first
tried adding these into asm/processor.h, but ran into circular header 
dependencies
with that which I could not figure out.

The commit I listed as causing the problem is the branch merge, as there is
likely a combination of multiple patches in that branch.

Fixes: 1827adb11ad2 ("Merge branch 'WIP.sched-core-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
Signed-off-by: Arnd Bergmann 
---
 arch/mips/cavium-octeon/cpu.c  | 2 ++
 arch/mips/cavium-octeon/crypto/octeon-crypto.c | 1 +
 arch/mips/cavium-octeon/smp.c  | 1 +
 arch/mips/include/asm/fpu.h| 1 +
 arch/mips/kernel/smp-bmips.c   | 1 +
 arch/mips/kernel/smp-mt.c  | 1 +
 arch/mips/loongson64/loongson-3/cop2-ex.c  | 1 +
 arch/mips/netlogic/common/smp.c| 1 +
 arch/mips/netlogic/xlp/cop2-ex.c   | 3 +++
 arch/mips/sgi-ip22/ip28-berr.c | 1 +
 arch/mips/sgi-ip27/ip27-berr.c | 2 ++
 arch/mips/sgi-ip27/ip27-smp.c  | 3 +++
 arch/mips/sgi-ip32/ip32-berr.c | 1 +
 arch/mips/sgi-ip32/ip32-reset.c| 1 +
 14 files changed, 20 insertions(+)

diff --git a/arch/mips/cavium-octeon/cpu.c b/arch/mips/cavium-octeon/cpu.c
index a5b427909b5c..036d56cc4591 100644
--- a/arch/mips/cavium-octeon/cpu.c
+++ b/arch/mips/cavium-octeon/cpu.c
@@ -10,7 +10,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/cavium-octeon/crypto/octeon-crypto.c 
b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
index 4d22365844af..cfb4a146cf17 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-crypto.c
+++ b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "octeon-crypto.h"
 
diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c
index 4b94b7fbafa3..3de786545ded 100644
--- a/arch/mips/cavium-octeon/smp.c
+++ b/arch/mips/cavium-octeon/smp.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/include/asm/fpu.h b/arch/mips/include/asm/fpu.h
index 321752bcbab6..f94455f964ec 100644
--- a/arch/mips/include/asm/fpu.h
+++ b/arch/mips/include/asm/fpu.h
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/kernel/smp-bmips.c b/arch/mips/kernel/smp-bmips.c
index 3daa2cae50b0..1b070a76fcdd 100644
--- a/arch/mips/kernel/smp-bmips.c
+++ b/arch/mips/kernel/smp-bmips.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/mips/kernel/smp-mt.c b/arch/mips/kernel/smp-mt.c
index e077ea3e11fb..e398cbc3d776 100644
--- a/arch/mips/kernel/smp-mt.c
+++ b/arch/mips/kernel/smp-mt.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
diff --git a/arch/mips/loongson64/loongson-3/cop2-ex.c 
b/arch/mips/loongson64/loongson-3/cop2-ex.c
index ea13764d0a03..621d6af5f6eb 100644
--- a/arch/mips/loongson64/loongson-3/cop2-ex.c
+++ b/arch/mips/loongson64/loongson-3/cop2-ex.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/netlogic/common/smp.c b/arch/mips/netlogic/common/smp.c
index 10d86d54880a..bddf1ef553a4 100644
--- a/arch/mips/netlogic/common/smp.c
+++ b/arch/mips/netlogic/common/smp.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/netlogic/xlp/cop2-ex.c b/arch/mips/netlogic/xlp/cop2-ex.c
index 52bc5de42005..21e439b3db70 100644
--- a/arch/mips/netlogic/xlp/cop2-ex.c
+++ b/arch/mips/netlogic/xlp/cop2-ex.c
@@ -9,11 +9,14 @@
  * Copyright (C) 2009 Wind River Systems,
  *   written by Ralf Baechle 
  */
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/sgi-ip22/ip28-berr.c b/arch/mips/sgi-ip22/ip28-berr.c
index 1f2a5bc4779e..75460e1e106b 100644
--- a/arch/mips/sgi-ip22/ip28-berr.c
+++ b/arch/mips/sgi-ip22/ip28-berr.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
diff --git a/arch/mips/sgi-ip27/ip27-berr.c b/arch/mips/sgi-ip27/ip27-berr.c
index d12879eb2b1f..83efe03d5c60 100644
--- a/arch/mips/sgi-ip27/ip27-berr.c
+++ b/arch/mips/sgi-ip27/ip27-berr.c
@@ -12,7 +12,9 @@
 #include   /* for SIGBUS */
 #include/* schow_regs(), force_sig() */
 #include 
+#include 
 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/mips/sgi-ip27/ip27-smp.c b/arch/mips/sgi-ip27/ip27-smp.c
index f5ed45e8f442..4cd47d23d81a 100644
--- a/arch/mips/sgi-ip27/ip27-smp.c
+++ b/arch/mips/sgi-ip27/ip27-smp.c
@@ -8,10 +8,13 @@
  */
 #include 
 

[PATCH 1/2] x86/efi: Correct a tiny mistake in code comment

2017-03-07 Thread Baoquan He
EFI allocate runtime services regions down from EFI_VA_START, -4G.
It should be top-down handling.

Signed-off-by: Baoquan He 
---
 arch/x86/platform/efi/efi_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index a4695da..6cbf9e0 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -47,7 +47,7 @@
 #include 
 
 /*
- * We allocate runtime services regions bottom-up, starting from -4G, i.e.
+ * We allocate runtime services regions top-down, starting from -4G, i.e.
  * 0x___ and limit EFI VA mapping space to 64G.
  */
 static u64 efi_va = EFI_VA_START;
-- 
2.5.5



[PATCH] mips: add missing include files

2017-03-07 Thread Arnd Bergmann
After the split of linux/sched.h, several platforms in arch/mips stopped 
building,
This add the respective additional #include statements to fix the problem I 
first
tried adding these into asm/processor.h, but ran into circular header 
dependencies
with that which I could not figure out.

The commit I listed as causing the problem is the branch merge, as there is
likely a combination of multiple patches in that branch.

Fixes: 1827adb11ad2 ("Merge branch 'WIP.sched-core-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
Signed-off-by: Arnd Bergmann 
---
 arch/mips/cavium-octeon/cpu.c  | 2 ++
 arch/mips/cavium-octeon/crypto/octeon-crypto.c | 1 +
 arch/mips/cavium-octeon/smp.c  | 1 +
 arch/mips/include/asm/fpu.h| 1 +
 arch/mips/kernel/smp-bmips.c   | 1 +
 arch/mips/kernel/smp-mt.c  | 1 +
 arch/mips/loongson64/loongson-3/cop2-ex.c  | 1 +
 arch/mips/netlogic/common/smp.c| 1 +
 arch/mips/netlogic/xlp/cop2-ex.c   | 3 +++
 arch/mips/sgi-ip22/ip28-berr.c | 1 +
 arch/mips/sgi-ip27/ip27-berr.c | 2 ++
 arch/mips/sgi-ip27/ip27-smp.c  | 3 +++
 arch/mips/sgi-ip32/ip32-berr.c | 1 +
 arch/mips/sgi-ip32/ip32-reset.c| 1 +
 14 files changed, 20 insertions(+)

diff --git a/arch/mips/cavium-octeon/cpu.c b/arch/mips/cavium-octeon/cpu.c
index a5b427909b5c..036d56cc4591 100644
--- a/arch/mips/cavium-octeon/cpu.c
+++ b/arch/mips/cavium-octeon/cpu.c
@@ -10,7 +10,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/cavium-octeon/crypto/octeon-crypto.c 
b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
index 4d22365844af..cfb4a146cf17 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-crypto.c
+++ b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "octeon-crypto.h"
 
diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c
index 4b94b7fbafa3..3de786545ded 100644
--- a/arch/mips/cavium-octeon/smp.c
+++ b/arch/mips/cavium-octeon/smp.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/include/asm/fpu.h b/arch/mips/include/asm/fpu.h
index 321752bcbab6..f94455f964ec 100644
--- a/arch/mips/include/asm/fpu.h
+++ b/arch/mips/include/asm/fpu.h
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/kernel/smp-bmips.c b/arch/mips/kernel/smp-bmips.c
index 3daa2cae50b0..1b070a76fcdd 100644
--- a/arch/mips/kernel/smp-bmips.c
+++ b/arch/mips/kernel/smp-bmips.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/mips/kernel/smp-mt.c b/arch/mips/kernel/smp-mt.c
index e077ea3e11fb..e398cbc3d776 100644
--- a/arch/mips/kernel/smp-mt.c
+++ b/arch/mips/kernel/smp-mt.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
diff --git a/arch/mips/loongson64/loongson-3/cop2-ex.c 
b/arch/mips/loongson64/loongson-3/cop2-ex.c
index ea13764d0a03..621d6af5f6eb 100644
--- a/arch/mips/loongson64/loongson-3/cop2-ex.c
+++ b/arch/mips/loongson64/loongson-3/cop2-ex.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/netlogic/common/smp.c b/arch/mips/netlogic/common/smp.c
index 10d86d54880a..bddf1ef553a4 100644
--- a/arch/mips/netlogic/common/smp.c
+++ b/arch/mips/netlogic/common/smp.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/netlogic/xlp/cop2-ex.c b/arch/mips/netlogic/xlp/cop2-ex.c
index 52bc5de42005..21e439b3db70 100644
--- a/arch/mips/netlogic/xlp/cop2-ex.c
+++ b/arch/mips/netlogic/xlp/cop2-ex.c
@@ -9,11 +9,14 @@
  * Copyright (C) 2009 Wind River Systems,
  *   written by Ralf Baechle 
  */
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/mips/sgi-ip22/ip28-berr.c b/arch/mips/sgi-ip22/ip28-berr.c
index 1f2a5bc4779e..75460e1e106b 100644
--- a/arch/mips/sgi-ip22/ip28-berr.c
+++ b/arch/mips/sgi-ip22/ip28-berr.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
diff --git a/arch/mips/sgi-ip27/ip27-berr.c b/arch/mips/sgi-ip27/ip27-berr.c
index d12879eb2b1f..83efe03d5c60 100644
--- a/arch/mips/sgi-ip27/ip27-berr.c
+++ b/arch/mips/sgi-ip27/ip27-berr.c
@@ -12,7 +12,9 @@
 #include   /* for SIGBUS */
 #include/* schow_regs(), force_sig() */
 #include 
+#include 
 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/mips/sgi-ip27/ip27-smp.c b/arch/mips/sgi-ip27/ip27-smp.c
index f5ed45e8f442..4cd47d23d81a 100644
--- a/arch/mips/sgi-ip27/ip27-smp.c
+++ b/arch/mips/sgi-ip27/ip27-smp.c
@@ -8,10 +8,13 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 

[PATCH 2/2] x86/mm/KASLR: Correct the upper boundary of KALSR mm regions if adjacent to EFI

2017-03-07 Thread Baoquan He
EFI allocates runtime services regions top-down, starting from EFI_VA_START
to EFI_VA_END. So EFI_VA_START is bigger than EFI_VA_END and is the end of
EFI region. The upper boundary of memory regions randomized by KASLR should
be EFI_VA_END if it's adjacent to EFI region, but not EFI_VA_START.

Correct it in this patch.

Signed-off-by: Baoquan He 
---
 arch/x86/mm/kaslr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 887e571..aed2064 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -48,7 +48,7 @@ static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
 #if defined(CONFIG_X86_ESPFIX64)
 static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
 #elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_START;
+static const unsigned long vaddr_end = EFI_VA_END;
 #else
 static const unsigned long vaddr_end = __START_KERNEL_map;
 #endif
@@ -105,7 +105,7 @@ void __init kernel_randomize_memory(void)
 */
BUILD_BUG_ON(vaddr_start >= vaddr_end);
BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-vaddr_end >= EFI_VA_START);
+vaddr_end >= EFI_VA_END);
BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
  IS_ENABLED(CONFIG_EFI)) &&
 vaddr_end >= __START_KERNEL_map);
-- 
2.5.5



[PATCH 2/2] x86/mm/KASLR: Correct the upper boundary of KALSR mm regions if adjacent to EFI

2017-03-07 Thread Baoquan He
EFI allocates runtime services regions top-down, starting from EFI_VA_START
to EFI_VA_END. So EFI_VA_START is bigger than EFI_VA_END and is the end of
EFI region. The upper boundary of memory regions randomized by KASLR should
be EFI_VA_END if it's adjacent to EFI region, but not EFI_VA_START.

Correct it in this patch.

Signed-off-by: Baoquan He 
---
 arch/x86/mm/kaslr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 887e571..aed2064 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -48,7 +48,7 @@ static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
 #if defined(CONFIG_X86_ESPFIX64)
 static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
 #elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_START;
+static const unsigned long vaddr_end = EFI_VA_END;
 #else
 static const unsigned long vaddr_end = __START_KERNEL_map;
 #endif
@@ -105,7 +105,7 @@ void __init kernel_randomize_memory(void)
 */
BUILD_BUG_ON(vaddr_start >= vaddr_end);
BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-vaddr_end >= EFI_VA_START);
+vaddr_end >= EFI_VA_END);
BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
  IS_ENABLED(CONFIG_EFI)) &&
 vaddr_end >= __START_KERNEL_map);
-- 
2.5.5



Re: [PATCH 24/29] drivers: convert iblock_req.pending from atomic_t to refcount_t

2017-03-07 Thread Nicholas A. Bellinger
Hi Elena,

On Mon, 2017-03-06 at 16:21 +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/target/target_core_iblock.c | 12 ++--
>  drivers/target/target_core_iblock.h |  3 ++-
>  2 files changed, 8 insertions(+), 7 deletions(-)

For the target_core_iblock part:

Acked-by: Nicholas Bellinger 



Re: [PATCH 24/29] drivers: convert iblock_req.pending from atomic_t to refcount_t

2017-03-07 Thread Nicholas A. Bellinger
Hi Elena,

On Mon, 2017-03-06 at 16:21 +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/target/target_core_iblock.c | 12 ++--
>  drivers/target/target_core_iblock.h |  3 ++-
>  2 files changed, 8 insertions(+), 7 deletions(-)

For the target_core_iblock part:

Acked-by: Nicholas Bellinger 



Re: [PATCH 0/2] ARM: dts: sunxi: Remove no longer used pinctrl/sun4i-a10.h header

2017-03-07 Thread Maxime Ripard
On Wed, Mar 08, 2017 at 11:28:19AM +0800, Chen-Yu Tsai wrote:
> Hi Maxime,
> 
> This series gets rid of the last usage of the Allwinner specific pinconf
> bindings, and drops inclusion of dt-bindings/pinctrl/sun4i-a10.h across
> the tree.
> 
> Patch 1 gets rid of the last occurrence of Allwinner specific pinconf
> properties, which is actually a GPIO pinmux.
> 
> Patch 2 drops the #include for dt-bindings/pinctrl/sun4i-a10.h with a
> scripted approach.
> 
> A patch to remove the actual header file should be sent after this
> series is in Linus' tree, to avoid any unwanted cross-tree dependencies.
> If we want that to happen faster, maybe we could merge these as fixes.
> 
> Regards
> ChenYu

Applied both, thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [PATCH 0/2] ARM: dts: sunxi: Remove no longer used pinctrl/sun4i-a10.h header

2017-03-07 Thread Maxime Ripard
On Wed, Mar 08, 2017 at 11:28:19AM +0800, Chen-Yu Tsai wrote:
> Hi Maxime,
> 
> This series gets rid of the last usage of the Allwinner specific pinconf
> bindings, and drops inclusion of dt-bindings/pinctrl/sun4i-a10.h across
> the tree.
> 
> Patch 1 gets rid of the last occurrence of Allwinner specific pinconf
> properties, which is actually a GPIO pinmux.
> 
> Patch 2 drops the #include for dt-bindings/pinctrl/sun4i-a10.h with a
> scripted approach.
> 
> A patch to remove the actual header file should be sent after this
> series is in Linus' tree, to avoid any unwanted cross-tree dependencies.
> If we want that to happen faster, maybe we could merge these as fixes.
> 
> Regards
> ChenYu

Applied both, thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [PATCH] mm, vmalloc: use __GFP_HIGHMEM implicitly

2017-03-07 Thread Vlastimil Babka
On 03/07/2017 07:57 PM, Matthew Wilcox wrote:
> On Tue, Mar 07, 2017 at 10:28:41AM -0800, Matthew Wilcox wrote:
>> On Tue, Mar 07, 2017 at 03:10:20PM +0100, Michal Hocko wrote:
>>> This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to
>>> be mapped to the vmalloc space. Current users which add __GFP_HIGHMEM
>>> are simplified and drop the flag.
> 
> btw, I had another idea for GFP_HIGHMEM -- remove it when CONFIG_HIGHMEM
> isn't enabled.  Saves 26 bytes of .text and 64 bytes of .data on my
> laptop's kernel build.  What do you think?
> 
> Also, I suspect the layout of bits is suboptimal from an assembly
> language perspective.  I still mostly care about x86 which doesn't
> benefit, so I'm not inclined to do the work, but certainly ARM, PA-RISC,
> SPARC and Itanium would all benefit from having frequently-used bits
> (ie those used in GFP_KERNEL and GFP_ATOMIC) placed in the low 8 bits.
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 0fe0b6295ab5..d88cb532d7c8 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -16,7 +16,11 @@ struct vm_area_struct;
>  
>  /* Plain integer GFP bitmasks. Do not use this directly. */
>  #define ___GFP_DMA   0x01u
> +#ifdef CONFIG_HIGHMEM
>  #define ___GFP_HIGHMEM   0x02u
> +#else
> +#define ___GFP_HIGHMEM   0x0u

Make sure you don't break the users of __def_gfpflag_names e.g.
format_flags(). IIRC zero is a terminator in the table.

But the savings don't seem to be worth the trouble.

> +#endif
>  #define ___GFP_DMA32 0x04u
>  #define ___GFP_MOVABLE   0x08u
>  #define ___GFP_RECLAIMABLE   0x10u
> 



Re: [PATCH] mm, vmalloc: use __GFP_HIGHMEM implicitly

2017-03-07 Thread Vlastimil Babka
On 03/07/2017 07:57 PM, Matthew Wilcox wrote:
> On Tue, Mar 07, 2017 at 10:28:41AM -0800, Matthew Wilcox wrote:
>> On Tue, Mar 07, 2017 at 03:10:20PM +0100, Michal Hocko wrote:
>>> This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to
>>> be mapped to the vmalloc space. Current users which add __GFP_HIGHMEM
>>> are simplified and drop the flag.
> 
> btw, I had another idea for GFP_HIGHMEM -- remove it when CONFIG_HIGHMEM
> isn't enabled.  Saves 26 bytes of .text and 64 bytes of .data on my
> laptop's kernel build.  What do you think?
> 
> Also, I suspect the layout of bits is suboptimal from an assembly
> language perspective.  I still mostly care about x86 which doesn't
> benefit, so I'm not inclined to do the work, but certainly ARM, PA-RISC,
> SPARC and Itanium would all benefit from having frequently-used bits
> (ie those used in GFP_KERNEL and GFP_ATOMIC) placed in the low 8 bits.
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 0fe0b6295ab5..d88cb532d7c8 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -16,7 +16,11 @@ struct vm_area_struct;
>  
>  /* Plain integer GFP bitmasks. Do not use this directly. */
>  #define ___GFP_DMA   0x01u
> +#ifdef CONFIG_HIGHMEM
>  #define ___GFP_HIGHMEM   0x02u
> +#else
> +#define ___GFP_HIGHMEM   0x0u

Make sure you don't break the users of __def_gfpflag_names e.g.
format_flags(). IIRC zero is a terminator in the table.

But the savings don't seem to be worth the trouble.

> +#endif
>  #define ___GFP_DMA32 0x04u
>  #define ___GFP_MOVABLE   0x08u
>  #define ___GFP_RECLAIMABLE   0x10u
> 



[PATCH -v6 0/9] THP swap: Delay splitting THP during swapping out

2017-03-07 Thread Huang, Ying
From: Huang Ying 

Hi, Andrew, could you help me to check whether the overall design is
reasonable?

Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
swap part of the patchset?  Especially [1/9], [3/9], [4/9], [5/9],
[6/9], [9/9].

Hi, Andrea could you help me to review the THP part of the patchset?
Especially [2/9], [7/9] and [8/9].

Hi, Johannes, Michal and Vladimir, I am not very confident about the
memory cgroup part, especially [2/9].  Could you help me to review it?

And for all, Any comment is welcome!


Recently, the performance of the storage devices improved so fast that
we cannot saturate the disk bandwidth with single logical CPU when do
page swap out even on a high-end server machine.  Because the
performance of the storage device improved faster than that of single
logical CPU.  And it seems that the trend will not change in the near
future.  On the other hand, the THP becomes more and more popular
because of increased memory size.  So it becomes necessary to optimize
THP swap performance.

The advantages of the THP swap support include:

- Batch the swap operations for the THP to reduce lock
  acquiring/releasing, including allocating/freeing the swap space,
  adding/deleting to/from the swap cache, and writing/reading the swap
  space, etc.  This will help improve the performance of the THP swap.

- The THP swap space read/write will be 2M sequential IO.  It is
  particularly helpful for the swap read, which are usually 4k random
  IO.  This will improve the performance of the THP swap too.

- It will help the memory fragmentation, especially when the THP is
  heavily used by the applications.  The 2M continuous pages will be
  free up after THP swapping out.

- It will improve the THP utilization on the system with the swap
  turned on.  Because the speed for khugepaged to collapse the normal
  pages into the THP is quite slow.  After the THP is split during the
  swapping out, it will take quite long time for the normal pages to
  collapse back into the THP after being swapped in.  The high THP
  utilization helps the efficiency of the page based memory management
  too.

There are some concerns regarding THP swap in, mainly because possible
enlarged read/write IO size (for swap in/out) may put more overhead on
the storage device.  To deal with that, the THP swap in should be
turned on only when necessary.  For example, it can be selected via
"always/never/madvise" logic, to be turned on globally, turned off
globally, or turned on only for VMA with MADV_HUGEPAGE, etc.

This patchset is based on 03/06 head of mmotm/master.

This patchset is the first step for the THP swap support.  The plan is
to delay splitting THP step by step, finally avoid splitting THP
during the THP swapping out and swap out/in the THP as a whole.

As the first step, in this patchset, the splitting huge page is
delayed from almost the first step of swapping out to after allocating
the swap space for the THP and adding the THP into the swap cache.
This will reduce lock acquiring/releasing for the locks used for the
swap cache management.

With the patchset, the swap out throughput improves 14.9% (from about
3.77GB/s to about 4.34GB/s) in the vm-scalability swap-w-seq test case
with 8 processes.  The test is done on a Xeon E5 v3 system.  The swap
device used is a RAM simulated PMEM (persistent memory) device.  To
test the sequential swapping out, the test case creates 8 processes,
which sequentially allocate and write to the anonymous pages until the
RAM and part of the swap device is used up.

The detailed comparison result is as follow,

base base+patchset
 -- 
 %stddev %change %stddev
 \  |\  
   7043990 ±  0% +21.2%8536807 ±  0%  vm-scalability.throughput
109.94 ±  1% -16.2%  92.09 ±  0%  vm-scalability.time.elapsed_time
   3957091 ±  0% +14.9%4547173 ±  0%  vmstat.swap.so
 31.46 ±  1% -38.3%  19.42 ±  0%  perf-stat.cache-miss-rate%
  1.04 ±  1% +22.2%   1.27 ±  0%  perf-stat.ipc
  9.33 ±  2% -60.7%   3.67 ±  1%  
perf-profile.calltrace.cycles-pp.add_to_swap.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node

Changelog:

v6:

- Rebased on latest -mm tree (cluster lock, etc).
- Fix a potential uninitialized variable bug in __swap_entry_free()
- Revise the swap read-ahead changes to avoid a potential race
  condition between swap off and swap out in theory.

v5:

- Per Hillf's comments, fix a locking bug in error path of
  __add_to_swap_cache().  And merge the code to calculate extra_pins
  into can_split_huge_page().

v4:

- Per Johannes' comments, simplified swap cgroup array accessing code.
- Per Kirill and Dave Hansen's comments, used HPAGE_PMD_NR instead of
  HPAGE_SIZE/PAGE_SIZE.
- Per Anshuman's comments, used HPAGE_PMD_NR instead of 512 in patch
  description.

v3:

- Per Andrew's 

[PATCH -v6 0/9] THP swap: Delay splitting THP during swapping out

2017-03-07 Thread Huang, Ying
From: Huang Ying 

Hi, Andrew, could you help me to check whether the overall design is
reasonable?

Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
swap part of the patchset?  Especially [1/9], [3/9], [4/9], [5/9],
[6/9], [9/9].

Hi, Andrea could you help me to review the THP part of the patchset?
Especially [2/9], [7/9] and [8/9].

Hi, Johannes, Michal and Vladimir, I am not very confident about the
memory cgroup part, especially [2/9].  Could you help me to review it?

And for all, Any comment is welcome!


Recently, the performance of the storage devices improved so fast that
we cannot saturate the disk bandwidth with single logical CPU when do
page swap out even on a high-end server machine.  Because the
performance of the storage device improved faster than that of single
logical CPU.  And it seems that the trend will not change in the near
future.  On the other hand, the THP becomes more and more popular
because of increased memory size.  So it becomes necessary to optimize
THP swap performance.

The advantages of the THP swap support include:

- Batch the swap operations for the THP to reduce lock
  acquiring/releasing, including allocating/freeing the swap space,
  adding/deleting to/from the swap cache, and writing/reading the swap
  space, etc.  This will help improve the performance of the THP swap.

- The THP swap space read/write will be 2M sequential IO.  It is
  particularly helpful for the swap read, which are usually 4k random
  IO.  This will improve the performance of the THP swap too.

- It will help the memory fragmentation, especially when the THP is
  heavily used by the applications.  The 2M continuous pages will be
  free up after THP swapping out.

- It will improve the THP utilization on the system with the swap
  turned on.  Because the speed for khugepaged to collapse the normal
  pages into the THP is quite slow.  After the THP is split during the
  swapping out, it will take quite long time for the normal pages to
  collapse back into the THP after being swapped in.  The high THP
  utilization helps the efficiency of the page based memory management
  too.

There are some concerns regarding THP swap in, mainly because possible
enlarged read/write IO size (for swap in/out) may put more overhead on
the storage device.  To deal with that, the THP swap in should be
turned on only when necessary.  For example, it can be selected via
"always/never/madvise" logic, to be turned on globally, turned off
globally, or turned on only for VMA with MADV_HUGEPAGE, etc.

This patchset is based on 03/06 head of mmotm/master.

This patchset is the first step for the THP swap support.  The plan is
to delay splitting THP step by step, finally avoid splitting THP
during the THP swapping out and swap out/in the THP as a whole.

As the first step, in this patchset, the splitting huge page is
delayed from almost the first step of swapping out to after allocating
the swap space for the THP and adding the THP into the swap cache.
This will reduce lock acquiring/releasing for the locks used for the
swap cache management.

With the patchset, the swap out throughput improves 14.9% (from about
3.77GB/s to about 4.34GB/s) in the vm-scalability swap-w-seq test case
with 8 processes.  The test is done on a Xeon E5 v3 system.  The swap
device used is a RAM simulated PMEM (persistent memory) device.  To
test the sequential swapping out, the test case creates 8 processes,
which sequentially allocate and write to the anonymous pages until the
RAM and part of the swap device is used up.

The detailed comparison result is as follow,

base base+patchset
 -- 
 %stddev %change %stddev
 \  |\  
   7043990 ±  0% +21.2%8536807 ±  0%  vm-scalability.throughput
109.94 ±  1% -16.2%  92.09 ±  0%  vm-scalability.time.elapsed_time
   3957091 ±  0% +14.9%4547173 ±  0%  vmstat.swap.so
 31.46 ±  1% -38.3%  19.42 ±  0%  perf-stat.cache-miss-rate%
  1.04 ±  1% +22.2%   1.27 ±  0%  perf-stat.ipc
  9.33 ±  2% -60.7%   3.67 ±  1%  
perf-profile.calltrace.cycles-pp.add_to_swap.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node

Changelog:

v6:

- Rebased on latest -mm tree (cluster lock, etc).
- Fix a potential uninitialized variable bug in __swap_entry_free()
- Revise the swap read-ahead changes to avoid a potential race
  condition between swap off and swap out in theory.

v5:

- Per Hillf's comments, fix a locking bug in error path of
  __add_to_swap_cache().  And merge the code to calculate extra_pins
  into can_split_huge_page().

v4:

- Per Johannes' comments, simplified swap cgroup array accessing code.
- Per Kirill and Dave Hansen's comments, used HPAGE_PMD_NR instead of
  HPAGE_SIZE/PAGE_SIZE.
- Per Anshuman's comments, used HPAGE_PMD_NR instead of 512 in patch
  description.

v3:

- Per Andrew's suggestion, used a more 

[PATCH -mm -v6 3/9] mm, THP, swap: Add swap cluster allocate/free functions

2017-03-07 Thread Huang, Ying
From: Huang Ying 

The swap cluster allocation/free functions are added based on the
existing swap cluster management mechanism for SSD.  These functions
don't work for the rotating hard disks because the existing swap cluster
management mechanism doesn't work for them.  The hard disks support may
be added if someone really need it.  But that needn't be included in
this patchset.

This will be used for the THP (Transparent Huge Page) swap support.
Where one swap cluster will hold the contents of each THP swapped out.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 mm/swapfile.c | 217 +-
 1 file changed, 156 insertions(+), 61 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index a744604384ff..91876c33114b 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -378,6 +378,14 @@ static void swap_cluster_schedule_discard(struct 
swap_info_struct *si,
schedule_work(>discard_work);
 }
 
+static void __free_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info;
+
+   cluster_set_flag(ci + idx, CLUSTER_FLAG_FREE);
+   cluster_list_add_tail(>free_clusters, ci, idx);
+}
+
 /*
  * Doing discard actually. After a cluster discard is finished, the cluster
  * will be added to free cluster list. caller should hold si->lock.
@@ -398,10 +406,7 @@ static void swap_do_scheduled_discard(struct 
swap_info_struct *si)
 
spin_lock(>lock);
ci = lock_cluster(si, idx * SWAPFILE_CLUSTER);
-   cluster_set_flag(ci, CLUSTER_FLAG_FREE);
-   unlock_cluster(ci);
-   cluster_list_add_tail(>free_clusters, info, idx);
-   ci = lock_cluster(si, idx * SWAPFILE_CLUSTER);
+   __free_cluster(si, idx);
memset(si->swap_map + idx * SWAPFILE_CLUSTER,
0, SWAPFILE_CLUSTER);
unlock_cluster(ci);
@@ -419,6 +424,34 @@ static void swap_discard_work(struct work_struct *work)
spin_unlock(>lock);
 }
 
+static void alloc_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info;
+
+   VM_BUG_ON(cluster_list_first(>free_clusters) != idx);
+   cluster_list_del_first(>free_clusters, ci);
+   cluster_set_count_flag(ci + idx, 0, 0);
+}
+
+static void free_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info + idx;
+
+   VM_BUG_ON(cluster_count(ci) != 0);
+   /*
+* If the swap is discardable, prepare discard the cluster
+* instead of free it immediately. The cluster will be freed
+* after discard.
+*/
+   if ((si->flags & (SWP_WRITEOK | SWP_PAGE_DISCARD)) ==
+   (SWP_WRITEOK | SWP_PAGE_DISCARD)) {
+   swap_cluster_schedule_discard(si, idx);
+   return;
+   }
+
+   __free_cluster(si, idx);
+}
+
 /*
  * The cluster corresponding to page_nr will be used. The cluster will be
  * removed from free cluster list and its usage counter will be increased.
@@ -430,11 +463,8 @@ static void inc_cluster_info_page(struct swap_info_struct 
*p,
 
if (!cluster_info)
return;
-   if (cluster_is_free(_info[idx])) {
-   VM_BUG_ON(cluster_list_first(>free_clusters) != idx);
-   cluster_list_del_first(>free_clusters, cluster_info);
-   cluster_set_count_flag(_info[idx], 0, 0);
-   }
+   if (cluster_is_free(_info[idx]))
+   alloc_cluster(p, idx);
 
VM_BUG_ON(cluster_count(_info[idx]) >= SWAPFILE_CLUSTER);
cluster_set_count(_info[idx],
@@ -458,21 +488,8 @@ static void dec_cluster_info_page(struct swap_info_struct 
*p,
cluster_set_count(_info[idx],
cluster_count(_info[idx]) - 1);
 
-   if (cluster_count(_info[idx]) == 0) {
-   /*
-* If the swap is discardable, prepare discard the cluster
-* instead of free it immediately. The cluster will be freed
-* after discard.
-*/
-   if ((p->flags & (SWP_WRITEOK | SWP_PAGE_DISCARD)) ==
-(SWP_WRITEOK | SWP_PAGE_DISCARD)) {
-   swap_cluster_schedule_discard(p, idx);
-   return;
-   }
-
-   cluster_set_flag(_info[idx], CLUSTER_FLAG_FREE);
-   cluster_list_add_tail(>free_clusters, cluster_info, idx);
-   }
+   if (cluster_count(_info[idx]) == 0)
+   free_cluster(p, idx);
 }
 
 /*
@@ -562,6 +579,71 @@ static bool 

[PATCH -mm -v6 3/9] mm, THP, swap: Add swap cluster allocate/free functions

2017-03-07 Thread Huang, Ying
From: Huang Ying 

The swap cluster allocation/free functions are added based on the
existing swap cluster management mechanism for SSD.  These functions
don't work for the rotating hard disks because the existing swap cluster
management mechanism doesn't work for them.  The hard disks support may
be added if someone really need it.  But that needn't be included in
this patchset.

This will be used for the THP (Transparent Huge Page) swap support.
Where one swap cluster will hold the contents of each THP swapped out.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 mm/swapfile.c | 217 +-
 1 file changed, 156 insertions(+), 61 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index a744604384ff..91876c33114b 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -378,6 +378,14 @@ static void swap_cluster_schedule_discard(struct 
swap_info_struct *si,
schedule_work(>discard_work);
 }
 
+static void __free_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info;
+
+   cluster_set_flag(ci + idx, CLUSTER_FLAG_FREE);
+   cluster_list_add_tail(>free_clusters, ci, idx);
+}
+
 /*
  * Doing discard actually. After a cluster discard is finished, the cluster
  * will be added to free cluster list. caller should hold si->lock.
@@ -398,10 +406,7 @@ static void swap_do_scheduled_discard(struct 
swap_info_struct *si)
 
spin_lock(>lock);
ci = lock_cluster(si, idx * SWAPFILE_CLUSTER);
-   cluster_set_flag(ci, CLUSTER_FLAG_FREE);
-   unlock_cluster(ci);
-   cluster_list_add_tail(>free_clusters, info, idx);
-   ci = lock_cluster(si, idx * SWAPFILE_CLUSTER);
+   __free_cluster(si, idx);
memset(si->swap_map + idx * SWAPFILE_CLUSTER,
0, SWAPFILE_CLUSTER);
unlock_cluster(ci);
@@ -419,6 +424,34 @@ static void swap_discard_work(struct work_struct *work)
spin_unlock(>lock);
 }
 
+static void alloc_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info;
+
+   VM_BUG_ON(cluster_list_first(>free_clusters) != idx);
+   cluster_list_del_first(>free_clusters, ci);
+   cluster_set_count_flag(ci + idx, 0, 0);
+}
+
+static void free_cluster(struct swap_info_struct *si, unsigned long idx)
+{
+   struct swap_cluster_info *ci = si->cluster_info + idx;
+
+   VM_BUG_ON(cluster_count(ci) != 0);
+   /*
+* If the swap is discardable, prepare discard the cluster
+* instead of free it immediately. The cluster will be freed
+* after discard.
+*/
+   if ((si->flags & (SWP_WRITEOK | SWP_PAGE_DISCARD)) ==
+   (SWP_WRITEOK | SWP_PAGE_DISCARD)) {
+   swap_cluster_schedule_discard(si, idx);
+   return;
+   }
+
+   __free_cluster(si, idx);
+}
+
 /*
  * The cluster corresponding to page_nr will be used. The cluster will be
  * removed from free cluster list and its usage counter will be increased.
@@ -430,11 +463,8 @@ static void inc_cluster_info_page(struct swap_info_struct 
*p,
 
if (!cluster_info)
return;
-   if (cluster_is_free(_info[idx])) {
-   VM_BUG_ON(cluster_list_first(>free_clusters) != idx);
-   cluster_list_del_first(>free_clusters, cluster_info);
-   cluster_set_count_flag(_info[idx], 0, 0);
-   }
+   if (cluster_is_free(_info[idx]))
+   alloc_cluster(p, idx);
 
VM_BUG_ON(cluster_count(_info[idx]) >= SWAPFILE_CLUSTER);
cluster_set_count(_info[idx],
@@ -458,21 +488,8 @@ static void dec_cluster_info_page(struct swap_info_struct 
*p,
cluster_set_count(_info[idx],
cluster_count(_info[idx]) - 1);
 
-   if (cluster_count(_info[idx]) == 0) {
-   /*
-* If the swap is discardable, prepare discard the cluster
-* instead of free it immediately. The cluster will be freed
-* after discard.
-*/
-   if ((p->flags & (SWP_WRITEOK | SWP_PAGE_DISCARD)) ==
-(SWP_WRITEOK | SWP_PAGE_DISCARD)) {
-   swap_cluster_schedule_discard(p, idx);
-   return;
-   }
-
-   cluster_set_flag(_info[idx], CLUSTER_FLAG_FREE);
-   cluster_list_add_tail(>free_clusters, cluster_info, idx);
-   }
+   if (cluster_count(_info[idx]) == 0)
+   free_cluster(p, idx);
 }
 
 /*
@@ -562,6 +579,71 @@ static bool scan_swap_map_try_ssd_cluster(struct 
swap_info_struct *si,
return found_free;
 }
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+static inline unsigned int huge_cluster_nr_entries(bool huge)
+{
+   return 

[PATCH -mm -v6 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-07 Thread Huang, Ying
From: Huang Ying 

A variation of get_swap_page(), get_huge_swap_page(), is added to
allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
cluster allocation function.  A fair simple algorithm is used, that is,
only the first swap device in priority list will be tried to allocate
the swap cluster.  The function will fail if the trying is not
successful, and the caller will fallback to allocate a single swap slot
instead.  This works good enough for normal cases.

This will be used for the THP (Transparent Huge Page) swap support.
Where get_huge_swap_page() will be used to allocate one swap cluster for
each THP swapped out.

Because of the algorithm adopted, if the difference of the number of the
free swap clusters among multiple swap devices is significant, it is
possible that some THPs are split earlier than necessary.  For example,
this could be caused by big size difference among multiple swap devices.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h | 19 ++-
 mm/swap_slots.c  |  5 +++--
 mm/swapfile.c| 16 
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 278e1349a424..e3a7609a8989 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
 extern void si_swapinfo(struct sysinfo *);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
-extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
+extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
 extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
@@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
 
 #endif /* CONFIG_SWAP */
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   swp_entry_t entry;
+
+   if (get_swap_pages(1, , true))
+   return entry;
+   else
+   return (swp_entry_t) {0};
+}
+#else
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   return (swp_entry_t) {0};
+}
+#endif
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 9b5bc86f96ad..075bb39e03c5 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache 
*cache)
 
cache->cur = 0;
if (swap_slot_cache_active)
-   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
+   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
+  false);
 
return cache->nr;
 }
@@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
return entry;
}
 
-   get_swap_pages(1, );
+   get_swap_pages(1, , false);
 
return entry;
 }
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 91876c33114b..7241c937e52b 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -904,11 +904,12 @@ static unsigned long scan_swap_map(struct 
swap_info_struct *si,
 
 }
 
-int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
+int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
 {
struct swap_info_struct *si, *next;
long avail_pgs;
int n_ret = 0;
+   int nr_pages = huge_cluster_nr_entries(huge);
 
avail_pgs = atomic_long_read(_swap_pages);
if (avail_pgs <= 0)
@@ -920,6 +921,10 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
if (n_goal > avail_pgs)
n_goal = avail_pgs;
 
+   n_goal *= nr_pages;
+   if (avail_pgs < n_goal)
+   goto noswap;
+
atomic_long_sub(n_goal, _swap_pages);
 
spin_lock(_avail_lock);
@@ -946,10 +951,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
spin_unlock(>lock);
goto nextsi;
}
-   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
-   n_goal, swp_entries);
+   if (likely(nr_pages == 1))
+   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
+   n_goal, swp_entries);
+   else
+   n_ret = swap_alloc_huge_cluster(si, swp_entries);
spin_unlock(>lock);
-   if (n_ret)
+   if (n_ret || unlikely(nr_pages != 1))
goto check_out;

[PATCH -mm -v6 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-07 Thread Huang, Ying
From: Huang Ying 

A variation of get_swap_page(), get_huge_swap_page(), is added to
allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
cluster allocation function.  A fair simple algorithm is used, that is,
only the first swap device in priority list will be tried to allocate
the swap cluster.  The function will fail if the trying is not
successful, and the caller will fallback to allocate a single swap slot
instead.  This works good enough for normal cases.

This will be used for the THP (Transparent Huge Page) swap support.
Where get_huge_swap_page() will be used to allocate one swap cluster for
each THP swapped out.

Because of the algorithm adopted, if the difference of the number of the
free swap clusters among multiple swap devices is significant, it is
possible that some THPs are split earlier than necessary.  For example,
this could be caused by big size difference among multiple swap devices.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h | 19 ++-
 mm/swap_slots.c  |  5 +++--
 mm/swapfile.c| 16 
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 278e1349a424..e3a7609a8989 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
 extern void si_swapinfo(struct sysinfo *);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
-extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
+extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
 extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
@@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
 
 #endif /* CONFIG_SWAP */
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   swp_entry_t entry;
+
+   if (get_swap_pages(1, , true))
+   return entry;
+   else
+   return (swp_entry_t) {0};
+}
+#else
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   return (swp_entry_t) {0};
+}
+#endif
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 9b5bc86f96ad..075bb39e03c5 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache 
*cache)
 
cache->cur = 0;
if (swap_slot_cache_active)
-   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
+   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
+  false);
 
return cache->nr;
 }
@@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
return entry;
}
 
-   get_swap_pages(1, );
+   get_swap_pages(1, , false);
 
return entry;
 }
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 91876c33114b..7241c937e52b 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -904,11 +904,12 @@ static unsigned long scan_swap_map(struct 
swap_info_struct *si,
 
 }
 
-int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
+int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
 {
struct swap_info_struct *si, *next;
long avail_pgs;
int n_ret = 0;
+   int nr_pages = huge_cluster_nr_entries(huge);
 
avail_pgs = atomic_long_read(_swap_pages);
if (avail_pgs <= 0)
@@ -920,6 +921,10 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
if (n_goal > avail_pgs)
n_goal = avail_pgs;
 
+   n_goal *= nr_pages;
+   if (avail_pgs < n_goal)
+   goto noswap;
+
atomic_long_sub(n_goal, _swap_pages);
 
spin_lock(_avail_lock);
@@ -946,10 +951,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
spin_unlock(>lock);
goto nextsi;
}
-   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
-   n_goal, swp_entries);
+   if (likely(nr_pages == 1))
+   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
+   n_goal, swp_entries);
+   else
+   n_ret = swap_alloc_huge_cluster(si, swp_entries);
spin_unlock(>lock);
-   if (n_ret)
+   if (n_ret || unlikely(nr_pages != 1))
goto check_out;
pr_debug("scan_swap_map of si %d failed to find offset\n",
si->type);
-- 
2.11.0



[PATCH -mm -v6 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2017-03-07 Thread Huang, Ying
From: Huang Ying 

__swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag
for the huge page.  This will free the specified swap cluster now.
Because now this function will be called only in the error path to free
the swap cluster just allocated.  So the corresponding swap_map[i] ==
SWAP_HAS_CACHE, that is, the swap count is 0.  This makes the
implementation simpler than that of the ordinary swap entry.

This will be used for delaying splitting THP (Transparent Huge Page)
during swapping out.  Where for one THP to swap out, we will allocate a
swap cluster, add the THP into the swap cache, then split the THP.  If
anything fails after allocating the swap cluster and before splitting
the THP successfully, the swapcache_free_trans_huge() will be used to
free the swap space allocated.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h |  9 +++--
 mm/swapfile.c| 34 --
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index e3a7609a8989..2f2a6c0363aa 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -394,7 +394,7 @@ extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
 extern int swapcache_prepare(swp_entry_t);
 extern void swap_free(swp_entry_t);
-extern void swapcache_free(swp_entry_t);
+extern void __swapcache_free(swp_entry_t entry, bool huge);
 extern void swapcache_free_entries(swp_entry_t *entries, int n);
 extern int free_swap_and_cache(swp_entry_t);
 extern int swap_type_of(dev_t, sector_t, struct block_device **);
@@ -456,7 +456,7 @@ static inline void swap_free(swp_entry_t swp)
 {
 }
 
-static inline void swapcache_free(swp_entry_t swp)
+static inline void __swapcache_free(swp_entry_t swp, bool huge)
 {
 }
 
@@ -544,6 +544,11 @@ static inline swp_entry_t get_huge_swap_page(void)
 }
 #endif
 
+static inline void swapcache_free(swp_entry_t entry)
+{
+   __swapcache_free(entry, false);
+}
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 7241c937e52b..6019f94afbaf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -855,6 +855,29 @@ static void swap_free_huge_cluster(struct swap_info_struct 
*si,
_swap_entry_free(si, offset, true);
 }
 
+static void swapcache_free_trans_huge(struct swap_info_struct *si,
+ swp_entry_t entry)
+{
+   unsigned long offset = swp_offset(entry);
+   unsigned long idx = offset / SWAPFILE_CLUSTER;
+   struct swap_cluster_info *ci;
+   unsigned char *map;
+   unsigned int i;
+
+   spin_lock(>lock);
+   ci = lock_cluster(si, offset);
+   map = si->swap_map + offset;
+   for (i = 0; i < SWAPFILE_CLUSTER; i++) {
+   VM_BUG_ON(map[i] != SWAP_HAS_CACHE);
+   map[i] &= ~SWAP_HAS_CACHE;
+   }
+   unlock_cluster(ci);
+   /* Cluster size is same as huge pmd size */
+   mem_cgroup_uncharge_swap(entry, HPAGE_PMD_NR);
+   swap_free_huge_cluster(si, idx);
+   spin_unlock(>lock);
+}
+
 static int swap_alloc_huge_cluster(struct swap_info_struct *si,
   swp_entry_t *slot)
 {
@@ -887,6 +910,11 @@ static inline int swap_alloc_huge_cluster(struct 
swap_info_struct *si,
 {
return 0;
 }
+
+static inline void swapcache_free_trans_huge(struct swap_info_struct *si,
+swp_entry_t entry)
+{
+}
 #endif
 
 static unsigned long scan_swap_map(struct swap_info_struct *si,
@@ -1161,13 +1189,15 @@ void swap_free(swp_entry_t entry)
 /*
  * Called after dropping swapcache to decrease refcnt to swap entries.
  */
-void swapcache_free(swp_entry_t entry)
+void __swapcache_free(swp_entry_t entry, bool huge)
 {
struct swap_info_struct *p;
 
p = _swap_info_get(entry);
if (p) {
-   if (!__swap_entry_free(p, entry, SWAP_HAS_CACHE))
+   if (unlikely(huge))
+   swapcache_free_trans_huge(p, entry);
+   else if (!__swap_entry_free(p, entry, SWAP_HAS_CACHE))
free_swap_slot(entry);
}
 }
-- 
2.11.0



[PATCH -mm -v6 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache

2017-03-07 Thread Huang, Ying
From: Huang Ying 

With this patch, a THP (Transparent Huge Page) can be added/deleted
to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages.

This will be used for the THP (Transparent Huge Page) swap support.
Where one THP may be added/delted to/from the swap cache.  This will
batch the swap cache operations to reduce the lock acquire/release times
for the THP swap too.

Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Signed-off-by: "Huang, Ying" 
---
 include/linux/page-flags.h |  5 ++--
 mm/swap_state.c| 64 ++
 2 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6b5818d6de32..f4acd6c4f808 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -326,11 +326,12 @@ PAGEFLAG_FALSE(HighMem)
 #ifdef CONFIG_SWAP
 static __always_inline int PageSwapCache(struct page *page)
 {
+   page = compound_head(page);
return PageSwapBacked(page) && test_bit(PG_swapcache, >flags);
 
 }
-SETPAGEFLAG(SwapCache, swapcache, PF_NO_COMPOUND)
-CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_COMPOUND)
+SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
+CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
 PAGEFLAG_FALSE(SwapCache)
 #endif
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3c248f0a0abc..387466fd114b 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -38,6 +38,7 @@ struct address_space *swapper_spaces[MAX_SWAPFILES];
 static unsigned int nr_swapper_spaces[MAX_SWAPFILES];
 
 #define INC_CACHE_INFO(x)  do { swap_cache_info.x++; } while (0)
+#define ADD_CACHE_INFO(x, nr)  do { swap_cache_info.x += (nr); } while (0)
 
 static struct {
unsigned long add_total;
@@ -90,39 +91,52 @@ void show_swap_cache_info(void)
  */
 int __add_to_swap_cache(struct page *page, swp_entry_t entry)
 {
-   int error;
+   int error, i, nr = hpage_nr_pages(page);
struct address_space *address_space;
+   struct page *cur_page;
+   swp_entry_t cur_entry;
 
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(PageSwapCache(page), page);
VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
 
-   get_page(page);
+   page_ref_add(page, nr);
SetPageSwapCache(page);
-   set_page_private(page, entry.val);
 
address_space = swap_address_space(entry);
+   cur_page = page;
+   cur_entry.val = entry.val;
spin_lock_irq(_space->tree_lock);
-   error = radix_tree_insert(_space->page_tree,
- swp_offset(entry), page);
-   if (likely(!error)) {
-   address_space->nrpages++;
-   __inc_node_page_state(page, NR_FILE_PAGES);
-   INC_CACHE_INFO(add_total);
+   for (i = 0; i < nr; i++, cur_page++, cur_entry.val++) {
+   set_page_private(cur_page, cur_entry.val);
+   error = radix_tree_insert(_space->page_tree,
+ swp_offset(cur_entry), cur_page);
+   if (unlikely(error))
+   break;
}
-   spin_unlock_irq(_space->tree_lock);
-
-   if (unlikely(error)) {
+   if (likely(!error)) {
+   address_space->nrpages += nr;
+   __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
+   ADD_CACHE_INFO(add_total, nr);
+   } else {
/*
 * Only the context which have set SWAP_HAS_CACHE flag
 * would call add_to_swap_cache().
 * So add_to_swap_cache() doesn't returns -EEXIST.
 */
VM_BUG_ON(error == -EEXIST);
-   set_page_private(page, 0UL);
+   set_page_private(cur_page, 0UL);
+   while (i--) {
+   cur_page--;
+   cur_entry.val--;
+   radix_tree_delete(_space->page_tree,
+ swp_offset(cur_entry));
+   set_page_private(cur_page, 0UL);
+   }
ClearPageSwapCache(page);
-   put_page(page);
+   page_ref_sub(page, nr);
}
+   spin_unlock_irq(_space->tree_lock);
 
return error;
 }
@@ -132,7 +146,7 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, 
gfp_t gfp_mask)
 {
int error;
 
-   error = radix_tree_maybe_preload(gfp_mask);
+   error = radix_tree_maybe_preload_order(gfp_mask, compound_order(page));
if (!error) {
error = __add_to_swap_cache(page, entry);
radix_tree_preload_end();
@@ -148,6 +162,7 @@ void __delete_from_swap_cache(struct page *page)
 {
swp_entry_t entry;
  

[PATCH -mm -v6 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2017-03-07 Thread Huang, Ying
From: Huang Ying 

__swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag
for the huge page.  This will free the specified swap cluster now.
Because now this function will be called only in the error path to free
the swap cluster just allocated.  So the corresponding swap_map[i] ==
SWAP_HAS_CACHE, that is, the swap count is 0.  This makes the
implementation simpler than that of the ordinary swap entry.

This will be used for delaying splitting THP (Transparent Huge Page)
during swapping out.  Where for one THP to swap out, we will allocate a
swap cluster, add the THP into the swap cache, then split the THP.  If
anything fails after allocating the swap cluster and before splitting
the THP successfully, the swapcache_free_trans_huge() will be used to
free the swap space allocated.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h |  9 +++--
 mm/swapfile.c| 34 --
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index e3a7609a8989..2f2a6c0363aa 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -394,7 +394,7 @@ extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
 extern int swapcache_prepare(swp_entry_t);
 extern void swap_free(swp_entry_t);
-extern void swapcache_free(swp_entry_t);
+extern void __swapcache_free(swp_entry_t entry, bool huge);
 extern void swapcache_free_entries(swp_entry_t *entries, int n);
 extern int free_swap_and_cache(swp_entry_t);
 extern int swap_type_of(dev_t, sector_t, struct block_device **);
@@ -456,7 +456,7 @@ static inline void swap_free(swp_entry_t swp)
 {
 }
 
-static inline void swapcache_free(swp_entry_t swp)
+static inline void __swapcache_free(swp_entry_t swp, bool huge)
 {
 }
 
@@ -544,6 +544,11 @@ static inline swp_entry_t get_huge_swap_page(void)
 }
 #endif
 
+static inline void swapcache_free(swp_entry_t entry)
+{
+   __swapcache_free(entry, false);
+}
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 7241c937e52b..6019f94afbaf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -855,6 +855,29 @@ static void swap_free_huge_cluster(struct swap_info_struct 
*si,
_swap_entry_free(si, offset, true);
 }
 
+static void swapcache_free_trans_huge(struct swap_info_struct *si,
+ swp_entry_t entry)
+{
+   unsigned long offset = swp_offset(entry);
+   unsigned long idx = offset / SWAPFILE_CLUSTER;
+   struct swap_cluster_info *ci;
+   unsigned char *map;
+   unsigned int i;
+
+   spin_lock(>lock);
+   ci = lock_cluster(si, offset);
+   map = si->swap_map + offset;
+   for (i = 0; i < SWAPFILE_CLUSTER; i++) {
+   VM_BUG_ON(map[i] != SWAP_HAS_CACHE);
+   map[i] &= ~SWAP_HAS_CACHE;
+   }
+   unlock_cluster(ci);
+   /* Cluster size is same as huge pmd size */
+   mem_cgroup_uncharge_swap(entry, HPAGE_PMD_NR);
+   swap_free_huge_cluster(si, idx);
+   spin_unlock(>lock);
+}
+
 static int swap_alloc_huge_cluster(struct swap_info_struct *si,
   swp_entry_t *slot)
 {
@@ -887,6 +910,11 @@ static inline int swap_alloc_huge_cluster(struct 
swap_info_struct *si,
 {
return 0;
 }
+
+static inline void swapcache_free_trans_huge(struct swap_info_struct *si,
+swp_entry_t entry)
+{
+}
 #endif
 
 static unsigned long scan_swap_map(struct swap_info_struct *si,
@@ -1161,13 +1189,15 @@ void swap_free(swp_entry_t entry)
 /*
  * Called after dropping swapcache to decrease refcnt to swap entries.
  */
-void swapcache_free(swp_entry_t entry)
+void __swapcache_free(swp_entry_t entry, bool huge)
 {
struct swap_info_struct *p;
 
p = _swap_info_get(entry);
if (p) {
-   if (!__swap_entry_free(p, entry, SWAP_HAS_CACHE))
+   if (unlikely(huge))
+   swapcache_free_trans_huge(p, entry);
+   else if (!__swap_entry_free(p, entry, SWAP_HAS_CACHE))
free_swap_slot(entry);
}
 }
-- 
2.11.0



[PATCH -mm -v6 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache

2017-03-07 Thread Huang, Ying
From: Huang Ying 

With this patch, a THP (Transparent Huge Page) can be added/deleted
to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages.

This will be used for the THP (Transparent Huge Page) swap support.
Where one THP may be added/delted to/from the swap cache.  This will
batch the swap cache operations to reduce the lock acquire/release times
for the THP swap too.

Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Signed-off-by: "Huang, Ying" 
---
 include/linux/page-flags.h |  5 ++--
 mm/swap_state.c| 64 ++
 2 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6b5818d6de32..f4acd6c4f808 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -326,11 +326,12 @@ PAGEFLAG_FALSE(HighMem)
 #ifdef CONFIG_SWAP
 static __always_inline int PageSwapCache(struct page *page)
 {
+   page = compound_head(page);
return PageSwapBacked(page) && test_bit(PG_swapcache, >flags);
 
 }
-SETPAGEFLAG(SwapCache, swapcache, PF_NO_COMPOUND)
-CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_COMPOUND)
+SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
+CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
 PAGEFLAG_FALSE(SwapCache)
 #endif
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3c248f0a0abc..387466fd114b 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -38,6 +38,7 @@ struct address_space *swapper_spaces[MAX_SWAPFILES];
 static unsigned int nr_swapper_spaces[MAX_SWAPFILES];
 
 #define INC_CACHE_INFO(x)  do { swap_cache_info.x++; } while (0)
+#define ADD_CACHE_INFO(x, nr)  do { swap_cache_info.x += (nr); } while (0)
 
 static struct {
unsigned long add_total;
@@ -90,39 +91,52 @@ void show_swap_cache_info(void)
  */
 int __add_to_swap_cache(struct page *page, swp_entry_t entry)
 {
-   int error;
+   int error, i, nr = hpage_nr_pages(page);
struct address_space *address_space;
+   struct page *cur_page;
+   swp_entry_t cur_entry;
 
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(PageSwapCache(page), page);
VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
 
-   get_page(page);
+   page_ref_add(page, nr);
SetPageSwapCache(page);
-   set_page_private(page, entry.val);
 
address_space = swap_address_space(entry);
+   cur_page = page;
+   cur_entry.val = entry.val;
spin_lock_irq(_space->tree_lock);
-   error = radix_tree_insert(_space->page_tree,
- swp_offset(entry), page);
-   if (likely(!error)) {
-   address_space->nrpages++;
-   __inc_node_page_state(page, NR_FILE_PAGES);
-   INC_CACHE_INFO(add_total);
+   for (i = 0; i < nr; i++, cur_page++, cur_entry.val++) {
+   set_page_private(cur_page, cur_entry.val);
+   error = radix_tree_insert(_space->page_tree,
+ swp_offset(cur_entry), cur_page);
+   if (unlikely(error))
+   break;
}
-   spin_unlock_irq(_space->tree_lock);
-
-   if (unlikely(error)) {
+   if (likely(!error)) {
+   address_space->nrpages += nr;
+   __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
+   ADD_CACHE_INFO(add_total, nr);
+   } else {
/*
 * Only the context which have set SWAP_HAS_CACHE flag
 * would call add_to_swap_cache().
 * So add_to_swap_cache() doesn't returns -EEXIST.
 */
VM_BUG_ON(error == -EEXIST);
-   set_page_private(page, 0UL);
+   set_page_private(cur_page, 0UL);
+   while (i--) {
+   cur_page--;
+   cur_entry.val--;
+   radix_tree_delete(_space->page_tree,
+ swp_offset(cur_entry));
+   set_page_private(cur_page, 0UL);
+   }
ClearPageSwapCache(page);
-   put_page(page);
+   page_ref_sub(page, nr);
}
+   spin_unlock_irq(_space->tree_lock);
 
return error;
 }
@@ -132,7 +146,7 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, 
gfp_t gfp_mask)
 {
int error;
 
-   error = radix_tree_maybe_preload(gfp_mask);
+   error = radix_tree_maybe_preload_order(gfp_mask, compound_order(page));
if (!error) {
error = __add_to_swap_cache(page, entry);
radix_tree_preload_end();
@@ -148,6 +162,7 @@ void __delete_from_swap_cache(struct page *page)
 {
swp_entry_t entry;
struct address_space *address_space;
+   int i, nr = hpage_nr_pages(page);
 
VM_BUG_ON_PAGE(!PageLocked(page), page);

[PATCH -mm -v6 7/9] mm, THP: Add can_split_huge_page()

2017-03-07 Thread Huang, Ying
From: Huang Ying 

Separates checking whether we can split the huge page from
split_huge_page_to_list() into a function.  This will help to check that
before splitting the THP (Transparent Huge Page) really.

This will be used for delaying splitting THP during swapping out.  Where
for a THP, we will allocate a swap cluster, add the THP into the swap
cache, then split the THP.  To avoid the unnecessary operations for the
un-splittable THP, we will check that firstly.

There is no functionality change in this patch.

Cc: Andrea Arcangeli 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
Acked-by: Kirill A. Shutemov 
---
 include/linux/huge_mm.h |  7 +++
 mm/huge_memory.c| 17 ++---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index a3762d49ba39..d3b3e8fcc717 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -113,6 +113,7 @@ extern unsigned long thp_get_unmapped_area(struct file 
*filp,
 extern void prep_transhuge_page(struct page *page);
 extern void free_transhuge_page(struct page *page);
 
+bool can_split_huge_page(struct page *page, int *pextra_pins);
 int split_huge_page_to_list(struct page *page, struct list_head *list);
 static inline int split_huge_page(struct page *page)
 {
@@ -231,6 +232,12 @@ static inline void prep_transhuge_page(struct page *page) 
{}
 
 #define thp_get_unmapped_area  NULL
 
+static inline bool
+can_split_huge_page(struct page *page, int *pextra_pins)
+{
+   BUILD_BUG();
+   return false;
+}
 static inline int
 split_huge_page_to_list(struct page *page, struct list_head *list)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2b4120f6930c..45f944db43b0 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2362,6 +2362,19 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
return ret;
 }
 
+/* Racy check whether the huge page can be split */
+bool can_split_huge_page(struct page *page, int *pextra_pins)
+{
+   int extra_pins = 0;
+
+   /* Additional pins from radix tree */
+   if (!PageAnon(page))
+   extra_pins = HPAGE_PMD_NR;
+   if (pextra_pins)
+   *pextra_pins = extra_pins;
+   return total_mapcount(page) == page_count(page) - extra_pins - 1;
+}
+
 /*
  * This function splits huge page into normal pages. @page can point to any
  * subpage of huge page to split. Split doesn't change the position of @page.
@@ -2421,8 +2434,6 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
goto out;
}
 
-   /* Addidional pins from radix tree */
-   extra_pins = HPAGE_PMD_NR;
anon_vma = NULL;
i_mmap_lock_read(mapping);
}
@@ -2431,7 +2442,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
 * Racy check if we can split the page, before freeze_page() will
 * split PMDs
 */
-   if (total_mapcount(head) != page_count(head) - extra_pins - 1) {
+   if (!can_split_huge_page(head, _pins)) {
ret = -EBUSY;
goto out_unlock;
}
-- 
2.11.0



[PATCH -mm -v6 7/9] mm, THP: Add can_split_huge_page()

2017-03-07 Thread Huang, Ying
From: Huang Ying 

Separates checking whether we can split the huge page from
split_huge_page_to_list() into a function.  This will help to check that
before splitting the THP (Transparent Huge Page) really.

This will be used for delaying splitting THP during swapping out.  Where
for a THP, we will allocate a swap cluster, add the THP into the swap
cache, then split the THP.  To avoid the unnecessary operations for the
un-splittable THP, we will check that firstly.

There is no functionality change in this patch.

Cc: Andrea Arcangeli 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
Acked-by: Kirill A. Shutemov 
---
 include/linux/huge_mm.h |  7 +++
 mm/huge_memory.c| 17 ++---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index a3762d49ba39..d3b3e8fcc717 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -113,6 +113,7 @@ extern unsigned long thp_get_unmapped_area(struct file 
*filp,
 extern void prep_transhuge_page(struct page *page);
 extern void free_transhuge_page(struct page *page);
 
+bool can_split_huge_page(struct page *page, int *pextra_pins);
 int split_huge_page_to_list(struct page *page, struct list_head *list);
 static inline int split_huge_page(struct page *page)
 {
@@ -231,6 +232,12 @@ static inline void prep_transhuge_page(struct page *page) 
{}
 
 #define thp_get_unmapped_area  NULL
 
+static inline bool
+can_split_huge_page(struct page *page, int *pextra_pins)
+{
+   BUILD_BUG();
+   return false;
+}
 static inline int
 split_huge_page_to_list(struct page *page, struct list_head *list)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2b4120f6930c..45f944db43b0 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2362,6 +2362,19 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
return ret;
 }
 
+/* Racy check whether the huge page can be split */
+bool can_split_huge_page(struct page *page, int *pextra_pins)
+{
+   int extra_pins = 0;
+
+   /* Additional pins from radix tree */
+   if (!PageAnon(page))
+   extra_pins = HPAGE_PMD_NR;
+   if (pextra_pins)
+   *pextra_pins = extra_pins;
+   return total_mapcount(page) == page_count(page) - extra_pins - 1;
+}
+
 /*
  * This function splits huge page into normal pages. @page can point to any
  * subpage of huge page to split. Split doesn't change the position of @page.
@@ -2421,8 +2434,6 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
goto out;
}
 
-   /* Addidional pins from radix tree */
-   extra_pins = HPAGE_PMD_NR;
anon_vma = NULL;
i_mmap_lock_read(mapping);
}
@@ -2431,7 +2442,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
 * Racy check if we can split the page, before freeze_page() will
 * split PMDs
 */
-   if (total_mapcount(head) != page_count(head) - extra_pins - 1) {
+   if (!can_split_huge_page(head, _pins)) {
ret = -EBUSY;
goto out_unlock;
}
-- 
2.11.0



[PATCH -mm -v6 9/9] mm, THP, swap: Delay splitting THP during swap out

2017-03-07 Thread Huang, Ying
From: Huang Ying 

In this patch, splitting huge page is delayed from almost the first step
of swapping out to after allocating the swap space for the
THP (Transparent Huge Page) and adding the THP into the swap cache.
This will reduce lock acquiring/releasing for the locks used for the
swap cache management.

This is the first step for the THP swap support.  The plan is to delay
splitting the THP step by step and avoid splitting the THP finally.

The advantages of the THP swap support include:

- Batch the swap operations for the THP to reduce lock
  acquiring/releasing, including allocating/freeing the swap space,
  adding/deleting to/from the swap cache, and writing/reading the swap
  space, etc.  This will help to improve the THP swap performance.

- The THP swap space read/write will be 2M sequential IO.  It is
  particularly helpful for the swap read, which usually are 4k random
  IO.  This will help to improve the THP swap performance too.

- It will help the memory fragmentation, especially when the THP is
  heavily used by the applications.  The 2M continuous pages will be
  free up after the THP swapping out.

- It will improve the THP utilization on the system with the swap
  turned on.  Because the speed for khugepaged to collapse the normal
  pages into the THP is quite slow.  After the THP is split during the
  swapping out, it will take quite long time for the normal pages to
  collapse back into the THP after being swapped in.  The high THP
  utilization helps the efficiency of the page based memory management
  too.

There are some concerns regarding THP swap in, mainly because possible
enlarged read/write IO size (for swap in/out) may put more overhead on
the storage device.  To deal with that, the THP swap in should be
turned on only when necessary.  For example, it can be selected via
"always/never/madvise" logic, to be turned on globally, turned off
globally, or turned on only for VMA with MADV_HUGEPAGE, etc.

With the patchset, the swap out throughput improves 14.9% (from about
3.77GB/s to about 4.34GB/s) in the vm-scalability swap-w-seq test case
with 8 processes.  The test is done on a Xeon E5 v3 system.  The swap
device used is a RAM simulated PMEM (persistent memory) device.  To
test the sequential swapping out, the test case creates 8 processes,
which sequentially allocate and write to the anonymous pages until the
RAM and part of the swap device is used up.

The detailed comparison result is as follow,

base base+patchset
 --
 %stddev %change %stddev
 \  |\
   7043990 ±  0% +21.2%8536807 ±  0%  vm-scalability.throughput
109.94 ±  1% -16.2%  92.09 ±  0%  vm-scalability.time.elapsed_time
   3957091 ±  0% +14.9%4547173 ±  0%  vmstat.swap.so
 31.46 ±  1% -38.3%  19.42 ±  0%  perf-stat.cache-miss-rate%
  1.04 ±  1% +22.2%   1.27 ±  0%  perf-stat.ipc
  9.33 ±  2% -60.7%   3.67 ±  1%  
perf-profile.calltrace.cycles-pp.add_to_swap.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node

Signed-off-by: "Huang, Ying" 
---
 mm/swap_state.c | 60 ++---
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 387466fd114b..12e7a461cf4c 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -183,12 +184,53 @@ void __delete_from_swap_cache(struct page *page)
ADD_CACHE_INFO(del_total, nr);
 }
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+int add_to_swap_trans_huge(struct page *page, struct list_head *list)
+{
+   swp_entry_t entry;
+   int ret = 0;
+
+   /* cannot split, which may be needed during swap in, skip it */
+   if (!can_split_huge_page(page, NULL))
+   return -EBUSY;
+   /* fallback to split huge page firstly if no PMD map */
+   if (!compound_mapcount(page))
+   return 0;
+   entry = get_huge_swap_page();
+   if (!entry.val)
+   return 0;
+   if (mem_cgroup_try_charge_swap(page, entry, HPAGE_PMD_NR)) {
+   __swapcache_free(entry, true);
+   return -EOVERFLOW;
+   }
+   ret = add_to_swap_cache(page, entry,
+   __GFP_HIGH | __GFP_NOMEMALLOC|__GFP_NOWARN);
+   /* -ENOMEM radix-tree allocation failure */
+   if (ret) {
+   __swapcache_free(entry, true);
+   return 0;
+   }
+   ret = split_huge_page_to_list(page, list);
+   if (ret) {
+   delete_from_swap_cache(page);
+   return -EBUSY;
+   }
+   return 1;
+}
+#else
+static inline int add_to_swap_trans_huge(struct page *page,
+struct list_head *list)
+{
+   return 0;
+}
+#endif
+
 /**
  * 

[PATCH -mm -v6 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2017-03-07 Thread Huang, Ying
From: Huang Ying 

This patch make it possible to charge or uncharge a set of continuous
swap entries in the swap cgroup.  The number of swap entries is
specified via an added parameter.

This will be used for the THP (Transparent Huge Page) swap support.
Where a swap cluster backing a THP may be allocated and freed as a
whole.  So a set of (HPAGE_PMD_NR) continuous swap entries backing one
THP need to be charged or uncharged together.  This will batch the
cgroup operations for the THP swap too.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Tejun Heo 
Cc: cgro...@vger.kernel.org
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h| 12 ++
 include/linux/swap_cgroup.h |  6 +++--
 mm/memcontrol.c | 57 +
 mm/shmem.c  |  2 +-
 mm/swap_cgroup.c| 40 +++
 mm/swap_state.c |  2 +-
 mm/swapfile.c   |  2 +-
 7 files changed, 77 insertions(+), 44 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 486494e6b2fc..278e1349a424 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -550,8 +550,10 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup 
*mem)
 
 #ifdef CONFIG_MEMCG_SWAP
 extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
-extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
-extern void mem_cgroup_uncharge_swap(swp_entry_t entry);
+extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry,
+ unsigned int nr_entries);
+extern void mem_cgroup_uncharge_swap(swp_entry_t entry,
+unsigned int nr_entries);
 extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
 extern bool mem_cgroup_swap_full(struct page *page);
 #else
@@ -560,12 +562,14 @@ static inline void mem_cgroup_swapout(struct page *page, 
swp_entry_t entry)
 }
 
 static inline int mem_cgroup_try_charge_swap(struct page *page,
-swp_entry_t entry)
+swp_entry_t entry,
+unsigned int nr_entries)
 {
return 0;
 }
 
-static inline void mem_cgroup_uncharge_swap(swp_entry_t entry)
+static inline void mem_cgroup_uncharge_swap(swp_entry_t entry,
+   unsigned int nr_entries)
 {
 }
 
diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
index 145306bdc92f..b2b8ec7bda3f 100644
--- a/include/linux/swap_cgroup.h
+++ b/include/linux/swap_cgroup.h
@@ -7,7 +7,8 @@
 
 extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
unsigned short old, unsigned short new);
-extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id);
+extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
+unsigned int nr_ents);
 extern unsigned short lookup_swap_cgroup_id(swp_entry_t ent);
 extern int swap_cgroup_swapon(int type, unsigned long max_pages);
 extern void swap_cgroup_swapoff(int type);
@@ -15,7 +16,8 @@ extern void swap_cgroup_swapoff(int type);
 #else
 
 static inline
-unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
+unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
+ unsigned int nr_ents)
 {
return 0;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 537362e4108e..2b0b6012ad22 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2393,10 +2393,9 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 
 #ifdef CONFIG_MEMCG_SWAP
 static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg,
-bool charge)
+  int nr_entries)
 {
-   int val = (charge) ? 1 : -1;
-   this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
+   this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], nr_entries);
 }
 
 /**
@@ -2422,8 +2421,8 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
new_id = mem_cgroup_id(to);
 
if (swap_cgroup_cmpxchg(entry, old_id, new_id) == old_id) {
-   mem_cgroup_swap_statistics(from, false);
-   mem_cgroup_swap_statistics(to, true);
+   mem_cgroup_swap_statistics(from, -1);
+   mem_cgroup_swap_statistics(to, 1);
return 0;
}
return -EINVAL;
@@ -5446,7 +5445,7 @@ void mem_cgroup_commit_charge(struct page *page, struct 
mem_cgroup *memcg,
 * let's not wait for it.  The page already received a
   

[PATCH -mm -v6 9/9] mm, THP, swap: Delay splitting THP during swap out

2017-03-07 Thread Huang, Ying
From: Huang Ying 

In this patch, splitting huge page is delayed from almost the first step
of swapping out to after allocating the swap space for the
THP (Transparent Huge Page) and adding the THP into the swap cache.
This will reduce lock acquiring/releasing for the locks used for the
swap cache management.

This is the first step for the THP swap support.  The plan is to delay
splitting the THP step by step and avoid splitting the THP finally.

The advantages of the THP swap support include:

- Batch the swap operations for the THP to reduce lock
  acquiring/releasing, including allocating/freeing the swap space,
  adding/deleting to/from the swap cache, and writing/reading the swap
  space, etc.  This will help to improve the THP swap performance.

- The THP swap space read/write will be 2M sequential IO.  It is
  particularly helpful for the swap read, which usually are 4k random
  IO.  This will help to improve the THP swap performance too.

- It will help the memory fragmentation, especially when the THP is
  heavily used by the applications.  The 2M continuous pages will be
  free up after the THP swapping out.

- It will improve the THP utilization on the system with the swap
  turned on.  Because the speed for khugepaged to collapse the normal
  pages into the THP is quite slow.  After the THP is split during the
  swapping out, it will take quite long time for the normal pages to
  collapse back into the THP after being swapped in.  The high THP
  utilization helps the efficiency of the page based memory management
  too.

There are some concerns regarding THP swap in, mainly because possible
enlarged read/write IO size (for swap in/out) may put more overhead on
the storage device.  To deal with that, the THP swap in should be
turned on only when necessary.  For example, it can be selected via
"always/never/madvise" logic, to be turned on globally, turned off
globally, or turned on only for VMA with MADV_HUGEPAGE, etc.

With the patchset, the swap out throughput improves 14.9% (from about
3.77GB/s to about 4.34GB/s) in the vm-scalability swap-w-seq test case
with 8 processes.  The test is done on a Xeon E5 v3 system.  The swap
device used is a RAM simulated PMEM (persistent memory) device.  To
test the sequential swapping out, the test case creates 8 processes,
which sequentially allocate and write to the anonymous pages until the
RAM and part of the swap device is used up.

The detailed comparison result is as follow,

base base+patchset
 --
 %stddev %change %stddev
 \  |\
   7043990 ±  0% +21.2%8536807 ±  0%  vm-scalability.throughput
109.94 ±  1% -16.2%  92.09 ±  0%  vm-scalability.time.elapsed_time
   3957091 ±  0% +14.9%4547173 ±  0%  vmstat.swap.so
 31.46 ±  1% -38.3%  19.42 ±  0%  perf-stat.cache-miss-rate%
  1.04 ±  1% +22.2%   1.27 ±  0%  perf-stat.ipc
  9.33 ±  2% -60.7%   3.67 ±  1%  
perf-profile.calltrace.cycles-pp.add_to_swap.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node

Signed-off-by: "Huang, Ying" 
---
 mm/swap_state.c | 60 ++---
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 387466fd114b..12e7a461cf4c 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -183,12 +184,53 @@ void __delete_from_swap_cache(struct page *page)
ADD_CACHE_INFO(del_total, nr);
 }
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+int add_to_swap_trans_huge(struct page *page, struct list_head *list)
+{
+   swp_entry_t entry;
+   int ret = 0;
+
+   /* cannot split, which may be needed during swap in, skip it */
+   if (!can_split_huge_page(page, NULL))
+   return -EBUSY;
+   /* fallback to split huge page firstly if no PMD map */
+   if (!compound_mapcount(page))
+   return 0;
+   entry = get_huge_swap_page();
+   if (!entry.val)
+   return 0;
+   if (mem_cgroup_try_charge_swap(page, entry, HPAGE_PMD_NR)) {
+   __swapcache_free(entry, true);
+   return -EOVERFLOW;
+   }
+   ret = add_to_swap_cache(page, entry,
+   __GFP_HIGH | __GFP_NOMEMALLOC|__GFP_NOWARN);
+   /* -ENOMEM radix-tree allocation failure */
+   if (ret) {
+   __swapcache_free(entry, true);
+   return 0;
+   }
+   ret = split_huge_page_to_list(page, list);
+   if (ret) {
+   delete_from_swap_cache(page);
+   return -EBUSY;
+   }
+   return 1;
+}
+#else
+static inline int add_to_swap_trans_huge(struct page *page,
+struct list_head *list)
+{
+   return 0;
+}
+#endif
+
 /**
  * add_to_swap - allocate swap space for a page
  * 

[PATCH -mm -v6 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2017-03-07 Thread Huang, Ying
From: Huang Ying 

This patch make it possible to charge or uncharge a set of continuous
swap entries in the swap cgroup.  The number of swap entries is
specified via an added parameter.

This will be used for the THP (Transparent Huge Page) swap support.
Where a swap cluster backing a THP may be allocated and freed as a
whole.  So a set of (HPAGE_PMD_NR) continuous swap entries backing one
THP need to be charged or uncharged together.  This will batch the
cgroup operations for the THP swap too.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Tejun Heo 
Cc: cgro...@vger.kernel.org
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h| 12 ++
 include/linux/swap_cgroup.h |  6 +++--
 mm/memcontrol.c | 57 +
 mm/shmem.c  |  2 +-
 mm/swap_cgroup.c| 40 +++
 mm/swap_state.c |  2 +-
 mm/swapfile.c   |  2 +-
 7 files changed, 77 insertions(+), 44 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 486494e6b2fc..278e1349a424 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -550,8 +550,10 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup 
*mem)
 
 #ifdef CONFIG_MEMCG_SWAP
 extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
-extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
-extern void mem_cgroup_uncharge_swap(swp_entry_t entry);
+extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry,
+ unsigned int nr_entries);
+extern void mem_cgroup_uncharge_swap(swp_entry_t entry,
+unsigned int nr_entries);
 extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
 extern bool mem_cgroup_swap_full(struct page *page);
 #else
@@ -560,12 +562,14 @@ static inline void mem_cgroup_swapout(struct page *page, 
swp_entry_t entry)
 }
 
 static inline int mem_cgroup_try_charge_swap(struct page *page,
-swp_entry_t entry)
+swp_entry_t entry,
+unsigned int nr_entries)
 {
return 0;
 }
 
-static inline void mem_cgroup_uncharge_swap(swp_entry_t entry)
+static inline void mem_cgroup_uncharge_swap(swp_entry_t entry,
+   unsigned int nr_entries)
 {
 }
 
diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
index 145306bdc92f..b2b8ec7bda3f 100644
--- a/include/linux/swap_cgroup.h
+++ b/include/linux/swap_cgroup.h
@@ -7,7 +7,8 @@
 
 extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
unsigned short old, unsigned short new);
-extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id);
+extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
+unsigned int nr_ents);
 extern unsigned short lookup_swap_cgroup_id(swp_entry_t ent);
 extern int swap_cgroup_swapon(int type, unsigned long max_pages);
 extern void swap_cgroup_swapoff(int type);
@@ -15,7 +16,8 @@ extern void swap_cgroup_swapoff(int type);
 #else
 
 static inline
-unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
+unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
+ unsigned int nr_ents)
 {
return 0;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 537362e4108e..2b0b6012ad22 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2393,10 +2393,9 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 
 #ifdef CONFIG_MEMCG_SWAP
 static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg,
-bool charge)
+  int nr_entries)
 {
-   int val = (charge) ? 1 : -1;
-   this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
+   this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], nr_entries);
 }
 
 /**
@@ -2422,8 +2421,8 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
new_id = mem_cgroup_id(to);
 
if (swap_cgroup_cmpxchg(entry, old_id, new_id) == old_id) {
-   mem_cgroup_swap_statistics(from, false);
-   mem_cgroup_swap_statistics(to, true);
+   mem_cgroup_swap_statistics(from, -1);
+   mem_cgroup_swap_statistics(to, 1);
return 0;
}
return -EINVAL;
@@ -5446,7 +5445,7 @@ void mem_cgroup_commit_charge(struct page *page, struct 
mem_cgroup *memcg,
 * let's not wait for it.  The page already received a
 * memory+swap charge, drop the swap entry duplicate.
 */
-   mem_cgroup_uncharge_swap(entry);
+   

[PATCH -mm -v6 8/9] mm, THP, swap: Support to split THP in swap cache

2017-03-07 Thread Huang, Ying
From: Huang Ying 

This patch enhanced the split_huge_page_to_list() to work properly for
the THP (Transparent Huge Page) in the swap cache during swapping out.

This is used for delaying splitting the THP during swapping out.  Where
for a THP to be swapped out, we will allocate a swap cluster, add the
THP into the swap cache, then split the THP.  The page lock will be held
during this process.  So in the code path other than swapping out, if
the THP need to be split, the PageSwapCache(THP) will be always false.

Cc: Andrea Arcangeli 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
Acked-by: Kirill A. Shutemov 
---
 mm/huge_memory.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 45f944db43b0..ffb7da440fb8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2180,7 +2180,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 * atomic_set() here would be safe on all archs (and not only on x86),
 * it's safer to use atomic_inc()/atomic_add().
 */
-   if (PageAnon(head)) {
+   if (PageAnon(head) && !PageSwapCache(head)) {
page_ref_inc(page_tail);
} else {
/* Additional pin to radix tree */
@@ -2191,6 +2191,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
page_tail->flags |= (head->flags &
((1L << PG_referenced) |
 (1L << PG_swapbacked) |
+(1L << PG_swapcache) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
 (1L << PG_active) |
@@ -2253,7 +2254,11 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
ClearPageCompound(head);
/* See comment in __split_huge_page_tail() */
if (PageAnon(head)) {
-   page_ref_inc(head);
+   /* Additional pin to radix tree of swap cache */
+   if (PageSwapCache(head))
+   page_ref_add(head, 2);
+   else
+   page_ref_inc(head);
} else {
/* Additional pin to radix tree */
page_ref_add(head, 2);
@@ -2365,10 +2370,12 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
 /* Racy check whether the huge page can be split */
 bool can_split_huge_page(struct page *page, int *pextra_pins)
 {
-   int extra_pins = 0;
+   int extra_pins;
 
/* Additional pins from radix tree */
-   if (!PageAnon(page))
+   if (PageAnon(page))
+   extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
+   else
extra_pins = HPAGE_PMD_NR;
if (pextra_pins)
*pextra_pins = extra_pins;
@@ -2422,7 +2429,6 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
ret = -EBUSY;
goto out;
}
-   extra_pins = 0;
mapping = NULL;
anon_vma_lock_write(anon_vma);
} else {
-- 
2.11.0



[PATCH -mm -v6 8/9] mm, THP, swap: Support to split THP in swap cache

2017-03-07 Thread Huang, Ying
From: Huang Ying 

This patch enhanced the split_huge_page_to_list() to work properly for
the THP (Transparent Huge Page) in the swap cache during swapping out.

This is used for delaying splitting the THP during swapping out.  Where
for a THP to be swapped out, we will allocate a swap cluster, add the
THP into the swap cache, then split the THP.  The page lock will be held
during this process.  So in the code path other than swapping out, if
the THP need to be split, the PageSwapCache(THP) will be always false.

Cc: Andrea Arcangeli 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
Acked-by: Kirill A. Shutemov 
---
 mm/huge_memory.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 45f944db43b0..ffb7da440fb8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2180,7 +2180,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 * atomic_set() here would be safe on all archs (and not only on x86),
 * it's safer to use atomic_inc()/atomic_add().
 */
-   if (PageAnon(head)) {
+   if (PageAnon(head) && !PageSwapCache(head)) {
page_ref_inc(page_tail);
} else {
/* Additional pin to radix tree */
@@ -2191,6 +2191,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
page_tail->flags |= (head->flags &
((1L << PG_referenced) |
 (1L << PG_swapbacked) |
+(1L << PG_swapcache) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
 (1L << PG_active) |
@@ -2253,7 +2254,11 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
ClearPageCompound(head);
/* See comment in __split_huge_page_tail() */
if (PageAnon(head)) {
-   page_ref_inc(head);
+   /* Additional pin to radix tree of swap cache */
+   if (PageSwapCache(head))
+   page_ref_add(head, 2);
+   else
+   page_ref_inc(head);
} else {
/* Additional pin to radix tree */
page_ref_add(head, 2);
@@ -2365,10 +2370,12 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
 /* Racy check whether the huge page can be split */
 bool can_split_huge_page(struct page *page, int *pextra_pins)
 {
-   int extra_pins = 0;
+   int extra_pins;
 
/* Additional pins from radix tree */
-   if (!PageAnon(page))
+   if (PageAnon(page))
+   extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
+   else
extra_pins = HPAGE_PMD_NR;
if (pextra_pins)
*pextra_pins = extra_pins;
@@ -2422,7 +2429,6 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
ret = -EBUSY;
goto out;
}
-   extra_pins = 0;
mapping = NULL;
anon_vma_lock_write(anon_vma);
} else {
-- 
2.11.0



Re: iommu/rockchip: Fix bugs and enable on ARM64

2017-03-07 Thread Caesar Wang

Shunqian,
something is depending on these patches, can you resend these patches to 
solve the

compile errors?

-Caesar
在 2016年07月16日 00:16, Joerg Roedel 写道:

On Fri, Jul 15, 2016 at 05:32:02PM +0200, Matthias Brugger wrote:

The drm rockchip patches are dependent on iommu/rockchip patches, can
you also apply these patches together? So that can avoid compile problem.


While at it. I don't see patch 8 (iommu/Kconfig) in linux-next.
I suppose you forgot to pick that one.

I picked it up first, but it causes compile errors, so I removed it for
now.


Joerg


___
Linux-rockchip mailing list
linux-rockc...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip





Re: iommu/rockchip: Fix bugs and enable on ARM64

2017-03-07 Thread Caesar Wang

Shunqian,
something is depending on these patches, can you resend these patches to 
solve the

compile errors?

-Caesar
在 2016年07月16日 00:16, Joerg Roedel 写道:

On Fri, Jul 15, 2016 at 05:32:02PM +0200, Matthias Brugger wrote:

The drm rockchip patches are dependent on iommu/rockchip patches, can
you also apply these patches together? So that can avoid compile problem.


While at it. I don't see patch 8 (iommu/Kconfig) in linux-next.
I suppose you forgot to pick that one.

I picked it up first, but it causes compile errors, so I removed it for
now.


Joerg


___
Linux-rockchip mailing list
linux-rockc...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip





[PATCH v4 3/7] arm: dts: imx6q: Add Engicam i.CoreM6 Quad/Dual OpenFrame Cap 12.3 initial support

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

i.CoreM6 Quad/Dual OpenFrame modules are "system on modules plus
openframe display carriers" which are good solution for develop
user friendly graphic user interface.

General features:
CPU   NXP i.MX6Q rev1.2 at 792 MHz
RAM   1GB, 32, 64 bit, DDR3-800/1066
NAND  SLC,512MB
LVDS Display  TFT 12.3" industrial, 1280x480 resolution
Backlight LED backlight, brightness 350 Cd/m2
Power supply  15 to 30 Vdc

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Fix checkpatch.pl Errors/Warnings
Changes for v3:
- Use native-mode as timing0 since this is the initial lvds-channel
- Rename hsd100pxn1 reference as timing0
- Update the correct patch author
Changes for v2:
- none

 arch/arm/boot/dts/Makefile|  1 +
 arch/arm/boot/dts/imx6q-icore-ofcap12.dts | 76 +++
 2 files changed, 77 insertions(+)
 create mode 100644 arch/arm/boot/dts/imx6q-icore-ofcap12.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 458ec09..adfdaee 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -399,6 +399,7 @@ dtb-$(CONFIG_SOC_IMX6Q) += \
imx6q-hummingboard.dtb \
imx6q-icore.dtb \
imx6q-icore-ofcap10.dtb \
+   imx6q-icore-ofcap12.dtb \
imx6q-icore-rqs.dtb \
imx6q-marsboard.dtb \
imx6q-mccmon6.dtb \
diff --git a/arch/arm/boot/dts/imx6q-icore-ofcap12.dts 
b/arch/arm/boot/dts/imx6q-icore-ofcap12.dts
new file mode 100644
index 000..9e230f5
--- /dev/null
+++ b/arch/arm/boot/dts/imx6q-icore-ofcap12.dts
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2016 Amarula Solutions B.V.
+ * Copyright (C) 2016 Engicam S.r.l.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+#include "imx6q.dtsi"
+#include "imx6qdl-icore.dtsi"
+
+/ {
+   model = "Engicam i.CoreM6 Quad/Dual OpenFrame Capacitive touch 12 Kit";
+   compatible = "engicam,imx6-icore", "fsl,imx6q";
+};
+
+ {
+   status = "okay";
+
+   lvds-channel@0 {
+   fsl,data-mapping = "spwg";
+   fsl,data-width = <18>;
+   status = "okay";
+
+   display-timings {
+   native-mode = <>;
+   timing0: timing0 {
+   clock-frequency = <4680>;
+   hactive = <1280>;
+   vactive = <480>;
+   hback-porch = <353>;
+   hfront-porch = <47>;
+   vback-porch = <39>;
+   vfront-porch = <4>;
+   hsync-len = <8>;
+   vsync-len = <2>;
+   };
+   };
+   };
+};
-- 
1.9.1



Re: [PATCH v5 1/5] arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:08PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi | 2 ++
>  1 file changed, 2 insertions(+)

If anyone needs this, then:


The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201:

  Linux 4.11-rc1 (2017-03-05 12:59:56 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
tags/samsung-dt64-clock-dsi-property-4.12

for you to fetch changes up to e3c07546747cdec07ff15c984bc6cebc9c9f788c:

  arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node (2017-03-08 08:55:39 +0200)


Add the burst and esc clock frequency properties to DSI node in Exynos ARM64
DeviceTree, before the OF graph could be parsed.


Andi Shyti (2):
  arm64: dts: exynos: Enable ir-spi in the TM2 and TM2E boards
  arm64: dts: exynos: Add stmfts touchscreen node for TM2 and TM2E

Hoegeun Kwon (1):
  arm64: dts: exynos: Add the burst and esc clock frequency properties to 
DSI node

Hyungwon Hwang (1):
  arm64: dts: exynos: Add support for S6E3HA2 panel device on TM2 board

 .../boot/dts/exynos/exynos5433-tm2-common.dtsi | 43 +-
 arch/arm64/boot/dts/exynos/exynos5433-tm2.dts  | 17 +
 arch/arm64/boot/dts/exynos/exynos5433-tm2e.dts |  7 
 3 files changed, 65 insertions(+), 2 deletions(-)


[PATCH v4 3/7] arm: dts: imx6q: Add Engicam i.CoreM6 Quad/Dual OpenFrame Cap 12.3 initial support

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

i.CoreM6 Quad/Dual OpenFrame modules are "system on modules plus
openframe display carriers" which are good solution for develop
user friendly graphic user interface.

General features:
CPU   NXP i.MX6Q rev1.2 at 792 MHz
RAM   1GB, 32, 64 bit, DDR3-800/1066
NAND  SLC,512MB
LVDS Display  TFT 12.3" industrial, 1280x480 resolution
Backlight LED backlight, brightness 350 Cd/m2
Power supply  15 to 30 Vdc

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Fix checkpatch.pl Errors/Warnings
Changes for v3:
- Use native-mode as timing0 since this is the initial lvds-channel
- Rename hsd100pxn1 reference as timing0
- Update the correct patch author
Changes for v2:
- none

 arch/arm/boot/dts/Makefile|  1 +
 arch/arm/boot/dts/imx6q-icore-ofcap12.dts | 76 +++
 2 files changed, 77 insertions(+)
 create mode 100644 arch/arm/boot/dts/imx6q-icore-ofcap12.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 458ec09..adfdaee 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -399,6 +399,7 @@ dtb-$(CONFIG_SOC_IMX6Q) += \
imx6q-hummingboard.dtb \
imx6q-icore.dtb \
imx6q-icore-ofcap10.dtb \
+   imx6q-icore-ofcap12.dtb \
imx6q-icore-rqs.dtb \
imx6q-marsboard.dtb \
imx6q-mccmon6.dtb \
diff --git a/arch/arm/boot/dts/imx6q-icore-ofcap12.dts 
b/arch/arm/boot/dts/imx6q-icore-ofcap12.dts
new file mode 100644
index 000..9e230f5
--- /dev/null
+++ b/arch/arm/boot/dts/imx6q-icore-ofcap12.dts
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2016 Amarula Solutions B.V.
+ * Copyright (C) 2016 Engicam S.r.l.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+#include "imx6q.dtsi"
+#include "imx6qdl-icore.dtsi"
+
+/ {
+   model = "Engicam i.CoreM6 Quad/Dual OpenFrame Capacitive touch 12 Kit";
+   compatible = "engicam,imx6-icore", "fsl,imx6q";
+};
+
+ {
+   status = "okay";
+
+   lvds-channel@0 {
+   fsl,data-mapping = "spwg";
+   fsl,data-width = <18>;
+   status = "okay";
+
+   display-timings {
+   native-mode = <>;
+   timing0: timing0 {
+   clock-frequency = <4680>;
+   hactive = <1280>;
+   vactive = <480>;
+   hback-porch = <353>;
+   hfront-porch = <47>;
+   vback-porch = <39>;
+   vfront-porch = <4>;
+   hsync-len = <8>;
+   vsync-len = <2>;
+   };
+   };
+   };
+};
-- 
1.9.1



Re: [PATCH v5 1/5] arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:08PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi | 2 ++
>  1 file changed, 2 insertions(+)

If anyone needs this, then:


The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201:

  Linux 4.11-rc1 (2017-03-05 12:59:56 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
tags/samsung-dt64-clock-dsi-property-4.12

for you to fetch changes up to e3c07546747cdec07ff15c984bc6cebc9c9f788c:

  arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node (2017-03-08 08:55:39 +0200)


Add the burst and esc clock frequency properties to DSI node in Exynos ARM64
DeviceTree, before the OF graph could be parsed.


Andi Shyti (2):
  arm64: dts: exynos: Enable ir-spi in the TM2 and TM2E boards
  arm64: dts: exynos: Add stmfts touchscreen node for TM2 and TM2E

Hoegeun Kwon (1):
  arm64: dts: exynos: Add the burst and esc clock frequency properties to 
DSI node

Hyungwon Hwang (1):
  arm64: dts: exynos: Add support for S6E3HA2 panel device on TM2 board

 .../boot/dts/exynos/exynos5433-tm2-common.dtsi | 43 +-
 arch/arm64/boot/dts/exynos/exynos5433-tm2.dts  | 17 +
 arch/arm64/boot/dts/exynos/exynos5433-tm2e.dts |  7 
 3 files changed, 65 insertions(+), 2 deletions(-)


[PATCH v4 4/7] arm: dts: imx6q-icore: Add LVDS support

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Add LVDS display support for OpenFrame Capacitive touch 7 inc
display which is supported by Engicam i.CoreM6 QDL Starter Kit.

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Fix checkpatch.pl Errors/Warnings
Changes for v3:
- Use native-mode as timing0 since this is the initial lvds-channel
- Rename hsd100pxn1 reference as timing0
- Update the correct patch author
Changes for v2:
- none

 arch/arm/boot/dts/imx6q-icore.dts | 25 +
 1 file changed, 25 insertions(+)

diff --git a/arch/arm/boot/dts/imx6q-icore.dts 
b/arch/arm/boot/dts/imx6q-icore.dts
index 59eb7ad..73f34d1 100644
--- a/arch/arm/boot/dts/imx6q-icore.dts
+++ b/arch/arm/boot/dts/imx6q-icore.dts
@@ -57,3 +57,28 @@
  {
status = "okay";
 };
+
+ {
+   status = "okay";
+
+   lvds-channel@0 {
+   fsl,data-mapping = "spwg";
+   fsl,data-width = <18>;
+   status = "okay";
+
+   display-timings {
+   native-mode = <>;
+   timing0: timing0 {
+   clock-frequency = <6000>;
+   hactive = <800>;
+   vactive = <480>;
+   hback-porch = <30>;
+   hfront-porch = <30>;
+   vback-porch = <5>;
+   vfront-porch = <5>;
+   hsync-len = <64>;
+   vsync-len = <20>;
+   };
+   };
+   };
+};
-- 
1.9.1



random-dev misses merge window?

2017-03-07 Thread Jason A. Donenfeld
Hey Ted,

I was disappointed to see that this series:

https://git.kernel.org/cgit/linux/kernel/git/tytso/random.git/log/?h=dev

missed 4.11-rc1, especially considering those patches were ready since
January, and I had sent you several reminders to even get them merged
in your random.git tree. Did you simply forget, or is there just
something procedural we're waiting for in the 4.11 series?

IOW, can we make this happen?

Thanks,
Jason


[PATCH v4 4/7] arm: dts: imx6q-icore: Add LVDS support

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Add LVDS display support for OpenFrame Capacitive touch 7 inc
display which is supported by Engicam i.CoreM6 QDL Starter Kit.

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Fix checkpatch.pl Errors/Warnings
Changes for v3:
- Use native-mode as timing0 since this is the initial lvds-channel
- Rename hsd100pxn1 reference as timing0
- Update the correct patch author
Changes for v2:
- none

 arch/arm/boot/dts/imx6q-icore.dts | 25 +
 1 file changed, 25 insertions(+)

diff --git a/arch/arm/boot/dts/imx6q-icore.dts 
b/arch/arm/boot/dts/imx6q-icore.dts
index 59eb7ad..73f34d1 100644
--- a/arch/arm/boot/dts/imx6q-icore.dts
+++ b/arch/arm/boot/dts/imx6q-icore.dts
@@ -57,3 +57,28 @@
  {
status = "okay";
 };
+
+ {
+   status = "okay";
+
+   lvds-channel@0 {
+   fsl,data-mapping = "spwg";
+   fsl,data-width = <18>;
+   status = "okay";
+
+   display-timings {
+   native-mode = <>;
+   timing0: timing0 {
+   clock-frequency = <6000>;
+   hactive = <800>;
+   vactive = <480>;
+   hback-porch = <30>;
+   hfront-porch = <30>;
+   vback-porch = <5>;
+   vfront-porch = <5>;
+   hsync-len = <64>;
+   vsync-len = <20>;
+   };
+   };
+   };
+};
-- 
1.9.1



random-dev misses merge window?

2017-03-07 Thread Jason A. Donenfeld
Hey Ted,

I was disappointed to see that this series:

https://git.kernel.org/cgit/linux/kernel/git/tytso/random.git/log/?h=dev

missed 4.11-rc1, especially considering those patches were ready since
January, and I had sent you several reminders to even get them merged
in your random.git tree. Did you simply forget, or is there just
something procedural we're waiting for in the 4.11 series?

IOW, can we make this happen?

Thanks,
Jason


[PATCH v4 6/7] arm: imx_v6_v7_defconfig: Select max11801_ts touchscreen driver

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Select CONFIG_TOUCHSCREEN_MAX11801 so that we can have touchscreen
funtionality by default on Engicam i.CoreM6 Quad boards.

Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index eaba3b1..842168f 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -165,6 +165,7 @@ CONFIG_TOUCHSCREEN_ADS7846=y
 CONFIG_TOUCHSCREEN_EGALAX=y
 CONFIG_TOUCHSCREEN_IMX6UL_TSC=y
 CONFIG_TOUCHSCREEN_EDT_FT5X06=y
+CONFIG_TOUCHSCREEN_MAX11801=y
 CONFIG_TOUCHSCREEN_MC13783=y
 CONFIG_TOUCHSCREEN_TSC2004=y
 CONFIG_TOUCHSCREEN_TSC2007=y
-- 
1.9.1



[PATCH v4 6/7] arm: imx_v6_v7_defconfig: Select max11801_ts touchscreen driver

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

Select CONFIG_TOUCHSCREEN_MAX11801 so that we can have touchscreen
funtionality by default on Engicam i.CoreM6 Quad boards.

Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index eaba3b1..842168f 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -165,6 +165,7 @@ CONFIG_TOUCHSCREEN_ADS7846=y
 CONFIG_TOUCHSCREEN_EGALAX=y
 CONFIG_TOUCHSCREEN_IMX6UL_TSC=y
 CONFIG_TOUCHSCREEN_EDT_FT5X06=y
+CONFIG_TOUCHSCREEN_MAX11801=y
 CONFIG_TOUCHSCREEN_MC13783=y
 CONFIG_TOUCHSCREEN_TSC2004=y
 CONFIG_TOUCHSCREEN_TSC2007=y
-- 
1.9.1



[PATCH v4 5/7] arm: dts: imx6q-icore: Add touchscreen node

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

max11801 touchscreen on Engicam iCoreM6 Quad module is
connected via i2c1, so add tc: max11801@48 on i2c1.

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/imx6q-icore.dts 
b/arch/arm/boot/dts/imx6q-icore.dts
index 73f34d1..8c1a572 100644
--- a/arch/arm/boot/dts/imx6q-icore.dts
+++ b/arch/arm/boot/dts/imx6q-icore.dts
@@ -58,6 +58,15 @@
status = "okay";
 };
 
+ {
+   ts: max11801@48 {
+   compatible = "max11801";
+   reg = <0x48>;
+   interrupt-parent = <>;
+   interrupts = <31 IRQ_TYPE_EDGE_FALLING>;
+   };
+};
+
  {
status = "okay";
 
-- 
1.9.1



[PATCH v4 5/7] arm: dts: imx6q-icore: Add touchscreen node

2017-03-07 Thread Jagan Teki
From: Jagan Teki 

max11801 touchscreen on Engicam iCoreM6 Quad module is
connected via i2c1, so add tc: max11801@48 on i2c1.

Cc: Domenico Acri 
Cc: Matteo Lisi 
Cc: Michael Trimarchi 
Cc: Shawn Guo 
Signed-off-by: Jagan Teki 
---
Changes for v4:
- Newly added patch.

 arch/arm/boot/dts/imx6q-icore.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/imx6q-icore.dts 
b/arch/arm/boot/dts/imx6q-icore.dts
index 73f34d1..8c1a572 100644
--- a/arch/arm/boot/dts/imx6q-icore.dts
+++ b/arch/arm/boot/dts/imx6q-icore.dts
@@ -58,6 +58,15 @@
status = "okay";
 };
 
+ {
+   ts: max11801@48 {
+   compatible = "max11801";
+   reg = <0x48>;
+   interrupt-parent = <>;
+   interrupts = <31 IRQ_TYPE_EDGE_FALLING>;
+   };
+};
+
  {
status = "okay";
 
-- 
1.9.1



Re: [PATCH v5 2/5] arm: dts: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:09PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm/boot/dts/exynos3250-rinato.dts | 2 ++
>  arch/arm/boot/dts/exynos4210-trats.dts  | 2 ++
>  arch/arm/boot/dts/exynos4412-trats2.dts | 2 ++
>  3 files changed, 6 insertions(+)
>

If anyone needs this, then:


The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201:

  Linux 4.11-rc1 (2017-03-05 12:59:56 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
tags/samsung-dt-clock-dsi-property-4.12

for you to fetch changes up to 4c74ea4e20b582bf90a2cf509d88aa7c2dbffb12:

  ARM: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node (2017-03-08 09:09:05 +0200)


Add the burst and esc clock frequency properties to DSI node in Exynos ARM
DeviceTree, before the OF graph could be parsed.


Hoegeun Kwon (1):
  ARM: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node

Jaehoon Chung (1):
  ARM: dts: exynos: Add phy-pcie node for pcie to Exynos5440

Krzysztof Kozlowski (1):
  ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 
on Exynos5420

 arch/arm/boot/dts/exynos3250-rinato.dts   |  2 ++
 arch/arm/boot/dts/exynos4210-trats.dts|  2 ++
 arch/arm/boot/dts/exynos4412-trats2.dts   |  2 ++
 arch/arm/boot/dts/exynos5420-tmu-sensor-conf.dtsi | 25 +++
 arch/arm/boot/dts/exynos5420.dtsi | 10 
 arch/arm/boot/dts/exynos5440.dtsi | 30 +++
 6 files changed, 56 insertions(+), 15 deletions(-)
 create mode 100644 arch/arm/boot/dts/exynos5420-tmu-sensor-conf.dtsi



Re: [PATCH v5 2/5] arm: dts: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:09PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm/boot/dts/exynos3250-rinato.dts | 2 ++
>  arch/arm/boot/dts/exynos4210-trats.dts  | 2 ++
>  arch/arm/boot/dts/exynos4412-trats2.dts | 2 ++
>  3 files changed, 6 insertions(+)
>

If anyone needs this, then:


The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201:

  Linux 4.11-rc1 (2017-03-05 12:59:56 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
tags/samsung-dt-clock-dsi-property-4.12

for you to fetch changes up to 4c74ea4e20b582bf90a2cf509d88aa7c2dbffb12:

  ARM: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node (2017-03-08 09:09:05 +0200)


Add the burst and esc clock frequency properties to DSI node in Exynos ARM
DeviceTree, before the OF graph could be parsed.


Hoegeun Kwon (1):
  ARM: dts: exynos: Add the burst and esc clock frequency properties to DSI 
node

Jaehoon Chung (1):
  ARM: dts: exynos: Add phy-pcie node for pcie to Exynos5440

Krzysztof Kozlowski (1):
  ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 
on Exynos5420

 arch/arm/boot/dts/exynos3250-rinato.dts   |  2 ++
 arch/arm/boot/dts/exynos4210-trats.dts|  2 ++
 arch/arm/boot/dts/exynos4412-trats2.dts   |  2 ++
 arch/arm/boot/dts/exynos5420-tmu-sensor-conf.dtsi | 25 +++
 arch/arm/boot/dts/exynos5420.dtsi | 10 
 arch/arm/boot/dts/exynos5440.dtsi | 30 +++
 6 files changed, 56 insertions(+), 15 deletions(-)
 create mode 100644 arch/arm/boot/dts/exynos5420-tmu-sensor-conf.dtsi



Re: [RFC PATCH v4 25/28] x86: Access the setup data through sysfs decrypted

2017-03-07 Thread Dave Young
On 02/16/17 at 09:47am, Tom Lendacky wrote:
> Use memremap() to map the setup data.  This will make the appropriate
> decision as to whether a RAM remapping can be done or if a fallback to
> ioremap_cache() is needed (similar to the setup data debugfs support).
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/ksysfs.c |   27 ++-
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/ksysfs.c b/arch/x86/kernel/ksysfs.c
> index 4afc67f..d653b3e 100644
> --- a/arch/x86/kernel/ksysfs.c
> +++ b/arch/x86/kernel/ksysfs.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -79,12 +80,12 @@ static int get_setup_data_paddr(int nr, u64 *paddr)
>   *paddr = pa_data;
>   return 0;
>   }
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   i++;
>   }
>   return -EINVAL;
> @@ -97,17 +98,17 @@ static int __init get_setup_data_size(int nr, size_t 
> *size)
>   u64 pa_data = boot_params.hdr.setup_data;
>  
>   while (pa_data) {
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>   if (nr == i) {
>   *size = data->len;
> - iounmap(data);
> + memunmap(data);
>   return 0;
>   }
>  
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   i++;
>   }
>   return -EINVAL;
> @@ -127,12 +128,12 @@ static ssize_t type_show(struct kobject *kobj,
>   ret = get_setup_data_paddr(nr, );
>   if (ret)
>   return ret;
> - data = ioremap_cache(paddr, sizeof(*data));
> + data = memremap(paddr, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
>   ret = sprintf(buf, "0x%x\n", data->type);
> - iounmap(data);
> + memunmap(data);
>   return ret;
>  }
>  
> @@ -154,7 +155,7 @@ static ssize_t setup_data_data_read(struct file *fp,
>   ret = get_setup_data_paddr(nr, );
>   if (ret)
>   return ret;
> - data = ioremap_cache(paddr, sizeof(*data));
> + data = memremap(paddr, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
> @@ -170,15 +171,15 @@ static ssize_t setup_data_data_read(struct file *fp,
>   goto out;
>  
>   ret = count;
> - p = ioremap_cache(paddr + sizeof(*data), data->len);
> + p = memremap(paddr + sizeof(*data), data->len, MEMREMAP_WB);
>   if (!p) {
>   ret = -ENOMEM;
>   goto out;
>   }
>   memcpy(buf, p + off, count);
> - iounmap(p);
> + memunmap(p);
>  out:
> - iounmap(data);
> + memunmap(data);
>   return ret;
>  }
>  
> @@ -250,13 +251,13 @@ static int __init get_setup_data_total_num(u64 pa_data, 
> int *nr)
>   *nr = 0;
>   while (pa_data) {
>   *nr += 1;
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data) {
>   ret = -ENOMEM;
>   goto out;
>   }
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   }
>  
>  out:
> 

It would be better that these cleanup patches are sent separately.

Acked-by: Dave Young 

Thanks
Dave


Re: [RFC PATCH v4 25/28] x86: Access the setup data through sysfs decrypted

2017-03-07 Thread Dave Young
On 02/16/17 at 09:47am, Tom Lendacky wrote:
> Use memremap() to map the setup data.  This will make the appropriate
> decision as to whether a RAM remapping can be done or if a fallback to
> ioremap_cache() is needed (similar to the setup data debugfs support).
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/ksysfs.c |   27 ++-
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/ksysfs.c b/arch/x86/kernel/ksysfs.c
> index 4afc67f..d653b3e 100644
> --- a/arch/x86/kernel/ksysfs.c
> +++ b/arch/x86/kernel/ksysfs.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -79,12 +80,12 @@ static int get_setup_data_paddr(int nr, u64 *paddr)
>   *paddr = pa_data;
>   return 0;
>   }
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   i++;
>   }
>   return -EINVAL;
> @@ -97,17 +98,17 @@ static int __init get_setup_data_size(int nr, size_t 
> *size)
>   u64 pa_data = boot_params.hdr.setup_data;
>  
>   while (pa_data) {
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>   if (nr == i) {
>   *size = data->len;
> - iounmap(data);
> + memunmap(data);
>   return 0;
>   }
>  
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   i++;
>   }
>   return -EINVAL;
> @@ -127,12 +128,12 @@ static ssize_t type_show(struct kobject *kobj,
>   ret = get_setup_data_paddr(nr, );
>   if (ret)
>   return ret;
> - data = ioremap_cache(paddr, sizeof(*data));
> + data = memremap(paddr, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
>   ret = sprintf(buf, "0x%x\n", data->type);
> - iounmap(data);
> + memunmap(data);
>   return ret;
>  }
>  
> @@ -154,7 +155,7 @@ static ssize_t setup_data_data_read(struct file *fp,
>   ret = get_setup_data_paddr(nr, );
>   if (ret)
>   return ret;
> - data = ioremap_cache(paddr, sizeof(*data));
> + data = memremap(paddr, sizeof(*data), MEMREMAP_WB);
>   if (!data)
>   return -ENOMEM;
>  
> @@ -170,15 +171,15 @@ static ssize_t setup_data_data_read(struct file *fp,
>   goto out;
>  
>   ret = count;
> - p = ioremap_cache(paddr + sizeof(*data), data->len);
> + p = memremap(paddr + sizeof(*data), data->len, MEMREMAP_WB);
>   if (!p) {
>   ret = -ENOMEM;
>   goto out;
>   }
>   memcpy(buf, p + off, count);
> - iounmap(p);
> + memunmap(p);
>  out:
> - iounmap(data);
> + memunmap(data);
>   return ret;
>  }
>  
> @@ -250,13 +251,13 @@ static int __init get_setup_data_total_num(u64 pa_data, 
> int *nr)
>   *nr = 0;
>   while (pa_data) {
>   *nr += 1;
> - data = ioremap_cache(pa_data, sizeof(*data));
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
>   if (!data) {
>   ret = -ENOMEM;
>   goto out;
>   }
>   pa_data = data->next;
> - iounmap(data);
> + memunmap(data);
>   }
>  
>  out:
> 

It would be better that these cleanup patches are sent separately.

Acked-by: Dave Young 

Thanks
Dave


Re: [PATCH v4 3/4] dt-bindings: phy: Add support for QMP phy

2017-03-07 Thread Vivek Gautam



On 03/07/2017 07:30 PM, Stephen Boyd wrote:

(Not sure I replied so here it is)

On 01/27, Vivek Gautam wrote:


On 01/27/2017 05:13 AM, Stephen Boyd wrote:

On 01/24, Vivek Gautam wrote:

 From "./Documentation/devicetree/bindings/graph.txt" -
"The device tree graph bindings described herein abstract more complex
devices that can have multiple specifiable ports, each of which can be
linked to one or more ports of other devices."

So, this means we use 'port', 'ports' and 'endpoint' for devices whose one
or more ports is connected to other device's one or more ports.

I can use 'lane' for the node name here.

Ok.


 reg = <0x035000 0x130>,
 <0x035200 0x200>,
 <0x035400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_0_PIPE_CLK>;
 clock-names = "pipe0";
 resets = < GCC_PCIE_0_PHY_BCR>;
 reset-names = "lane0";
 };

   pciephy_p1: port@1 {
 reg = <0x036000 0x130>,
 <0x036200 0x200>,
 <0x036400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_1_PIPE_CLK>;
 clock-names = "pipe1";
 resets = < GCC_PCIE_1_PHY_BCR>;
 reset-names = "lane1";
 };

 pciephy_p2: port@2 {
 reg = <0x037000 0x130>,
 <0x037200 0x200>,
 <0x037400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_2_PIPE_CLK>;
 clock-names = "pipe2";
 resets = < GCC_PCIE_2_PHY_BCR>;
 reset-names = "lane2";
 };
 };


let me know if this looks okay.



What's the plan for non-pcie qmp phy binding? In that case we
don't have ports, so it gets folded into one node?


The non-pcie qmp phys still have one lane, that provides tx/rx.

I am of the opinion that we don't have two different ways to create
phys in the driver, and keep one port/lane for such phys in dt.


Ok so we would still have a subnode in that case. Sounds ok.


Cool.

Thanks
Vivek

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH v4 3/4] dt-bindings: phy: Add support for QMP phy

2017-03-07 Thread Vivek Gautam



On 03/07/2017 07:30 PM, Stephen Boyd wrote:

(Not sure I replied so here it is)

On 01/27, Vivek Gautam wrote:


On 01/27/2017 05:13 AM, Stephen Boyd wrote:

On 01/24, Vivek Gautam wrote:

 From "./Documentation/devicetree/bindings/graph.txt" -
"The device tree graph bindings described herein abstract more complex
devices that can have multiple specifiable ports, each of which can be
linked to one or more ports of other devices."

So, this means we use 'port', 'ports' and 'endpoint' for devices whose one
or more ports is connected to other device's one or more ports.

I can use 'lane' for the node name here.

Ok.


 reg = <0x035000 0x130>,
 <0x035200 0x200>,
 <0x035400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_0_PIPE_CLK>;
 clock-names = "pipe0";
 resets = < GCC_PCIE_0_PHY_BCR>;
 reset-names = "lane0";
 };

   pciephy_p1: port@1 {
 reg = <0x036000 0x130>,
 <0x036200 0x200>,
 <0x036400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_1_PIPE_CLK>;
 clock-names = "pipe1";
 resets = < GCC_PCIE_1_PHY_BCR>;
 reset-names = "lane1";
 };

 pciephy_p2: port@2 {
 reg = <0x037000 0x130>,
 <0x037200 0x200>,
 <0x037400 0x1dc>;
 #phy-cells = <0>;

 clocks = < GCC_PCIE_2_PIPE_CLK>;
 clock-names = "pipe2";
 resets = < GCC_PCIE_2_PHY_BCR>;
 reset-names = "lane2";
 };
 };


let me know if this looks okay.



What's the plan for non-pcie qmp phy binding? In that case we
don't have ports, so it gets folded into one node?


The non-pcie qmp phys still have one lane, that provides tx/rx.

I am of the opinion that we don't have two different ways to create
phys in the driver, and keep one port/lane for such phys in dt.


Ok so we would still have a subnode in that case. Sounds ok.


Cool.

Thanks
Vivek

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [RFC v2 10/10] mm, page_alloc: introduce MIGRATE_MIXED migratetype

2017-03-07 Thread Vlastimil Babka
On 03/08/2017 03:16 AM, Yisheng Xie wrote:
> Hi Vlastimil ,
> 
> On 2017/2/11 1:23, Vlastimil Babka wrote:
>> @@ -1977,7 +1978,7 @@ static void steal_suitable_fallback(struct zone *zone, 
>> struct page *page,
>>  unsigned int current_order = page_order(page);
>>  struct free_area *area;
>>  int free_pages, good_pages;
>> -int old_block_type;
>> +int old_block_type, new_block_type;
>>  
>>  /* Take ownership for orders >= pageblock_order */
>>  if (current_order >= pageblock_order) {
>> @@ -1991,11 +1992,27 @@ static void steal_suitable_fallback(struct zone 
>> *zone, struct page *page,
>>  if (!whole_block) {
>>  area = >free_area[current_order];
>>  list_move(>lru, >free_list[start_type]);
>> -return;
>> +free_pages = 1 << current_order;
>> +/* TODO: We didn't scan the block, so be pessimistic */
>> +good_pages = 0;
>> +} else {
>> +free_pages = move_freepages_block(zone, page, start_type,
>> +_pages);
>> +/*
>> + * good_pages is now the number of movable pages, but if we
>> + * want UNMOVABLE or RECLAIMABLE, we consider all non-movable
>> + * as good (but we can't fully distinguish them)
>> + */
>> +if (start_type != MIGRATE_MOVABLE)
>> +good_pages = pageblock_nr_pages - free_pages -
>> +good_pages;
>>  }
>>  
>>  free_pages = move_freepages_block(zone, page, start_type,
>>  _pages);
> It seems this move_freepages_block() should be removed, if we can steal whole 
> block
> then just  do it. If not we can check whether we can set it as mixed mt, 
> right?
> Please let me know if I miss something..

Right. My results suggested this patch was buggy, so this might be the
bug (or one of the bugs), thanks for pointing it out. I've reposted v3
without the RFC patches 9 and 10 and will return to them later.



Re: [RFC v2 10/10] mm, page_alloc: introduce MIGRATE_MIXED migratetype

2017-03-07 Thread Vlastimil Babka
On 03/08/2017 03:16 AM, Yisheng Xie wrote:
> Hi Vlastimil ,
> 
> On 2017/2/11 1:23, Vlastimil Babka wrote:
>> @@ -1977,7 +1978,7 @@ static void steal_suitable_fallback(struct zone *zone, 
>> struct page *page,
>>  unsigned int current_order = page_order(page);
>>  struct free_area *area;
>>  int free_pages, good_pages;
>> -int old_block_type;
>> +int old_block_type, new_block_type;
>>  
>>  /* Take ownership for orders >= pageblock_order */
>>  if (current_order >= pageblock_order) {
>> @@ -1991,11 +1992,27 @@ static void steal_suitable_fallback(struct zone 
>> *zone, struct page *page,
>>  if (!whole_block) {
>>  area = >free_area[current_order];
>>  list_move(>lru, >free_list[start_type]);
>> -return;
>> +free_pages = 1 << current_order;
>> +/* TODO: We didn't scan the block, so be pessimistic */
>> +good_pages = 0;
>> +} else {
>> +free_pages = move_freepages_block(zone, page, start_type,
>> +_pages);
>> +/*
>> + * good_pages is now the number of movable pages, but if we
>> + * want UNMOVABLE or RECLAIMABLE, we consider all non-movable
>> + * as good (but we can't fully distinguish them)
>> + */
>> +if (start_type != MIGRATE_MOVABLE)
>> +good_pages = pageblock_nr_pages - free_pages -
>> +good_pages;
>>  }
>>  
>>  free_pages = move_freepages_block(zone, page, start_type,
>>  _pages);
> It seems this move_freepages_block() should be removed, if we can steal whole 
> block
> then just  do it. If not we can check whether we can set it as mixed mt, 
> right?
> Please let me know if I miss something..

Right. My results suggested this patch was buggy, so this might be the
bug (or one of the bugs), thanks for pointing it out. I've reposted v3
without the RFC patches 9 and 10 and will return to them later.



Re: [RFC PATCH v4 24/28] x86: Access the setup data through debugfs decrypted

2017-03-07 Thread Dave Young
On 02/16/17 at 09:47am, Tom Lendacky wrote:
> Use memremap() to map the setup data.  This simplifies the code and will
> make the appropriate decision as to whether a RAM remapping can be done
> or if a fallback to ioremap_cache() is needed (which includes checking
> PageHighMem).
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/kdebugfs.c |   30 +++---
>  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c
> index bdb83e4..c3d354d 100644
> --- a/arch/x86/kernel/kdebugfs.c
> +++ b/arch/x86/kernel/kdebugfs.c
> @@ -48,17 +48,13 @@ static ssize_t setup_data_read(struct file *file, char 
> __user *user_buf,
>  
>   pa = node->paddr + sizeof(struct setup_data) + pos;
>   pg = pfn_to_page((pa + count - 1) >> PAGE_SHIFT);
> - if (PageHighMem(pg)) {
> - p = ioremap_cache(pa, count);
> - if (!p)
> - return -ENXIO;
> - } else
> - p = __va(pa);
> + p = memremap(pa, count, MEMREMAP_WB);
> + if (!p)
> + return -ENXIO;

-ENOMEM looks better for memremap, ditto for other places..

>  
>   remain = copy_to_user(user_buf, p, count);
>  
> - if (PageHighMem(pg))
> - iounmap(p);
> + memunmap(p);
>  
>   if (remain)
>   return -EFAULT;
> @@ -127,15 +123,12 @@ static int __init create_setup_data_nodes(struct dentry 
> *parent)
>   }
>  
>   pg = pfn_to_page((pa_data+sizeof(*data)-1) >> PAGE_SHIFT);
> - if (PageHighMem(pg)) {
> - data = ioremap_cache(pa_data, sizeof(*data));
> - if (!data) {
> - kfree(node);
> - error = -ENXIO;
> - goto err_dir;
> - }
> - } else
> - data = __va(pa_data);
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
> + if (!data) {
> + kfree(node);
> + error = -ENXIO;
> + goto err_dir;
> + }
>  
>   node->paddr = pa_data;
>   node->type = data->type;
> @@ -143,8 +136,7 @@ static int __init create_setup_data_nodes(struct dentry 
> *parent)
>   error = create_setup_data_node(d, no, node);
>   pa_data = data->next;
>  
> - if (PageHighMem(pg))
> - iounmap(data);
> + memunmap(data);
>   if (error)
>   goto err_dir;
>   no++;
> 

Thanks
Dave


Re: [RFC PATCH v4 24/28] x86: Access the setup data through debugfs decrypted

2017-03-07 Thread Dave Young
On 02/16/17 at 09:47am, Tom Lendacky wrote:
> Use memremap() to map the setup data.  This simplifies the code and will
> make the appropriate decision as to whether a RAM remapping can be done
> or if a fallback to ioremap_cache() is needed (which includes checking
> PageHighMem).
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/kdebugfs.c |   30 +++---
>  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c
> index bdb83e4..c3d354d 100644
> --- a/arch/x86/kernel/kdebugfs.c
> +++ b/arch/x86/kernel/kdebugfs.c
> @@ -48,17 +48,13 @@ static ssize_t setup_data_read(struct file *file, char 
> __user *user_buf,
>  
>   pa = node->paddr + sizeof(struct setup_data) + pos;
>   pg = pfn_to_page((pa + count - 1) >> PAGE_SHIFT);
> - if (PageHighMem(pg)) {
> - p = ioremap_cache(pa, count);
> - if (!p)
> - return -ENXIO;
> - } else
> - p = __va(pa);
> + p = memremap(pa, count, MEMREMAP_WB);
> + if (!p)
> + return -ENXIO;

-ENOMEM looks better for memremap, ditto for other places..

>  
>   remain = copy_to_user(user_buf, p, count);
>  
> - if (PageHighMem(pg))
> - iounmap(p);
> + memunmap(p);
>  
>   if (remain)
>   return -EFAULT;
> @@ -127,15 +123,12 @@ static int __init create_setup_data_nodes(struct dentry 
> *parent)
>   }
>  
>   pg = pfn_to_page((pa_data+sizeof(*data)-1) >> PAGE_SHIFT);
> - if (PageHighMem(pg)) {
> - data = ioremap_cache(pa_data, sizeof(*data));
> - if (!data) {
> - kfree(node);
> - error = -ENXIO;
> - goto err_dir;
> - }
> - } else
> - data = __va(pa_data);
> + data = memremap(pa_data, sizeof(*data), MEMREMAP_WB);
> + if (!data) {
> + kfree(node);
> + error = -ENXIO;
> + goto err_dir;
> + }
>  
>   node->paddr = pa_data;
>   node->type = data->type;
> @@ -143,8 +136,7 @@ static int __init create_setup_data_nodes(struct dentry 
> *parent)
>   error = create_setup_data_node(d, no, node);
>   pa_data = data->next;
>  
> - if (PageHighMem(pg))
> - iounmap(data);
> + memunmap(data);
>   if (error)
>   goto err_dir;
>   no++;
> 

Thanks
Dave


Re: [RFC PATCH v4 14/28] Add support to access boot related data in the clear

2017-03-07 Thread Dave Young
On 02/16/17 at 09:45am, Tom Lendacky wrote:
[snip]
> + * This function determines if an address should be mapped encrypted.
> + * Boot setup data, EFI data and E820 areas are checked in making this
> + * determination.
> + */
> +static bool memremap_should_map_encrypted(resource_size_t phys_addr,
> +   unsigned long size)
> +{
> + /*
> +  * SME is not active, return true:
> +  *   - For early_memremap_pgprot_adjust(), returning true or false
> +  * results in the same protection value
> +  *   - For arch_memremap_do_ram_remap(), returning true will allow
> +  * the RAM remap to occur instead of falling back to ioremap()
> +  */
> + if (!sme_active())
> + return true;

>From the function name shouldn't above be return false? 

> +
> + /* Check if the address is part of the setup data */
> + if (memremap_is_setup_data(phys_addr, size))
> + return false;
> +
> + /* Check if the address is part of EFI boot/runtime data */
> + switch (efi_mem_type(phys_addr)) {
> + case EFI_BOOT_SERVICES_DATA:
> + case EFI_RUNTIME_SERVICES_DATA:

Only these two types needed? I'm not sure about this, just bring up the
question.

> + return false;
> + default:
> + break;
> + }
> +
> + /* Check if the address is outside kernel usable area */
> + switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
> + case E820_TYPE_RESERVED:
> + case E820_TYPE_ACPI:
> + case E820_TYPE_NVS:
> + case E820_TYPE_UNUSABLE:
> + return false;
> + default:
> + break;
> + }
> +
> + return true;
> +}
> +

Thanks
Dave


Re: [RFC 06/11] mm: remove SWAP_MLOCK in ttu

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 06:24:37PM +0300, Kirill A. Shutemov wrote:
> On Mon, Mar 06, 2017 at 11:15:08AM +0900, Minchan Kim wrote:
> > Hi Anshuman,
> > 
> > On Fri, Mar 03, 2017 at 06:06:38PM +0530, Anshuman Khandual wrote:
> > > On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > > > ttu don't need to return SWAP_MLOCK. Instead, just return SWAP_FAIL
> > > > because it means the page is not-swappable so it should move to
> > > > another LRU list(active or unevictable). putback friends will
> > > > move it to right list depending on the page's LRU flag.
> > > 
> > > Right, if it cannot be swapped out there is not much difference with
> > > SWAP_FAIL once we change the callers who expected to see a SWAP_MLOCK
> > > return instead.
> > > 
> > > > 
> > > > A side effect is shrink_page_list accounts unevictable list movement
> > > > by PGACTIVATE but I don't think it corrupts something severe.
> > > 
> > > Not sure I got that, could you please elaborate on this. We will still
> > > activate the page and put it in an appropriate LRU list if it is marked
> > > mlocked ?
> > 
> > Right. putback_iactive_pages/putback_lru_page has a logic to filter
> > out unevictable pages and move them to unevictable LRU list so it
> > doesn't break LRU change behavior but the concern is until now,
> > we have accounted PGACTIVATE for only evictable LRU list page but
> > by this change, it accounts it to unevictable LRU list as well.
> > However, although I don't think it's big problem in real practice,
> > we can fix it simply with checking PG_mlocked if someone reports.
> 
> I think it's better to do this pro-actively. Let's hide both pgactivate++
> and SetPageActive() under "if (!PageMlocked())".
> SetPageActive() is not free.

I will consider it in next spin.

Thanks!


Re: [PATCH v5 2/5] arm: dts: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:09PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm/boot/dts/exynos3250-rinato.dts | 2 ++
>  arch/arm/boot/dts/exynos4210-trats.dts  | 2 ++
>  arch/arm/boot/dts/exynos4412-trats2.dts | 2 ++
>  3 files changed, 6 insertions(+)
> 

Thanks, applied.

Best regards,
Krzysztof



Re: [RFC PATCH v4 14/28] Add support to access boot related data in the clear

2017-03-07 Thread Dave Young
On 02/16/17 at 09:45am, Tom Lendacky wrote:
[snip]
> + * This function determines if an address should be mapped encrypted.
> + * Boot setup data, EFI data and E820 areas are checked in making this
> + * determination.
> + */
> +static bool memremap_should_map_encrypted(resource_size_t phys_addr,
> +   unsigned long size)
> +{
> + /*
> +  * SME is not active, return true:
> +  *   - For early_memremap_pgprot_adjust(), returning true or false
> +  * results in the same protection value
> +  *   - For arch_memremap_do_ram_remap(), returning true will allow
> +  * the RAM remap to occur instead of falling back to ioremap()
> +  */
> + if (!sme_active())
> + return true;

>From the function name shouldn't above be return false? 

> +
> + /* Check if the address is part of the setup data */
> + if (memremap_is_setup_data(phys_addr, size))
> + return false;
> +
> + /* Check if the address is part of EFI boot/runtime data */
> + switch (efi_mem_type(phys_addr)) {
> + case EFI_BOOT_SERVICES_DATA:
> + case EFI_RUNTIME_SERVICES_DATA:

Only these two types needed? I'm not sure about this, just bring up the
question.

> + return false;
> + default:
> + break;
> + }
> +
> + /* Check if the address is outside kernel usable area */
> + switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
> + case E820_TYPE_RESERVED:
> + case E820_TYPE_ACPI:
> + case E820_TYPE_NVS:
> + case E820_TYPE_UNUSABLE:
> + return false;
> + default:
> + break;
> + }
> +
> + return true;
> +}
> +

Thanks
Dave


Re: [RFC 06/11] mm: remove SWAP_MLOCK in ttu

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 06:24:37PM +0300, Kirill A. Shutemov wrote:
> On Mon, Mar 06, 2017 at 11:15:08AM +0900, Minchan Kim wrote:
> > Hi Anshuman,
> > 
> > On Fri, Mar 03, 2017 at 06:06:38PM +0530, Anshuman Khandual wrote:
> > > On 03/02/2017 12:09 PM, Minchan Kim wrote:
> > > > ttu don't need to return SWAP_MLOCK. Instead, just return SWAP_FAIL
> > > > because it means the page is not-swappable so it should move to
> > > > another LRU list(active or unevictable). putback friends will
> > > > move it to right list depending on the page's LRU flag.
> > > 
> > > Right, if it cannot be swapped out there is not much difference with
> > > SWAP_FAIL once we change the callers who expected to see a SWAP_MLOCK
> > > return instead.
> > > 
> > > > 
> > > > A side effect is shrink_page_list accounts unevictable list movement
> > > > by PGACTIVATE but I don't think it corrupts something severe.
> > > 
> > > Not sure I got that, could you please elaborate on this. We will still
> > > activate the page and put it in an appropriate LRU list if it is marked
> > > mlocked ?
> > 
> > Right. putback_iactive_pages/putback_lru_page has a logic to filter
> > out unevictable pages and move them to unevictable LRU list so it
> > doesn't break LRU change behavior but the concern is until now,
> > we have accounted PGACTIVATE for only evictable LRU list page but
> > by this change, it accounts it to unevictable LRU list as well.
> > However, although I don't think it's big problem in real practice,
> > we can fix it simply with checking PG_mlocked if someone reports.
> 
> I think it's better to do this pro-actively. Let's hide both pgactivate++
> and SetPageActive() under "if (!PageMlocked())".
> SetPageActive() is not free.

I will consider it in next spin.

Thanks!


Re: [PATCH v5 2/5] arm: dts: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:09PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm/boot/dts/exynos3250-rinato.dts | 2 ++
>  arch/arm/boot/dts/exynos4210-trats.dts  | 2 ++
>  arch/arm/boot/dts/exynos4412-trats2.dts | 2 ++
>  3 files changed, 6 insertions(+)
> 

Thanks, applied.

Best regards,
Krzysztof



Re: [PATCH v5 1/5] arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:08PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
> 

Thanks, applied.

I'll prepare tags with these.

Best regards,
Krzysztof



Re: [PATCH v5 1/5] arm64: dts: exynos: Add the burst and esc clock frequency properties to DSI node

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 01:54:08PM +0900, Hoegeun Kwon wrote:
> Add the burst and esc clock frequency properties to the parent (DSI node).
> Currently the clock is parsed from the port node, while it should be
> taken from the dsi node.
> 
> Signed-off-by: Hoegeun Kwon 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Andi Shyti 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
> 

Thanks, applied.

I'll prepare tags with these.

Best regards,
Krzysztof



RE: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-07 Thread Peter Chen
 
>>> You mean type-C trigger an ACPI event, and this ACPI event can notify
>>> related USB controller driver doing role switch?
>>
>> No (firmware programs the dual-role hw/registers), but never mind.
>> That could be the case.
>>
>>> If it is correct, there is a notifier between type-C and USB
>>> controller driver, how to define this notifier for non-ACPI platform?
>>
>> Once there is a platform with Type-C like that, the problem needs to
>> be solved. However..
>>
 I'm not commenting on Roger's dual role patch series, but I don't
 really think it should be mixed with Type-C. USB Type-C and USB
 Power Delivery define their own ways of handling the roles, and they
 are not limited to the data role only. Things like OTG for example
 will, and actually can not be supported. With Type-C we will have
 competing state machines compared to OTG. The dual-role framework
 may be useful on systems that provide more traditional connectors,
 which possibly have the ID-pin like micro-AB, and possibly also
 support OTG. It can also be something that exist in parallel with the 
 Type-C
>class, but there just can not be any dependencies between the two.

>>>
>>> Yes, there are two independent things. But if the kernel doesn't have
>>> a notifier between type-C message sender (type-c class) and message
>>> receiver (like USB controller driver for role switch or other drivers
>>> for alternate mode message), we had to find some ways at userspace.
>>
>> ..what ever the solution is, it really can't rely on user space.
>>
>
>... and, at least for our application, using extcon for the necessary 
>notifications works
>just fine.
>

I see, that means you have a hardware signal to notify role switch.

Peter


Re: [PATCH v3 0/4] fujitsu_init() cleanup

2017-03-07 Thread Jonathan Woithe
On Tue, Mar 07, 2017 at 11:15:12AM +0100, Micha?? K??pie?? wrote:
> These patches should make fujitsu_init() a bit more palatable.  No
> changes are made to platform device code yet, for clarity these will be
> posted in a separate series after this one gets applied.

I will test and review these as soon as possible.  It is likely to be later
this week or over the weekend.  I suspect this version will be good to go
but we should confirm this. :-)

Regards
  jonathan


RE: [PATCH v17 2/3] usb: USB Type-C connector class

2017-03-07 Thread Peter Chen
 
>>> You mean type-C trigger an ACPI event, and this ACPI event can notify
>>> related USB controller driver doing role switch?
>>
>> No (firmware programs the dual-role hw/registers), but never mind.
>> That could be the case.
>>
>>> If it is correct, there is a notifier between type-C and USB
>>> controller driver, how to define this notifier for non-ACPI platform?
>>
>> Once there is a platform with Type-C like that, the problem needs to
>> be solved. However..
>>
 I'm not commenting on Roger's dual role patch series, but I don't
 really think it should be mixed with Type-C. USB Type-C and USB
 Power Delivery define their own ways of handling the roles, and they
 are not limited to the data role only. Things like OTG for example
 will, and actually can not be supported. With Type-C we will have
 competing state machines compared to OTG. The dual-role framework
 may be useful on systems that provide more traditional connectors,
 which possibly have the ID-pin like micro-AB, and possibly also
 support OTG. It can also be something that exist in parallel with the 
 Type-C
>class, but there just can not be any dependencies between the two.

>>>
>>> Yes, there are two independent things. But if the kernel doesn't have
>>> a notifier between type-C message sender (type-c class) and message
>>> receiver (like USB controller driver for role switch or other drivers
>>> for alternate mode message), we had to find some ways at userspace.
>>
>> ..what ever the solution is, it really can't rely on user space.
>>
>
>... and, at least for our application, using extcon for the necessary 
>notifications works
>just fine.
>

I see, that means you have a hardware signal to notify role switch.

Peter


Re: [PATCH v3 0/4] fujitsu_init() cleanup

2017-03-07 Thread Jonathan Woithe
On Tue, Mar 07, 2017 at 11:15:12AM +0100, Micha?? K??pie?? wrote:
> These patches should make fujitsu_init() a bit more palatable.  No
> changes are made to platform device code yet, for clarity these will be
> posted in a separate series after this one gets applied.

I will test and review these as soon as possible.  It is likely to be later
this week or over the weekend.  I suspect this version will be good to go
but we should confirm this. :-)

Regards
  jonathan


Re: [PATCH v11 3/3] arm64: dts: exynos: Add support for S6E3HA2 panel device on TM2 board

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 10:42:37AM +0900, Hoegeun Kwon wrote:
> From: Hyungwon Hwang 
> 
> This patch add the panel device tree node for S6E3HA2 display
> controller to TM2 dts.
> 
> Signed-off-by: Hyungwon Hwang 
> Signed-off-by: Andrzej Hajda 
> Signed-off-by: Chanwoo Choi 
> Signed-off-by: Hoegeun Kwon 
> Tested-by: Chanwoo Choi 
> Reviewed-by: Javier Martinez Canillas 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2.dts | 12 
>  1 file changed, 12 insertions(+)
> 

Thanks, applied.

Best regards,
Krzysztof



Re: [PATCH v11 3/3] arm64: dts: exynos: Add support for S6E3HA2 panel device on TM2 board

2017-03-07 Thread Krzysztof Kozlowski
On Wed, Mar 08, 2017 at 10:42:37AM +0900, Hoegeun Kwon wrote:
> From: Hyungwon Hwang 
> 
> This patch add the panel device tree node for S6E3HA2 display
> controller to TM2 dts.
> 
> Signed-off-by: Hyungwon Hwang 
> Signed-off-by: Andrzej Hajda 
> Signed-off-by: Chanwoo Choi 
> Signed-off-by: Hoegeun Kwon 
> Tested-by: Chanwoo Choi 
> Reviewed-by: Javier Martinez Canillas 
> ---
>  arch/arm64/boot/dts/exynos/exynos5433-tm2.dts | 12 
>  1 file changed, 12 insertions(+)
> 

Thanks, applied.

Best regards,
Krzysztof



Re: [RFC 05/11] mm: make the try_to_munlock void function

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 06:17:47PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:19PM +0900, Minchan Kim wrote:
> > try_to_munlock returns SWAP_MLOCK if the one of VMAs mapped
> > the page has VM_LOCKED flag. In that time, VM set PG_mlocked to
> > the page if the page is not pte-mapped THP which cannot be
> > mlocked, either.
> > 
> > With that, __munlock_isolated_page can use PageMlocked to check
> > whether try_to_munlock is successful or not without relying on
> > try_to_munlock's retval. It helps to make ttu/ttuo simple with
> > upcoming patches.
> 
> I *think* you're correct, but it took time to wrap my head around.
> We basically rely on try_to_munlock() never caller for PTE-mapped THP.
> And we don't at the moment.
> 
> It worth adding something like
> 
>   VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
> 
> into try_to_munlock().

Agree.

> 
> Otherwise looks good to me.
> 
> Will free adding my Acked-by once this nit is addressed.

Thanks for the review this part, Kirill!

> 
> -- 
>  Kirill A. Shutemov
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 05/11] mm: make the try_to_munlock void function

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 06:17:47PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:19PM +0900, Minchan Kim wrote:
> > try_to_munlock returns SWAP_MLOCK if the one of VMAs mapped
> > the page has VM_LOCKED flag. In that time, VM set PG_mlocked to
> > the page if the page is not pte-mapped THP which cannot be
> > mlocked, either.
> > 
> > With that, __munlock_isolated_page can use PageMlocked to check
> > whether try_to_munlock is successful or not without relying on
> > try_to_munlock's retval. It helps to make ttu/ttuo simple with
> > upcoming patches.
> 
> I *think* you're correct, but it took time to wrap my head around.
> We basically rely on try_to_munlock() never caller for PTE-mapped THP.
> And we don't at the moment.
> 
> It worth adding something like
> 
>   VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
> 
> into try_to_munlock().

Agree.

> 
> Otherwise looks good to me.
> 
> Will free adding my Acked-by once this nit is addressed.

Thanks for the review this part, Kirill!

> 
> -- 
>  Kirill A. Shutemov
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 01/11] mm: use SWAP_SUCCESS instead of 0

2017-03-07 Thread Minchan Kim
Hi Kirill,

On Tue, Mar 07, 2017 at 05:19:33PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:15PM +0900, Minchan Kim wrote:
> > SWAP_SUCCESS defined value 0 can be changed always so don't rely on
> > it. Instead, use explict macro.
> 
> I'm okay with this as long as it's prepartion for something meaningful.
> 0 as success is widely used. I don't think replacing it's with macro here
> has value on its own.

It's the prepartion for making try_to_unmap return bool type but strictly
speaking, it's not necessary but I wanted to replace it with SWAP_SUCCESS
in this chance because it has several *defined* return type so it would
make it clear if we use one of those defiend type, IMO.
However, my thumb rule is to keep author/maintainer's credit for trivial
case and it seems you don't like so I will drop in next spin.

Thanks.


> 
> -- 
>  Kirill A. Shutemov
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 04/11] mm: remove SWAP_MLOCK check for SWAP_SUCCESS in ttu

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 05:26:43PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:18PM +0900, Minchan Kim wrote:
> > If the page is mapped and rescue in ttuo, page_mapcount(page) == 0 cannot
> > be true so page_mapcount check in ttu is enough to return SWAP_SUCCESS.
> > IOW, SWAP_MLOCK check is redundant so remove it.
> > 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/rmap.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 3a14013..0a48958 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1523,7 +1523,7 @@ int try_to_unmap(struct page *page, enum ttu_flags 
> > flags)
> > else
> > ret = rmap_walk(page, );
> >  
> > -   if (ret != SWAP_MLOCK && !page_mapcount(page))
> > +   if (!page_mapcount(page))
> 
> Hm. I think there's bug in current code.
> It should be !total_mapcount(page) otherwise it can be false-positive if
> there's THP mapped with PTEs.

Hmm, I lost THP thesedays totally so I can miss something easily.
When I look at that, it seems every pages passed try_to_unmap is already
splited by split split_huge_page_to_list which calls freeze_page which
split pmd. So I guess it's no problem. Right?

Anyway, it's out of scope in this patch so if it's really problem,
I'd like to handle it separately.

One asking:

When we should use total_mapcount instead of page_mapcount?
If total_mapcount has some lengthy description, it would be very helpful
for one who not is faimilar with that.

> 
> And in this case ret != SWAP_MLOCK is helpful to cut down some cost.
> Althouth it should be fine to remove it, I guess.

Sure but be hard to measure it, I think. As well, later patch removes
SWAP_MLOCK.


Re: [RFC 01/11] mm: use SWAP_SUCCESS instead of 0

2017-03-07 Thread Minchan Kim
Hi Kirill,

On Tue, Mar 07, 2017 at 05:19:33PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:15PM +0900, Minchan Kim wrote:
> > SWAP_SUCCESS defined value 0 can be changed always so don't rely on
> > it. Instead, use explict macro.
> 
> I'm okay with this as long as it's prepartion for something meaningful.
> 0 as success is widely used. I don't think replacing it's with macro here
> has value on its own.

It's the prepartion for making try_to_unmap return bool type but strictly
speaking, it's not necessary but I wanted to replace it with SWAP_SUCCESS
in this chance because it has several *defined* return type so it would
make it clear if we use one of those defiend type, IMO.
However, my thumb rule is to keep author/maintainer's credit for trivial
case and it seems you don't like so I will drop in next spin.

Thanks.


> 
> -- 
>  Kirill A. Shutemov
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


Re: [RFC 04/11] mm: remove SWAP_MLOCK check for SWAP_SUCCESS in ttu

2017-03-07 Thread Minchan Kim
On Tue, Mar 07, 2017 at 05:26:43PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 02, 2017 at 03:39:18PM +0900, Minchan Kim wrote:
> > If the page is mapped and rescue in ttuo, page_mapcount(page) == 0 cannot
> > be true so page_mapcount check in ttu is enough to return SWAP_SUCCESS.
> > IOW, SWAP_MLOCK check is redundant so remove it.
> > 
> > Signed-off-by: Minchan Kim 
> > ---
> >  mm/rmap.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 3a14013..0a48958 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1523,7 +1523,7 @@ int try_to_unmap(struct page *page, enum ttu_flags 
> > flags)
> > else
> > ret = rmap_walk(page, );
> >  
> > -   if (ret != SWAP_MLOCK && !page_mapcount(page))
> > +   if (!page_mapcount(page))
> 
> Hm. I think there's bug in current code.
> It should be !total_mapcount(page) otherwise it can be false-positive if
> there's THP mapped with PTEs.

Hmm, I lost THP thesedays totally so I can miss something easily.
When I look at that, it seems every pages passed try_to_unmap is already
splited by split split_huge_page_to_list which calls freeze_page which
split pmd. So I guess it's no problem. Right?

Anyway, it's out of scope in this patch so if it's really problem,
I'd like to handle it separately.

One asking:

When we should use total_mapcount instead of page_mapcount?
If total_mapcount has some lengthy description, it would be very helpful
for one who not is faimilar with that.

> 
> And in this case ret != SWAP_MLOCK is helpful to cut down some cost.
> Althouth it should be fine to remove it, I guess.

Sure but be hard to measure it, I think. As well, later patch removes
SWAP_MLOCK.


RE: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw

2017-03-07 Thread 廖崇榮
Hi Matjaz,

-Original Message-
From: Matjaž Hegedič [mailto:matjaz.hege...@gmail.com] 
Sent: Tuesday, March 07, 2017 6:52 PM
To: 廖崇榮; 'Dmitry Torokhov'
Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; 黃世鵬 經理; 
miller_w...@emc.com.tw
Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad 
fw

Hi Dmitry, KT!

On 2017-03-07 08:05, 廖崇榮 wrote:
> Hi Dmitry
>
> -Original Message-
> From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com]
> Sent: Tuesday, March 07, 2017 3:55 AM
> To: KT Liao
> Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; Matjaz 
> Hegedic
> Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special 
> touchpad fw
>
> On Sun, Mar 05, 2017 at 03:13:02AM +0100, Matjaz Hegedic wrote:
>> EeeBook X205TA is yet another ASUS device with a special touchpad 
>> firmware that needs to be accounted for during initialization, or 
>> else the touchpad will go into an invalid state upon suspend/resume.
>> Adding the appropriate ic_type and product_id check fixes the problem.
>
> KT, does this look reasonable? Are there more ASUS models that need 
> such handling?
> [KT] : I just discuss it with FW team.
> We can't confirm it right now because it's an old product. And the 
> solution focus on power-on issue, not suspend/resume.
> I will let you know once we figure it out.
>
> Our FW has modified, the issue should not happen on new models.
>
> ThanksKT

As it is now, the touchpad will stop working upon resume, returning an invalid 
id (and requires a cumbersome workaround of reloading the module). As the 
touchpad FW is opaque to me, the only way I could resolve the bug was through 
trial-and-error. Including the touchpad in the 'special fw' resolves the bug 
and the touchpad resumes without issue.

Now, even if the function is indeed used to resolve a different issue on other 
ASUS touchpads, I would argue that this is the most pragmatic way of resolving 
the problem on X205TA, X205TAW and F205TA (and possibly also X200HA & X206HA, 
though I don't have those to test).

It shouldn't affect any other models or touchpad products.

Thanks!

I agree your opinion.
The special handle simply changes the sequence of commands and adds delay 
cycle. 
It is harmless to general Elan products but rescue control flaws in some FW.

Thanks  KT
>>
>> Signed-off-by: Matjaz Hegedic 
>> ---
>>  drivers/input/mouse/elan_i2c_core.c | 22 --
>>  1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/input/mouse/elan_i2c_core.c
>> b/drivers/input/mouse/elan_i2c_core.c
>> index 2c7d287..dde3ad7 100644
>> --- a/drivers/input/mouse/elan_i2c_core.c
>> +++ b/drivers/input/mouse/elan_i2c_core.c
>> @@ -218,17 +218,19 @@ static int elan_query_product(struct 
>> elan_tp_data *data)
>>
>>  static int elan_check_ASUS_special_fw(struct elan_tp_data *data)  {
>> -if (data->ic_type != 0x0E)
>> -return false;
>> -
>> -switch (data->product_id) {
>> -case 0x05 ... 0x07:
>> -case 0x09:
>> -case 0x13:
>> -return true;
>> -default:
>> -return false;
>> +if (data->ic_type == 0x0E) {
>> +switch (data->product_id) {
>> +case 0x05 ... 0x07:
>> +case 0x09:
>> +case 0x13:
>> +return true;
>> +}
>>  }
>> +/* ASUS EeeBook X205TA */
>> +else if (data->ic_type == 0x8 && data->product_id == 0x26)
>> +return true;
>> +
>> +return false;
>>  }
>>
>>  static int __elan_initialize(struct elan_tp_data *data)
>> --
>> 2.7.4
>>
>
> Thanks.
>



RE: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw

2017-03-07 Thread 廖崇榮
Hi Matjaz,

-Original Message-
From: Matjaž Hegedič [mailto:matjaz.hege...@gmail.com] 
Sent: Tuesday, March 07, 2017 6:52 PM
To: 廖崇榮; 'Dmitry Torokhov'
Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; 黃世鵬 經理; 
miller_w...@emc.com.tw
Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad 
fw

Hi Dmitry, KT!

On 2017-03-07 08:05, 廖崇榮 wrote:
> Hi Dmitry
>
> -Original Message-
> From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com]
> Sent: Tuesday, March 07, 2017 3:55 AM
> To: KT Liao
> Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; Matjaz 
> Hegedic
> Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special 
> touchpad fw
>
> On Sun, Mar 05, 2017 at 03:13:02AM +0100, Matjaz Hegedic wrote:
>> EeeBook X205TA is yet another ASUS device with a special touchpad 
>> firmware that needs to be accounted for during initialization, or 
>> else the touchpad will go into an invalid state upon suspend/resume.
>> Adding the appropriate ic_type and product_id check fixes the problem.
>
> KT, does this look reasonable? Are there more ASUS models that need 
> such handling?
> [KT] : I just discuss it with FW team.
> We can't confirm it right now because it's an old product. And the 
> solution focus on power-on issue, not suspend/resume.
> I will let you know once we figure it out.
>
> Our FW has modified, the issue should not happen on new models.
>
> ThanksKT

As it is now, the touchpad will stop working upon resume, returning an invalid 
id (and requires a cumbersome workaround of reloading the module). As the 
touchpad FW is opaque to me, the only way I could resolve the bug was through 
trial-and-error. Including the touchpad in the 'special fw' resolves the bug 
and the touchpad resumes without issue.

Now, even if the function is indeed used to resolve a different issue on other 
ASUS touchpads, I would argue that this is the most pragmatic way of resolving 
the problem on X205TA, X205TAW and F205TA (and possibly also X200HA & X206HA, 
though I don't have those to test).

It shouldn't affect any other models or touchpad products.

Thanks!

I agree your opinion.
The special handle simply changes the sequence of commands and adds delay 
cycle. 
It is harmless to general Elan products but rescue control flaws in some FW.

Thanks  KT
>>
>> Signed-off-by: Matjaz Hegedic 
>> ---
>>  drivers/input/mouse/elan_i2c_core.c | 22 --
>>  1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/input/mouse/elan_i2c_core.c
>> b/drivers/input/mouse/elan_i2c_core.c
>> index 2c7d287..dde3ad7 100644
>> --- a/drivers/input/mouse/elan_i2c_core.c
>> +++ b/drivers/input/mouse/elan_i2c_core.c
>> @@ -218,17 +218,19 @@ static int elan_query_product(struct 
>> elan_tp_data *data)
>>
>>  static int elan_check_ASUS_special_fw(struct elan_tp_data *data)  {
>> -if (data->ic_type != 0x0E)
>> -return false;
>> -
>> -switch (data->product_id) {
>> -case 0x05 ... 0x07:
>> -case 0x09:
>> -case 0x13:
>> -return true;
>> -default:
>> -return false;
>> +if (data->ic_type == 0x0E) {
>> +switch (data->product_id) {
>> +case 0x05 ... 0x07:
>> +case 0x09:
>> +case 0x13:
>> +return true;
>> +}
>>  }
>> +/* ASUS EeeBook X205TA */
>> +else if (data->ic_type == 0x8 && data->product_id == 0x26)
>> +return true;
>> +
>> +return false;
>>  }
>>
>>  static int __elan_initialize(struct elan_tp_data *data)
>> --
>> 2.7.4
>>
>
> Thanks.
>



Re: [PATCH v2] staging: wilc1000: Fix sparse warnings incorrect type assignment

2017-03-07 Thread Dan Carpenter
I think this change is buggy.

On Tue, Mar 07, 2017 at 10:36:53PM +0100, Andrea Ghittino wrote:
> Fixed sparse warnings related to the conversion of le16 and le32 to u16 and 
> u32, during the update of internal structures
> 
> Fixed sparse warnings:
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_ext_params
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52:got restricted 
> __le16 const [usertype] extended_ht_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51:expected 
> unsigned int [unsigned] [assigned] [usertype] ht_tx_bf_cap
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51:got restricted 
> __le32 const [usertype] tx_BF_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_capa_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51:got restricted 
> __le16 const [usertype] cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_ext_params
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52:got restricted 
> __le16 const [usertype] extended_ht_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51:expected 
> unsigned int [unsigned] [assigned] [usertype] ht_tx_bf_cap
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51:got restricted 
> __le32 const [usertype] tx_BF_cap_info
> 

What I want in a changelog is:  "We're copying data that's little endian
from (some place) to where ever and then we (whatever) do math using the
variable so it needs to be CPU endian.  Presumably this wasn't caught
in testing because it was only used on x86 or other little endian
systems."

In this case we're copying little endian data and then sending it
directly back to some place which requires little endian data so
converting it is a bug.

regards,
dan carpenter


Re: [PATCH v2] staging: wilc1000: Fix sparse warnings incorrect type assignment

2017-03-07 Thread Dan Carpenter
I think this change is buggy.

On Tue, Mar 07, 2017 at 10:36:53PM +0100, Andrea Ghittino wrote:
> Fixed sparse warnings related to the conversion of le16 and le32 to u16 and 
> u32, during the update of internal structures
> 
> Fixed sparse warnings:
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_ext_params
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2011:52:got restricted 
> __le16 const [usertype] extended_ht_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51:expected 
> unsigned int [unsigned] [assigned] [usertype] ht_tx_bf_cap
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2012:51:got restricted 
> __le32 const [usertype] tx_BF_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_capa_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2078:51:got restricted 
> __le16 const [usertype] cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52:expected 
> unsigned short [unsigned] [assigned] [usertype] ht_ext_params
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2083:52:got restricted 
> __le16 const [usertype] extended_ht_cap_info
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51: warning: 
> incorrect type in assignment (different base types)
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51:expected 
> unsigned int [unsigned] [assigned] [usertype] ht_tx_bf_cap
> drivers/staging/wilc1000//wilc_wfi_cfgoperations.c:2084:51:got restricted 
> __le32 const [usertype] tx_BF_cap_info
> 

What I want in a changelog is:  "We're copying data that's little endian
from (some place) to where ever and then we (whatever) do math using the
variable so it needs to be CPU endian.  Presumably this wasn't caught
in testing because it was only used on x86 or other little endian
systems."

In this case we're copying little endian data and then sending it
directly back to some place which requires little endian data so
converting it is a bug.

regards,
dan carpenter


[PATCH] drm/rockchip: Refactor the component match logic.

2017-03-07 Thread Jeffy Chen
Currently we are adding all components from the dts, if one of their
drivers been disabled, we would not be able to bring up others.

Refactor component match logic, follow exynos drm.

Signed-off-by: Jeffy Chen 

---

 drivers/gpu/drm/rockchip/Kconfig|  10 +-
 drivers/gpu/drm/rockchip/Makefile   |  16 +--
 drivers/gpu/drm/rockchip/analogix_dp-rockchip.c |   9 +-
 drivers/gpu/drm/rockchip/cdn-dp-core.c  |   8 +-
 drivers/gpu/drm/rockchip/dw-mipi-dsi.c  |   8 +-
 drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c |  10 +-
 drivers/gpu/drm/rockchip/inno_hdmi.c|  10 +-
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 156 
 drivers/gpu/drm/rockchip/rockchip_drm_drv.h |   6 +
 drivers/gpu/drm/rockchip/rockchip_vop_reg.c |   8 +-
 10 files changed, 133 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig
index 0e4eb84..50c41c0 100644
--- a/drivers/gpu/drm/rockchip/Kconfig
+++ b/drivers/gpu/drm/rockchip/Kconfig
@@ -13,7 +13,7 @@ config DRM_ROCKCHIP
  IP found on the SoC.
 
 config ROCKCHIP_ANALOGIX_DP
-   tristate "Rockchip specific extensions for Analogix DP driver"
+   bool "Rockchip specific extensions for Analogix DP driver"
depends on DRM_ROCKCHIP
select DRM_ANALOGIX_DP
help
@@ -22,7 +22,7 @@ config ROCKCHIP_ANALOGIX_DP
  on RK3288 based SoC, you should selet this option.
 
 config ROCKCHIP_CDN_DP
-tristate "Rockchip cdn DP"
+bool "Rockchip cdn DP"
 depends on DRM_ROCKCHIP
depends on EXTCON
select SND_SOC_HDMI_CODEC if SND_SOC
@@ -33,7 +33,7 @@ config ROCKCHIP_CDN_DP
  option.
 
 config ROCKCHIP_DW_HDMI
-tristate "Rockchip specific extensions for Synopsys DW HDMI"
+bool "Rockchip specific extensions for Synopsys DW HDMI"
 depends on DRM_ROCKCHIP
 select DRM_DW_HDMI
 help
@@ -43,7 +43,7 @@ config ROCKCHIP_DW_HDMI
  option.
 
 config ROCKCHIP_DW_MIPI_DSI
-   tristate "Rockchip specific extensions for Synopsys DW MIPI DSI"
+   bool "Rockchip specific extensions for Synopsys DW MIPI DSI"
depends on DRM_ROCKCHIP
select DRM_MIPI_DSI
help
@@ -53,7 +53,7 @@ config ROCKCHIP_DW_MIPI_DSI
 option.
 
 config ROCKCHIP_INNO_HDMI
-   tristate "Rockchip specific extensions for Innosilicon HDMI"
+   bool "Rockchip specific extensions for Innosilicon HDMI"
depends on DRM_ROCKCHIP
help
  This selects support for Rockchip SoC specific extensions
diff --git a/drivers/gpu/drm/rockchip/Makefile 
b/drivers/gpu/drm/rockchip/Makefile
index c931e2a..fa8dc9d 100644
--- a/drivers/gpu/drm/rockchip/Makefile
+++ b/drivers/gpu/drm/rockchip/Makefile
@@ -3,14 +3,14 @@
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
 rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \
-   rockchip_drm_gem.o rockchip_drm_psr.o rockchip_drm_vop.o
+   rockchip_drm_gem.o rockchip_drm_psr.o \
+   rockchip_drm_vop.o rockchip_vop_reg.o
 rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o
 
-obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
-obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp.o
-cdn-dp-objs := cdn-dp-core.o cdn-dp-reg.o
-obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
-obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
-obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
+rockchipdrm-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
+rockchipdrm-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o
+rockchipdrm-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
+rockchipdrm-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
+rockchipdrm-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
 
-obj-$(CONFIG_DRM_ROCKCHIP) += rockchipdrm.o rockchip_vop_reg.o
+obj-$(CONFIG_DRM_ROCKCHIP) += rockchipdrm.o
diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c 
b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
index 8548e82..91ebe5c 100644
--- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
@@ -507,7 +507,7 @@ static const struct of_device_id rockchip_dp_dt_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, rockchip_dp_dt_ids);
 
-static struct platform_driver rockchip_dp_driver = {
+struct platform_driver rockchip_dp_driver = {
.probe = rockchip_dp_probe,
.remove = rockchip_dp_remove,
.driver = {
@@ -516,10 +516,3 @@ static struct platform_driver rockchip_dp_driver = {
   .of_match_table = of_match_ptr(rockchip_dp_dt_ids),
},
 };
-
-module_platform_driver(rockchip_dp_driver);
-
-MODULE_AUTHOR("Yakir Yang ");
-MODULE_AUTHOR("Jeff chen ");
-MODULE_DESCRIPTION("Rockchip Specific Analogix-DP Driver Extension");
-MODULE_LICENSE("GPL v2");
diff --git 

[PATCH] drm/rockchip: Refactor the component match logic.

2017-03-07 Thread Jeffy Chen
Currently we are adding all components from the dts, if one of their
drivers been disabled, we would not be able to bring up others.

Refactor component match logic, follow exynos drm.

Signed-off-by: Jeffy Chen 

---

 drivers/gpu/drm/rockchip/Kconfig|  10 +-
 drivers/gpu/drm/rockchip/Makefile   |  16 +--
 drivers/gpu/drm/rockchip/analogix_dp-rockchip.c |   9 +-
 drivers/gpu/drm/rockchip/cdn-dp-core.c  |   8 +-
 drivers/gpu/drm/rockchip/dw-mipi-dsi.c  |   8 +-
 drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c |  10 +-
 drivers/gpu/drm/rockchip/inno_hdmi.c|  10 +-
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 156 
 drivers/gpu/drm/rockchip/rockchip_drm_drv.h |   6 +
 drivers/gpu/drm/rockchip/rockchip_vop_reg.c |   8 +-
 10 files changed, 133 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/Kconfig b/drivers/gpu/drm/rockchip/Kconfig
index 0e4eb84..50c41c0 100644
--- a/drivers/gpu/drm/rockchip/Kconfig
+++ b/drivers/gpu/drm/rockchip/Kconfig
@@ -13,7 +13,7 @@ config DRM_ROCKCHIP
  IP found on the SoC.
 
 config ROCKCHIP_ANALOGIX_DP
-   tristate "Rockchip specific extensions for Analogix DP driver"
+   bool "Rockchip specific extensions for Analogix DP driver"
depends on DRM_ROCKCHIP
select DRM_ANALOGIX_DP
help
@@ -22,7 +22,7 @@ config ROCKCHIP_ANALOGIX_DP
  on RK3288 based SoC, you should selet this option.
 
 config ROCKCHIP_CDN_DP
-tristate "Rockchip cdn DP"
+bool "Rockchip cdn DP"
 depends on DRM_ROCKCHIP
depends on EXTCON
select SND_SOC_HDMI_CODEC if SND_SOC
@@ -33,7 +33,7 @@ config ROCKCHIP_CDN_DP
  option.
 
 config ROCKCHIP_DW_HDMI
-tristate "Rockchip specific extensions for Synopsys DW HDMI"
+bool "Rockchip specific extensions for Synopsys DW HDMI"
 depends on DRM_ROCKCHIP
 select DRM_DW_HDMI
 help
@@ -43,7 +43,7 @@ config ROCKCHIP_DW_HDMI
  option.
 
 config ROCKCHIP_DW_MIPI_DSI
-   tristate "Rockchip specific extensions for Synopsys DW MIPI DSI"
+   bool "Rockchip specific extensions for Synopsys DW MIPI DSI"
depends on DRM_ROCKCHIP
select DRM_MIPI_DSI
help
@@ -53,7 +53,7 @@ config ROCKCHIP_DW_MIPI_DSI
 option.
 
 config ROCKCHIP_INNO_HDMI
-   tristate "Rockchip specific extensions for Innosilicon HDMI"
+   bool "Rockchip specific extensions for Innosilicon HDMI"
depends on DRM_ROCKCHIP
help
  This selects support for Rockchip SoC specific extensions
diff --git a/drivers/gpu/drm/rockchip/Makefile 
b/drivers/gpu/drm/rockchip/Makefile
index c931e2a..fa8dc9d 100644
--- a/drivers/gpu/drm/rockchip/Makefile
+++ b/drivers/gpu/drm/rockchip/Makefile
@@ -3,14 +3,14 @@
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
 rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \
-   rockchip_drm_gem.o rockchip_drm_psr.o rockchip_drm_vop.o
+   rockchip_drm_gem.o rockchip_drm_psr.o \
+   rockchip_drm_vop.o rockchip_vop_reg.o
 rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o
 
-obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
-obj-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp.o
-cdn-dp-objs := cdn-dp-core.o cdn-dp-reg.o
-obj-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
-obj-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
-obj-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
+rockchipdrm-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
+rockchipdrm-$(CONFIG_ROCKCHIP_CDN_DP) += cdn-dp-core.o cdn-dp-reg.o
+rockchipdrm-$(CONFIG_ROCKCHIP_DW_HDMI) += dw_hdmi-rockchip.o
+rockchipdrm-$(CONFIG_ROCKCHIP_DW_MIPI_DSI) += dw-mipi-dsi.o
+rockchipdrm-$(CONFIG_ROCKCHIP_INNO_HDMI) += inno_hdmi.o
 
-obj-$(CONFIG_DRM_ROCKCHIP) += rockchipdrm.o rockchip_vop_reg.o
+obj-$(CONFIG_DRM_ROCKCHIP) += rockchipdrm.o
diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c 
b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
index 8548e82..91ebe5c 100644
--- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
@@ -507,7 +507,7 @@ static const struct of_device_id rockchip_dp_dt_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, rockchip_dp_dt_ids);
 
-static struct platform_driver rockchip_dp_driver = {
+struct platform_driver rockchip_dp_driver = {
.probe = rockchip_dp_probe,
.remove = rockchip_dp_remove,
.driver = {
@@ -516,10 +516,3 @@ static struct platform_driver rockchip_dp_driver = {
   .of_match_table = of_match_ptr(rockchip_dp_dt_ids),
},
 };
-
-module_platform_driver(rockchip_dp_driver);
-
-MODULE_AUTHOR("Yakir Yang ");
-MODULE_AUTHOR("Jeff chen ");
-MODULE_DESCRIPTION("Rockchip Specific Analogix-DP Driver Extension");
-MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index 

Re: [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-03-07 Thread Michael S. Tsirkin
On Fri, Mar 03, 2017 at 01:40:28PM +0800, Wei Wang wrote:
> From: Liang Li 
> 
> The implementation of the current virtio-balloon is not very
> efficient, because the pages are transferred to the host one by one.
> Here is the breakdown of the time in percentage spent on each
> step of the balloon inflating process (inflating 7GB of an 8GB
> idle guest).
> 
> 1) allocating pages (6.5%)
> 2) sending PFNs to host (68.3%)
> 3) address translation (6.1%)
> 4) madvise (19%)
> 
> It takes about 4126ms for the inflating process to complete.
> The above profiling shows that the bottlenecks are stage 2)
> and stage 4).
> 
> This patch optimizes step 2) by transfering pages to the host in
> chunks. A chunk consists of guest physically continuous pages, and
> it is offered to the host via a base PFN (i.e. the start PFN of
> those physically continuous pages) and the size (i.e. the total
> number of the pages). A normal chunk is formated as below:
> ---
> |  Base (52 bit)   | Size (12 bit)|
> ---
> For large size chunks, an extended chunk format is used:
> ---
> | Base (64 bit)   |
> ---
> ---
> | Size (64 bit)   |
> ---
> 
> By doing so, step 4) can also be optimized by doing address
> translation and madvise() in chunks rather than page by page.
> 
> This optimization requires the negotation of a new feature bit,
> VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> 
> With this new feature, the above ballooning process takes ~590ms
> resulting in an improvement of ~85%.
> 
> TODO: optimize stage 1) by allocating/freeing a chunk of pages
> instead of a single page each time.
> 
> Signed-off-by: Liang Li 
> Signed-off-by: Wei Wang 
> Suggested-by: Michael S. Tsirkin 
> Cc: Michael S. Tsirkin 
> Cc: Paolo Bonzini 
> Cc: Cornelia Huck 
> Cc: Amit Shah 
> Cc: Dave Hansen 
> Cc: Andrea Arcangeli 
> Cc: David Hildenbrand 
> Cc: Liang Li 
> Cc: Wei Wang 

Does this pass sparse? I see some endian-ness issues here.

> ---
>  drivers/virtio/virtio_balloon.c | 351 
> 
>  1 file changed, 323 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f59cb4f..4416370 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> +#define PAGE_BMAP_SIZE   (8 * PAGE_SIZE)
> +#define PFNS_PER_PAGE_BMAP   (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> +#define PAGE_BMAP_COUNT_MAX  32
> +
>  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
>  module_param(oom_pages, int, S_IRUSR | S_IWUSR);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> @@ -50,6 +54,16 @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  static struct vfsmount *balloon_mnt;
>  #endif
>  
> +struct balloon_page_chunk {
> + __le64 base : 52;
> + __le64 size : 12;
> +};
> +
> +struct balloon_page_chunk_ext {
> + __le64 base;
> + __le64 size;
> +};
> +
>  struct virtio_balloon {
>   struct virtio_device *vdev;
>   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
> @@ -67,6 +81,20 @@ struct virtio_balloon {
>  
>   /* Number of balloon pages we've told the Host we're not using. */
>   unsigned int num_pages;
> + /* Pointer to the response header. */
> + struct virtio_balloon_resp_hdr resp_hdr;
> + /* Pointer to the start address of response data. */
> + __le64 *resp_data;
> + /* Size of response data buffer. */
> + unsigned int resp_buf_size;
> + /* Pointer offset of the response data. */
> + unsigned int resp_pos;
> + /* Bitmap used to save the pfns info */
> + unsigned long *page_bitmap[PAGE_BMAP_COUNT_MAX];
> + /* Number of split page bitmaps */
> + unsigned int nr_page_bmap;
> + /* Used to record the processed pfn range */
> + unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
>   /*
>* The pages we've told the Host we're not using are enqueued
>* at vb_dev_info->pages list.
> @@ -110,20 +138,180 @@ static void balloon_ack(struct virtqueue *vq)
>   wake_up(>acked);
>  }
>  
> -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
> +static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
>  {
> - struct scatterlist sg;
> + vb->min_pfn = ULONG_MAX;
> + vb->max_pfn = 0;
> +}
> +
> 

Re: [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-03-07 Thread Michael S. Tsirkin
On Fri, Mar 03, 2017 at 01:40:28PM +0800, Wei Wang wrote:
> From: Liang Li 
> 
> The implementation of the current virtio-balloon is not very
> efficient, because the pages are transferred to the host one by one.
> Here is the breakdown of the time in percentage spent on each
> step of the balloon inflating process (inflating 7GB of an 8GB
> idle guest).
> 
> 1) allocating pages (6.5%)
> 2) sending PFNs to host (68.3%)
> 3) address translation (6.1%)
> 4) madvise (19%)
> 
> It takes about 4126ms for the inflating process to complete.
> The above profiling shows that the bottlenecks are stage 2)
> and stage 4).
> 
> This patch optimizes step 2) by transfering pages to the host in
> chunks. A chunk consists of guest physically continuous pages, and
> it is offered to the host via a base PFN (i.e. the start PFN of
> those physically continuous pages) and the size (i.e. the total
> number of the pages). A normal chunk is formated as below:
> ---
> |  Base (52 bit)   | Size (12 bit)|
> ---
> For large size chunks, an extended chunk format is used:
> ---
> | Base (64 bit)   |
> ---
> ---
> | Size (64 bit)   |
> ---
> 
> By doing so, step 4) can also be optimized by doing address
> translation and madvise() in chunks rather than page by page.
> 
> This optimization requires the negotation of a new feature bit,
> VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> 
> With this new feature, the above ballooning process takes ~590ms
> resulting in an improvement of ~85%.
> 
> TODO: optimize stage 1) by allocating/freeing a chunk of pages
> instead of a single page each time.
> 
> Signed-off-by: Liang Li 
> Signed-off-by: Wei Wang 
> Suggested-by: Michael S. Tsirkin 
> Cc: Michael S. Tsirkin 
> Cc: Paolo Bonzini 
> Cc: Cornelia Huck 
> Cc: Amit Shah 
> Cc: Dave Hansen 
> Cc: Andrea Arcangeli 
> Cc: David Hildenbrand 
> Cc: Liang Li 
> Cc: Wei Wang 

Does this pass sparse? I see some endian-ness issues here.

> ---
>  drivers/virtio/virtio_balloon.c | 351 
> 
>  1 file changed, 323 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f59cb4f..4416370 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> +#define PAGE_BMAP_SIZE   (8 * PAGE_SIZE)
> +#define PFNS_PER_PAGE_BMAP   (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> +#define PAGE_BMAP_COUNT_MAX  32
> +
>  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
>  module_param(oom_pages, int, S_IRUSR | S_IWUSR);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> @@ -50,6 +54,16 @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  static struct vfsmount *balloon_mnt;
>  #endif
>  
> +struct balloon_page_chunk {
> + __le64 base : 52;
> + __le64 size : 12;
> +};
> +
> +struct balloon_page_chunk_ext {
> + __le64 base;
> + __le64 size;
> +};
> +
>  struct virtio_balloon {
>   struct virtio_device *vdev;
>   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
> @@ -67,6 +81,20 @@ struct virtio_balloon {
>  
>   /* Number of balloon pages we've told the Host we're not using. */
>   unsigned int num_pages;
> + /* Pointer to the response header. */
> + struct virtio_balloon_resp_hdr resp_hdr;
> + /* Pointer to the start address of response data. */
> + __le64 *resp_data;
> + /* Size of response data buffer. */
> + unsigned int resp_buf_size;
> + /* Pointer offset of the response data. */
> + unsigned int resp_pos;
> + /* Bitmap used to save the pfns info */
> + unsigned long *page_bitmap[PAGE_BMAP_COUNT_MAX];
> + /* Number of split page bitmaps */
> + unsigned int nr_page_bmap;
> + /* Used to record the processed pfn range */
> + unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
>   /*
>* The pages we've told the Host we're not using are enqueued
>* at vb_dev_info->pages list.
> @@ -110,20 +138,180 @@ static void balloon_ack(struct virtqueue *vq)
>   wake_up(>acked);
>  }
>  
> -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
> +static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
>  {
> - struct scatterlist sg;
> + vb->min_pfn = ULONG_MAX;
> + vb->max_pfn = 0;
> +}
> +
> +static inline void update_bmap_pfn_range(struct virtio_balloon *vb,
> +  struct page *page)
> +{
> + unsigned long balloon_pfn = page_to_balloon_pfn(page);
> +
> + vb->min_pfn = min(balloon_pfn, vb->min_pfn);
> + vb->max_pfn = 

Re: [PATCH 2/2] Staging: comedi: comedi_fops: Fix "out of minor numbers for board device files"

2017-03-07 Thread Dan Carpenter
On Sun, Mar 05, 2017 at 03:22:33AM +0800, Cheah Kok Cheong wrote:
> If comedi module is loaded with the following max allowed parameter
> [comedi_num_legacy_minors=48], subsequent loading of an auto-configured
> device will fail.

Don't set comedi_num_legacy_minors=48, then?

This doesn't seem like the right fix at all.  Why only allow one auto
configured board?  Why not 5 or 10?

regards,
dan carpenter



Re: [PATCH 2/2] Staging: comedi: comedi_fops: Fix "out of minor numbers for board device files"

2017-03-07 Thread Dan Carpenter
On Sun, Mar 05, 2017 at 03:22:33AM +0800, Cheah Kok Cheong wrote:
> If comedi module is loaded with the following max allowed parameter
> [comedi_num_legacy_minors=48], subsequent loading of an auto-configured
> device will fail.

Don't set comedi_num_legacy_minors=48, then?

This doesn't seem like the right fix at all.  Why only allow one auto
configured board?  Why not 5 or 10?

regards,
dan carpenter



Re: [PATCH V5 6/6] proc: show MADV_FREE pages info in smaps

2017-03-07 Thread Minchan Kim
Hi Andrew,

On Tue, Mar 07, 2017 at 02:43:38PM -0800, Andrew Morton wrote:
> On Tue, 7 Mar 2017 11:05:45 +0100 Michal Hocko  wrote:
> 
> > On Fri 03-03-17 16:10:27, Andrew Morton wrote:
> > > On Thu, 2 Mar 2017 17:30:54 +0100 Michal Hocko  wrote:
> > > 
> > > > > It's not that I think you're wrong: it *is* an implementation detail.
> > > > > But we take a bit of incoherency from batching all over the place, so
> > > > > it's a little odd to take a stand over this particular instance of it
> > > > > - whether demanding that it'd be fixed, or be documented, which would
> > > > > only suggest to users that this is special when it really isn't etc.
> > > > 
> > > > I am not aware of other counter printed in smaps that would suffer from
> > > > the same problem, but I haven't checked too deeply so I might be wrong. 
> > > > 
> > > > Anyway it seems that I am alone in my position so I will not insist.
> > > > If we have any bug report then we can still fix it.
> > > 
> > > A single lru_add_drain_all() right at the top level (in smaps_show()?)
> > > won't kill us
> > 
> > I do not think we want to put lru_add_drain_all cost to a random
> > process reading /proc//smaps.
> 
> Why not?  It's that process which is calling for the work to be done.
> 
> > If anything the one which does the
> > madvise should be doing this.
> 
> But it would be silly to do extra work in madvise() if nobody will be
> reading smaps for the next two months.
> 
> How much work is it anyway?  What would be the relative impact upon a
> smaps read?

I agree only if the draining guarantees all of mapped pages in the range
could be marked to lazyfree. However, it's not true because there are a
few of logics to skip the page marking in madvise_free_pte_range.

So, my conclusion is drainning helps a bit but not gaurantees.
In such case, IMHO, let's not do the effort to make better.

Thanks.


Re: [PATCH V5 6/6] proc: show MADV_FREE pages info in smaps

2017-03-07 Thread Minchan Kim
Hi Andrew,

On Tue, Mar 07, 2017 at 02:43:38PM -0800, Andrew Morton wrote:
> On Tue, 7 Mar 2017 11:05:45 +0100 Michal Hocko  wrote:
> 
> > On Fri 03-03-17 16:10:27, Andrew Morton wrote:
> > > On Thu, 2 Mar 2017 17:30:54 +0100 Michal Hocko  wrote:
> > > 
> > > > > It's not that I think you're wrong: it *is* an implementation detail.
> > > > > But we take a bit of incoherency from batching all over the place, so
> > > > > it's a little odd to take a stand over this particular instance of it
> > > > > - whether demanding that it'd be fixed, or be documented, which would
> > > > > only suggest to users that this is special when it really isn't etc.
> > > > 
> > > > I am not aware of other counter printed in smaps that would suffer from
> > > > the same problem, but I haven't checked too deeply so I might be wrong. 
> > > > 
> > > > Anyway it seems that I am alone in my position so I will not insist.
> > > > If we have any bug report then we can still fix it.
> > > 
> > > A single lru_add_drain_all() right at the top level (in smaps_show()?)
> > > won't kill us
> > 
> > I do not think we want to put lru_add_drain_all cost to a random
> > process reading /proc//smaps.
> 
> Why not?  It's that process which is calling for the work to be done.
> 
> > If anything the one which does the
> > madvise should be doing this.
> 
> But it would be silly to do extra work in madvise() if nobody will be
> reading smaps for the next two months.
> 
> How much work is it anyway?  What would be the relative impact upon a
> smaps read?

I agree only if the draining guarantees all of mapped pages in the range
could be marked to lazyfree. However, it's not true because there are a
few of logics to skip the page marking in madvise_free_pte_range.

So, my conclusion is drainning helps a bit but not gaurantees.
In such case, IMHO, let's not do the effort to make better.

Thanks.


Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-07 Thread Minchan Kim
Hi Johannes,

On Tue, Mar 07, 2017 at 10:51:45AM +0100, Johannes Thumshirn wrote:
> On 03/07/2017 09:55 AM, Minchan Kim wrote:
> > On Tue, Mar 07, 2017 at 08:48:06AM +0100, Hannes Reinecke wrote:
> >> On 03/07/2017 08:23 AM, Minchan Kim wrote:
> >>> Hi Hannes,
> >>>
> >>> On Tue, Mar 7, 2017 at 4:00 PM, Hannes Reinecke  wrote:
>  On 03/07/2017 06:22 AM, Minchan Kim wrote:
> > Hello Johannes,
> >
> > On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
> >> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When 
> >> using
> >> the NVMe over Fabrics loopback target which potentially sends a huge 
> >> bulk of
> >> pages attached to the bio's bvec this results in a kernel panic 
> >> because of
> >> array out of bounds accesses in zram_decompress_page().
> >
> > First of all, thanks for the report and fix up!
> > Unfortunately, I'm not familiar with that interface of block layer.
> >
> > It seems this is a material for stable so I want to understand it clear.
> > Could you say more specific things to educate me?
> >
> > What scenario/When/How it is problem?  It will help for me to 
> > understand!
> >
> >>>
> >>> Thanks for the quick response!
> >>>
>  The problem is that zram as it currently stands can only handle bios
>  where each bvec contains a single page (or, to be precise, a chunk of
>  data with a length of a page).
> >>>
> >>> Right.
> >>>
> 
>  This is not an automatic guarantee from the block layer (who is free to
>  send us bios with arbitrary-sized bvecs), so we need to set the queue
>  limits to ensure that.
> >>>
> >>> What does it mean "bios with arbitrary-sized bvecs"?
> >>> What kinds of scenario is it used/useful?
> >>>
> >> Each bio contains a list of bvecs, each of which points to a specific
> >> memory area:
> >>
> >> struct bio_vec {
> >>struct page *bv_page;
> >>unsigned intbv_len;
> >>unsigned intbv_offset;
> >> };
> >>
> >> The trick now is that while 'bv_page' does point to a page, the memory
> >> area pointed to might in fact be contiguous (if several pages are
> >> adjacent). Hence we might be getting a bio_vec where bv_len is _larger_
> >> than a page.
> > 
> > Thanks for detail, Hannes!
> > 
> > If I understand it correctly, it seems to be related to bid_add_page
> > with high-order page. Right?
> > 
> > If so, I really wonder why I don't see such problem because several
> > places have used it and I expected some of them might do IO with
> > contiguous pages intentionally or by chance. Hmm,
> > 
> > IIUC, it's not a nvme specific problme but general problem which
> > can trigger normal FSes if they uses contiguos pages?
> > 
> 
> I'm not a FS expert, but a quick grep shows that non of the file-systems
> does the
> 
> for_each_sg()
>   while(bio_add_page()))
> 
> trick NVMe does.

Aah, I see.

> 
> >>
> >> Hence the check for 'is_partial_io' in zram_drv.c (which just does a
> >> test 'if bv_len != PAGE_SIZE) is in fact wrong, as it would trigger for
> >> partial I/O (ie if the overall length of the bio_vec is _smaller_ than a
> >> page), but also for multipage bvecs (where the length of the bio_vec is
> >> _larger_ than a page).
> > 
> > Right. I need to look into that. Thanks for the pointing out!
> > 
> >>
> >> So rather than fixing the bio scanning loop in zram it's easier to set
> >> the queue limits correctly so that 'is_partial_io' does the correct
> >> thing and the overall logic in zram doesn't need to be altered.
> > 
> > 
> > Isn't that approach require new bio allocation through blk_queue_split?
> > Maybe, it wouldn't make severe regression in zram-FS workload but need
> > to test.
> 
> Yes, but blk_queue_split() needs information how big the bvecs can be,
> hence the patch.
> 
> For details have a look into blk_bio_segment_split() in block/blk-merge.c
> 
> It get's the max_sectors from blk_max_size_offset() which is
> q->limits.max_sectors when q->limits.chunk_sectors isn't set and
> then loops over the bio's bvecs to check when to split the bio and then
> calls bio_split() when appropriate.

Yeb so it causes split bio which means new bio allocations which was
not needed before.

> 
> > 
> > Is there any ways to trigger the problem without real nvme device?
> > It would really help to test/measure zram.
> 
> It isn't a /real/ device but the fabrics loopback target. If you want a
> fast reproducible test-case, have a look at:
> 
> https://github.com/ddiss/rapido/
> the cut_nvme_local.sh script set's up the correct VM for this test. Then
> a simple mkfs.xfs /dev/nvme0n1 will oops.

Thanks! I will look into that.

And could you test this patch? It avoids split bio so no need new bio
allocations and makes zram code simple.

>From f778d7564d5cd772f25bb181329362c29548a257 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Wed, 8 Mar 2017 13:35:29 +0900
Subject: [PATCH] 

Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-07 Thread Minchan Kim
Hi Johannes,

On Tue, Mar 07, 2017 at 10:51:45AM +0100, Johannes Thumshirn wrote:
> On 03/07/2017 09:55 AM, Minchan Kim wrote:
> > On Tue, Mar 07, 2017 at 08:48:06AM +0100, Hannes Reinecke wrote:
> >> On 03/07/2017 08:23 AM, Minchan Kim wrote:
> >>> Hi Hannes,
> >>>
> >>> On Tue, Mar 7, 2017 at 4:00 PM, Hannes Reinecke  wrote:
>  On 03/07/2017 06:22 AM, Minchan Kim wrote:
> > Hello Johannes,
> >
> > On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
> >> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When 
> >> using
> >> the NVMe over Fabrics loopback target which potentially sends a huge 
> >> bulk of
> >> pages attached to the bio's bvec this results in a kernel panic 
> >> because of
> >> array out of bounds accesses in zram_decompress_page().
> >
> > First of all, thanks for the report and fix up!
> > Unfortunately, I'm not familiar with that interface of block layer.
> >
> > It seems this is a material for stable so I want to understand it clear.
> > Could you say more specific things to educate me?
> >
> > What scenario/When/How it is problem?  It will help for me to 
> > understand!
> >
> >>>
> >>> Thanks for the quick response!
> >>>
>  The problem is that zram as it currently stands can only handle bios
>  where each bvec contains a single page (or, to be precise, a chunk of
>  data with a length of a page).
> >>>
> >>> Right.
> >>>
> 
>  This is not an automatic guarantee from the block layer (who is free to
>  send us bios with arbitrary-sized bvecs), so we need to set the queue
>  limits to ensure that.
> >>>
> >>> What does it mean "bios with arbitrary-sized bvecs"?
> >>> What kinds of scenario is it used/useful?
> >>>
> >> Each bio contains a list of bvecs, each of which points to a specific
> >> memory area:
> >>
> >> struct bio_vec {
> >>struct page *bv_page;
> >>unsigned intbv_len;
> >>unsigned intbv_offset;
> >> };
> >>
> >> The trick now is that while 'bv_page' does point to a page, the memory
> >> area pointed to might in fact be contiguous (if several pages are
> >> adjacent). Hence we might be getting a bio_vec where bv_len is _larger_
> >> than a page.
> > 
> > Thanks for detail, Hannes!
> > 
> > If I understand it correctly, it seems to be related to bid_add_page
> > with high-order page. Right?
> > 
> > If so, I really wonder why I don't see such problem because several
> > places have used it and I expected some of them might do IO with
> > contiguous pages intentionally or by chance. Hmm,
> > 
> > IIUC, it's not a nvme specific problme but general problem which
> > can trigger normal FSes if they uses contiguos pages?
> > 
> 
> I'm not a FS expert, but a quick grep shows that non of the file-systems
> does the
> 
> for_each_sg()
>   while(bio_add_page()))
> 
> trick NVMe does.

Aah, I see.

> 
> >>
> >> Hence the check for 'is_partial_io' in zram_drv.c (which just does a
> >> test 'if bv_len != PAGE_SIZE) is in fact wrong, as it would trigger for
> >> partial I/O (ie if the overall length of the bio_vec is _smaller_ than a
> >> page), but also for multipage bvecs (where the length of the bio_vec is
> >> _larger_ than a page).
> > 
> > Right. I need to look into that. Thanks for the pointing out!
> > 
> >>
> >> So rather than fixing the bio scanning loop in zram it's easier to set
> >> the queue limits correctly so that 'is_partial_io' does the correct
> >> thing and the overall logic in zram doesn't need to be altered.
> > 
> > 
> > Isn't that approach require new bio allocation through blk_queue_split?
> > Maybe, it wouldn't make severe regression in zram-FS workload but need
> > to test.
> 
> Yes, but blk_queue_split() needs information how big the bvecs can be,
> hence the patch.
> 
> For details have a look into blk_bio_segment_split() in block/blk-merge.c
> 
> It get's the max_sectors from blk_max_size_offset() which is
> q->limits.max_sectors when q->limits.chunk_sectors isn't set and
> then loops over the bio's bvecs to check when to split the bio and then
> calls bio_split() when appropriate.

Yeb so it causes split bio which means new bio allocations which was
not needed before.

> 
> > 
> > Is there any ways to trigger the problem without real nvme device?
> > It would really help to test/measure zram.
> 
> It isn't a /real/ device but the fabrics loopback target. If you want a
> fast reproducible test-case, have a look at:
> 
> https://github.com/ddiss/rapido/
> the cut_nvme_local.sh script set's up the correct VM for this test. Then
> a simple mkfs.xfs /dev/nvme0n1 will oops.

Thanks! I will look into that.

And could you test this patch? It avoids split bio so no need new bio
allocations and makes zram code simple.

>From f778d7564d5cd772f25bb181329362c29548a257 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Wed, 8 Mar 2017 13:35:29 +0900
Subject: [PATCH] fix

Not-yet-Signed-off-by: 

Re: [PATCH] mm: Do not use double negation for testing page flags

2017-03-07 Thread Minchan Kim
Hi Anshuman,

On Tue, Mar 07, 2017 at 09:31:18PM +0530, Anshuman Khandual wrote:
> On 03/07/2017 12:06 PM, Minchan Kim wrote:
> > With the discussion[1], I found it seems there are every PageFlags
> > functions return bool at this moment so we don't need double
> > negation any more.
> > Although it's not a problem to keep it, it makes future users
> > confused to use dobule negation for them, too.
> > 
> > Remove such possibility.
> 
> A quick search of '!!Page' in the source tree does not show any other
> place having this double negation. So I guess this is all which need
> to be fixed.

Yeb. That's the why my patch includes only khugepagd part but my
concern is PageFlags returns int type not boolean so user might
be confused easily and tempted to use dobule negation.

Other side is they who create new custom PageXXX(e.g., PageMovable)
should keep it in mind that they should return 0 or 1 although
fucntion prototype's return value is int type. It shouldn't be
documented nowhere. Although we can add a little description
somewhere in page-flags.h, I believe changing to boolean is more
clear/not-error-prone so Chen's work is enough worth, I think.


Re: [PATCH] mm: Do not use double negation for testing page flags

2017-03-07 Thread Minchan Kim
Hi Anshuman,

On Tue, Mar 07, 2017 at 09:31:18PM +0530, Anshuman Khandual wrote:
> On 03/07/2017 12:06 PM, Minchan Kim wrote:
> > With the discussion[1], I found it seems there are every PageFlags
> > functions return bool at this moment so we don't need double
> > negation any more.
> > Although it's not a problem to keep it, it makes future users
> > confused to use dobule negation for them, too.
> > 
> > Remove such possibility.
> 
> A quick search of '!!Page' in the source tree does not show any other
> place having this double negation. So I guess this is all which need
> to be fixed.

Yeb. That's the why my patch includes only khugepagd part but my
concern is PageFlags returns int type not boolean so user might
be confused easily and tempted to use dobule negation.

Other side is they who create new custom PageXXX(e.g., PageMovable)
should keep it in mind that they should return 0 or 1 although
fucntion prototype's return value is int type. It shouldn't be
documented nowhere. Although we can add a little description
somewhere in page-flags.h, I believe changing to boolean is more
clear/not-error-prone so Chen's work is enough worth, I think.


[PATCH] ARM: dts: mvebu: linksys: enable buffer manager support

2017-03-07 Thread Ralph Sennhauser
Add appropriate properties to devices in the Linksys WRT AC Series for the
mvneta driver to use hardware buffer management.

Also update "soc" ranges property and set the status of bm and bm-bppi
to "okay" (SRAM).

Signed-off-by: Ralph Sennhauser 
---
 arch/arm/boot/dts/armada-385-linksys.dtsi | 17 -
 arch/arm/boot/dts/armada-xp-linksys-mamba.dts | 17 -
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/armada-385-linksys.dtsi 
b/arch/arm/boot/dts/armada-385-linksys.dtsi
index df47bf1..4aac375 100644
--- a/arch/arm/boot/dts/armada-385-linksys.dtsi
+++ b/arch/arm/boot/dts/armada-385-linksys.dtsi
@@ -59,7 +59,8 @@
ranges = ;
+ MBUS_ID(0x09, 0x15) 0 0xf111 0x1
+ MBUS_ID(0x0c, 0x04) 0 0xf120 0x10>;
 
internal-regs {
i2c@11000 {
@@ -88,6 +89,9 @@
ethernet@7 {
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -97,6 +101,9 @@
ethernet@34000 {
status = "okay";
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -159,6 +166,10 @@
status = "okay";
};
 
+   bm@c8000 {
+   status = "okay";
+   };
+
/* USB part of the eSATA/USB 2.0 port */
usb@58000 {
status = "okay";
@@ -241,6 +252,10 @@
};
};
 
+   bm-bppi {
+   status = "okay";
+   };
+
pcie-controller {
status = "okay";
 
diff --git a/arch/arm/boot/dts/armada-xp-linksys-mamba.dts 
b/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
index 3744ba3..b188a4dc 100644
--- a/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
+++ b/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
@@ -71,7 +71,8 @@
ranges = ;
+ MBUS_ID(0x09, 0x05) 0 0 0xf111 0x1
+ MBUS_ID(0x0c, 0x04) 0 0 0xf120 0x10>;
 
internal-regs {
 
@@ -95,6 +96,9 @@
pinctrl-names = "default";
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -106,6 +110,9 @@
pinctrl-names = "default";
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -186,6 +193,10 @@
};
};
 
+   bm@c8000 {
+   status = "okay";
+   };
+
nand@d {
status = "okay";
num-cs = <1>;
@@ -259,6 +270,10 @@
};
};
};
+
+   bm-bppi {
+   status = "okay";
+   };
};
 
gpio_keys {

[PATCH] ARM: dts: mvebu: linksys: enable buffer manager support

2017-03-07 Thread Ralph Sennhauser
Add appropriate properties to devices in the Linksys WRT AC Series for the
mvneta driver to use hardware buffer management.

Also update "soc" ranges property and set the status of bm and bm-bppi
to "okay" (SRAM).

Signed-off-by: Ralph Sennhauser 
---
 arch/arm/boot/dts/armada-385-linksys.dtsi | 17 -
 arch/arm/boot/dts/armada-xp-linksys-mamba.dts | 17 -
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/armada-385-linksys.dtsi 
b/arch/arm/boot/dts/armada-385-linksys.dtsi
index df47bf1..4aac375 100644
--- a/arch/arm/boot/dts/armada-385-linksys.dtsi
+++ b/arch/arm/boot/dts/armada-385-linksys.dtsi
@@ -59,7 +59,8 @@
ranges = ;
+ MBUS_ID(0x09, 0x15) 0 0xf111 0x1
+ MBUS_ID(0x0c, 0x04) 0 0xf120 0x10>;
 
internal-regs {
i2c@11000 {
@@ -88,6 +89,9 @@
ethernet@7 {
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -97,6 +101,9 @@
ethernet@34000 {
status = "okay";
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -159,6 +166,10 @@
status = "okay";
};
 
+   bm@c8000 {
+   status = "okay";
+   };
+
/* USB part of the eSATA/USB 2.0 port */
usb@58000 {
status = "okay";
@@ -241,6 +252,10 @@
};
};
 
+   bm-bppi {
+   status = "okay";
+   };
+
pcie-controller {
status = "okay";
 
diff --git a/arch/arm/boot/dts/armada-xp-linksys-mamba.dts 
b/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
index 3744ba3..b188a4dc 100644
--- a/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
+++ b/arch/arm/boot/dts/armada-xp-linksys-mamba.dts
@@ -71,7 +71,8 @@
ranges = ;
+ MBUS_ID(0x09, 0x05) 0 0 0xf111 0x1
+ MBUS_ID(0x0c, 0x04) 0 0 0xf120 0x10>;
 
internal-regs {
 
@@ -95,6 +96,9 @@
pinctrl-names = "default";
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -106,6 +110,9 @@
pinctrl-names = "default";
status = "okay";
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
+   bm,pool-short = <3>;
fixed-link {
speed = <1000>;
full-duplex;
@@ -186,6 +193,10 @@
};
};
 
+   bm@c8000 {
+   status = "okay";
+   };
+
nand@d {
status = "okay";
num-cs = <1>;
@@ -259,6 +270,10 @@
};
};
};
+
+   bm-bppi {
+   status = "okay";
+   };
};
 
gpio_keys {
-- 
2.10.2



Re: [PATCH v3 3/3] printk: fix double printing with earlycon

2017-03-07 Thread Sergey Senozhatsky
Hello,

sorry for the delay.

On (03/07/17 15:54), Aleksey Makarov wrote:
> On 03/06/2017 03:59 PM, Sergey Senozhatsky wrote:
> > On (03/03/17 18:49), Aleksey Makarov wrote:
> > [..]
> > > +static enum { CONSOLE_MATCH, CONSOLE_MATCH_RETURN, CONSOLE_MATCH_NEXT }
> > > +match_console(struct console *newcon, struct console_cmdline *c)
> > 
> > that enum in function return is interesting :)
> > can we make it less hackish?
> We probably can, but I can not figure out how to do that.
> Suggestions will be appreciated.
> We should signal 3 different outcomes.
> I thought that using standard errnos is not quite desciptive.

no problems with the enum on its own. errnos probably can also do
the trick.

the way it's defined, however, is a bit unusual and may be
inconvenient - we can add, say, 5 more CONSOLE_MATCH_FOO someday
in the future and match_console() function definition thus will be:

static enum { CONSOLE_MATCH, CONSOLE_MATCH_RETURN, CONSOLE_MATCH_NEXT,
CONSOLE_MATCH_FOO1, CONSOLE_MATCH_FOO2,
CONSOLE_MATCH_FOO3, CONSOLE_MATCH_FOO4,
CONSOLE_MATCH_FOO5}
match_console(struct console *newcon, struct console_cmdline *c)
{
...
}

or something like this

static enum { CONSOLE_MATCH,
CONSOLE_MATCH_RETURN,
CONSOLE_MATCH_NEXT,
CONSOLE_MATCH_FOO1,
CONSOLE_MATCH_FOO2,
CONSOLE_MATCH_FOO3,
CONSOLE_MATCH_FOO4,
CONSOLE_MATCH_FOO5 }
match_console(struct console *newcon, struct console_cmdline *c)
{
..
}

or anything else. which is, to my admittedly imperfect taste, slightly
"unpretty".

[..]
> > > + /*
> > >*  See if this console matches one we selected on
> > >*  the command line.
> > >*/
> > >   for (i = 0, c = console_cmdline;
> > >i < MAX_CMDLINECONSOLES && c->name[0];
> > >i++, c++) {
> > > - if (!newcon->match ||
> > > - newcon->match(newcon, c->name, c->index, c->options) != 0) {
> > > - /* default matching */
> > > - BUILD_BUG_ON(sizeof(c->name) != sizeof(newcon->name));
> > > - if (strcmp(c->name, newcon->name) != 0)
> > > - continue;
> > > - if (newcon->index >= 0 &&
> > > - newcon->index != c->index)
> > > - continue;
> > > - if (newcon->index < 0)
> > > - newcon->index = c->index;
> > > -
> > > - if (_braille_register_console(newcon, c))
> > > - return;
> > > 
> > > - if (newcon->setup &&
> > > - newcon->setup(newcon, c->options) != 0)
> > > - break;
> > > - }
> > > + if (preferred_console == i)
> > > + continue;
> > > 
> > > - newcon->flags |= CON_ENABLED;
> > > - if (i == preferred_console) {
> > > - newcon->flags |= CON_CONSDEV;
> > > - has_preferred = true;
> > > + switch (match_console(newcon, c)) {
> > > + case CONSOLE_MATCH:
> > > + goto match;
> > > + case CONSOLE_MATCH_RETURN:
> > > + return;
> > > + default:
> > > + break;
> > 
> > sorry, it was a rather long for me today. need to look more at this.
> > for what is now CONSOLE_MATCH_NEXT we used to have continue,
> 
> CONSOLE_MATCH is for the case when the console matches against the 
> description,
> CONSOLE_MATCH_NEXT - it does not, we should try next,

my bad, sorry. I misread the patch: there was another `break' right after
that switch, that you have removed; and I just wrongly concluded that
CONSOLE_MATCH_NEXT would now 'break' from 'default' label *and* `break'
from the console_cmdline loop right after it.

bikeshedding:
may be explicit CONSOLE_MATCH_NEXT test will save us from problems (in
case if match_console() will return more codes someday), may be it won't.
hard to say. 'default: continue' is probably OK. or may be can do without
that 'match' label at all. something like this (_may be_)

for (i = 0, c = console_cmdline; ... ) {
if (preferred_console == i)
continue;

match = match_console(newcon, c);
if (match == CONSOLE_MATCH_NEXT)
continue;
if (match == CONSOLE_MATCH_FOUND)
break;
if (match == CONSOLE_MATCH_STOP)
return;
}
...



CONSOLE_MATCH_RETURN  -  basically means that we should stop matching.
can we thus rename it to CONSOLE_MATCH_STOP, or similar?

match_console() returned CONSOLE_MATCH_STOP

is a bit better than

match_console() returned CONSOLE_MATCH_RETURN.

isn't it? :)


// I also used CONSOLE_MATCH_FOUND in the example above instead of
// CONSOLE_MATCH. not insisting that CONSOLE_MATCH_FOUND is much
// better than 

Re: [PATCH v3 3/3] printk: fix double printing with earlycon

2017-03-07 Thread Sergey Senozhatsky
Hello,

sorry for the delay.

On (03/07/17 15:54), Aleksey Makarov wrote:
> On 03/06/2017 03:59 PM, Sergey Senozhatsky wrote:
> > On (03/03/17 18:49), Aleksey Makarov wrote:
> > [..]
> > > +static enum { CONSOLE_MATCH, CONSOLE_MATCH_RETURN, CONSOLE_MATCH_NEXT }
> > > +match_console(struct console *newcon, struct console_cmdline *c)
> > 
> > that enum in function return is interesting :)
> > can we make it less hackish?
> We probably can, but I can not figure out how to do that.
> Suggestions will be appreciated.
> We should signal 3 different outcomes.
> I thought that using standard errnos is not quite desciptive.

no problems with the enum on its own. errnos probably can also do
the trick.

the way it's defined, however, is a bit unusual and may be
inconvenient - we can add, say, 5 more CONSOLE_MATCH_FOO someday
in the future and match_console() function definition thus will be:

static enum { CONSOLE_MATCH, CONSOLE_MATCH_RETURN, CONSOLE_MATCH_NEXT,
CONSOLE_MATCH_FOO1, CONSOLE_MATCH_FOO2,
CONSOLE_MATCH_FOO3, CONSOLE_MATCH_FOO4,
CONSOLE_MATCH_FOO5}
match_console(struct console *newcon, struct console_cmdline *c)
{
...
}

or something like this

static enum { CONSOLE_MATCH,
CONSOLE_MATCH_RETURN,
CONSOLE_MATCH_NEXT,
CONSOLE_MATCH_FOO1,
CONSOLE_MATCH_FOO2,
CONSOLE_MATCH_FOO3,
CONSOLE_MATCH_FOO4,
CONSOLE_MATCH_FOO5 }
match_console(struct console *newcon, struct console_cmdline *c)
{
..
}

or anything else. which is, to my admittedly imperfect taste, slightly
"unpretty".

[..]
> > > + /*
> > >*  See if this console matches one we selected on
> > >*  the command line.
> > >*/
> > >   for (i = 0, c = console_cmdline;
> > >i < MAX_CMDLINECONSOLES && c->name[0];
> > >i++, c++) {
> > > - if (!newcon->match ||
> > > - newcon->match(newcon, c->name, c->index, c->options) != 0) {
> > > - /* default matching */
> > > - BUILD_BUG_ON(sizeof(c->name) != sizeof(newcon->name));
> > > - if (strcmp(c->name, newcon->name) != 0)
> > > - continue;
> > > - if (newcon->index >= 0 &&
> > > - newcon->index != c->index)
> > > - continue;
> > > - if (newcon->index < 0)
> > > - newcon->index = c->index;
> > > -
> > > - if (_braille_register_console(newcon, c))
> > > - return;
> > > 
> > > - if (newcon->setup &&
> > > - newcon->setup(newcon, c->options) != 0)
> > > - break;
> > > - }
> > > + if (preferred_console == i)
> > > + continue;
> > > 
> > > - newcon->flags |= CON_ENABLED;
> > > - if (i == preferred_console) {
> > > - newcon->flags |= CON_CONSDEV;
> > > - has_preferred = true;
> > > + switch (match_console(newcon, c)) {
> > > + case CONSOLE_MATCH:
> > > + goto match;
> > > + case CONSOLE_MATCH_RETURN:
> > > + return;
> > > + default:
> > > + break;
> > 
> > sorry, it was a rather long for me today. need to look more at this.
> > for what is now CONSOLE_MATCH_NEXT we used to have continue,
> 
> CONSOLE_MATCH is for the case when the console matches against the 
> description,
> CONSOLE_MATCH_NEXT - it does not, we should try next,

my bad, sorry. I misread the patch: there was another `break' right after
that switch, that you have removed; and I just wrongly concluded that
CONSOLE_MATCH_NEXT would now 'break' from 'default' label *and* `break'
from the console_cmdline loop right after it.

bikeshedding:
may be explicit CONSOLE_MATCH_NEXT test will save us from problems (in
case if match_console() will return more codes someday), may be it won't.
hard to say. 'default: continue' is probably OK. or may be can do without
that 'match' label at all. something like this (_may be_)

for (i = 0, c = console_cmdline; ... ) {
if (preferred_console == i)
continue;

match = match_console(newcon, c);
if (match == CONSOLE_MATCH_NEXT)
continue;
if (match == CONSOLE_MATCH_FOUND)
break;
if (match == CONSOLE_MATCH_STOP)
return;
}
...



CONSOLE_MATCH_RETURN  -  basically means that we should stop matching.
can we thus rename it to CONSOLE_MATCH_STOP, or similar?

match_console() returned CONSOLE_MATCH_STOP

is a bit better than

match_console() returned CONSOLE_MATCH_RETURN.

isn't it? :)


// I also used CONSOLE_MATCH_FOUND in the example above instead of
// CONSOLE_MATCH. not insisting that CONSOLE_MATCH_FOUND is much
// better than 

  1   2   3   4   5   6   7   8   9   10   >