date:20161117

[PATCH 2/2] vhost: forbid IOTLB invalidation when not enabled

2016-11-17 Thread Jason Wang

When IOTLB is not enabled, we should forbid IOTLB invalidation to
avoid a NULL pointer dereference.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c6f2d89..7d338d5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -959,6 +959,10 @@ int vhost_process_iotlb_msg(struct vhost_dev *dev,
vhost_iotlb_notify_vq(dev, msg);
break;
case VHOST_IOTLB_INVALIDATE:
+   if (!dev->iotlb) {
+   ret = -EFAULT;
+   break;
+   }
vhost_del_umem_range(dev->iotlb, msg->iova,
 msg->iova + msg->size - 1);
break;
-- 
2.7.4

[PATCH 2/2] vhost: forbid IOTLB invalidation when not enabled

2016-11-17 Thread Jason Wang

When IOTLB is not enabled, we should forbid IOTLB invalidation to
avoid a NULL pointer dereference.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c6f2d89..7d338d5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -959,6 +959,10 @@ int vhost_process_iotlb_msg(struct vhost_dev *dev,
vhost_iotlb_notify_vq(dev, msg);
break;
case VHOST_IOTLB_INVALIDATE:
+   if (!dev->iotlb) {
+   ret = -EFAULT;
+   break;
+   }
vhost_del_umem_range(dev->iotlb, msg->iova,
 msg->iova + msg->size - 1);
break;
-- 
2.7.4

Bug in fs/gs_base PTRACE_SETREGS on pre-4.7 kernels

2016-11-17 Thread Keno Fischer

Hi Andy (),

this is more of a heads up than a bug report, since it turns out you
already fixed this in

731e33e: x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization

In any case, without that commit, trying to use PTRACE_SETREGS to set
either fs_base, or gs_base to 0 when it was previously <4G, but wasn't
0, fails to take effect in the tracee.

This is caused by the `if (child->thread.fs != value)`, in
`ptrace.c:putreg`, which skips the `do_arch_prctl` call. Of course the
problem here is that while the optimization is in place `fs` is set to
0, but does not actually hold the fs base, so the call is incorrectly
skipped.

In any case, figured you may be interested that the commit changes behavior
(for the better - not complaining ;), even if user code does not go
out of its way to confuse ptrace.

Keno

Bug in fs/gs_base PTRACE_SETREGS on pre-4.7 kernels

2016-11-17 Thread Keno Fischer

Hi Andy (),

this is more of a heads up than a bug report, since it turns out you
already fixed this in

731e33e: x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization

In any case, without that commit, trying to use PTRACE_SETREGS to set
either fs_base, or gs_base to 0 when it was previously <4G, but wasn't
0, fails to take effect in the tracee.

This is caused by the `if (child->thread.fs != value)`, in
`ptrace.c:putreg`, which skips the `do_arch_prctl` call. Of course the
problem here is that while the optimization is in place `fs` is set to
0, but does not actually hold the fs base, so the call is incorrectly
skipped.

In any case, figured you may be interested that the commit changes behavior
(for the better - not complaining ;), even if user code does not go
out of its way to confuse ptrace.

Keno

[PATCH 1/2] vhost: remove unused feature bit

2016-11-17 Thread Jason Wang

Signed-off-by: Jason Wang 
---
 include/uapi/linux/vhost.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 56b7ab5..60180c0 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -172,8 +172,6 @@ struct vhost_memory {
 #define VHOST_F_LOG_ALL 26
 /* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
 #define VHOST_NET_F_VIRTIO_NET_HDR 27
-/* Vhost have device IOTLB */
-#define VHOST_F_DEVICE_IOTLB 63
 
 /* VHOST_SCSI specific definitions */
 
-- 
2.7.4

[PATCH 1/2] vhost: remove unused feature bit

2016-11-17 Thread Jason Wang

Signed-off-by: Jason Wang 
---
 include/uapi/linux/vhost.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 56b7ab5..60180c0 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -172,8 +172,6 @@ struct vhost_memory {
 #define VHOST_F_LOG_ALL 26
 /* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
 #define VHOST_NET_F_VIRTIO_NET_HDR 27
-/* Vhost have device IOTLB */
-#define VHOST_F_DEVICE_IOTLB 63
 
 /* VHOST_SCSI specific definitions */
 
-- 
2.7.4

Re: [PATCH v2] ARM: Drop fixed 200 Hz timer requirement from Samsung platforms

2016-11-17 Thread Kukjin Kim

2016. 11. 18. 16:16 Krzysztof Kozlowski  wrote:

> All Samsung platforms, including the Exynos, are selecting HZ_FIXED with
> 200 Hz.  Unfortunately in case of multiplatform image this affects also
> other platforms when Exynos is enabled.
> 
> This looks like an very old legacy code, dating back to initial
> upstreaming of S3C24xx.  Probably it was required for s3c24xx timer
> driver, which was removed in commit ad38bdd15d5b ("ARM: SAMSUNG: Remove
> unused plat-samsung/time.c").
> 
> Since then, this fixed 200 Hz spread everywhere, including out-of-tree
> Samsung kernels (SoC vendor's and Tizen's).  I believe this choice
> was rather an effect of coincidence instead of conscious choice.
> 
> Exynos uses its own MCT or arch timer and can work with all HZ values.
> Older platforms use newer Samsung PWM timer driver which should handle
> down to 100 Hz.
> 
> Few perf mem and sched tests on Odroid XU3 board (Exynos5422, 4x Cortex
> A7, 4x Cortex A15) show no regressions when switching from 200 Hz to
> other values.
> 
> Reported-by: Lee Jones 
> [Dropping 200_HZ from S3C/S5P suggested by Arnd]
> Reported-by: Arnd Bergmann 
> Signed-off-by: Krzysztof Kozlowski 
> Cc: Kukjin Kim 

Acked-by: Kukjin Kim 

> Tested-by: Javier Martinez Canillas 
> 
> ---
> 
> Tested on Exynos5422 and Exynos5800 (by Javier). It would be
> appreciated if anyone could test it on S3C24xx or S5PV210.
> 
> Changes since v1:
> 1. Add Javier's tested-by.
> 2. Drop HZ_FIXED also from ARCH_S5PV210 and ARCH_S3C24XX after Arnd
>   suggestions and analysis.
> ---
> arch/arm/Kconfig | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index b5d529fdffab..ced2e08a9d08 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1496,8 +1496,7 @@ source kernel/Kconfig.preempt
> 
> config HZ_FIXED
>int
> -default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
> -ARCH_S5PV210 || ARCH_EXYNOS4
> +default 200 if ARCH_EBSA110
>default 128 if SOC_AT91RM9200
>default 0
> 
> -- 
> 2.7.4
>

Re: [PATCH v2] ARM: Drop fixed 200 Hz timer requirement from Samsung platforms

2016-11-17 Thread Kukjin Kim

2016. 11. 18. 16:16 Krzysztof Kozlowski  wrote:

> All Samsung platforms, including the Exynos, are selecting HZ_FIXED with
> 200 Hz.  Unfortunately in case of multiplatform image this affects also
> other platforms when Exynos is enabled.
> 
> This looks like an very old legacy code, dating back to initial
> upstreaming of S3C24xx.  Probably it was required for s3c24xx timer
> driver, which was removed in commit ad38bdd15d5b ("ARM: SAMSUNG: Remove
> unused plat-samsung/time.c").
> 
> Since then, this fixed 200 Hz spread everywhere, including out-of-tree
> Samsung kernels (SoC vendor's and Tizen's).  I believe this choice
> was rather an effect of coincidence instead of conscious choice.
> 
> Exynos uses its own MCT or arch timer and can work with all HZ values.
> Older platforms use newer Samsung PWM timer driver which should handle
> down to 100 Hz.
> 
> Few perf mem and sched tests on Odroid XU3 board (Exynos5422, 4x Cortex
> A7, 4x Cortex A15) show no regressions when switching from 200 Hz to
> other values.
> 
> Reported-by: Lee Jones 
> [Dropping 200_HZ from S3C/S5P suggested by Arnd]
> Reported-by: Arnd Bergmann 
> Signed-off-by: Krzysztof Kozlowski 
> Cc: Kukjin Kim 

Acked-by: Kukjin Kim 

> Tested-by: Javier Martinez Canillas 
> 
> ---
> 
> Tested on Exynos5422 and Exynos5800 (by Javier). It would be
> appreciated if anyone could test it on S3C24xx or S5PV210.
> 
> Changes since v1:
> 1. Add Javier's tested-by.
> 2. Drop HZ_FIXED also from ARCH_S5PV210 and ARCH_S3C24XX after Arnd
>   suggestions and analysis.
> ---
> arch/arm/Kconfig | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index b5d529fdffab..ced2e08a9d08 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1496,8 +1496,7 @@ source kernel/Kconfig.preempt
> 
> config HZ_FIXED
>int
> -default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
> -ARCH_S5PV210 || ARCH_EXYNOS4
> +default 200 if ARCH_EBSA110
>default 128 if SOC_AT91RM9200
>default 0
> 
> -- 
> 2.7.4
>

RE: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

2016-11-17 Thread Hayes Wang

Mark Lord [mailto:ml...@pobox.com]
> Sent: Thursday, November 17, 2016 9:42 PM
[...]
> What the above sample shows, is the URB transfer buffer ran out of space in 
> the
> middle
> of a packet, and the hardware then tried to just continue that same packet in 
> the
> next URB,
> without an rx_desc header inserted.  The r8152.c driver always assumes the URB
> buffer begins
> with an rx_desc, so of course this behaviour produces really weird effects, 
> and
> system crashes, etc..

The USB device wouldn't know the address and size of buffer. Only
the USB host controller knows. Therefore, the device sends the
data to host, and the host fills the memory. According to your
description, it seems the host splits the data from the device
into two different buffers (or URB transfers). I wonder if it would
occur. As far as I know, the host wouldn't allow the buffer size
less than the data length.

Our hw engineers need the log from the USB analyzer to confirm
what the device sends to the host. However, I don't think you
have USB analyzer to do this. I would try to reproduce the issue.
But, I am busy, so I don't think I would response quickly.

Besides, the maximum data length which the RTL8152 would send to
the host is 16KB. That is, if the agg_buf_sz is 16KB, the host
wouldn't split it. However, you still see problems for it.

[...]
> It is not clear to me how the chip decides when to forward an rx URB to the 
> host.
> If you could describe how that part works for us, then it would help in 
> further
> understanding why fast systems (eg. a PC) don't generally notice the issue,
> while much slower embedded systems do see the issue regularly.

The driver expects the rx buffer would be

rx_desc + a packet + padding to 8 alignment + 
rx_desc + a packet + padding to 8 alignment + ...

Therefore, when a urb transfer is completed, the driver parsers
the buffer by this way. After the buffer is handled, it would
be submitted to the host, until the transfer is completed again.
If the submitting fail, the driver would try again later. The
urb->actual_length means how much data the host fills. The drive
uses it to check the end of the data. The urb->status mean if
the transfer is successful. The driver submits the urb to the
host directly if the status is not successful.

Best Regards,
Hayes

RE: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

2016-11-17 Thread Hayes Wang

Mark Lord [mailto:ml...@pobox.com]
> Sent: Thursday, November 17, 2016 9:42 PM
[...]
> What the above sample shows, is the URB transfer buffer ran out of space in 
> the
> middle
> of a packet, and the hardware then tried to just continue that same packet in 
> the
> next URB,
> without an rx_desc header inserted.  The r8152.c driver always assumes the URB
> buffer begins
> with an rx_desc, so of course this behaviour produces really weird effects, 
> and
> system crashes, etc..

The USB device wouldn't know the address and size of buffer. Only
the USB host controller knows. Therefore, the device sends the
data to host, and the host fills the memory. According to your
description, it seems the host splits the data from the device
into two different buffers (or URB transfers). I wonder if it would
occur. As far as I know, the host wouldn't allow the buffer size
less than the data length.

Our hw engineers need the log from the USB analyzer to confirm
what the device sends to the host. However, I don't think you
have USB analyzer to do this. I would try to reproduce the issue.
But, I am busy, so I don't think I would response quickly.

Besides, the maximum data length which the RTL8152 would send to
the host is 16KB. That is, if the agg_buf_sz is 16KB, the host
wouldn't split it. However, you still see problems for it.

[...]
> It is not clear to me how the chip decides when to forward an rx URB to the 
> host.
> If you could describe how that part works for us, then it would help in 
> further
> understanding why fast systems (eg. a PC) don't generally notice the issue,
> while much slower embedded systems do see the issue regularly.

The driver expects the rx buffer would be

rx_desc + a packet + padding to 8 alignment + 
rx_desc + a packet + padding to 8 alignment + ...

Therefore, when a urb transfer is completed, the driver parsers
the buffer by this way. After the buffer is handled, it would
be submitted to the host, until the transfer is completed again.
If the submitting fail, the driver would try again later. The
urb->actual_length means how much data the host fills. The drive
uses it to check the end of the data. The urb->status mean if
the transfer is successful. The driver submits the urb to the
host directly if the status is not successful.

Best Regards,
Hayes

Re: [GIT PULL] STi DT update for v4.10 round 2

2016-11-17 Thread Olof Johansson

On Thu, Nov 10, 2016 at 10:00:48AM +0100, Patrice Chotard wrote:
> Hi Arnd, Kevin, Olof
> 
> PLease consider this second round of STi dts update for v4.10 :
> 
> The following changes since commit 97a0b97f9e8197429eee5f87ce14373f73dbd9d3:
> 
>   ARM: dts: stih410-clocks: Add PROC_STFE as a critical clock (2016-10-20 
> 16:20:26 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> tags/sti-dt-for-4.10-round2
> 
> for you to fetch changes up to 64783ea7de0bff3de77cfdff1ed76428c288faac:
> 
>   ARM: dts: STiHxxx-b2120: change sound card name (2016-11-10 09:52:49 +0100)
> 
> 
> STi dts update:
> 
> Change sound card name for B2120
> Enable sound card for B2260
> Remove stih415-clks.h
> Identify critical clocks for STiH407
> Fix typo in stih407-pinctrl.dtsi
> 
> 
> Arnaud Pouliquen (2):
>   ARM: dts: STiH410-B2260: enable sound card
>   ARM: dts: STiHxxx-b2120: change sound card name
> 
> Geert Uytterhoeven (1):
>   ARM: dts: STiH407: DT fix s/interrupts-names/interrupt-names/
> 
> Patrice Chotard (1):
>   ARM: dts: remove stih415-clks.h
> 
> Peter Griffin (1):
>   ARM: dts: stih407-clocks: Identify critical clocks
> 
>  arch/arm/boot/dts/stih407-clock.dtsi | 10 ++
>  arch/arm/boot/dts/stih407-pinctrl.dtsi   |  2 +-
>  arch/arm/boot/dts/stih410-b2260.dts  | 22 ++
>  arch/arm/boot/dts/stihxxx-b2120.dtsi |  2 +-
>  include/dt-bindings/clock/stih415-clks.h | 16 
>  5 files changed, 34 insertions(+), 18 deletions(-)
>  delete mode 100644 include/dt-bindings/clock/stih415-clks.h
> 
> 
> 

Merged, thanks!


-Olof

Re: [GIT PULL] STi defconfig updates for v4.10 round 2

2016-11-17 Thread Olof Johansson

On Thu, Nov 10, 2016 at 10:00:32AM +0100, Patrice Chotard wrote:
> Hi Olof, Arnd and Kevin,
> 
> Please consider the second round of multi_v7_defconfig updates for v4.10 :
> 
> 
> The following changes since commit 620c52f4db4d47e1f33c64e641392fe575d5397f:
> 
>   ARM: multi_v7_defconfig: Remove stih41x phy Kconfig symbol. (2016-10-20 
> 17:05:08 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> sti-defconfig-for-4.10-round2
> 
> for you to fetch changes up to 57dae748959d0abae2b382ccee68621a82f827c8:
> 
>   ARM: multi_v7_defconfig: Remove ST_THERMAL_SYSCFG Kconfig symbol 
> (2016-10-21 17:05:54 +0200)
> 
> 
> 
> Remove STiH415/416 specific IPs
> 
> As STiH415/416 have been removed from kernel, remove IPs only
> found on these socs, remove ST_THERMAL_SYSCFG.

Merged, thanks.


-Olof

Re: [GIT PULL] STi DT update for v4.10 round 2

2016-11-17 Thread Olof Johansson

On Thu, Nov 10, 2016 at 10:00:48AM +0100, Patrice Chotard wrote:
> Hi Arnd, Kevin, Olof
> 
> PLease consider this second round of STi dts update for v4.10 :
> 
> The following changes since commit 97a0b97f9e8197429eee5f87ce14373f73dbd9d3:
> 
>   ARM: dts: stih410-clocks: Add PROC_STFE as a critical clock (2016-10-20 
> 16:20:26 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> tags/sti-dt-for-4.10-round2
> 
> for you to fetch changes up to 64783ea7de0bff3de77cfdff1ed76428c288faac:
> 
>   ARM: dts: STiHxxx-b2120: change sound card name (2016-11-10 09:52:49 +0100)
> 
> 
> STi dts update:
> 
> Change sound card name for B2120
> Enable sound card for B2260
> Remove stih415-clks.h
> Identify critical clocks for STiH407
> Fix typo in stih407-pinctrl.dtsi
> 
> 
> Arnaud Pouliquen (2):
>   ARM: dts: STiH410-B2260: enable sound card
>   ARM: dts: STiHxxx-b2120: change sound card name
> 
> Geert Uytterhoeven (1):
>   ARM: dts: STiH407: DT fix s/interrupts-names/interrupt-names/
> 
> Patrice Chotard (1):
>   ARM: dts: remove stih415-clks.h
> 
> Peter Griffin (1):
>   ARM: dts: stih407-clocks: Identify critical clocks
> 
>  arch/arm/boot/dts/stih407-clock.dtsi | 10 ++
>  arch/arm/boot/dts/stih407-pinctrl.dtsi   |  2 +-
>  arch/arm/boot/dts/stih410-b2260.dts  | 22 ++
>  arch/arm/boot/dts/stihxxx-b2120.dtsi |  2 +-
>  include/dt-bindings/clock/stih415-clks.h | 16 
>  5 files changed, 34 insertions(+), 18 deletions(-)
>  delete mode 100644 include/dt-bindings/clock/stih415-clks.h
> 
> 
> 

Merged, thanks!


-Olof

Re: [GIT PULL] STi defconfig updates for v4.10 round 2

2016-11-17 Thread Olof Johansson

On Thu, Nov 10, 2016 at 10:00:32AM +0100, Patrice Chotard wrote:
> Hi Olof, Arnd and Kevin,
> 
> Please consider the second round of multi_v7_defconfig updates for v4.10 :
> 
> 
> The following changes since commit 620c52f4db4d47e1f33c64e641392fe575d5397f:
> 
>   ARM: multi_v7_defconfig: Remove stih41x phy Kconfig symbol. (2016-10-20 
> 17:05:08 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> sti-defconfig-for-4.10-round2
> 
> for you to fetch changes up to 57dae748959d0abae2b382ccee68621a82f827c8:
> 
>   ARM: multi_v7_defconfig: Remove ST_THERMAL_SYSCFG Kconfig symbol 
> (2016-10-21 17:05:54 +0200)
> 
> 
> 
> Remove STiH415/416 specific IPs
> 
> As STiH415/416 have been removed from kernel, remove IPs only
> found on these socs, remove ST_THERMAL_SYSCFG.

Merged, thanks.


-Olof

Re: [GIT PULL 1/3] ARM: dts: exynos: DT for v4.10

2016-11-17 Thread Olof Johansson

Hi,

On Tue, Nov 08, 2016 at 08:26:28PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Hurray! New board! ... Exynos4415 slowly is going away.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt-4.10
> 
> for you to fetch changes up to 05a3589f46f913fbe91704f12fdca46a0eb0a27b:
> 
>   ARM: dts: exynos: Add SCU device node to exynos4.dtsi (2016-11-05 17:39:50 
> +0200)
> 
> 
> Samsung DeviceTree update for v4.10:
> 1. Add TOPEET itop core and Elite boards, based on Exynos4412.
> 2. Remove the Exynos4415 DTSI. We did not have any mainlined boards
>using it. I am also not aware of any popular out-of-tree boards using it.
> 3. Add Snoop Control Unit node for Exynos4.
> 4. Minor cleanups.

Merged, thanks.


-Olof

Re: [GIT PULL 1/3] ARM: dts: exynos: DT for v4.10

2016-11-17 Thread Olof Johansson

Hi,

On Tue, Nov 08, 2016 at 08:26:28PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Hurray! New board! ... Exynos4415 slowly is going away.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt-4.10
> 
> for you to fetch changes up to 05a3589f46f913fbe91704f12fdca46a0eb0a27b:
> 
>   ARM: dts: exynos: Add SCU device node to exynos4.dtsi (2016-11-05 17:39:50 
> +0200)
> 
> 
> Samsung DeviceTree update for v4.10:
> 1. Add TOPEET itop core and Elite boards, based on Exynos4412.
> 2. Remove the Exynos4415 DTSI. We did not have any mainlined boards
>using it. I am also not aware of any popular out-of-tree boards using it.
> 3. Add Snoop Control Unit node for Exynos4.
> 4. Minor cleanups.

Merged, thanks.


-Olof

Re: [GIT PULL 2/3] ARM64: dts: exynos: DT for v4.10

2016-11-17 Thread Olof Johansson

On Tue, Nov 08, 2016 at 08:26:29PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Exynos5433 + two boards using it. Mobile boards! :)
> 
> I am really happy to push it. I know that it has been a lot of effort
> in Samsung to mainline this.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt64-4.10
> 
> for you to fetch changes up to 8ac46fc57df82efbc19194335b6c7a960c31:
> 
>   arm64: dts: exynos: Add dts file for Exynos5433-based TM2E board 
> (2016-11-03 22:19:57 +0200)
> 
> 
> Finally, I am really pleased to announce adding support for Exynos5433 ARMv8
> SoC along with two boards.  A lot of Samsung people contributed into this
> but the final work and commits were done by Chanwoo Choi.
> 
> This means that for v4.10 we got:
> 1. Exynos5433 DTSI.
> 2. Two boards: TM2 and TM2E.  These are (almost fully) working mobile phones.

Awesome! Looks like TM2 is a Tizen reference board? Great to see the support,
even if it's taken a while.


-Olof

Re: [GIT PULL 3/3] ARM: defconfig: Samsung defconfigs for v4.10

2016-11-17 Thread Olof Johansson

On Tue, Nov 08, 2016 at 08:26:27PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Nothing special.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-defconfig-4.10
> 
> for you to fetch changes up to e471e9b4b13b59ee8cb7079018472c4dda46cb7a:
> 
>   ARM: multi_v7_defconfig: Enable exynos-gsc driver as module (2016-10-17 
> 19:43:29 +0300)
> 
> 
> Samsung defconfig update for v4.10:
> 1. Enable the Exynos gscaler driver on multi_v7 and exynos defconfigs.

Merged, thanks.


-Olof

Re: [GIT PULL 3/3] ARM: defconfig: Samsung defconfigs for v4.10

2016-11-17 Thread Olof Johansson

On Tue, Nov 08, 2016 at 08:26:27PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Nothing special.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-defconfig-4.10
> 
> for you to fetch changes up to e471e9b4b13b59ee8cb7079018472c4dda46cb7a:
> 
>   ARM: multi_v7_defconfig: Enable exynos-gsc driver as module (2016-10-17 
> 19:43:29 +0300)
> 
> 
> Samsung defconfig update for v4.10:
> 1. Enable the Exynos gscaler driver on multi_v7 and exynos defconfigs.

Merged, thanks.


-Olof

Re: [GIT PULL 2/3] ARM64: dts: exynos: DT for v4.10

2016-11-17 Thread Olof Johansson

On Tue, Nov 08, 2016 at 08:26:29PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Exynos5433 + two boards using it. Mobile boards! :)
> 
> I am really happy to push it. I know that it has been a lot of effort
> in Samsung to mainline this.
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
> 
>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt64-4.10
> 
> for you to fetch changes up to 8ac46fc57df82efbc19194335b6c7a960c31:
> 
>   arm64: dts: exynos: Add dts file for Exynos5433-based TM2E board 
> (2016-11-03 22:19:57 +0200)
> 
> 
> Finally, I am really pleased to announce adding support for Exynos5433 ARMv8
> SoC along with two boards.  A lot of Samsung people contributed into this
> but the final work and commits were done by Chanwoo Choi.
> 
> This means that for v4.10 we got:
> 1. Exynos5433 DTSI.
> 2. Two boards: TM2 and TM2E.  These are (almost fully) working mobile phones.

Awesome! Looks like TM2 is a Tizen reference board? Great to see the support,
even if it's taken a while.


-Olof

Re: [PATCH] crypto: sun4i-ss: support the Security System PRNG

2016-11-17 Thread Corentin Labbe

On Thu, Nov 17, 2016 at 08:07:09PM -0500, Sandy Harris wrote:
> Add Ted T'so to cc list. Shouldn't he be included on anything affecting
> the random(4) driver?
> 

Blindy used get_maintainer.pl, and since the file is in crypto, hw_random 
people were not set.
Note that get_maintainer.pl on drivers/char/hw_random/, does not give his 
address also.
My V2 patch will have them in CC/TO.

> On Tue, Oct 18, 2016 at 8:34 AM, Corentin Labbe
>  wrote:
> 
> > From: LABBE Corentin 
> >
> > The Security System have a PRNG.
> > This patch add support for it as an hwrng.
> 
> Which is it? A PRNG & a HW RNG are quite different things.
> It would, in general, be a fairly serious error to treat a PRNG
> as a HWRNG.
> 
> If it is just a prng (which it appears to be from a quick look
> at your code) then it is not clear it is useful since the
> random(4) driver already has two PRNGs. It might be
> but I cannot tell.

For me hwrng is a way to give user space an another way to get "random" data 
via /dev/hwrng.
The only impact of hwrng with random is that just after init some data of hwrng 
is used for having more entropy.

Grepping prng in drivers/char/hw_random/ and drivers/crypto show me some other 
PRNG used with hwrng.

Regards
Corentin Labbe

Re: [PATCH] crypto: sun4i-ss: support the Security System PRNG

2016-11-17 Thread Corentin Labbe

On Thu, Nov 17, 2016 at 08:07:09PM -0500, Sandy Harris wrote:
> Add Ted T'so to cc list. Shouldn't he be included on anything affecting
> the random(4) driver?
> 

Blindy used get_maintainer.pl, and since the file is in crypto, hw_random 
people were not set.
Note that get_maintainer.pl on drivers/char/hw_random/, does not give his 
address also.
My V2 patch will have them in CC/TO.

> On Tue, Oct 18, 2016 at 8:34 AM, Corentin Labbe
>  wrote:
> 
> > From: LABBE Corentin 
> >
> > The Security System have a PRNG.
> > This patch add support for it as an hwrng.
> 
> Which is it? A PRNG & a HW RNG are quite different things.
> It would, in general, be a fairly serious error to treat a PRNG
> as a HWRNG.
> 
> If it is just a prng (which it appears to be from a quick look
> at your code) then it is not clear it is useful since the
> random(4) driver already has two PRNGs. It might be
> but I cannot tell.

For me hwrng is a way to give user space an another way to get "random" data 
via /dev/hwrng.
The only impact of hwrng with random is that just after init some data of hwrng 
is used for having more entropy.

Grepping prng in drivers/char/hw_random/ and drivers/crypto show me some other 
PRNG used with hwrng.

Regards
Corentin Labbe

Re: [PATCH v3] staging: lustre: llog: fix wrong offset in llog_process_thread()

2016-11-17 Thread Greg Kroah-Hartman

On Thu, Nov 17, 2016 at 06:29:08PM -0500, James Simmons wrote:
> From: Mikhail Pershin 
> 
> - llh_cat_idx may become bigger than llog bitmap size in
>   llog_cat_set_first_idx() function
> - it is wrong to use previous cur_offset as new buffer offset,
>   new offset should be calculated from value returned by
>   llog_next_block().
> - optimize llog_skip_over() to find llog entry offset by index
>   for llog with fixed-size records.
> 
> Signed-off-by: Mikhail Pershin 
> Signed-off-by: Bob Glossman 
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6714
> Reviewed-on: http://review.whamcloud.com/15316
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6163
> Reviewed-on: http://review.whamcloud.com/18819
> Reviewed-by: John L. Hammond 
> Reviewed-by: James Simmons 
> Reviewed-by: Oleg Drokin 
> Signed-off-by: James Simmons 
> ---
> 
> ChangeLog:
> 
> v1) Initial patch with umoddi issue
> v2) Included fix from patch LU-6163 that fixed umoddi problem
> v3) Remove no longer needed last_offset variable
> 
>  drivers/staging/lustre/lustre/obdclass/llog.c |   82 +---
>  include/linux/fs.h|2 +-
>  2 files changed, 59 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
> b/drivers/staging/lustre/lustre/obdclass/llog.c
> index 3bc1789..ae63047 100644
> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
> @@ -217,8 +217,7 @@ static int llog_process_thread(void *arg)
>   struct llog_log_hdr *llh = loghandle->lgh_hdr;
>   struct llog_process_cat_data*cd  = lpi->lpi_catdata;
>   char*buf;
> - __u64cur_offset;
> - __u64last_offset;
> + u64 cur_offset, tmp_offset;
>   int chunk_size;
>   int  rc = 0, index = 1, last_index;
>   int  saved_index = 0;
> @@ -229,6 +228,8 @@ static int llog_process_thread(void *arg)
>  
>   cur_offset = llh->llh_hdr.lrh_len;
>   chunk_size = llh->llh_hdr.lrh_len;
> + /* expect chunk_size to be power of two */
> + LASSERT(is_power_of_2(chunk_size));
>  
>   buf = libcfs_kvzalloc(chunk_size, GFP_NOFS);
>   if (!buf) {
> @@ -245,38 +246,50 @@ static int llog_process_thread(void *arg)
>   else
>   last_index = LLOG_HDR_BITMAP_SIZE(llh) - 1;
>  
> - /* Record is not in this buffer. */
> - if (index > last_index)
> - goto out;
> -
>   while (rc == 0) {
> + unsigned int buf_offset = 0;
>   struct llog_rec_hdr *rec;
> + bool partial_chunk;
> + off_t chunk_offset;
>  
>   /* skip records not set in bitmap */
>   while (index <= last_index &&
>  !ext2_test_bit(index, LLOG_HDR_BITMAP(llh)))
>   ++index;
>  
> - LASSERT(index <= last_index + 1);
> - if (index == last_index + 1)
> + if (index > last_index)
>   break;
> -repeat:
> +
>   CDEBUG(D_OTHER, "index: %d last_index %d\n",
>  index, last_index);
> -
> +repeat:
>   /* get the buf with our target record; avoid old garbage */
>   memset(buf, 0, chunk_size);
> - last_offset = cur_offset;
>   rc = llog_next_block(lpi->lpi_env, loghandle, _index,
>index, _offset, buf, chunk_size);
>   if (rc)
>   goto out;
>  
> + /*
> +  * NB: after llog_next_block() call the cur_offset is the
> +  * offset of the next block after read one.
> +  * The absolute offset of the current chunk is calculated
> +  * from cur_offset value and stored in chunk_offset variable.
> +  */
> + tmp_offset = cur_offset;
> + if (do_div(tmp_offset, chunk_size)) {
> + partial_chunk = true;
> + chunk_offset = cur_offset & ~(chunk_size - 1);
> + } else {
> + partial_chunk = false;
> + chunk_offset = cur_offset - chunk_size;
> + }
> +
>   /* NB: when rec->lrh_len is accessed it is already swabbed
>* since it is used at the "end" of the loop and the rec
>* swabbing is done at the beginning of the loop.
>*/
> - for (rec = (struct llog_rec_hdr *)buf;
> + for (rec = (struct llog_rec_hdr *)(buf + buf_offset);
>(char *)rec < buf + chunk_size;
>rec = llog_rec_hdr_next(rec)) {
>

Re: [PATCH v3] staging: lustre: llog: fix wrong offset in llog_process_thread()

2016-11-17 Thread Greg Kroah-Hartman

On Thu, Nov 17, 2016 at 06:29:08PM -0500, James Simmons wrote:
> From: Mikhail Pershin 
> 
> - llh_cat_idx may become bigger than llog bitmap size in
>   llog_cat_set_first_idx() function
> - it is wrong to use previous cur_offset as new buffer offset,
>   new offset should be calculated from value returned by
>   llog_next_block().
> - optimize llog_skip_over() to find llog entry offset by index
>   for llog with fixed-size records.
> 
> Signed-off-by: Mikhail Pershin 
> Signed-off-by: Bob Glossman 
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6714
> Reviewed-on: http://review.whamcloud.com/15316
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6163
> Reviewed-on: http://review.whamcloud.com/18819
> Reviewed-by: John L. Hammond 
> Reviewed-by: James Simmons 
> Reviewed-by: Oleg Drokin 
> Signed-off-by: James Simmons 
> ---
> 
> ChangeLog:
> 
> v1) Initial patch with umoddi issue
> v2) Included fix from patch LU-6163 that fixed umoddi problem
> v3) Remove no longer needed last_offset variable
> 
>  drivers/staging/lustre/lustre/obdclass/llog.c |   82 +---
>  include/linux/fs.h|2 +-
>  2 files changed, 59 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
> b/drivers/staging/lustre/lustre/obdclass/llog.c
> index 3bc1789..ae63047 100644
> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
> @@ -217,8 +217,7 @@ static int llog_process_thread(void *arg)
>   struct llog_log_hdr *llh = loghandle->lgh_hdr;
>   struct llog_process_cat_data*cd  = lpi->lpi_catdata;
>   char*buf;
> - __u64cur_offset;
> - __u64last_offset;
> + u64 cur_offset, tmp_offset;
>   int chunk_size;
>   int  rc = 0, index = 1, last_index;
>   int  saved_index = 0;
> @@ -229,6 +228,8 @@ static int llog_process_thread(void *arg)
>  
>   cur_offset = llh->llh_hdr.lrh_len;
>   chunk_size = llh->llh_hdr.lrh_len;
> + /* expect chunk_size to be power of two */
> + LASSERT(is_power_of_2(chunk_size));
>  
>   buf = libcfs_kvzalloc(chunk_size, GFP_NOFS);
>   if (!buf) {
> @@ -245,38 +246,50 @@ static int llog_process_thread(void *arg)
>   else
>   last_index = LLOG_HDR_BITMAP_SIZE(llh) - 1;
>  
> - /* Record is not in this buffer. */
> - if (index > last_index)
> - goto out;
> -
>   while (rc == 0) {
> + unsigned int buf_offset = 0;
>   struct llog_rec_hdr *rec;
> + bool partial_chunk;
> + off_t chunk_offset;
>  
>   /* skip records not set in bitmap */
>   while (index <= last_index &&
>  !ext2_test_bit(index, LLOG_HDR_BITMAP(llh)))
>   ++index;
>  
> - LASSERT(index <= last_index + 1);
> - if (index == last_index + 1)
> + if (index > last_index)
>   break;
> -repeat:
> +
>   CDEBUG(D_OTHER, "index: %d last_index %d\n",
>  index, last_index);
> -
> +repeat:
>   /* get the buf with our target record; avoid old garbage */
>   memset(buf, 0, chunk_size);
> - last_offset = cur_offset;
>   rc = llog_next_block(lpi->lpi_env, loghandle, _index,
>index, _offset, buf, chunk_size);
>   if (rc)
>   goto out;
>  
> + /*
> +  * NB: after llog_next_block() call the cur_offset is the
> +  * offset of the next block after read one.
> +  * The absolute offset of the current chunk is calculated
> +  * from cur_offset value and stored in chunk_offset variable.
> +  */
> + tmp_offset = cur_offset;
> + if (do_div(tmp_offset, chunk_size)) {
> + partial_chunk = true;
> + chunk_offset = cur_offset & ~(chunk_size - 1);
> + } else {
> + partial_chunk = false;
> + chunk_offset = cur_offset - chunk_size;
> + }
> +
>   /* NB: when rec->lrh_len is accessed it is already swabbed
>* since it is used at the "end" of the loop and the rec
>* swabbing is done at the beginning of the loop.
>*/
> - for (rec = (struct llog_rec_hdr *)buf;
> + for (rec = (struct llog_rec_hdr *)(buf + buf_offset);
>(char *)rec < buf + chunk_size;
>rec = llog_rec_hdr_next(rec)) {
>   CDEBUG(D_OTHER, "processing rec 0x%p type %#x\n",
> @@ -288,13 +301,28 @@ static int llog_process_thread(void *arg)
>   CDEBUG(D_OTHER, "after

Re: [PATCH 5/9] lib: radix-tree: check accounting of existing slot replacement users

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:30:21, Johannes Weiner wrote:
> The bug in khugepaged fixed earlier in this series shows that radix
> tree slot replacement is fragile; and it will become more so when not
> only NULL<->!NULL transitions need to be caught but transitions from
> and to exceptional entries as well. We need checks.
> 
> Re-implement radix_tree_replace_slot() on top of the sanity-checked
> __radix_tree_replace(). This requires existing callers to also pass
> the radix tree root, but it'll warn us when somebody replaces slots
> with contents that need proper accounting (transitions between NULL
> entries, real entries, exceptional entries) and where a replacement
> through the slot pointer would corrupt the radix tree node counts.
> 
> Suggested-by: Jan Kara 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

One nit below:

> @@ -785,6 +776,50 @@ void __radix_tree_replace(struct radix_tree_root *root,
>  }
>  
>  /**
> + * __radix_tree_replace  - replace item in a slot
> + * @root:radix tree root
> + * @node:pointer to tree node
> + * @slot:pointer to slot in @node
> + * @item:new item to store in the slot.
> + *
> + * For use with __radix_tree_lookup().  Caller must hold tree write locked
> + * across slot lookup and replacement.
> + */

I'd comment here that even this function cannot be used for NULL <->
non-NULL replacements. For that are radix_tree_delete() and
radix_tree_insert().

Honza
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH 5/9] lib: radix-tree: check accounting of existing slot replacement users

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:30:21, Johannes Weiner wrote:
> The bug in khugepaged fixed earlier in this series shows that radix
> tree slot replacement is fragile; and it will become more so when not
> only NULL<->!NULL transitions need to be caught but transitions from
> and to exceptional entries as well. We need checks.
> 
> Re-implement radix_tree_replace_slot() on top of the sanity-checked
> __radix_tree_replace(). This requires existing callers to also pass
> the radix tree root, but it'll warn us when somebody replaces slots
> with contents that need proper accounting (transitions between NULL
> entries, real entries, exceptional entries) and where a replacement
> through the slot pointer would corrupt the radix tree node counts.
> 
> Suggested-by: Jan Kara 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

One nit below:

> @@ -785,6 +776,50 @@ void __radix_tree_replace(struct radix_tree_root *root,
>  }
>  
>  /**
> + * __radix_tree_replace  - replace item in a slot
> + * @root:radix tree root
> + * @node:pointer to tree node
> + * @slot:pointer to slot in @node
> + * @item:new item to store in the slot.
> + *
> + * For use with __radix_tree_lookup().  Caller must hold tree write locked
> + * across slot lookup and replacement.
> + */

I'd comment here that even this function cannot be used for NULL <->
non-NULL replacements. For that are radix_tree_delete() and
radix_tree_insert().

Honza
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH v3 1/2] staging: slicoss: fix different address space warnings: 32 bits

2016-11-17 Thread Greg KH

On Thu, Nov 17, 2016 at 12:46:12PM +0100, Sergio Paracuellos wrote:
> On Thu, Nov 17, 2016 at 12:33 PM, Dan Carpenter
>  wrote:
> > Give it a shot and see if the warnings go away.  I don't think the tag
> > is correct.
> 
> Just removing __iomem tag in shmem_data field of slic_shmemory struct
> makes sparse happy. No warnings around.
> 
> Should I send a v4 patch with the tag removed?

Yes, please fix up this series and resend.

thanks,

greg k-h

Re: [PATCH v3 1/2] staging: slicoss: fix different address space warnings: 32 bits

2016-11-17 Thread Greg KH

On Thu, Nov 17, 2016 at 12:46:12PM +0100, Sergio Paracuellos wrote:
> On Thu, Nov 17, 2016 at 12:33 PM, Dan Carpenter
>  wrote:
> > Give it a shot and see if the warnings go away.  I don't think the tag
> > is correct.
> 
> Just removing __iomem tag in shmem_data field of slic_shmemory struct
> makes sparse happy. No warnings around.
> 
> Should I send a v4 patch with the tag removed?

Yes, please fix up this series and resend.

thanks,

greg k-h

Re: [PATCH 4/9] lib: radix-tree: native accounting of exceptional entries

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:29:45, Johannes Weiner wrote:
> The way the page cache is sneaking shadow entries of evicted pages
> into the radix tree past the node entry accounting and tracking them
> manually in the upper bits of node->count is fraught with problems.
> 
> These shadow entries are marked in the tree as exceptional entries,
> which are a native concept to the radix tree. Maintain an explicit
> counter of exceptional entries in the radix tree node. Subsequent
> patches will switch shadow entry tracking over to that counter.
> 
> DAX and shmem are the other users of exceptional entries. Since slot
> replacements that change the entry type from regular to exceptional
> must now be accounted, introduce a __radix_tree_replace() function
> that does replacement and accounting, and switch DAX and shmem over.
> 
> The increase in radix tree node size is temporary. A followup patch
> switches the shadow tracking to this new scheme and we'll no longer
> need the upper bits in node->count and shrink that back to one byte.
> 
> Signed-off-by: Johannes Weiner 

Looks good to me. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  fs/dax.c   |  5 +++--
>  include/linux/radix-tree.h | 10 +++---
>  lib/radix-tree.c   | 46 
> +++---
>  mm/shmem.c |  8 
>  4 files changed, 57 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 014defd2e744..db78bae0dc0f 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -643,12 +643,13 @@ static void *dax_insert_mapping_entry(struct 
> address_space *mapping,
>   }
>   mapping->nrexceptional++;
>   } else {
> + struct radix_tree_node *node;
>   void **slot;
>   void *ret;
>  
> - ret = __radix_tree_lookup(page_tree, index, NULL, );
> + ret = __radix_tree_lookup(page_tree, index, , );
>   WARN_ON_ONCE(ret != entry);
> - radix_tree_replace_slot(slot, new_entry);
> + __radix_tree_replace(page_tree, node, slot, new_entry);
>   }
>   if (vmf->flags & FAULT_FLAG_WRITE)
>   radix_tree_tag_set(page_tree, index, PAGECACHE_TAG_DIRTY);
> diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
> index af3581b8a451..7ced8a70cc8b 100644
> --- a/include/linux/radix-tree.h
> +++ b/include/linux/radix-tree.h
> @@ -85,9 +85,10 @@ static inline bool radix_tree_is_internal_node(void *ptr)
>  #define RADIX_TREE_COUNT_MASK((1UL << RADIX_TREE_COUNT_SHIFT) - 1)
>  
>  struct radix_tree_node {
> - unsigned char   shift;  /* Bits remaining in each slot */
> - unsigned char   offset; /* Slot offset in parent */
> - unsigned intcount;
> + unsigned char   shift;  /* Bits remaining in each slot */
> + unsigned char   offset; /* Slot offset in parent */
> + unsigned intcount;  /* Total entry count */
> + unsigned char   exceptional;/* Exceptional entry count */
>   union {
>   struct {
>   /* Used when ascending tree */
> @@ -276,6 +277,9 @@ void *__radix_tree_lookup(struct radix_tree_root *root, 
> unsigned long index,
> struct radix_tree_node **nodep, void ***slotp);
>  void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
>  void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long);
> +void __radix_tree_replace(struct radix_tree_root *root,
> +   struct radix_tree_node *node,
> +   void **slot, void *item);
>  bool __radix_tree_delete_node(struct radix_tree_root *root,
> struct radix_tree_node *node);
>  void *radix_tree_delete_item(struct radix_tree_root *, unsigned long, void 
> *);
> diff --git a/lib/radix-tree.c b/lib/radix-tree.c
> index 8e6d552c40dd..7885796d35ae 100644
> --- a/lib/radix-tree.c
> +++ b/lib/radix-tree.c
> @@ -220,10 +220,10 @@ static void dump_node(struct radix_tree_node *node, 
> unsigned long index)
>  {
>   unsigned long i;
>  
> - pr_debug("radix node: %p offset %d tags %lx %lx %lx shift %d count %d 
> parent %p\n",
> + pr_debug("radix node: %p offset %d tags %lx %lx %lx shift %d count %d 
> exceptional %d parent %p\n",
>   node, node->offset,
>   node->tags[0][0], node->tags[1][0], node->tags[2][0],
> - node->shift, node->count, node->parent);
> + node->shift, node->count, node->exceptional, node->parent);
>  
>   for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) {
>   unsigned long first = index | (i << node->shift);
> @@ -522,8 +522,13 @@ static int radix_tree_extend(struct radix_tree_root 
> *root,
>   node->offset = 0;
>   node->count = 1;
>   node->parent = NULL;
> - if

Re: [PATCH 4/9] lib: radix-tree: native accounting of exceptional entries

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:29:45, Johannes Weiner wrote:
> The way the page cache is sneaking shadow entries of evicted pages
> into the radix tree past the node entry accounting and tracking them
> manually in the upper bits of node->count is fraught with problems.
> 
> These shadow entries are marked in the tree as exceptional entries,
> which are a native concept to the radix tree. Maintain an explicit
> counter of exceptional entries in the radix tree node. Subsequent
> patches will switch shadow entry tracking over to that counter.
> 
> DAX and shmem are the other users of exceptional entries. Since slot
> replacements that change the entry type from regular to exceptional
> must now be accounted, introduce a __radix_tree_replace() function
> that does replacement and accounting, and switch DAX and shmem over.
> 
> The increase in radix tree node size is temporary. A followup patch
> switches the shadow tracking to this new scheme and we'll no longer
> need the upper bits in node->count and shrink that back to one byte.
> 
> Signed-off-by: Johannes Weiner 

Looks good to me. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  fs/dax.c   |  5 +++--
>  include/linux/radix-tree.h | 10 +++---
>  lib/radix-tree.c   | 46 
> +++---
>  mm/shmem.c |  8 
>  4 files changed, 57 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 014defd2e744..db78bae0dc0f 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -643,12 +643,13 @@ static void *dax_insert_mapping_entry(struct 
> address_space *mapping,
>   }
>   mapping->nrexceptional++;
>   } else {
> + struct radix_tree_node *node;
>   void **slot;
>   void *ret;
>  
> - ret = __radix_tree_lookup(page_tree, index, NULL, );
> + ret = __radix_tree_lookup(page_tree, index, , );
>   WARN_ON_ONCE(ret != entry);
> - radix_tree_replace_slot(slot, new_entry);
> + __radix_tree_replace(page_tree, node, slot, new_entry);
>   }
>   if (vmf->flags & FAULT_FLAG_WRITE)
>   radix_tree_tag_set(page_tree, index, PAGECACHE_TAG_DIRTY);
> diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
> index af3581b8a451..7ced8a70cc8b 100644
> --- a/include/linux/radix-tree.h
> +++ b/include/linux/radix-tree.h
> @@ -85,9 +85,10 @@ static inline bool radix_tree_is_internal_node(void *ptr)
>  #define RADIX_TREE_COUNT_MASK((1UL << RADIX_TREE_COUNT_SHIFT) - 1)
>  
>  struct radix_tree_node {
> - unsigned char   shift;  /* Bits remaining in each slot */
> - unsigned char   offset; /* Slot offset in parent */
> - unsigned intcount;
> + unsigned char   shift;  /* Bits remaining in each slot */
> + unsigned char   offset; /* Slot offset in parent */
> + unsigned intcount;  /* Total entry count */
> + unsigned char   exceptional;/* Exceptional entry count */
>   union {
>   struct {
>   /* Used when ascending tree */
> @@ -276,6 +277,9 @@ void *__radix_tree_lookup(struct radix_tree_root *root, 
> unsigned long index,
> struct radix_tree_node **nodep, void ***slotp);
>  void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
>  void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long);
> +void __radix_tree_replace(struct radix_tree_root *root,
> +   struct radix_tree_node *node,
> +   void **slot, void *item);
>  bool __radix_tree_delete_node(struct radix_tree_root *root,
> struct radix_tree_node *node);
>  void *radix_tree_delete_item(struct radix_tree_root *, unsigned long, void 
> *);
> diff --git a/lib/radix-tree.c b/lib/radix-tree.c
> index 8e6d552c40dd..7885796d35ae 100644
> --- a/lib/radix-tree.c
> +++ b/lib/radix-tree.c
> @@ -220,10 +220,10 @@ static void dump_node(struct radix_tree_node *node, 
> unsigned long index)
>  {
>   unsigned long i;
>  
> - pr_debug("radix node: %p offset %d tags %lx %lx %lx shift %d count %d 
> parent %p\n",
> + pr_debug("radix node: %p offset %d tags %lx %lx %lx shift %d count %d 
> exceptional %d parent %p\n",
>   node, node->offset,
>   node->tags[0][0], node->tags[1][0], node->tags[2][0],
> - node->shift, node->count, node->parent);
> + node->shift, node->count, node->exceptional, node->parent);
>  
>   for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) {
>   unsigned long first = index | (i << node->shift);
> @@ -522,8 +522,13 @@ static int radix_tree_extend(struct radix_tree_root 
> *root,
>   node->offset = 0;
>   node->count = 1;
>   node->parent = NULL;
> - if (radix_tree_is_internal_node(slot))
> +

Re: [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64

2016-11-17 Thread Thomas Gleixner

On Fri, 18 Nov 2016, Ingo Molnar wrote:

> 
> * Kyle Huey  wrote:
> 
> > In order to introduce new arch_prctls that are not 64 bit only, rename the
> > existing 64 bit implementation to do_arch_prctl_64(). Also rename the second
> > argument to arch_prctl(), which will no longer always be an address.
> 
> >  #ifdef CONFIG_X86_64
> >  void entry_SYSCALL_64(void);
> > +long do_arch_prctl_64(struct task_struct *task, int code, unsigned long 
> > arg2);
> >  #endif
> 
> Could you please also rename the weirdly named 'code' argument to 'option',
> to be in line with the existing sys_prctl() interface nomenclature?

I'll fix that up when picking up the series. No need for another iteration, ok?

Thanks,

tglx

Re: [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64

2016-11-17 Thread Thomas Gleixner

On Fri, 18 Nov 2016, Ingo Molnar wrote:

> 
> * Kyle Huey  wrote:
> 
> > In order to introduce new arch_prctls that are not 64 bit only, rename the
> > existing 64 bit implementation to do_arch_prctl_64(). Also rename the second
> > argument to arch_prctl(), which will no longer always be an address.
> 
> >  #ifdef CONFIG_X86_64
> >  void entry_SYSCALL_64(void);
> > +long do_arch_prctl_64(struct task_struct *task, int code, unsigned long 
> > arg2);
> >  #endif
> 
> Could you please also rename the weirdly named 'code' argument to 'option',
> to be in line with the existing sys_prctl() interface nomenclature?

I'll fix that up when picking up the series. No need for another iteration, ok?

Thanks,

tglx

Re: [PATCH 3/9] mm: workingset: turn shadow node shrinker bugs into warnings

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:32, Johannes Weiner wrote:
> When the shadow page shrinker tries to reclaim a radix tree node but
> finds it in an unexpected state - it should contain no pages, and
> non-zero shadow entries - there is no need to kill the executing task
> or even the entire system. Warn about the invalid state, then leave
> that tree node be. Simply don't put it back on the shadow LRU for
> future reclaim and move on.
> 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/workingset.c | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/workingset.c b/mm/workingset.c
> index 617475f529f4..3cfc61d84a52 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -418,23 +418,27 @@ static enum lru_status shadow_lru_isolate(struct 
> list_head *item,
>* no pages, so we expect to be able to remove them all and
>* delete and free the empty node afterwards.
>*/
> - BUG_ON(!workingset_node_shadows(node));
> - BUG_ON(workingset_node_pages(node));
> -
> + if (WARN_ON_ONCE(!workingset_node_shadows(node)))
> + goto out_invalid;
> + if (WARN_ON_ONCE(workingset_node_pages(node)))
> + goto out_invalid;
>   for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) {
>   if (node->slots[i]) {
> - BUG_ON(!radix_tree_exceptional_entry(node->slots[i]));
> + if 
> (WARN_ON_ONCE(!radix_tree_exceptional_entry(node->slots[i])))
> + goto out_invalid;
> + if (WARN_ON_ONCE(!mapping->nrexceptional))
> + goto out_invalid;
>   node->slots[i] = NULL;
>   workingset_node_shadows_dec(node);
> - BUG_ON(!mapping->nrexceptional);
>   mapping->nrexceptional--;
>   }
>   }
> - BUG_ON(workingset_node_shadows(node));
> + if (WARN_ON_ONCE(workingset_node_shadows(node)))
> + goto out_invalid;
>   inc_node_state(page_pgdat(virt_to_page(node)), WORKINGSET_NODERECLAIM);
> - if (!__radix_tree_delete_node(>page_tree, node))
> - BUG();
> + __radix_tree_delete_node(>page_tree, node);
>  
> +out_invalid:
>   spin_unlock(>tree_lock);
>   ret = LRU_REMOVED_RETRY;
>  out:
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH 3/9] mm: workingset: turn shadow node shrinker bugs into warnings

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:32, Johannes Weiner wrote:
> When the shadow page shrinker tries to reclaim a radix tree node but
> finds it in an unexpected state - it should contain no pages, and
> non-zero shadow entries - there is no need to kill the executing task
> or even the entire system. Warn about the invalid state, then leave
> that tree node be. Simply don't put it back on the shadow LRU for
> future reclaim and move on.
> 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/workingset.c | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/workingset.c b/mm/workingset.c
> index 617475f529f4..3cfc61d84a52 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -418,23 +418,27 @@ static enum lru_status shadow_lru_isolate(struct 
> list_head *item,
>* no pages, so we expect to be able to remove them all and
>* delete and free the empty node afterwards.
>*/
> - BUG_ON(!workingset_node_shadows(node));
> - BUG_ON(workingset_node_pages(node));
> -
> + if (WARN_ON_ONCE(!workingset_node_shadows(node)))
> + goto out_invalid;
> + if (WARN_ON_ONCE(workingset_node_pages(node)))
> + goto out_invalid;
>   for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) {
>   if (node->slots[i]) {
> - BUG_ON(!radix_tree_exceptional_entry(node->slots[i]));
> + if 
> (WARN_ON_ONCE(!radix_tree_exceptional_entry(node->slots[i])))
> + goto out_invalid;
> + if (WARN_ON_ONCE(!mapping->nrexceptional))
> + goto out_invalid;
>   node->slots[i] = NULL;
>   workingset_node_shadows_dec(node);
> - BUG_ON(!mapping->nrexceptional);
>   mapping->nrexceptional--;
>   }
>   }
> - BUG_ON(workingset_node_shadows(node));
> + if (WARN_ON_ONCE(workingset_node_shadows(node)))
> + goto out_invalid;
>   inc_node_state(page_pgdat(virt_to_page(node)), WORKINGSET_NODERECLAIM);
> - if (!__radix_tree_delete_node(>page_tree, node))
> - BUG();
> + __radix_tree_delete_node(>page_tree, node);
>  
> +out_invalid:
>   spin_unlock(>tree_lock);
>   ret = LRU_REMOVED_RETRY;
>  out:
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH v12 4/7] x86/syscalls/32: Wire up arch_prctl on x86-32

2016-11-17 Thread Ingo Molnar


* Kyle Huey  wrote:

> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -716,16 +716,18 @@ int __compat_save_altstack(compat_stack_t __user *, 
> unsigned long);
>  } while (0);
>  
>  asmlinkage long compat_sys_sched_rr_get_interval(compat_pid_t pid,
>struct compat_timespec __user 
> *interval);
>  
>  asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
>   int, const char __user *);
>  
> +asmlinkage long compat_sys_arch_prctl(int, unsigned long);

Please always use prototypes with proper argument names spelled out, i.e.:

+asmlinkage long compat_sys_arch_prctl(int option, unsigned long arg2);

Thanks,

Ingo

Re: [PATCH v12 4/7] x86/syscalls/32: Wire up arch_prctl on x86-32

2016-11-17 Thread Ingo Molnar


* Kyle Huey  wrote:

> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -716,16 +716,18 @@ int __compat_save_altstack(compat_stack_t __user *, 
> unsigned long);
>  } while (0);
>  
>  asmlinkage long compat_sys_sched_rr_get_interval(compat_pid_t pid,
>struct compat_timespec __user 
> *interval);
>  
>  asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
>   int, const char __user *);
>  
> +asmlinkage long compat_sys_arch_prctl(int, unsigned long);

Please always use prototypes with proper argument names spelled out, i.e.:

+asmlinkage long compat_sys_arch_prctl(int option, unsigned long arg2);

Thanks,

Ingo

Re: [PATCH 2/9] mm: khugepaged: fix radix tree node leak in shmem collapse error path

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:31, Johannes Weiner wrote:
> The radix tree counts valid entries in each tree node. Entries stored
> in the tree cannot be removed by simpling storing NULL in the slot or
> the internal counters will be off and the node never gets freed again.
> 
> When collapsing a shmem page fails, restore the holes that were filled
> with radix_tree_insert() with a proper radix tree deletion.
> 
> Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem 
> pages")
> Reported-by: Jan Kara 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/khugepaged.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index bdfdab40a813..d553c294de40 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1523,9 +1523,11 @@ static void collapse_shmem(struct mm_struct *mm,
>   if (!page || iter.index < page->index) {
>   if (!nr_none)
>   break;
> - /* Put holes back where they were */
> - radix_tree_replace_slot(slot, NULL);
>   nr_none--;
> + /* Put holes back where they were */
> + radix_tree_delete(>page_tree,
> +   iter.index);
> + slot = radix_tree_iter_next();
>   continue;
>   }
>  
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH 2/9] mm: khugepaged: fix radix tree node leak in shmem collapse error path

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:31, Johannes Weiner wrote:
> The radix tree counts valid entries in each tree node. Entries stored
> in the tree cannot be removed by simpling storing NULL in the slot or
> the internal counters will be off and the node never gets freed again.
> 
> When collapsing a shmem page fails, restore the holes that were filled
> with radix_tree_insert() with a proper radix tree deletion.
> 
> Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem 
> pages")
> Reported-by: Jan Kara 
> Signed-off-by: Johannes Weiner 

Looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/khugepaged.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index bdfdab40a813..d553c294de40 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1523,9 +1523,11 @@ static void collapse_shmem(struct mm_struct *mm,
>   if (!page || iter.index < page->index) {
>   if (!nr_none)
>   break;
> - /* Put holes back where they were */
> - radix_tree_replace_slot(slot, NULL);
>   nr_none--;
> + /* Put holes back where they were */
> + radix_tree_delete(>page_tree,
> +   iter.index);
> + slot = radix_tree_iter_next();
>   continue;
>   }
>  
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH 1/9] mm: khugepaged: close use-after-free race during shmem collapsing

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:30, Johannes Weiner wrote:
> When a radix tree iteration drops the tree lock, another thread might
> swoop in and free the node holding the current slot. The iteration
> needs to do another tree lookup from the current index to continue.
> 
> [kirill.shute...@linux.intel.com: re-lookup for replacement]
> Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem 
> pages")
> Signed-off-by: Johannes Weiner 

The patch looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/khugepaged.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 728d7790dc2d..bdfdab40a813 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1401,6 +1401,9 @@ static void collapse_shmem(struct mm_struct *mm,
>  
>   spin_lock_irq(>tree_lock);
>  
> + slot = radix_tree_lookup_slot(>page_tree, index);
> + VM_BUG_ON_PAGE(page != radix_tree_deref_slot_protected(slot,
> + >tree_lock), page);
>   VM_BUG_ON_PAGE(page_mapped(page), page);
>  
>   /*
> @@ -1424,6 +1427,7 @@ static void collapse_shmem(struct mm_struct *mm,
>   radix_tree_replace_slot(slot,
>   new_page + (index % HPAGE_PMD_NR));
>  
> + slot = radix_tree_iter_next();
>   index++;
>   continue;
>  out_lru:
> @@ -1535,6 +1539,7 @@ static void collapse_shmem(struct mm_struct *mm,
>   putback_lru_page(page);
>   unlock_page(page);
>   spin_lock_irq(>tree_lock);
> + slot = radix_tree_iter_next();
>   }
>   VM_BUG_ON(nr_none);
>   spin_unlock_irq(>tree_lock);
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH 1/9] mm: khugepaged: close use-after-free race during shmem collapsing

2016-11-17 Thread Jan Kara

On Thu 17-11-16 14:11:30, Johannes Weiner wrote:
> When a radix tree iteration drops the tree lock, another thread might
> swoop in and free the node holding the current slot. The iteration
> needs to do another tree lookup from the current index to continue.
> 
> [kirill.shute...@linux.intel.com: re-lookup for replacement]
> Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem 
> pages")
> Signed-off-by: Johannes Weiner 

The patch looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  mm/khugepaged.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 728d7790dc2d..bdfdab40a813 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1401,6 +1401,9 @@ static void collapse_shmem(struct mm_struct *mm,
>  
>   spin_lock_irq(>tree_lock);
>  
> + slot = radix_tree_lookup_slot(>page_tree, index);
> + VM_BUG_ON_PAGE(page != radix_tree_deref_slot_protected(slot,
> + >tree_lock), page);
>   VM_BUG_ON_PAGE(page_mapped(page), page);
>  
>   /*
> @@ -1424,6 +1427,7 @@ static void collapse_shmem(struct mm_struct *mm,
>   radix_tree_replace_slot(slot,
>   new_page + (index % HPAGE_PMD_NR));
>  
> + slot = radix_tree_iter_next();
>   index++;
>   continue;
>  out_lru:
> @@ -1535,6 +1539,7 @@ static void collapse_shmem(struct mm_struct *mm,
>   putback_lru_page(page);
>   unlock_page(page);
>   spin_lock_irq(>tree_lock);
> + slot = radix_tree_iter_next();
>   }
>   VM_BUG_ON(nr_none);
>   spin_unlock_irq(>tree_lock);
> -- 
> 2.10.2
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64

2016-11-17 Thread Ingo Molnar


* Kyle Huey  wrote:

> In order to introduce new arch_prctls that are not 64 bit only, rename the
> existing 64 bit implementation to do_arch_prctl_64(). Also rename the second
> argument to arch_prctl(), which will no longer always be an address.

>  #ifdef CONFIG_X86_64
>  void entry_SYSCALL_64(void);
> +long do_arch_prctl_64(struct task_struct *task, int code, unsigned long 
> arg2);
>  #endif

Could you please also rename the weirdly named 'code' argument to 'option',
to be in line with the existing sys_prctl() interface nomenclature?

Thanks,

Ingo

Re: [PATCH v12 2/7] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64

2016-11-17 Thread Ingo Molnar


* Kyle Huey  wrote:

> In order to introduce new arch_prctls that are not 64 bit only, rename the
> existing 64 bit implementation to do_arch_prctl_64(). Also rename the second
> argument to arch_prctl(), which will no longer always be an address.

>  #ifdef CONFIG_X86_64
>  void entry_SYSCALL_64(void);
> +long do_arch_prctl_64(struct task_struct *task, int code, unsigned long 
> arg2);
>  #endif

Could you please also rename the weirdly named 'code' argument to 'option',
to be in line with the existing sys_prctl() interface nomenclature?

Thanks,

Ingo

[PATCH v2] ARM: Drop fixed 200 Hz timer requirement from Samsung platforms

2016-11-17 Thread Krzysztof Kozlowski

All Samsung platforms, including the Exynos, are selecting HZ_FIXED with
200 Hz.  Unfortunately in case of multiplatform image this affects also
other platforms when Exynos is enabled.

This looks like an very old legacy code, dating back to initial
upstreaming of S3C24xx.  Probably it was required for s3c24xx timer
driver, which was removed in commit ad38bdd15d5b ("ARM: SAMSUNG: Remove
unused plat-samsung/time.c").

Since then, this fixed 200 Hz spread everywhere, including out-of-tree
Samsung kernels (SoC vendor's and Tizen's).  I believe this choice
was rather an effect of coincidence instead of conscious choice.

Exynos uses its own MCT or arch timer and can work with all HZ values.
Older platforms use newer Samsung PWM timer driver which should handle
down to 100 Hz.

Few perf mem and sched tests on Odroid XU3 board (Exynos5422, 4x Cortex
A7, 4x Cortex A15) show no regressions when switching from 200 Hz to
other values.

Reported-by: Lee Jones 
[Dropping 200_HZ from S3C/S5P suggested by Arnd]
Reported-by: Arnd Bergmann 
Signed-off-by: Krzysztof Kozlowski 
Cc: Kukjin Kim 
Tested-by: Javier Martinez Canillas 

---

Tested on Exynos5422 and Exynos5800 (by Javier). It would be
appreciated if anyone could test it on S3C24xx or S5PV210.

Changes since v1:
1. Add Javier's tested-by.
2. Drop HZ_FIXED also from ARCH_S5PV210 and ARCH_S3C24XX after Arnd
   suggestions and analysis.
---
 arch/arm/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b5d529fdffab..ced2e08a9d08 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1496,8 +1496,7 @@ source kernel/Kconfig.preempt
 
 config HZ_FIXED
int
-   default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
-   ARCH_S5PV210 || ARCH_EXYNOS4
+   default 200 if ARCH_EBSA110
default 128 if SOC_AT91RM9200
default 0
 
-- 
2.7.4

[PATCH v2] ARM: Drop fixed 200 Hz timer requirement from Samsung platforms

2016-11-17 Thread Krzysztof Kozlowski

All Samsung platforms, including the Exynos, are selecting HZ_FIXED with
200 Hz.  Unfortunately in case of multiplatform image this affects also
other platforms when Exynos is enabled.

This looks like an very old legacy code, dating back to initial
upstreaming of S3C24xx.  Probably it was required for s3c24xx timer
driver, which was removed in commit ad38bdd15d5b ("ARM: SAMSUNG: Remove
unused plat-samsung/time.c").

Since then, this fixed 200 Hz spread everywhere, including out-of-tree
Samsung kernels (SoC vendor's and Tizen's).  I believe this choice
was rather an effect of coincidence instead of conscious choice.

Exynos uses its own MCT or arch timer and can work with all HZ values.
Older platforms use newer Samsung PWM timer driver which should handle
down to 100 Hz.

Few perf mem and sched tests on Odroid XU3 board (Exynos5422, 4x Cortex
A7, 4x Cortex A15) show no regressions when switching from 200 Hz to
other values.

Reported-by: Lee Jones 
[Dropping 200_HZ from S3C/S5P suggested by Arnd]
Reported-by: Arnd Bergmann 
Signed-off-by: Krzysztof Kozlowski 
Cc: Kukjin Kim 
Tested-by: Javier Martinez Canillas 

---

Tested on Exynos5422 and Exynos5800 (by Javier). It would be
appreciated if anyone could test it on S3C24xx or S5PV210.

Changes since v1:
1. Add Javier's tested-by.
2. Drop HZ_FIXED also from ARCH_S5PV210 and ARCH_S3C24XX after Arnd
   suggestions and analysis.
---
 arch/arm/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b5d529fdffab..ced2e08a9d08 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1496,8 +1496,7 @@ source kernel/Kconfig.preempt
 
 config HZ_FIXED
int
-   default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
-   ARCH_S5PV210 || ARCH_EXYNOS4
+   default 200 if ARCH_EBSA110
default 128 if SOC_AT91RM9200
default 0
 
-- 
2.7.4

[PATCH] clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it

2016-11-17 Thread Chen-Yu Tsai

The PLL-MIPI clock is somewhat special as it has its own LDOs which
need to be turned on for this PLL to actually work and output a clock
signal.

Add the 2 LDO enable bits to the gate bits. This fixes issues with
the TCON not sending vblank interrupts when the tcon and dot clock are
indirectly clocked from the PLL-MIPI clock.

Fixes: c6e6c96d8fa6 ("clk: sunxi-ng: Add A31/A31s clocks")
Signed-off-by: Chen-Yu Tsai 
---

This can be queued for either 4.9 or 4.10.

The clock driver was introduced in 4.9,
but the users won't appear until 4.10.

---
 drivers/clk/sunxi-ng/ccu-sun6i-a31.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c 
b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
index 4a82a49cff5e..fc75a335a7ce 100644
--- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
@@ -143,7 +143,7 @@ static SUNXI_CCU_NKM_WITH_MUX_GATE_LOCK(pll_mipi_clk, 
"pll-mipi",
4, 2,   /* K */
0, 4,   /* M */
21, 0,  /* mux */
-   BIT(31),/* gate */
+   BIT(31) | BIT(23) | BIT(22), /* gate */
BIT(28),/* lock */
CLK_SET_RATE_UNGATE);
 
-- 
2.10.2

[PATCH] clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it

2016-11-17 Thread Chen-Yu Tsai

The PLL-MIPI clock is somewhat special as it has its own LDOs which
need to be turned on for this PLL to actually work and output a clock
signal.

Add the 2 LDO enable bits to the gate bits. This fixes issues with
the TCON not sending vblank interrupts when the tcon and dot clock are
indirectly clocked from the PLL-MIPI clock.

Fixes: c6e6c96d8fa6 ("clk: sunxi-ng: Add A31/A31s clocks")
Signed-off-by: Chen-Yu Tsai 
---

This can be queued for either 4.9 or 4.10.

The clock driver was introduced in 4.9,
but the users won't appear until 4.10.

---
 drivers/clk/sunxi-ng/ccu-sun6i-a31.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c 
b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
index 4a82a49cff5e..fc75a335a7ce 100644
--- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
@@ -143,7 +143,7 @@ static SUNXI_CCU_NKM_WITH_MUX_GATE_LOCK(pll_mipi_clk, 
"pll-mipi",
4, 2,   /* K */
0, 4,   /* M */
21, 0,  /* mux */
-   BIT(31),/* gate */
+   BIT(31) | BIT(23) | BIT(22), /* gate */
BIT(28),/* lock */
CLK_SET_RATE_UNGATE);
 
-- 
2.10.2

Re: [PATCH 4.8 00/92] 4.8.9-stable review

2016-11-17 Thread Greg Kroah-Hartman

On Thu, Nov 17, 2016 at 02:23:50PM -0800, Guenter Roeck wrote:
> On Thu, Nov 17, 2016 at 11:31:33AM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.8.9 release.
> > There are 92 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Nov 19 10:32:04 UTC 2016.
> > Anything received after that time might be too late.
> > 
> Build results:
>   total: 149 pass: 149 fail: 0
> Qemu test results:
>   total: 114 pass: 114 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Great, thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 4.8 00/92] 4.8.9-stable review

2016-11-17 Thread Greg Kroah-Hartman

On Thu, Nov 17, 2016 at 02:23:50PM -0800, Guenter Roeck wrote:
> On Thu, Nov 17, 2016 at 11:31:33AM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.8.9 release.
> > There are 92 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Nov 19 10:32:04 UTC 2016.
> > Anything received after that time might be too late.
> > 
> Build results:
>   total: 149 pass: 149 fail: 0
> Qemu test results:
>   total: 114 pass: 114 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Great, thanks for testing all of these and letting me know.

greg k-h

[PATCH] arm64: dts: exynos: add the mshc_2 node for supporting T-Flash

2016-11-17 Thread Jaehoon Chung

Add the mshc_2 node for supporting T-flash.

And it needs to add the "mshc*" aliases. Because dwmmc driver should be
assigned to "ctrl_id" after parsing to "mshc".
If there is no aliases for mshc, then it might be set to the wrong
capabilities.

Signed-off-by: Jaehoon Chung 
---
 arch/arm64/boot/dts/exynos/exynos5433-tm2.dts | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts 
b/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
index 9ea3f32..75dab01 100644
--- a/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
+++ b/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
@@ -42,6 +42,8 @@
spi2 = _2;
spi3 = _3;
spi4 = _4;
+   mshc0 = _0;
+   mshc2 = _2;
};
 
chosen {
@@ -661,6 +663,23 @@
assigned-clock-rates = <8>;
 };
 
+_2 {
+   status = "okay";
+   num-slots = <1>;
+   cap-sd-highspeed;
+   disable-wp;
+   cd-gpios = < 4 0>;
+   cd-inverted;
+   card-detect-delay = <200>;
+   samsung,dw-mshc-ciu-div = <3>;
+   samsung,dw-mshc-sdr-timing = <0 4>;
+   samsung,dw-mshc-ddr-timing = <0 2>;
+   fifo-depth = <0x80>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_clk _cmd _bus1 _bus4>;
+   bus-width = <4>;
+};
+
 _alive {
pinctrl-names = "default";
pinctrl-0 = <_alive>;
-- 
2.10.1

[PATCH] arm64: dts: exynos: add the mshc_2 node for supporting T-Flash

2016-11-17 Thread Jaehoon Chung

Add the mshc_2 node for supporting T-flash.

And it needs to add the "mshc*" aliases. Because dwmmc driver should be
assigned to "ctrl_id" after parsing to "mshc".
If there is no aliases for mshc, then it might be set to the wrong
capabilities.

Signed-off-by: Jaehoon Chung 
---
 arch/arm64/boot/dts/exynos/exynos5433-tm2.dts | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts 
b/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
index 9ea3f32..75dab01 100644
--- a/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
+++ b/arch/arm64/boot/dts/exynos/exynos5433-tm2.dts
@@ -42,6 +42,8 @@
spi2 = _2;
spi3 = _3;
spi4 = _4;
+   mshc0 = _0;
+   mshc2 = _2;
};
 
chosen {
@@ -661,6 +663,23 @@
assigned-clock-rates = <8>;
 };
 
+_2 {
+   status = "okay";
+   num-slots = <1>;
+   cap-sd-highspeed;
+   disable-wp;
+   cd-gpios = < 4 0>;
+   cd-inverted;
+   card-detect-delay = <200>;
+   samsung,dw-mshc-ciu-div = <3>;
+   samsung,dw-mshc-sdr-timing = <0 4>;
+   samsung,dw-mshc-ddr-timing = <0 2>;
+   fifo-depth = <0x80>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_clk _cmd _bus1 _bus4>;
+   bus-width = <4>;
+};
+
 _alive {
pinctrl-names = "default";
pinctrl-0 = <_alive>;
-- 
2.10.1

Re: [PATCH/RFC] add "failfast" support for raid1/raid10.

2016-11-17 Thread Hannes Reinecke

(Seeing that it was me who initiated those patches I guess I should
speak up here)

On 11/18/2016 06:16 AM, NeilBrown wrote:
> Hi,
> 
>  I've been sitting on these patches for a while because although they
>  solve a real problem, it is a fairly limited use-case, and I don't
>  really like some of the details.
> 
>  So I'm posting them as RFC in the hope that a different perspective
>  might help me like them better, or find a better approach.
> 
[ .. ]
>  My two main concerns are:
>   - does this functionality have any use-case outside of mirrored
> storage arrays, and are there other storage arrays which
> occasionally inserted excessive latency (seems like a serious
> misfeature to me, but I know few of the details)?
Yes, there are.
I've come across some storage arrays which really take some liberty when
doing internal error recovery; some even take up to 20 minutes
before sending a command completion (the response was "there's nothing
in the SCSI spec which forbids us to do so")

>   - would it be at all possible to have "real" failfast functionality
> in the block layer?  I.e. something that is based on time rather
> than retry count.  Maybe in some cases a retry would be
> appropriate if the first failure was very fast.
> I.e. it would reduce timeouts and decide on retries based on
> elapsed time rather than number of attempts.
> With this would come the question of "how fast is fast" and I
> don't have a really good answer.  Maybe md would need to set a
> timeout, which it would double whenever it got failures on all
> drives.  Otherwise the timeout would drift towards (say) 10 times
> the typical response time.
> 
The current 'failfast' is rather a 'do not attempt error recovery' flag;
ie the SCSI stack should _not_ start error recovery but rather pass the
request upwards in case of failure.
Problem is that there is no real upper limit on the time error recovery
could take, and it's virtually impossible to give an I/O response time
guarantees once error recovery had been invoked.
And to make matters worse, in most cases error recovery won't work
_anyway_ if the transport is severed.
So this is more to do with error recovery, and not so much on the time
each request can/should spend on the fly.

The S/390 DASD case is even worse, as the DASD driver _by design_ will
always have to wait for an answer from the storage array. So if the link
to the array is severed you are in deep trouble, as you'll never get a
completion (or any status, for that matter) until the array is reconnected.

So while the FAILFAST flag is a mere convenience for SCSI, it's a
positive must for S/390 if you want to have a functional RAID.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH/RFC] add "failfast" support for raid1/raid10.

2016-11-17 Thread Hannes Reinecke

(Seeing that it was me who initiated those patches I guess I should
speak up here)

On 11/18/2016 06:16 AM, NeilBrown wrote:
> Hi,
> 
>  I've been sitting on these patches for a while because although they
>  solve a real problem, it is a fairly limited use-case, and I don't
>  really like some of the details.
> 
>  So I'm posting them as RFC in the hope that a different perspective
>  might help me like them better, or find a better approach.
> 
[ .. ]
>  My two main concerns are:
>   - does this functionality have any use-case outside of mirrored
> storage arrays, and are there other storage arrays which
> occasionally inserted excessive latency (seems like a serious
> misfeature to me, but I know few of the details)?
Yes, there are.
I've come across some storage arrays which really take some liberty when
doing internal error recovery; some even take up to 20 minutes
before sending a command completion (the response was "there's nothing
in the SCSI spec which forbids us to do so")

>   - would it be at all possible to have "real" failfast functionality
> in the block layer?  I.e. something that is based on time rather
> than retry count.  Maybe in some cases a retry would be
> appropriate if the first failure was very fast.
> I.e. it would reduce timeouts and decide on retries based on
> elapsed time rather than number of attempts.
> With this would come the question of "how fast is fast" and I
> don't have a really good answer.  Maybe md would need to set a
> timeout, which it would double whenever it got failures on all
> drives.  Otherwise the timeout would drift towards (say) 10 times
> the typical response time.
> 
The current 'failfast' is rather a 'do not attempt error recovery' flag;
ie the SCSI stack should _not_ start error recovery but rather pass the
request upwards in case of failure.
Problem is that there is no real upper limit on the time error recovery
could take, and it's virtually impossible to give an I/O response time
guarantees once error recovery had been invoked.
And to make matters worse, in most cases error recovery won't work
_anyway_ if the transport is severed.
So this is more to do with error recovery, and not so much on the time
each request can/should spend on the fly.

The S/390 DASD case is even worse, as the DASD driver _by design_ will
always have to wait for an answer from the storage array. So if the link
to the array is severed you are in deep trouble, as you'll never get a
completion (or any status, for that matter) until the array is reconnected.

So while the FAILFAST flag is a mere convenience for SCSI, it's a
positive must for S/390 if you want to have a functional RAID.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH] mm: support anonymous stable page

2016-11-17 Thread Minchan Kim

Hi Hugh,

On Thu, Nov 17, 2016 at 08:35:10PM -0800, Hugh Dickins wrote:
> On Fri, 11 Nov 2016, Minchan Kim wrote:
> > Sorry for sending a wrong version. Here is new one.
> > 
> > From 2d42ead9335cde51fd58d6348439ca03cf359ba2 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Fri, 11 Nov 2016 15:02:57 +0900
> > Subject: [PATCH] mm: support anonymous stable page
> > 
> > For developemnt for zram-swap asynchronous writeback, I found
> > strange corruption of compressed page. With investigation, it
> > reveals currently stable page doesn't support anonymous page.
> > IOW, reuse_swap_page can reuse the page without waiting
> > writeback completion so that it can corrupt data during
> > zram compression. It can affect every swap device which supports
> > asynchronous writeback and CRC checking as well as zRAM.
> > 
> > Unfortunately, reuse_swap_page should be atomic so that we
> > cannot wait on writeback in there so the approach in this patch
> > is simply return false if we found it needs stable page.
> > Although it increases memory footprint temporarily, it happens
> > rarely and it should be reclaimed easily althoug it happened.
> > Also, It would be better than waiting of IO completion, which
> > is critial path for application latency.
> > 
> > Cc: Hugh Dickins 
> > Cc: Darrick J. Wong 
> > Signed-off-by: Minchan Kim 
> 
> Ack to your intention (we discussed this together years ago, but saw
> no actual demand for it before now), and I like what you're doing;
> but it has to be NAK to this implementation.
> 
> I sensed there was an problem when you posted; but only now, after
> searching through the uses of mapping->host, do I see that problem.
> 
> You're setting swap's mapping->host = inode when it used to be NULL:
> which seems like a very good way to get what you need, but I'm afraid
> it's a change which goes way beyond your intention.
> 
> See inode_to_bdi(): for ordinary disk-based swap, it will now pick
> up the bdi of the block device instead of noop_backing_dev_info, so
> swap would then pass the mapping_cap_account_dirty() and similar
> tests (mostly in mm/page-writeback.c), and go down codepaths it
> has never gone down before.
> 
> It's possible that swap (and shmem) would be better off going down
> those paths, to be throttled in a similar way to files; but that's
> debatable, and a much bigger change than you want to get into for
> zram stable pages.

Good point.
Thanks for the review, Hugh.

> 
> Maybe add SWP_STABLE_WRITES in include/linux/swap.h, and set that
> in swap_info->flags according to bdi_cap_stable_pages_required(),
> leaving mapping->host itself NULL as before?

The problem with the approach is that we need to get swap_info_struct
in reuse_swap_page so maybe, every caller should pass swp_entry_t
into reuse_swap_page. It would be no problem if swap slot is really
referenced the page(IOW, pte is real swp_entry_t) but some cases
where swap slot is already empty but the page remains in only
swap cache, we cannot pass swp_entry_t which means that we cannot
get swap_info_struct.

So, if I didn't miss, another option I can imagine is to move
SWP_STABLE_WRITES to address_space->flags as AS_STABLE_WRITES.
With that, we can always get the information without passing
swp_entry_t. Is there any better idea?


diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index dd15d39e1985..5397e82bfd57 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -26,6 +26,8 @@ enum mapping_flags {
AS_EXITING  = 4,/* final truncate in progress */
/* writeback related tags are not used */
AS_NO_WRITEBACK_TAGS = 5,
+   /* need stable write for swap */
+   AS_STABLE_WRITES = 6,
 };
 
 static inline void mapping_set_error(struct address_space *mapping, int error)
@@ -55,6 +57,21 @@ static inline int mapping_unevictable(struct address_space 
*mapping)
return !!mapping;
 }
 
+static inline void mapping_set_stable(struct address_space *mapping)
+{
+   set_bit(AS_STABLE_WRITES, >flags);
+}
+
+static inline void mapping_clear_stable(struct address_space *mapping)
+{
+   clear_bit(AS_STABLE_WRITES, >flags);
+}
+
+static inline int mapping_stable(struct address_space *mapping)
+{
+   return test_bit(AS_STABLE_WRITES, >flags);
+}
+
 static inline void mapping_set_exiting(struct address_space *mapping)
 {
set_bit(AS_EXITING, >flags);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2210de290b54..0c31fd814933 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -943,11 +943,20 @@ bool reuse_swap_page(struct page *page, int 
*total_mapcount)
count = page_trans_huge_mapcount(page, total_mapcount);
if (count <= 1 && PageSwapCache(page)) {
count += page_swapcount(page);
-   if (count == 1 && !PageWriteback(page)) {
+   if (count != 1)
+   goto out;
+   if

Re: [PATCH] mm: support anonymous stable page

2016-11-17 Thread Minchan Kim

Hi Hugh,

On Thu, Nov 17, 2016 at 08:35:10PM -0800, Hugh Dickins wrote:
> On Fri, 11 Nov 2016, Minchan Kim wrote:
> > Sorry for sending a wrong version. Here is new one.
> > 
> > From 2d42ead9335cde51fd58d6348439ca03cf359ba2 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Fri, 11 Nov 2016 15:02:57 +0900
> > Subject: [PATCH] mm: support anonymous stable page
> > 
> > For developemnt for zram-swap asynchronous writeback, I found
> > strange corruption of compressed page. With investigation, it
> > reveals currently stable page doesn't support anonymous page.
> > IOW, reuse_swap_page can reuse the page without waiting
> > writeback completion so that it can corrupt data during
> > zram compression. It can affect every swap device which supports
> > asynchronous writeback and CRC checking as well as zRAM.
> > 
> > Unfortunately, reuse_swap_page should be atomic so that we
> > cannot wait on writeback in there so the approach in this patch
> > is simply return false if we found it needs stable page.
> > Although it increases memory footprint temporarily, it happens
> > rarely and it should be reclaimed easily althoug it happened.
> > Also, It would be better than waiting of IO completion, which
> > is critial path for application latency.
> > 
> > Cc: Hugh Dickins 
> > Cc: Darrick J. Wong 
> > Signed-off-by: Minchan Kim 
> 
> Ack to your intention (we discussed this together years ago, but saw
> no actual demand for it before now), and I like what you're doing;
> but it has to be NAK to this implementation.
> 
> I sensed there was an problem when you posted; but only now, after
> searching through the uses of mapping->host, do I see that problem.
> 
> You're setting swap's mapping->host = inode when it used to be NULL:
> which seems like a very good way to get what you need, but I'm afraid
> it's a change which goes way beyond your intention.
> 
> See inode_to_bdi(): for ordinary disk-based swap, it will now pick
> up the bdi of the block device instead of noop_backing_dev_info, so
> swap would then pass the mapping_cap_account_dirty() and similar
> tests (mostly in mm/page-writeback.c), and go down codepaths it
> has never gone down before.
> 
> It's possible that swap (and shmem) would be better off going down
> those paths, to be throttled in a similar way to files; but that's
> debatable, and a much bigger change than you want to get into for
> zram stable pages.

Good point.
Thanks for the review, Hugh.

> 
> Maybe add SWP_STABLE_WRITES in include/linux/swap.h, and set that
> in swap_info->flags according to bdi_cap_stable_pages_required(),
> leaving mapping->host itself NULL as before?

The problem with the approach is that we need to get swap_info_struct
in reuse_swap_page so maybe, every caller should pass swp_entry_t
into reuse_swap_page. It would be no problem if swap slot is really
referenced the page(IOW, pte is real swp_entry_t) but some cases
where swap slot is already empty but the page remains in only
swap cache, we cannot pass swp_entry_t which means that we cannot
get swap_info_struct.

So, if I didn't miss, another option I can imagine is to move
SWP_STABLE_WRITES to address_space->flags as AS_STABLE_WRITES.
With that, we can always get the information without passing
swp_entry_t. Is there any better idea?


diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index dd15d39e1985..5397e82bfd57 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -26,6 +26,8 @@ enum mapping_flags {
AS_EXITING  = 4,/* final truncate in progress */
/* writeback related tags are not used */
AS_NO_WRITEBACK_TAGS = 5,
+   /* need stable write for swap */
+   AS_STABLE_WRITES = 6,
 };
 
 static inline void mapping_set_error(struct address_space *mapping, int error)
@@ -55,6 +57,21 @@ static inline int mapping_unevictable(struct address_space 
*mapping)
return !!mapping;
 }
 
+static inline void mapping_set_stable(struct address_space *mapping)
+{
+   set_bit(AS_STABLE_WRITES, >flags);
+}
+
+static inline void mapping_clear_stable(struct address_space *mapping)
+{
+   clear_bit(AS_STABLE_WRITES, >flags);
+}
+
+static inline int mapping_stable(struct address_space *mapping)
+{
+   return test_bit(AS_STABLE_WRITES, >flags);
+}
+
 static inline void mapping_set_exiting(struct address_space *mapping)
 {
set_bit(AS_EXITING, >flags);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2210de290b54..0c31fd814933 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -943,11 +943,20 @@ bool reuse_swap_page(struct page *page, int 
*total_mapcount)
count = page_trans_huge_mapcount(page, total_mapcount);
if (count <= 1 && PageSwapCache(page)) {
count += page_swapcount(page);
-   if (count == 1 && !PageWriteback(page)) {
+   if (count != 1)
+   goto out;
+   if (!PageWriteback(page)) {
delete_from_swap_cache(page);

Re: [PATCH v4 05/10] IB/isert: Replace semaphore sem with completion

2016-11-17 Thread Binoy Jayan

Hi Sagi,

On 31 October 2016 at 02:42, Sagi Grimberg  wrote:
>> The semaphore 'sem' in isert_device is used as completion, so convert
>> it to struct completion. Semaphores are going away in the future.
>
>
> Umm, this is 100% *not* true. np->sem is designed as a counting to
> sync the iscsi login thread with the connect requests coming from the
> initiators. So this is actually a reliable bug insertion :(
>
> NAK from me...

Sorry for the late reply as I was held up in other activities.

I converted this to a wait_event() implementation but as I was doing it,
I was wondering how it would have been different if it was a completion
and not a semaphore.

File: drivers/infiniband/ulp/isert/ib_isert.c

If isert_connected_handler() is called multiple times, adding an entry to the
list, and if that happens while we use completion, 'done' (part of struct
completion) would be incremented by 1 each time 'complete' is called from
isert_connected_handler. After 'n' iterations, done will be equal to 'n'. If we
call wait_for_completion now from isert_accept_np, it would just decrement
'done' by one and continue without blocking, consuming one node at a time
from the list 'isert_np->pending'.

Alternatively if "done" becomes zero, and the next time wait_for_completion is
called, the API would add a node at the end of the wait queue 'wait' in 'struct
completion' and block until "done" is nonzero. (Ref: do_wait_for_common)
It exists the wait when a call to 'complete' turns 'done' back to 1.
But if there
are multiple waits called before calling complete, all the tasks
calling the wait
gets queued up and they will all would see "done" set to zero. When complete
is called now, done turns 1 again and the first task in the queue is woken up
as it is serialized as FIFO. Now the first wait returns and the done is
decremented by 1 just before the return.

Am I missing something here?

Thanks,
Binoy

Re: [PATCH v4 05/10] IB/isert: Replace semaphore sem with completion

2016-11-17 Thread Binoy Jayan

Hi Sagi,

On 31 October 2016 at 02:42, Sagi Grimberg  wrote:
>> The semaphore 'sem' in isert_device is used as completion, so convert
>> it to struct completion. Semaphores are going away in the future.
>
>
> Umm, this is 100% *not* true. np->sem is designed as a counting to
> sync the iscsi login thread with the connect requests coming from the
> initiators. So this is actually a reliable bug insertion :(
>
> NAK from me...

Sorry for the late reply as I was held up in other activities.

I converted this to a wait_event() implementation but as I was doing it,
I was wondering how it would have been different if it was a completion
and not a semaphore.

File: drivers/infiniband/ulp/isert/ib_isert.c

If isert_connected_handler() is called multiple times, adding an entry to the
list, and if that happens while we use completion, 'done' (part of struct
completion) would be incremented by 1 each time 'complete' is called from
isert_connected_handler. After 'n' iterations, done will be equal to 'n'. If we
call wait_for_completion now from isert_accept_np, it would just decrement
'done' by one and continue without blocking, consuming one node at a time
from the list 'isert_np->pending'.

Alternatively if "done" becomes zero, and the next time wait_for_completion is
called, the API would add a node at the end of the wait queue 'wait' in 'struct
completion' and block until "done" is nonzero. (Ref: do_wait_for_common)
It exists the wait when a call to 'complete' turns 'done' back to 1.
But if there
are multiple waits called before calling complete, all the tasks
calling the wait
gets queued up and they will all would see "done" set to zero. When complete
is called now, done turns 1 again and the first task in the queue is woken up
as it is serialized as FIFO. Now the first wait returns and the done is
decremented by 1 just before the return.

Am I missing something here?

Thanks,
Binoy

[PATCH] sched/rt: Change rt_nr_running to rt_queued in the comment

2016-11-17 Thread T.Zhou

The code actually checks rt_queued not rt_nr_running
in pick_next_task_rt(), so change the corresponding
comment.

Signed-off-by: T.Zhou 
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2516b8d..9b4a5c5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1550,7 +1550,7 @@ pick_next_task_rt(struct rq *rq, struct task_struct 
*prev, struct pin_cookie coo
 
/*
 * We may dequeue prev's rt_rq in put_prev_task().
-* So, we update time before rt_nr_running check.
+* So, we update time before rt_queued check.
 */
if (prev->sched_class == _sched_class)
update_curr_rt(rq);
-- 
2.7.3

[PATCH] sched/rt: Change rt_nr_running to rt_queued in the comment

2016-11-17 Thread T.Zhou

The code actually checks rt_queued not rt_nr_running
in pick_next_task_rt(), so change the corresponding
comment.

Signed-off-by: T.Zhou 
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2516b8d..9b4a5c5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1550,7 +1550,7 @@ pick_next_task_rt(struct rq *rq, struct task_struct 
*prev, struct pin_cookie coo
 
/*
 * We may dequeue prev's rt_rq in put_prev_task().
-* So, we update time before rt_nr_running check.
+* So, we update time before rt_queued check.
 */
if (prev->sched_class == _sched_class)
update_curr_rt(rq);
-- 
2.7.3

Re: [PATCH] ARM: Drop fixed 200 Hz timer requirement from Exynos platforms

2016-11-17 Thread Krzysztof Kozlowski

On Thu, Nov 17, 2016 at 01:35:45PM +0100, Arnd Bergmann wrote:
> On Monday, November 14, 2016 8:27:05 PM CET Krzysztof Kozlowski wrote:
> > @@ -1497,7 +1497,7 @@ source kernel/Kconfig.preempt
> >  config HZ_FIXED
> > int
> > default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
> > -   ARCH_S5PV210 || ARCH_EXYNOS4
> > +   ARCH_S5PV210
> > default 128 if SOC_AT91RM9200
> > default 0
> 
> After further research, I've concluded that we should also drop the
> settings for ARCH_S5PV210 and ARCH_S3C24XX here.
> 
> ARCH_S5PV210 behaves exactly like EXYNOS here, it has 32-bit timers
> so there won't be any overflow with 100Hz.
> 
> For ARCH_S3C24XX, it the requirement was that HZ_100 could not
> be used with the old arch/arm/plat-samsung/time.c code that would
> overflow its 16-bit counter.
> However, the new drivers/clocksource/samsung_pwm_timer.c configures
> the clock divider to '50' instead of '6', so there is no longer
> a 16-bit overflow before the 100Hz tick, it now overflows every
> 3.7ms for the typical 12MHz clock.

I can send an updated version however testing would be nice... I know
Sylwester has a S3C6410 platform running, maybe S3C24xx as well.

Best regards,
Krzysztof

Re: [PATCH] ARM: Drop fixed 200 Hz timer requirement from Exynos platforms

2016-11-17 Thread Krzysztof Kozlowski

On Thu, Nov 17, 2016 at 01:35:45PM +0100, Arnd Bergmann wrote:
> On Monday, November 14, 2016 8:27:05 PM CET Krzysztof Kozlowski wrote:
> > @@ -1497,7 +1497,7 @@ source kernel/Kconfig.preempt
> >  config HZ_FIXED
> > int
> > default 200 if ARCH_EBSA110 || ARCH_S3C24XX || \
> > -   ARCH_S5PV210 || ARCH_EXYNOS4
> > +   ARCH_S5PV210
> > default 128 if SOC_AT91RM9200
> > default 0
> 
> After further research, I've concluded that we should also drop the
> settings for ARCH_S5PV210 and ARCH_S3C24XX here.
> 
> ARCH_S5PV210 behaves exactly like EXYNOS here, it has 32-bit timers
> so there won't be any overflow with 100Hz.
> 
> For ARCH_S3C24XX, it the requirement was that HZ_100 could not
> be used with the old arch/arm/plat-samsung/time.c code that would
> overflow its 16-bit counter.
> However, the new drivers/clocksource/samsung_pwm_timer.c configures
> the clock divider to '50' instead of '6', so there is no longer
> a 16-bit overflow before the 100Hz tick, it now overflows every
> 3.7ms for the typical 12MHz clock.

I can send an updated version however testing would be nice... I know
Sylwester has a S3C6410 platform running, maybe S3C24xx as well.

Best regards,
Krzysztof

Re: [PATCH 1/3] arm: hisi: add ARCH_MULTI_V5 support

2016-11-17 Thread Jiancheng Xue

Hi Marty,

On 2016/11/17 11:03, Jiancheng Xue wrote:
> Hi Wei，
> 
> On 2016/11/16 17:31, Wei Xu wrote:
>> Hi Pan,
>>
>> On 2016/11/16 8:56, wenpan wrote:
>>> Hi Marty，
>>> Does this confict with your patch？ If not，I hope this could be merged 
>>> first.  Besides could you tell me the link to your related patch?
>>
>> This is the link: https://patchwork.kernel.org/patch/9334743/
>>

Could you give your comments on this patch?
If you have any objections to it, please let us know.

> 
> Thank you for offering this.If I want to give some comments on Marty's patch,
> what should I do?
> 
> For Marty's patch, I think there's no need to add specific config item 
> ARCH_HI
> for every chipset. Some existing chipsets depend on ARCH_HISI directly like 
> Hi3519
> and Hi3798CV200. If some options like ARM_GIC is removed from ARCH_HISI, this 
> kind
> of chipsets will must choose other place to select it. I suggest we should 
> keep selecting
> ARM_GIC under ARCH_HISI as Pan's patch do.
> 
> The code may be like this:
> 
> config ARCH_HISI
>   bool "Hisilicon SoC Support"
> - depends on ARCH_MULTI_V7
> + depends on ARCH_MULTI_V5 || ARCH_MULTI_V6 || ARCH_MULTI_V7
>   select ARM_AMBA
> - select ARM_GIC
> + select ARM_GIC if ARCH_MULTI_V7
> + select ARM_VIC if ARCH_MULTI_V5 || depends on ARCH_MULTI_V6
>   select ARM_TIMER_SP804
>   select POWER_RESET
>   select POWER_RESET_HISI
>   select POWER_SUPPLY
> 

What's your opinion about this?

Best Regards,
Jiancheng

>>> On 2016/10/17 21:48, Arnd Bergmann wrote:
 On Monday, October 17, 2016 8:07:03 PM CEST Pan Wen wrote:
> Add support for some HiSilicon SoCs which depend on ARCH_MULTI_V5.
>
> Signed-off-by: Pan Wen 
>

 Looks ok. I've added Marty Plummer to Cc, he was recently proposing
 patches for Hi3520, which I think is closely related to this one.
 Please try to work together so the patches don't conflict. It should
 be fairly straightforward since you are basically doing the same
 change here.

> 
> 
> 
> .
>

Re: [PATCH 1/3] arm: hisi: add ARCH_MULTI_V5 support

2016-11-17 Thread Jiancheng Xue

Hi Marty,

On 2016/11/17 11:03, Jiancheng Xue wrote:
> Hi Wei，
> 
> On 2016/11/16 17:31, Wei Xu wrote:
>> Hi Pan,
>>
>> On 2016/11/16 8:56, wenpan wrote:
>>> Hi Marty，
>>> Does this confict with your patch？ If not，I hope this could be merged 
>>> first.  Besides could you tell me the link to your related patch?
>>
>> This is the link: https://patchwork.kernel.org/patch/9334743/
>>

Could you give your comments on this patch?
If you have any objections to it, please let us know.

> 
> Thank you for offering this.If I want to give some comments on Marty's patch,
> what should I do?
> 
> For Marty's patch, I think there's no need to add specific config item 
> ARCH_HI
> for every chipset. Some existing chipsets depend on ARCH_HISI directly like 
> Hi3519
> and Hi3798CV200. If some options like ARM_GIC is removed from ARCH_HISI, this 
> kind
> of chipsets will must choose other place to select it. I suggest we should 
> keep selecting
> ARM_GIC under ARCH_HISI as Pan's patch do.
> 
> The code may be like this:
> 
> config ARCH_HISI
>   bool "Hisilicon SoC Support"
> - depends on ARCH_MULTI_V7
> + depends on ARCH_MULTI_V5 || ARCH_MULTI_V6 || ARCH_MULTI_V7
>   select ARM_AMBA
> - select ARM_GIC
> + select ARM_GIC if ARCH_MULTI_V7
> + select ARM_VIC if ARCH_MULTI_V5 || depends on ARCH_MULTI_V6
>   select ARM_TIMER_SP804
>   select POWER_RESET
>   select POWER_RESET_HISI
>   select POWER_SUPPLY
> 

What's your opinion about this?

Best Regards,
Jiancheng

>>> On 2016/10/17 21:48, Arnd Bergmann wrote:
 On Monday, October 17, 2016 8:07:03 PM CEST Pan Wen wrote:
> Add support for some HiSilicon SoCs which depend on ARCH_MULTI_V5.
>
> Signed-off-by: Pan Wen 
>

 Looks ok. I've added Marty Plummer to Cc, he was recently proposing
 patches for Hi3520, which I think is closely related to this one.
 Please try to work together so the patches don't conflict. It should
 be fairly straightforward since you are basically doing the same
 change here.

> 
> 
> 
> .
>

[PATCH] sched/dl: Delete the argument flags of __dequeue_task_dl()

2016-11-17 Thread T.Zhou

See @flags is not used there, so delete it.

Signed-off-by: T.Zhou 
---
 kernel/sched/deadline.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index c61b461..f276a81 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -318,7 +318,7 @@ static inline void queue_pull_task(struct rq *rq)
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags);
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p);
 static void check_preempt_curr_dl(struct rq *rq, struct task_struct *p,
  int flags);
 
@@ -744,7 +744,7 @@ static void update_curr_dl(struct rq *rq)
 throttle:
if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
dl_se->dl_throttled = 1;
-   __dequeue_task_dl(rq, curr, 0);
+   __dequeue_task_dl(rq, curr);
if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr)))
enqueue_task_dl(rq, curr, ENQUEUE_REPLENISH);
 
@@ -962,7 +962,7 @@ static void enqueue_task_dl(struct rq *rq, struct 
task_struct *p, int flags)
enqueue_pushable_dl_task(rq, p);
 }
 
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p)
 {
dequeue_dl_entity(>dl);
dequeue_pushable_dl_task(rq, p);
@@ -971,7 +971,7 @@ static void __dequeue_task_dl(struct rq *rq, struct 
task_struct *p, int flags)
 static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 {
update_curr_dl(rq);
-   __dequeue_task_dl(rq, p, flags);
+   __dequeue_task_dl(rq, p);
 }
 
 /*
-- 
2.7.3

[PATCH] sched/dl: Delete the argument flags of __dequeue_task_dl()

2016-11-17 Thread T.Zhou

See @flags is not used there, so delete it.

Signed-off-by: T.Zhou 
---
 kernel/sched/deadline.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index c61b461..f276a81 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -318,7 +318,7 @@ static inline void queue_pull_task(struct rq *rq)
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags);
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p);
 static void check_preempt_curr_dl(struct rq *rq, struct task_struct *p,
  int flags);
 
@@ -744,7 +744,7 @@ static void update_curr_dl(struct rq *rq)
 throttle:
if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
dl_se->dl_throttled = 1;
-   __dequeue_task_dl(rq, curr, 0);
+   __dequeue_task_dl(rq, curr);
if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr)))
enqueue_task_dl(rq, curr, ENQUEUE_REPLENISH);
 
@@ -962,7 +962,7 @@ static void enqueue_task_dl(struct rq *rq, struct 
task_struct *p, int flags)
enqueue_pushable_dl_task(rq, p);
 }
 
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p)
 {
dequeue_dl_entity(>dl);
dequeue_pushable_dl_task(rq, p);
@@ -971,7 +971,7 @@ static void __dequeue_task_dl(struct rq *rq, struct 
task_struct *p, int flags)
 static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 {
update_curr_dl(rq);
-   __dequeue_task_dl(rq, p, flags);
+   __dequeue_task_dl(rq, p);
 }
 
 /*
-- 
2.7.3

[PATCH 4/5] x86: remove x86_test_and_clear_bit_percpu()

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the "is_idle" flag, x86_test_and_clear_bit_percpu()
is no longer used.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/percpu.h | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 84f58de08c2b..9fa03604b2b3 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -507,17 +507,6 @@ do {   
\
 
 #endif
 
-/* This is not atomic against other CPUs -- CPU preemption needs to be off */
-#define x86_test_and_clear_bit_percpu(bit, var)
\
-({ \
-   bool old__; \
-   asm volatile("btr %2,"__percpu_arg(1)"\n\t" \
-CC_SET(c)  \
-: CC_OUT(c) (old__), "+m" (var)\
-: "dIr" (bit));\
-   old__;  \
-})
-
 static __always_inline bool x86_this_cpu_constant_test_bit(unsigned int nr,
 const unsigned long __percpu *addr)
 {
-- 
2.11.0.rc1

[PATCH 2/5] x86: remove idle_notifier

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the i7300_idle driver, the idle_notifer is unused.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/idle.h |  7 ---
 arch/x86/kernel/process.c   | 15 ---
 2 files changed, 22 deletions(-)

diff --git a/arch/x86/include/asm/idle.h b/arch/x86/include/asm/idle.h
index c5d1785373ed..02bab09707f2 100644
--- a/arch/x86/include/asm/idle.h
+++ b/arch/x86/include/asm/idle.h
@@ -1,13 +1,6 @@
 #ifndef _ASM_X86_IDLE_H
 #define _ASM_X86_IDLE_H
 
-#define IDLE_START 1
-#define IDLE_END 2
-
-struct notifier_block;
-void idle_notifier_register(struct notifier_block *n);
-void idle_notifier_unregister(struct notifier_block *n);
-
 #ifdef CONFIG_X86_64
 void enter_idle(void);
 void exit_idle(void);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0888a879120f..f51950715145 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -67,19 +67,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
 #ifdef CONFIG_X86_64
 static DEFINE_PER_CPU(unsigned char, is_idle);
-static ATOMIC_NOTIFIER_HEAD(idle_notifier);
-
-void idle_notifier_register(struct notifier_block *n)
-{
-   atomic_notifier_chain_register(_notifier, n);
-}
-EXPORT_SYMBOL_GPL(idle_notifier_register);
-
-void idle_notifier_unregister(struct notifier_block *n)
-{
-   atomic_notifier_chain_unregister(_notifier, n);
-}
-EXPORT_SYMBOL_GPL(idle_notifier_unregister);
 #endif
 
 /*
@@ -255,14 +242,12 @@ static inline void play_dead(void)
 void enter_idle(void)
 {
this_cpu_write(is_idle, 1);
-   atomic_notifier_call_chain(_notifier, IDLE_START, NULL);
 }
 
 static void __exit_idle(void)
 {
if (x86_test_and_clear_bit_percpu(0, is_idle) == 0)
return;
-   atomic_notifier_call_chain(_notifier, IDLE_END, NULL);
 }
 
 /* Called from interrupts to signify idle end */
-- 
2.11.0.rc1

[PATCH 5/5] x86: remove enter_idle(), exit_idle()

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the is_idle flag, these routines became NOPs.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/apic.h  |  1 -
 arch/x86/include/asm/idle.h  |  9 -
 arch/x86/kernel/kvm.c|  2 --
 arch/x86/kernel/process.c| 25 -
 drivers/xen/events/events_base.c |  1 -
 5 files changed, 38 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f5aaf6c83222..5731274bfdba 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -639,7 +639,6 @@ extern void irq_exit(void);
 static inline void entering_irq(void)
 {
irq_enter();
-   exit_idle();
 }
 
 static inline void entering_ack_irq(void)
diff --git a/arch/x86/include/asm/idle.h b/arch/x86/include/asm/idle.h
index 02bab09707f2..dcebb1c634f1 100644
--- a/arch/x86/include/asm/idle.h
+++ b/arch/x86/include/asm/idle.h
@@ -1,15 +1,6 @@
 #ifndef _ASM_X86_IDLE_H
 #define _ASM_X86_IDLE_H
 
-#ifdef CONFIG_X86_64
-void enter_idle(void);
-void exit_idle(void);
-#else /* !CONFIG_X86_64 */
-static inline void enter_idle(void) { }
-static inline void exit_idle(void) { }
-static inline void __exit_idle(void) { }
-#endif /* CONFIG_X86_64 */
-
 void amd_e400_remove_cpu(int cpu);
 
 #endif /* _ASM_X86_IDLE_H */
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc854e39..093f550f372d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -267,13 +267,11 @@ do_async_page_fault(struct pt_regs *regs, unsigned long 
error_code)
case KVM_PV_REASON_PAGE_NOT_PRESENT:
/* page is swapped out by the host. */
prev_state = exception_enter();
-   exit_idle();
kvm_async_pf_task_wait((u32)read_cr2());
exception_exit(prev_state);
break;
case KVM_PV_REASON_PAGE_READY:
rcu_irq_enter();
-   exit_idle();
kvm_async_pf_task_wake((u32)read_cr2());
rcu_irq_exit();
break;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index d8e9d794e114..ee023919e476 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -234,34 +234,9 @@ static inline void play_dead(void)
 }
 #endif
 
-#ifdef CONFIG_X86_64
-void enter_idle(void)
-{
-}
-
-static void __exit_idle(void)
-{
-}
-
-/* Called from interrupts to signify idle end */
-void exit_idle(void)
-{
-   /* idle loop has pid 0 */
-   if (current->pid)
-   return;
-   __exit_idle();
-}
-#endif
-
 void arch_cpu_idle_enter(void)
 {
local_touch_nmi();
-   enter_idle();
-}
-
-void arch_cpu_idle_exit(void)
-{
-   __exit_idle();
 }
 
 void arch_cpu_idle_dead(void)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 9ecfcdcdd6d6..9ad622ab05dc 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1256,7 +1256,6 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 
irq_enter();
 #ifdef CONFIG_X86
-   exit_idle();
inc_irq_stat(irq_hv_callback_count);
 #endif
 
-- 
2.11.0.rc1

[PATCH 4/5] x86: remove x86_test_and_clear_bit_percpu()

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the "is_idle" flag, x86_test_and_clear_bit_percpu()
is no longer used.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/percpu.h | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 84f58de08c2b..9fa03604b2b3 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -507,17 +507,6 @@ do {   
\
 
 #endif
 
-/* This is not atomic against other CPUs -- CPU preemption needs to be off */
-#define x86_test_and_clear_bit_percpu(bit, var)
\
-({ \
-   bool old__; \
-   asm volatile("btr %2,"__percpu_arg(1)"\n\t" \
-CC_SET(c)  \
-: CC_OUT(c) (old__), "+m" (var)\
-: "dIr" (bit));\
-   old__;  \
-})
-
 static __always_inline bool x86_this_cpu_constant_test_bit(unsigned int nr,
 const unsigned long __percpu *addr)
 {
-- 
2.11.0.rc1

[PATCH 2/5] x86: remove idle_notifier

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the i7300_idle driver, the idle_notifer is unused.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/idle.h |  7 ---
 arch/x86/kernel/process.c   | 15 ---
 2 files changed, 22 deletions(-)

diff --git a/arch/x86/include/asm/idle.h b/arch/x86/include/asm/idle.h
index c5d1785373ed..02bab09707f2 100644
--- a/arch/x86/include/asm/idle.h
+++ b/arch/x86/include/asm/idle.h
@@ -1,13 +1,6 @@
 #ifndef _ASM_X86_IDLE_H
 #define _ASM_X86_IDLE_H
 
-#define IDLE_START 1
-#define IDLE_END 2
-
-struct notifier_block;
-void idle_notifier_register(struct notifier_block *n);
-void idle_notifier_unregister(struct notifier_block *n);
-
 #ifdef CONFIG_X86_64
 void enter_idle(void);
 void exit_idle(void);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0888a879120f..f51950715145 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -67,19 +67,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
 #ifdef CONFIG_X86_64
 static DEFINE_PER_CPU(unsigned char, is_idle);
-static ATOMIC_NOTIFIER_HEAD(idle_notifier);
-
-void idle_notifier_register(struct notifier_block *n)
-{
-   atomic_notifier_chain_register(_notifier, n);
-}
-EXPORT_SYMBOL_GPL(idle_notifier_register);
-
-void idle_notifier_unregister(struct notifier_block *n)
-{
-   atomic_notifier_chain_unregister(_notifier, n);
-}
-EXPORT_SYMBOL_GPL(idle_notifier_unregister);
 #endif
 
 /*
@@ -255,14 +242,12 @@ static inline void play_dead(void)
 void enter_idle(void)
 {
this_cpu_write(is_idle, 1);
-   atomic_notifier_call_chain(_notifier, IDLE_START, NULL);
 }
 
 static void __exit_idle(void)
 {
if (x86_test_and_clear_bit_percpu(0, is_idle) == 0)
return;
-   atomic_notifier_call_chain(_notifier, IDLE_END, NULL);
 }
 
 /* Called from interrupts to signify idle end */
-- 
2.11.0.rc1

[PATCH 5/5] x86: remove enter_idle(), exit_idle()

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the is_idle flag, these routines became NOPs.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/include/asm/apic.h  |  1 -
 arch/x86/include/asm/idle.h  |  9 -
 arch/x86/kernel/kvm.c|  2 --
 arch/x86/kernel/process.c| 25 -
 drivers/xen/events/events_base.c |  1 -
 5 files changed, 38 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f5aaf6c83222..5731274bfdba 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -639,7 +639,6 @@ extern void irq_exit(void);
 static inline void entering_irq(void)
 {
irq_enter();
-   exit_idle();
 }
 
 static inline void entering_ack_irq(void)
diff --git a/arch/x86/include/asm/idle.h b/arch/x86/include/asm/idle.h
index 02bab09707f2..dcebb1c634f1 100644
--- a/arch/x86/include/asm/idle.h
+++ b/arch/x86/include/asm/idle.h
@@ -1,15 +1,6 @@
 #ifndef _ASM_X86_IDLE_H
 #define _ASM_X86_IDLE_H
 
-#ifdef CONFIG_X86_64
-void enter_idle(void);
-void exit_idle(void);
-#else /* !CONFIG_X86_64 */
-static inline void enter_idle(void) { }
-static inline void exit_idle(void) { }
-static inline void __exit_idle(void) { }
-#endif /* CONFIG_X86_64 */
-
 void amd_e400_remove_cpu(int cpu);
 
 #endif /* _ASM_X86_IDLE_H */
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc854e39..093f550f372d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -267,13 +267,11 @@ do_async_page_fault(struct pt_regs *regs, unsigned long 
error_code)
case KVM_PV_REASON_PAGE_NOT_PRESENT:
/* page is swapped out by the host. */
prev_state = exception_enter();
-   exit_idle();
kvm_async_pf_task_wait((u32)read_cr2());
exception_exit(prev_state);
break;
case KVM_PV_REASON_PAGE_READY:
rcu_irq_enter();
-   exit_idle();
kvm_async_pf_task_wake((u32)read_cr2());
rcu_irq_exit();
break;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index d8e9d794e114..ee023919e476 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -234,34 +234,9 @@ static inline void play_dead(void)
 }
 #endif
 
-#ifdef CONFIG_X86_64
-void enter_idle(void)
-{
-}
-
-static void __exit_idle(void)
-{
-}
-
-/* Called from interrupts to signify idle end */
-void exit_idle(void)
-{
-   /* idle loop has pid 0 */
-   if (current->pid)
-   return;
-   __exit_idle();
-}
-#endif
-
 void arch_cpu_idle_enter(void)
 {
local_touch_nmi();
-   enter_idle();
-}
-
-void arch_cpu_idle_exit(void)
-{
-   __exit_idle();
 }
 
 void arch_cpu_idle_dead(void)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 9ecfcdcdd6d6..9ad622ab05dc 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1256,7 +1256,6 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 
irq_enter();
 #ifdef CONFIG_X86
-   exit_idle();
inc_irq_stat(irq_hv_callback_count);
 #endif
 
-- 
2.11.0.rc1

[PATCH 1/5] i7300_idle: remove this driver

2016-11-17 Thread Len Brown

From: Len Brown 

In preparation for removing the idle_notifier,
remove its only user, the i7300_idle driver.

i7300_idle was deployed in 2008 to reduce idle memory power
on systems using the i7300 chipset.  The driver worked by throttling
the fully-buffered DIMMs during idle periods using the IOAT DMA engine.

The driver ran only on the i7300 chip-set, and no other hardware
has used this mechanism.  The driver no longer has a maintainer.

Removing this driver will increase idle power on i7300 systems
when they run the new kernel without the driver.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 MAINTAINERS  |   6 -
 drivers/dma/ioat/registers.h |   2 -
 drivers/idle/Kconfig |  17 --
 drivers/idle/Makefile|   1 -
 drivers/idle/i7300_idle.c| 612 ---
 5 files changed, 638 deletions(-)
 delete mode 100644 drivers/idle/i7300_idle.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2a58eeac9452..3cdccf5b64f3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6087,12 +6087,6 @@ S:   Maintained
 F: Documentation/cdrom/ide-cd
 F: drivers/ide/ide-cd*
 
-IDLE-I7300
-M: Andy Henroid 
-L: linux...@vger.kernel.org
-S: Supported
-F: drivers/idle/i7300_idle.c
-
 IEEE 802.15.4 SUBSYSTEM
 M: Alexander Aring 
 L: linux-w...@vger.kernel.org
diff --git a/drivers/dma/ioat/registers.h b/drivers/dma/ioat/registers.h
index 48fa4cf9f64a..2f3bbc88ff2a 100644
--- a/drivers/dma/ioat/registers.h
+++ b/drivers/dma/ioat/registers.h
@@ -106,8 +106,6 @@
 #define IOAT_DMA_COMP_V1   0x0001  /* Compatibility with 
DMA version 1 */
 #define IOAT_DMA_COMP_V2   0x0002  /* Compatibility with 
DMA version 2 */
 
-/* IOAT1 define left for i7300_idle driver to not fail compiling */
-#define IOAT1_CHANSTS_OFFSET   0x04
 #define IOAT_CHANSTS_OFFSET0x08/* 64-bit Channel Status 
Register */
 #define IOAT_CHANSTS_COMPLETED_DESCRIPTOR_ADDR (~0x3fULL)
 #define IOAT_CHANSTS_SOFT_ERR  0x10ULL
diff --git a/drivers/idle/Kconfig b/drivers/idle/Kconfig
index 4732dfc15447..55bcf803841e 100644
--- a/drivers/idle/Kconfig
+++ b/drivers/idle/Kconfig
@@ -8,20 +8,3 @@ config INTEL_IDLE
  native Intel hardware idle features.  The acpi_idle driver
  can be configured at the same time, in order to handle
  processors intel_idle does not support.
-
-menu "Memory power savings"
-depends on X86_64
-
-config I7300_IDLE_IOAT_CHANNEL
-   bool
-
-config I7300_IDLE
-   tristate "Intel chipset idle memory power saving driver"
-   select I7300_IDLE_IOAT_CHANNEL
-   help
- Enable memory power savings when idle with certain Intel server
- chipsets. The chipset must have I/O AT support, such as the
- Intel 7300. The power savings depends on the type and quantity of
- DRAM devices.
-
-endmenu
diff --git a/drivers/idle/Makefile b/drivers/idle/Makefile
index 23d295cf10f2..0007111d73e9 100644
--- a/drivers/idle/Makefile
+++ b/drivers/idle/Makefile
@@ -1,3 +1,2 @@
-obj-$(CONFIG_I7300_IDLE)   += i7300_idle.o
 obj-$(CONFIG_INTEL_IDLE)   += intel_idle.o
 
diff --git a/drivers/idle/i7300_idle.c b/drivers/idle/i7300_idle.c
deleted file mode 100644
index ffeebc7e9f1c..
--- a/drivers/idle/i7300_idle.c
+++ /dev/null
@@ -1,612 +0,0 @@
-/*
- * (C) Copyright 2008 Intel Corporation
- * Authors:
- * Andy Henroid 
- * Venkatesh Pallipadi 
- */
-
-/*
- * Save DIMM power on Intel 7300-based platforms when all CPUs/cores
- * are idle, using the DIMM thermal throttling capability.
- *
- * This driver depends on the Intel integrated DMA controller (I/O AT).
- * If the driver for I/O AT (drivers/dma/ioatdma*) is also enabled,
- * this driver should work cooperatively.
- */
-
-/* #define DEBUG */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-#include "../dma/ioat/hw.h"
-#include "../dma/ioat/registers.h"
-
-#define I7300_IDLE_DRIVER_VERSION  "1.55"
-#define I7300_PRINT"i7300_idle:"
-
-#define MAX_STOP_RETRIES   10
-
-static int debug;
-module_param_named(debug, debug, uint, 0644);
-MODULE_PARM_DESC(debug, "Enable debug printks in this driver");
-
-static int forceload;
-module_param_named(forceload, forceload, uint, 0644);
-MODULE_PARM_DESC(debug, "Enable driver testing on unvalidated i5000");
-
-#define dprintk(fmt, arg...) \
-   do { if (debug) printk(KERN_INFO I7300_PRINT fmt, ##arg); } while (0)
-
-/*
- * Value to set THRTLOW to when initiating throttling
- *  0 = No throttling
- *  1 = Throttle when > 4 activations per eval window (Maximum throttling)
- *  2 = Throttle when > 8 activations
-

[PATCH 1/5] i7300_idle: remove this driver

2016-11-17 Thread Len Brown

From: Len Brown 

In preparation for removing the idle_notifier,
remove its only user, the i7300_idle driver.

i7300_idle was deployed in 2008 to reduce idle memory power
on systems using the i7300 chipset.  The driver worked by throttling
the fully-buffered DIMMs during idle periods using the IOAT DMA engine.

The driver ran only on the i7300 chip-set, and no other hardware
has used this mechanism.  The driver no longer has a maintainer.

Removing this driver will increase idle power on i7300 systems
when they run the new kernel without the driver.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 MAINTAINERS  |   6 -
 drivers/dma/ioat/registers.h |   2 -
 drivers/idle/Kconfig |  17 --
 drivers/idle/Makefile|   1 -
 drivers/idle/i7300_idle.c| 612 ---
 5 files changed, 638 deletions(-)
 delete mode 100644 drivers/idle/i7300_idle.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2a58eeac9452..3cdccf5b64f3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6087,12 +6087,6 @@ S:   Maintained
 F: Documentation/cdrom/ide-cd
 F: drivers/ide/ide-cd*
 
-IDLE-I7300
-M: Andy Henroid 
-L: linux...@vger.kernel.org
-S: Supported
-F: drivers/idle/i7300_idle.c
-
 IEEE 802.15.4 SUBSYSTEM
 M: Alexander Aring 
 L: linux-w...@vger.kernel.org
diff --git a/drivers/dma/ioat/registers.h b/drivers/dma/ioat/registers.h
index 48fa4cf9f64a..2f3bbc88ff2a 100644
--- a/drivers/dma/ioat/registers.h
+++ b/drivers/dma/ioat/registers.h
@@ -106,8 +106,6 @@
 #define IOAT_DMA_COMP_V1   0x0001  /* Compatibility with 
DMA version 1 */
 #define IOAT_DMA_COMP_V2   0x0002  /* Compatibility with 
DMA version 2 */
 
-/* IOAT1 define left for i7300_idle driver to not fail compiling */
-#define IOAT1_CHANSTS_OFFSET   0x04
 #define IOAT_CHANSTS_OFFSET0x08/* 64-bit Channel Status 
Register */
 #define IOAT_CHANSTS_COMPLETED_DESCRIPTOR_ADDR (~0x3fULL)
 #define IOAT_CHANSTS_SOFT_ERR  0x10ULL
diff --git a/drivers/idle/Kconfig b/drivers/idle/Kconfig
index 4732dfc15447..55bcf803841e 100644
--- a/drivers/idle/Kconfig
+++ b/drivers/idle/Kconfig
@@ -8,20 +8,3 @@ config INTEL_IDLE
  native Intel hardware idle features.  The acpi_idle driver
  can be configured at the same time, in order to handle
  processors intel_idle does not support.
-
-menu "Memory power savings"
-depends on X86_64
-
-config I7300_IDLE_IOAT_CHANNEL
-   bool
-
-config I7300_IDLE
-   tristate "Intel chipset idle memory power saving driver"
-   select I7300_IDLE_IOAT_CHANNEL
-   help
- Enable memory power savings when idle with certain Intel server
- chipsets. The chipset must have I/O AT support, such as the
- Intel 7300. The power savings depends on the type and quantity of
- DRAM devices.
-
-endmenu
diff --git a/drivers/idle/Makefile b/drivers/idle/Makefile
index 23d295cf10f2..0007111d73e9 100644
--- a/drivers/idle/Makefile
+++ b/drivers/idle/Makefile
@@ -1,3 +1,2 @@
-obj-$(CONFIG_I7300_IDLE)   += i7300_idle.o
 obj-$(CONFIG_INTEL_IDLE)   += intel_idle.o
 
diff --git a/drivers/idle/i7300_idle.c b/drivers/idle/i7300_idle.c
deleted file mode 100644
index ffeebc7e9f1c..
--- a/drivers/idle/i7300_idle.c
+++ /dev/null
@@ -1,612 +0,0 @@
-/*
- * (C) Copyright 2008 Intel Corporation
- * Authors:
- * Andy Henroid 
- * Venkatesh Pallipadi 
- */
-
-/*
- * Save DIMM power on Intel 7300-based platforms when all CPUs/cores
- * are idle, using the DIMM thermal throttling capability.
- *
- * This driver depends on the Intel integrated DMA controller (I/O AT).
- * If the driver for I/O AT (drivers/dma/ioatdma*) is also enabled,
- * this driver should work cooperatively.
- */
-
-/* #define DEBUG */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-#include "../dma/ioat/hw.h"
-#include "../dma/ioat/registers.h"
-
-#define I7300_IDLE_DRIVER_VERSION  "1.55"
-#define I7300_PRINT"i7300_idle:"
-
-#define MAX_STOP_RETRIES   10
-
-static int debug;
-module_param_named(debug, debug, uint, 0644);
-MODULE_PARM_DESC(debug, "Enable debug printks in this driver");
-
-static int forceload;
-module_param_named(forceload, forceload, uint, 0644);
-MODULE_PARM_DESC(debug, "Enable driver testing on unvalidated i5000");
-
-#define dprintk(fmt, arg...) \
-   do { if (debug) printk(KERN_INFO I7300_PRINT fmt, ##arg); } while (0)
-
-/*
- * Value to set THRTLOW to when initiating throttling
- *  0 = No throttling
- *  1 = Throttle when > 4 activations per eval window (Maximum throttling)
- *  2 = Throttle when > 8 activations
- *  168 = Throttle when > 672 activations (Minimum throttling)
- */
-#define MAX_THROTTLE_LOW_LIMIT 168
-static uint throttle_low_limit = 1;

[PATCH 0/5] x86: remove idle notifier

2016-11-17 Thread Len Brown

The return from idle path is latency sensitive,
so the less code and fewer data accesses, the better.

Remove the un-maintained i7300_idle driver,
the only user of the x86 idle-notifier,
and then remove the notifier itself.

[PATCH 3/5] x86: remove is_idle flag

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the idle_notifier, all accesses to
the "is_idle" flag serve no purpose.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/kernel/process.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index f51950715145..d8e9d794e114 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -65,10 +65,6 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, 
cpu_tss) = {
 };
 EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
-#ifdef CONFIG_X86_64
-static DEFINE_PER_CPU(unsigned char, is_idle);
-#endif
-
 /*
  * this gets called so that we can store lazy state into memory and copy the
  * current task into the new thread.
@@ -241,13 +237,10 @@ static inline void play_dead(void)
 #ifdef CONFIG_X86_64
 void enter_idle(void)
 {
-   this_cpu_write(is_idle, 1);
 }
 
 static void __exit_idle(void)
 {
-   if (x86_test_and_clear_bit_percpu(0, is_idle) == 0)
-   return;
 }
 
 /* Called from interrupts to signify idle end */
-- 
2.11.0.rc1

[PATCH 0/5] x86: remove idle notifier

2016-11-17 Thread Len Brown

The return from idle path is latency sensitive,
so the less code and fewer data accesses, the better.

Remove the un-maintained i7300_idle driver,
the only user of the x86 idle-notifier,
and then remove the notifier itself.

[PATCH 3/5] x86: remove is_idle flag

2016-11-17 Thread Len Brown

From: Len Brown 

Upon removal of the idle_notifier, all accesses to
the "is_idle" flag serve no purpose.

Signed-off-by: Len Brown 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/kernel/process.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index f51950715145..d8e9d794e114 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -65,10 +65,6 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, 
cpu_tss) = {
 };
 EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
-#ifdef CONFIG_X86_64
-static DEFINE_PER_CPU(unsigned char, is_idle);
-#endif
-
 /*
  * this gets called so that we can store lazy state into memory and copy the
  * current task into the new thread.
@@ -241,13 +237,10 @@ static inline void play_dead(void)
 #ifdef CONFIG_X86_64
 void enter_idle(void)
 {
-   this_cpu_write(is_idle, 1);
 }
 
 static void __exit_idle(void)
 {
-   if (x86_test_and_clear_bit_percpu(0, is_idle) == 0)
-   return;
 }
 
 /* Called from interrupts to signify idle end */
-- 
2.11.0.rc1

[RFC PATCH v2 3/7] iio: inkern: api for manipulating ext_info of iio channels

2016-11-17 Thread Peter Rosin

Extend the inkern api with functions for reading and writing ext_info
of iio channels.
---
 drivers/iio/inkern.c | 55 
 include/linux/iio/consumer.h |  6 +
 2 files changed, 61 insertions(+)

diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
index cfca17ba2535..a8099b164222 100644
--- a/drivers/iio/inkern.c
+++ b/drivers/iio/inkern.c
@@ -850,3 +850,58 @@ int iio_write_channel_raw(struct iio_channel *chan, int 
val)
return ret;
 }
 EXPORT_SYMBOL_GPL(iio_write_channel_raw);
+
+int iio_get_channel_ext_info_count(struct iio_channel *chan)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+   unsigned int i = 0;
+
+   if (!chan->channel->ext_info)
+   return i;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ext_info++)
+   ++i;
+
+   return i;
+}
+EXPORT_SYMBOL_GPL(iio_get_channel_ext_info_count);
+
+ssize_t iio_read_channel_ext_info(struct iio_channel *chan,
+ const char *attr, char *buf)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+
+   if (!chan->channel->ext_info)
+   return -EINVAL;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ++ext_info) {
+   if (strcmp(attr, ext_info->name))
+   continue;
+
+   return ext_info->read(chan->indio_dev, ext_info->private,
+ chan->channel, buf);
+   }
+
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(iio_read_channel_ext_info);
+
+ssize_t iio_write_channel_ext_info(struct iio_channel *chan, const char *attr,
+  const char *buf, size_t len)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+
+   if (!chan->channel->ext_info)
+   return -EINVAL;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ++ext_info) {
+   if (strcmp(attr, ext_info->name))
+   continue;
+
+   return ext_info->write(chan->indio_dev, ext_info->private,
+  chan->channel, buf, len);
+   }
+
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(iio_write_channel_ext_info);
diff --git a/include/linux/iio/consumer.h b/include/linux/iio/consumer.h
index 9a4f336d8b4a..471dece8729a 100644
--- a/include/linux/iio/consumer.h
+++ b/include/linux/iio/consumer.h
@@ -299,4 +299,10 @@ int iio_read_channel_scale(struct iio_channel *chan, int 
*val,
 int iio_convert_raw_to_processed(struct iio_channel *chan, int raw,
int *processed, unsigned int scale);
 
+int iio_get_channel_ext_info_count(struct iio_channel *chan);
+ssize_t iio_read_channel_ext_info(struct iio_channel *chan,
+ const char *attr, char *buf);
+ssize_t iio_write_channel_ext_info(struct iio_channel *chan, const char *attr,
+  const char *buf, size_t len);
+
 #endif
-- 
2.1.4

Re: [RFC PATCH v2 2/2] module: When modifying a module's text ignore modules which are going away too

2016-11-17 Thread Rusty Russell

Aaron Tomlin  writes:
> By default, during the access permission modification of a module's core
> and init pages, we only ignore modules that are malformed. Albeit for a
> module which is going away, it does not make sense to change its text to
> RO since the module should be RW, before deallocation.
>
> This patch makes set_all_modules_text_ro() skip modules which are going
> away too.
>
> Signed-off-by: Aaron Tomlin 

Acked-by: Rusty Russell 

Thanks!
Rusty.

> ---
>  kernel/module.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index ff93ab8..2a383df 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -1969,7 +1969,8 @@ void set_all_modules_text_ro(void)
>  
>   mutex_lock(_mutex);
>   list_for_each_entry_rcu(mod, , list) {
> - if (mod->state == MODULE_STATE_UNFORMED)
> + if (mod->state == MODULE_STATE_UNFORMED ||
> + mod->state == MODULE_STATE_GOING)
>   continue;
>  
>   frob_text(>core_layout, set_memory_ro);
> -- 
> 2.5.5

[RFC PATCH v2 3/7] iio: inkern: api for manipulating ext_info of iio channels

2016-11-17 Thread Peter Rosin

Extend the inkern api with functions for reading and writing ext_info
of iio channels.
---
 drivers/iio/inkern.c | 55 
 include/linux/iio/consumer.h |  6 +
 2 files changed, 61 insertions(+)

diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
index cfca17ba2535..a8099b164222 100644
--- a/drivers/iio/inkern.c
+++ b/drivers/iio/inkern.c
@@ -850,3 +850,58 @@ int iio_write_channel_raw(struct iio_channel *chan, int 
val)
return ret;
 }
 EXPORT_SYMBOL_GPL(iio_write_channel_raw);
+
+int iio_get_channel_ext_info_count(struct iio_channel *chan)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+   unsigned int i = 0;
+
+   if (!chan->channel->ext_info)
+   return i;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ext_info++)
+   ++i;
+
+   return i;
+}
+EXPORT_SYMBOL_GPL(iio_get_channel_ext_info_count);
+
+ssize_t iio_read_channel_ext_info(struct iio_channel *chan,
+ const char *attr, char *buf)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+
+   if (!chan->channel->ext_info)
+   return -EINVAL;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ++ext_info) {
+   if (strcmp(attr, ext_info->name))
+   continue;
+
+   return ext_info->read(chan->indio_dev, ext_info->private,
+ chan->channel, buf);
+   }
+
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(iio_read_channel_ext_info);
+
+ssize_t iio_write_channel_ext_info(struct iio_channel *chan, const char *attr,
+  const char *buf, size_t len)
+{
+   const struct iio_chan_spec_ext_info *ext_info;
+
+   if (!chan->channel->ext_info)
+   return -EINVAL;
+
+   for (ext_info = chan->channel->ext_info; ext_info->name; ++ext_info) {
+   if (strcmp(attr, ext_info->name))
+   continue;
+
+   return ext_info->write(chan->indio_dev, ext_info->private,
+  chan->channel, buf, len);
+   }
+
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(iio_write_channel_ext_info);
diff --git a/include/linux/iio/consumer.h b/include/linux/iio/consumer.h
index 9a4f336d8b4a..471dece8729a 100644
--- a/include/linux/iio/consumer.h
+++ b/include/linux/iio/consumer.h
@@ -299,4 +299,10 @@ int iio_read_channel_scale(struct iio_channel *chan, int 
*val,
 int iio_convert_raw_to_processed(struct iio_channel *chan, int raw,
int *processed, unsigned int scale);
 
+int iio_get_channel_ext_info_count(struct iio_channel *chan);
+ssize_t iio_read_channel_ext_info(struct iio_channel *chan,
+ const char *attr, char *buf);
+ssize_t iio_write_channel_ext_info(struct iio_channel *chan, const char *attr,
+  const char *buf, size_t len);
+
 #endif
-- 
2.1.4

Re: [RFC PATCH v2 2/2] module: When modifying a module's text ignore modules which are going away too

2016-11-17 Thread Rusty Russell

Aaron Tomlin  writes:
> By default, during the access permission modification of a module's core
> and init pages, we only ignore modules that are malformed. Albeit for a
> module which is going away, it does not make sense to change its text to
> RO since the module should be RW, before deallocation.
>
> This patch makes set_all_modules_text_ro() skip modules which are going
> away too.
>
> Signed-off-by: Aaron Tomlin 

Acked-by: Rusty Russell 

Thanks!
Rusty.

> ---
>  kernel/module.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index ff93ab8..2a383df 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -1969,7 +1969,8 @@ void set_all_modules_text_ro(void)
>  
>   mutex_lock(_mutex);
>   list_for_each_entry_rcu(mod, , list) {
> - if (mod->state == MODULE_STATE_UNFORMED)
> + if (mod->state == MODULE_STATE_UNFORMED ||
> + mod->state == MODULE_STATE_GOING)
>   continue;
>  
>   frob_text(>core_layout, set_memory_ro);
> -- 
> 2.5.5

Re: [PATCH v4] mm: don't cap request size based on read-ahead setting

2016-11-17 Thread Hillf Danton

On Friday, November 18, 2016 5:23 AM Jens Axboe wrote: 
> 
> We ran into a funky issue, where someone doing 256K buffered reads saw
> 128K requests at the device level. Turns out it is read-ahead capping
> the request size, since we use 128K as the default setting. This doesn't
> make a lot of sense - if someone is issuing 256K reads, they should see
> 256K reads, regardless of the read-ahead setting, if the underlying
> device can support a 256K read in a single command.
> 
> To make matters more confusing, there's an odd interaction with the
> fadvise hint setting. If we tell the kernel we're doing sequential IO on
> this file descriptor, we can get twice the read-ahead size. But if we
> tell the kernel that we are doing random IO, hence disabling read-ahead,
> we do get nice 256K requests at the lower level. This is because
> ondemand and forced read-ahead behave differently, with the latter doing
> the right thing. 

As far as I read, forced RA is innocent but it is corrected below. 
And with RA disabled, we should drop care of ondemand.

I'm scratching.

> An application developer will be, rightfully,
> scratching his head at this point, wondering wtf is going on. A good one
> will dive into the kernel source, and silently weep.
> 
> This patch introduces a bdi hint, io_pages. This is the soft max IO size
> for the lower level, I've hooked it up to the bdev settings here.
> Read-ahead is modified to issue the maximum of the user request size,
> and the read-ahead max size, but capped to the max request size on the
> device side. The latter is done to avoid reading ahead too much, if the
> application asks for a huge read. With this patch, the kernel behaves
> like the application expects.
> 
> Signed-off-by: Jens Axboe 
> 
> ---
> 
> Changes since v3:
> 
> - Went over it with Johannes, cleaned up the the logic as a result
> 
> Changes since v2:
> 
> - Fix up the last minute typo on io_pages (Johannes/Hillf)
> - Apply the same limit to force_page_cache_readahead().
> 
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index f679ae1..65f16cf 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -249,6 +249,7 @@ void blk_queue_max_hw_sectors(struct request_queue
> *q, unsigned int max_hw_secto
>   max_sectors = min_not_zero(max_hw_sectors, limits->max_dev_sectors);
>   max_sectors = min_t(unsigned int, max_sectors, BLK_DEF_MAX_SECTORS);
>   limits->max_sectors = max_sectors;
> + q->backing_dev_info.io_pages = max_sectors >> (PAGE_SHIFT - 9);
>   }
>   EXPORT_SYMBOL(blk_queue_max_hw_sectors);
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 9cc8d7c..ea374e8 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -212,6 +212,7 @@ queue_max_sectors_store(struct request_queue *q,
> const char *page, size_t count)
> 
>   spin_lock_irq(q->queue_lock);
>   q->limits.max_sectors = max_sectors_kb << 1;
> + q->backing_dev_info.io_pages = max_sectors_kb >> (PAGE_SHIFT - 10);
>   spin_unlock_irq(q->queue_lock);
> 
>   return ret;
> diff --git a/include/linux/backing-dev-defs.h
> b/include/linux/backing-dev-defs.h
> index c357f27..b8144b2 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -136,6 +136,7 @@ struct bdi_writeback {
>   struct backing_dev_info {
>   struct list_head bdi_list;
>   unsigned long ra_pages; /* max readahead in PAGE_SIZE units */
> + unsigned long io_pages; /* max allowed IO size */
>   unsigned int capabilities; /* Device capabilities */
>   congested_fn *congested_fn; /* Function pointer if device is md/dm */
>   void *congested_data;   /* Pointer to aux data for congested func */
> diff --git a/mm/readahead.c b/mm/readahead.c
> index c8a955b..344c1da 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -207,12 +207,17 @@ int __do_page_cache_readahead(struct address_space
> *mapping, struct file *filp,
>* memory at once.
>*/
>   int force_page_cache_readahead(struct address_space *mapping, struct
> file *filp,
> - pgoff_t offset, unsigned long nr_to_read)
> +pgoff_t offset, unsigned long nr_to_read)
>   {
> + struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
> + struct file_ra_state *ra = >f_ra;
> + unsigned long max_pages;
> +
>   if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages))
>   return -EINVAL;
> 
> - nr_to_read = min(nr_to_read, inode_to_bdi(mapping->host)->ra_pages);
> + max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages);
> + nr_to_read = min(nr_to_read, max_pages);
>   while (nr_to_read) {
>   int err;
> 
> @@ -369,10 +374,18 @@ ondemand_readahead(struct address_space *mapping,
>  bool hit_readahead_marker, pgoff_t offset,
>  unsigned long req_size)
>   {
> - unsigned long max = ra->ra_pages;
> + struct backing_dev_info *bdi =

Re: [PATCH v4] mm: don't cap request size based on read-ahead setting

2016-11-17 Thread Hillf Danton

On Friday, November 18, 2016 5:23 AM Jens Axboe wrote: 
> 
> We ran into a funky issue, where someone doing 256K buffered reads saw
> 128K requests at the device level. Turns out it is read-ahead capping
> the request size, since we use 128K as the default setting. This doesn't
> make a lot of sense - if someone is issuing 256K reads, they should see
> 256K reads, regardless of the read-ahead setting, if the underlying
> device can support a 256K read in a single command.
> 
> To make matters more confusing, there's an odd interaction with the
> fadvise hint setting. If we tell the kernel we're doing sequential IO on
> this file descriptor, we can get twice the read-ahead size. But if we
> tell the kernel that we are doing random IO, hence disabling read-ahead,
> we do get nice 256K requests at the lower level. This is because
> ondemand and forced read-ahead behave differently, with the latter doing
> the right thing. 

As far as I read, forced RA is innocent but it is corrected below. 
And with RA disabled, we should drop care of ondemand.

I'm scratching.

> An application developer will be, rightfully,
> scratching his head at this point, wondering wtf is going on. A good one
> will dive into the kernel source, and silently weep.
> 
> This patch introduces a bdi hint, io_pages. This is the soft max IO size
> for the lower level, I've hooked it up to the bdev settings here.
> Read-ahead is modified to issue the maximum of the user request size,
> and the read-ahead max size, but capped to the max request size on the
> device side. The latter is done to avoid reading ahead too much, if the
> application asks for a huge read. With this patch, the kernel behaves
> like the application expects.
> 
> Signed-off-by: Jens Axboe 
> 
> ---
> 
> Changes since v3:
> 
> - Went over it with Johannes, cleaned up the the logic as a result
> 
> Changes since v2:
> 
> - Fix up the last minute typo on io_pages (Johannes/Hillf)
> - Apply the same limit to force_page_cache_readahead().
> 
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index f679ae1..65f16cf 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -249,6 +249,7 @@ void blk_queue_max_hw_sectors(struct request_queue
> *q, unsigned int max_hw_secto
>   max_sectors = min_not_zero(max_hw_sectors, limits->max_dev_sectors);
>   max_sectors = min_t(unsigned int, max_sectors, BLK_DEF_MAX_SECTORS);
>   limits->max_sectors = max_sectors;
> + q->backing_dev_info.io_pages = max_sectors >> (PAGE_SHIFT - 9);
>   }
>   EXPORT_SYMBOL(blk_queue_max_hw_sectors);
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 9cc8d7c..ea374e8 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -212,6 +212,7 @@ queue_max_sectors_store(struct request_queue *q,
> const char *page, size_t count)
> 
>   spin_lock_irq(q->queue_lock);
>   q->limits.max_sectors = max_sectors_kb << 1;
> + q->backing_dev_info.io_pages = max_sectors_kb >> (PAGE_SHIFT - 10);
>   spin_unlock_irq(q->queue_lock);
> 
>   return ret;
> diff --git a/include/linux/backing-dev-defs.h
> b/include/linux/backing-dev-defs.h
> index c357f27..b8144b2 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -136,6 +136,7 @@ struct bdi_writeback {
>   struct backing_dev_info {
>   struct list_head bdi_list;
>   unsigned long ra_pages; /* max readahead in PAGE_SIZE units */
> + unsigned long io_pages; /* max allowed IO size */
>   unsigned int capabilities; /* Device capabilities */
>   congested_fn *congested_fn; /* Function pointer if device is md/dm */
>   void *congested_data;   /* Pointer to aux data for congested func */
> diff --git a/mm/readahead.c b/mm/readahead.c
> index c8a955b..344c1da 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -207,12 +207,17 @@ int __do_page_cache_readahead(struct address_space
> *mapping, struct file *filp,
>* memory at once.
>*/
>   int force_page_cache_readahead(struct address_space *mapping, struct
> file *filp,
> - pgoff_t offset, unsigned long nr_to_read)
> +pgoff_t offset, unsigned long nr_to_read)
>   {
> + struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
> + struct file_ra_state *ra = >f_ra;
> + unsigned long max_pages;
> +
>   if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages))
>   return -EINVAL;
> 
> - nr_to_read = min(nr_to_read, inode_to_bdi(mapping->host)->ra_pages);
> + max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages);
> + nr_to_read = min(nr_to_read, max_pages);
>   while (nr_to_read) {
>   int err;
> 
> @@ -369,10 +374,18 @@ ondemand_readahead(struct address_space *mapping,
>  bool hit_readahead_marker, pgoff_t offset,
>  unsigned long req_size)
>   {
> - unsigned long max = ra->ra_pages;
> + struct backing_dev_info *bdi =

[patch v8 1/1] i2c: add master driver for mellanox systems

2016-11-17 Thread vadimp

From: Vadim Pasternak 

Device driver for Mellanox I2C controller logic, implemented in Lattice
CPLD device.
Device supports:
 - Master mode
 - One physical bus
 - Polling mode

The Kconfig currently controlling compilation of this code is:
drivers/i2c/busses/Kconfig:config I2C_MLXCPLD

Signed-off-by: Michael Shych 
Signed-off-by: Vadim Pasternak 
Reviewed-by: Jiri Pirko 
Reviewed-by: Vladimir Zapolskiy 
---
v7->v8
 Comments pointed out by Wolfram:
 - Remove descriptions for two structures, since the members have
   self-explaining names;
 - Populate the structure i2c_adapter_quirks, so core will make length
   validation, as a result remove mlxcpld_i2c_invalid_len,
   and move common length calculation to mlxcpld_i2c_xfer. Consider
   removing of mlxcpld_i2c_check_msg_params after core quirks is able
   to handle messages full validation;
 - Use ENXIO error code for NACK;
v6->v7
 Comments pointed out by Peter:
 - Fix grammar in doc file;
 - Fix description for CMD in doc file;
 - Fix grammar for NUM_DATA, DATAx in doc file;
 - Rename mlxcpld_i2c_curr_transf to mlxcpld_i2c_curr_xfer;
 - Make code in mlxcpld_i2c_lpc_write_buf more readable and
   mlxcpld_i2c_lpc_read_buf, compact and elegant;
 - Fix multiline comments;
v5->v6:
 Comments pointed out by Vladimir:
 - Drop the line with module path from the header;
 - In description of mlxcpld_i2c_priv remove lpc_gen_dec_reg asnd dev_id;
 - In mlxcpld_i2c_priv change type of the filed base_addr to u16 for
   the alignment with in/out and remove unused dev_id;
 - Fix misspelling in comment for mlxcpld_i2c_invalid_len;
 - Remove comment regarding EBUSY return in mlxcpld_i2c_check_busy;
 - Use sizeof of the target storage in allocation in probe routine;
v4->v5:
 Comments pointed out by Vladimir:
 - Remove "default n" from Kconfig;
 - Fix the comments for timeout and pool time;
 - Optimize error flow in mlxcpld_i2c_probe;
v3->v4:
 Comments pointed out by Vladimir:
 - Set default to no in Kconfig;
 - Make mlxcpld_i2c_plat_dev static and add empty line before the
   declaration;
 - In function mlxcpld_i2c_invalid_len remove (msg->len < 0), since len is
   unsigned;
 - Remove unused symbol mlxcpld_i2c_plat_dev;
 - Remove extra spaces in comments to mlxcpld_i2c_check_msg_params;
 - Remove unnecessary round braces in mlxcpld_i2c_set_transf_data;
 - Remove the assignment of 'i' variable in mlxcpld_i2c_wait_for_tc;
 - Add extra line in mlxcpld_i2c_xfer;
 - Move assignment of the adapter's fields retries and nr inside
   mlxcpld_i2c_adapter declaration;
v2->v3:
 Comments pointed out by Vladimir:
 - Use tab symbol as indentation in Kconfig
 - Add the Kconfig section preserving the alphabetical order - added
   within "Other I2C/SMBus bus drivers" after I2C_ELEKTOR (but after this
   sections others are not follow alphabetical);
 - Change license to dual;
 - Replace ADRR with ADDR in macros;
 - Remove unused macros: MLXCPLD_LPCI2C_LPF_DFLT,
   MLXCPLD_LPCI2C_HALF_CYC_100, MLXCPLD_LPCI2C_I2C_HOLD_100,
   MLXCPLD_LPCI2C_HALF_CYC_REG, MLXCPLD_LPCI2C_I2C_HOLD_REG;
 - Fix checkpatch warnings (**/ and the end of comment);
 - Add empty line before structures mlxcpld_i2c_regs,
   mlxcpld_i2c_curr_transf, mlxcpld_i2c_priv;
 - Remove unused structure mlxcpld_i2c_regs;
 - Remove from mlxcpld_i2c_priv the next fields:
   retr_num, poll_time, block_sz, xfer_to; use instead macros
   respectively: MLXCPLD_I2C_RETR_NUM, MLXCPLD_I2C_POLL_TIME,
   MLXCPLD_I2C_DATA_REG_SZ, MLXCPLD_I2C_XFER_TO;
 - In mlxcpld_i2c_invalid_len remove unnecessary else;
 - Optimize mlxcpld_i2c_set_transf_data;
 - mlxcpld_i2c_reset - add empty lines after/before mutex
   lock/unlock;
 - mlxcpld_i2c_wait_for_free - cover case timeout is equal
   MLXCPLD_I2C_XFER_TO;
 - mlxcpld_i2c_wait_for_tc:
   - Do not assign err in declaration (also err is removed);
   - Insert empty line before case MLXCPLD_LPCI2C_ACK_IND;
   - inside case MLXCPLD_LPCI2C_ACK_IND - avoid unnecessary
 indentation;
   - Remove case MLXCPLD_LPCI2C_ERR_IND and remove this macro;
 - Add empty lines in mlxcpld_i2c_xfer before/after mutex_lock/
   mutex_unlock;
 - In mlxcpld_i2c_probe add emtpy line after platform_set_drvdata;
 - Replace platfrom handle pdev in mlxcpld_i2c_priv with the pointer
   to the structure device;
 - Place assignment of base_addr near the others;
 - Enclose e-mail with <>;
 Fixes added by Vadim:
 - Change structure description format according to
   Documentation/kernel-documentation.rst guideline;
 - mlxcpld_i2c_wait_for_tc: return error if status reaches default case;
v1->v2
 Fixes added by Vadim:
 - Put new record in Makefile in alphabetic order;
 - Remove http://www.mellanox.com from MAINTAINERS record;
---
 Documentation/i2c/busses/i2c-mlxcpld |  47 
 MAINTAINERS  |   8 +
 drivers/i2c/busses/Kconfig   |  11 +
 drivers/i2c/busses/Makefile  |   1 +

[patch v8 1/1] i2c: add master driver for mellanox systems

2016-11-17 Thread vadimp

From: Vadim Pasternak 

Device driver for Mellanox I2C controller logic, implemented in Lattice
CPLD device.
Device supports:
 - Master mode
 - One physical bus
 - Polling mode

The Kconfig currently controlling compilation of this code is:
drivers/i2c/busses/Kconfig:config I2C_MLXCPLD

Signed-off-by: Michael Shych 
Signed-off-by: Vadim Pasternak 
Reviewed-by: Jiri Pirko 
Reviewed-by: Vladimir Zapolskiy 
---
v7->v8
 Comments pointed out by Wolfram:
 - Remove descriptions for two structures, since the members have
   self-explaining names;
 - Populate the structure i2c_adapter_quirks, so core will make length
   validation, as a result remove mlxcpld_i2c_invalid_len,
   and move common length calculation to mlxcpld_i2c_xfer. Consider
   removing of mlxcpld_i2c_check_msg_params after core quirks is able
   to handle messages full validation;
 - Use ENXIO error code for NACK;
v6->v7
 Comments pointed out by Peter:
 - Fix grammar in doc file;
 - Fix description for CMD in doc file;
 - Fix grammar for NUM_DATA, DATAx in doc file;
 - Rename mlxcpld_i2c_curr_transf to mlxcpld_i2c_curr_xfer;
 - Make code in mlxcpld_i2c_lpc_write_buf more readable and
   mlxcpld_i2c_lpc_read_buf, compact and elegant;
 - Fix multiline comments;
v5->v6:
 Comments pointed out by Vladimir:
 - Drop the line with module path from the header;
 - In description of mlxcpld_i2c_priv remove lpc_gen_dec_reg asnd dev_id;
 - In mlxcpld_i2c_priv change type of the filed base_addr to u16 for
   the alignment with in/out and remove unused dev_id;
 - Fix misspelling in comment for mlxcpld_i2c_invalid_len;
 - Remove comment regarding EBUSY return in mlxcpld_i2c_check_busy;
 - Use sizeof of the target storage in allocation in probe routine;
v4->v5:
 Comments pointed out by Vladimir:
 - Remove "default n" from Kconfig;
 - Fix the comments for timeout and pool time;
 - Optimize error flow in mlxcpld_i2c_probe;
v3->v4:
 Comments pointed out by Vladimir:
 - Set default to no in Kconfig;
 - Make mlxcpld_i2c_plat_dev static and add empty line before the
   declaration;
 - In function mlxcpld_i2c_invalid_len remove (msg->len < 0), since len is
   unsigned;
 - Remove unused symbol mlxcpld_i2c_plat_dev;
 - Remove extra spaces in comments to mlxcpld_i2c_check_msg_params;
 - Remove unnecessary round braces in mlxcpld_i2c_set_transf_data;
 - Remove the assignment of 'i' variable in mlxcpld_i2c_wait_for_tc;
 - Add extra line in mlxcpld_i2c_xfer;
 - Move assignment of the adapter's fields retries and nr inside
   mlxcpld_i2c_adapter declaration;
v2->v3:
 Comments pointed out by Vladimir:
 - Use tab symbol as indentation in Kconfig
 - Add the Kconfig section preserving the alphabetical order - added
   within "Other I2C/SMBus bus drivers" after I2C_ELEKTOR (but after this
   sections others are not follow alphabetical);
 - Change license to dual;
 - Replace ADRR with ADDR in macros;
 - Remove unused macros: MLXCPLD_LPCI2C_LPF_DFLT,
   MLXCPLD_LPCI2C_HALF_CYC_100, MLXCPLD_LPCI2C_I2C_HOLD_100,
   MLXCPLD_LPCI2C_HALF_CYC_REG, MLXCPLD_LPCI2C_I2C_HOLD_REG;
 - Fix checkpatch warnings (**/ and the end of comment);
 - Add empty line before structures mlxcpld_i2c_regs,
   mlxcpld_i2c_curr_transf, mlxcpld_i2c_priv;
 - Remove unused structure mlxcpld_i2c_regs;
 - Remove from mlxcpld_i2c_priv the next fields:
   retr_num, poll_time, block_sz, xfer_to; use instead macros
   respectively: MLXCPLD_I2C_RETR_NUM, MLXCPLD_I2C_POLL_TIME,
   MLXCPLD_I2C_DATA_REG_SZ, MLXCPLD_I2C_XFER_TO;
 - In mlxcpld_i2c_invalid_len remove unnecessary else;
 - Optimize mlxcpld_i2c_set_transf_data;
 - mlxcpld_i2c_reset - add empty lines after/before mutex
   lock/unlock;
 - mlxcpld_i2c_wait_for_free - cover case timeout is equal
   MLXCPLD_I2C_XFER_TO;
 - mlxcpld_i2c_wait_for_tc:
   - Do not assign err in declaration (also err is removed);
   - Insert empty line before case MLXCPLD_LPCI2C_ACK_IND;
   - inside case MLXCPLD_LPCI2C_ACK_IND - avoid unnecessary
 indentation;
   - Remove case MLXCPLD_LPCI2C_ERR_IND and remove this macro;
 - Add empty lines in mlxcpld_i2c_xfer before/after mutex_lock/
   mutex_unlock;
 - In mlxcpld_i2c_probe add emtpy line after platform_set_drvdata;
 - Replace platfrom handle pdev in mlxcpld_i2c_priv with the pointer
   to the structure device;
 - Place assignment of base_addr near the others;
 - Enclose e-mail with <>;
 Fixes added by Vadim:
 - Change structure description format according to
   Documentation/kernel-documentation.rst guideline;
 - mlxcpld_i2c_wait_for_tc: return error if status reaches default case;
v1->v2
 Fixes added by Vadim:
 - Put new record in Makefile in alphabetic order;
 - Remove http://www.mellanox.com from MAINTAINERS record;
---
 Documentation/i2c/busses/i2c-mlxcpld |  47 
 MAINTAINERS  |   8 +
 drivers/i2c/busses/Kconfig   |  11 +
 drivers/i2c/busses/Makefile  |   1 +
 drivers/i2c/busses/i2c-mlxcpld.c | 502 +++
 5 files changed, 569 insertions(+)
 create mode

[PATCH v11 1/6] drivers/platform/x86/p2sb: New Primary to Sideband bridge support driver for Intel SOC's

2016-11-17 Thread Tan Jui Nee

From: Andy Shevchenko 

There is already one and at least one more user coming which
require an access to Primary to Sideband bridge (P2SB) in order
to get IO or MMIO bar hidden by BIOS.
Create a driver to access P2SB for x86 devices.

Signed-off-by: Yong, Jonathan 
Signed-off-by: Andy Shevchenko 
---
Changes in V11:
- No change

Changes in V10:
- Since P2SB is platform enablement driver and therefore should go into
  drivers/platform/x86 (suggested by tglx).

Changes in V9:
- No change

Changes in V8:
- No change

Changes in V7:
- EXPORT_SYMBOL_GPL() and MODULE_LICENSE("GPL v2") are used for new file
  p2sb.c.

Changes in V6:
- No change

Changes in V5:
- No change

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.

Changes in V3:
- No change

Changes in V2:
- Add new config option CONFIG_X86_INTEL_NON_ACPI and "select PINCTRL"
  to fix kbuildbot error

 arch/x86/include/asm/p2sb.h   | 27 
 drivers/platform/x86/Kconfig  |  4 ++
 drivers/platform/x86/Makefile |  1 +
 drivers/platform/x86/p2sb.c   | 98 +++
 4 files changed, 130 insertions(+)
 create mode 100644 arch/x86/include/asm/p2sb.h
 create mode 100644 drivers/platform/x86/p2sb.c

diff --git a/arch/x86/include/asm/p2sb.h b/arch/x86/include/asm/p2sb.h
new file mode 100644
index 000..686e07b
--- /dev/null
+++ b/arch/x86/include/asm/p2sb.h
@@ -0,0 +1,27 @@
+/*
+ * Primary to Sideband bridge (P2SB) access support
+ */
+
+#ifndef P2SB_SYMS_H
+#define P2SB_SYMS_H
+
+#include 
+#include 
+
+#if IS_ENABLED(CONFIG_P2SB)
+
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res);
+
+#else /* CONFIG_P2SB is not set */
+
+static inline
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res)
+{
+   return -ENODEV;
+}
+
+#endif /* CONFIG_P2SB */
+
+#endif /* P2SB_SYMS_H */
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index b8a21d7..65ef6a0 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1027,4 +1027,8 @@ config INTEL_TELEMETRY
  used to get various SoC events and parameters
  directly via debugfs files. Various tools may use
  this interface for SoC state monitoring.
+
+config P2SB
+   tristate
+   depends on PCI
 endif # X86_PLATFORM_DEVICES
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2efa86d..c39a13d 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -71,3 +71,4 @@ obj-$(CONFIG_INTEL_TELEMETRY) += intel_telemetry_core.o \
   intel_telemetry_pltdrv.o \
   intel_telemetry_debugfs.o
 obj-$(CONFIG_INTEL_PMC_CORE)+= intel_pmc_core.o
+obj-$(CONFIG_P2SB) += p2sb.o
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c
new file mode 100644
index 000..b1d784c
--- /dev/null
+++ b/drivers/platform/x86/p2sb.c
@@ -0,0 +1,98 @@
+/*
+ * Primary to Sideband bridge (P2SB) driver
+ *
+ * Copyright (c) 2016, Intel Corporation.
+ *
+ * Authors: Andy Shevchenko 
+ * Jonathan Yong 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define SBREG_BAR  0x10
+#define SBREG_HIDE 0xe1
+
+static DEFINE_SPINLOCK(p2sb_spinlock);
+
+/*
+ * p2sb_bar - Get Primary to Sideband bridge (P2SB) BAR
+ * @pdev:  PCI device to get PCI bus to communicate with
+ * @devfn: PCI device and function to communicate with
+ * @res:   resources to be filled in
+ *
+ * The BIOS prevents the P2SB device from being enumerated by the PCI
+ * subsystem, so we need to unhide and hide it back to lookup the P2SB BAR.
+ *
+ * Locking is handled by spinlock - cannot sleep.
+ *
+ * Return:
+ * 0 on success or appropriate errno value on error.
+ */
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res)
+{
+   u32 base_addr;
+   u64 base64_addr;
+

[PATCH v11 1/6] drivers/platform/x86/p2sb: New Primary to Sideband bridge support driver for Intel SOC's

2016-11-17 Thread Tan Jui Nee

From: Andy Shevchenko 

There is already one and at least one more user coming which
require an access to Primary to Sideband bridge (P2SB) in order
to get IO or MMIO bar hidden by BIOS.
Create a driver to access P2SB for x86 devices.

Signed-off-by: Yong, Jonathan 
Signed-off-by: Andy Shevchenko 
---
Changes in V11:
- No change

Changes in V10:
- Since P2SB is platform enablement driver and therefore should go into
  drivers/platform/x86 (suggested by tglx).

Changes in V9:
- No change

Changes in V8:
- No change

Changes in V7:
- EXPORT_SYMBOL_GPL() and MODULE_LICENSE("GPL v2") are used for new file
  p2sb.c.

Changes in V6:
- No change

Changes in V5:
- No change

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.

Changes in V3:
- No change

Changes in V2:
- Add new config option CONFIG_X86_INTEL_NON_ACPI and "select PINCTRL"
  to fix kbuildbot error

 arch/x86/include/asm/p2sb.h   | 27 
 drivers/platform/x86/Kconfig  |  4 ++
 drivers/platform/x86/Makefile |  1 +
 drivers/platform/x86/p2sb.c   | 98 +++
 4 files changed, 130 insertions(+)
 create mode 100644 arch/x86/include/asm/p2sb.h
 create mode 100644 drivers/platform/x86/p2sb.c

diff --git a/arch/x86/include/asm/p2sb.h b/arch/x86/include/asm/p2sb.h
new file mode 100644
index 000..686e07b
--- /dev/null
+++ b/arch/x86/include/asm/p2sb.h
@@ -0,0 +1,27 @@
+/*
+ * Primary to Sideband bridge (P2SB) access support
+ */
+
+#ifndef P2SB_SYMS_H
+#define P2SB_SYMS_H
+
+#include 
+#include 
+
+#if IS_ENABLED(CONFIG_P2SB)
+
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res);
+
+#else /* CONFIG_P2SB is not set */
+
+static inline
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res)
+{
+   return -ENODEV;
+}
+
+#endif /* CONFIG_P2SB */
+
+#endif /* P2SB_SYMS_H */
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index b8a21d7..65ef6a0 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1027,4 +1027,8 @@ config INTEL_TELEMETRY
  used to get various SoC events and parameters
  directly via debugfs files. Various tools may use
  this interface for SoC state monitoring.
+
+config P2SB
+   tristate
+   depends on PCI
 endif # X86_PLATFORM_DEVICES
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2efa86d..c39a13d 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -71,3 +71,4 @@ obj-$(CONFIG_INTEL_TELEMETRY) += intel_telemetry_core.o \
   intel_telemetry_pltdrv.o \
   intel_telemetry_debugfs.o
 obj-$(CONFIG_INTEL_PMC_CORE)+= intel_pmc_core.o
+obj-$(CONFIG_P2SB) += p2sb.o
diff --git a/drivers/platform/x86/p2sb.c b/drivers/platform/x86/p2sb.c
new file mode 100644
index 000..b1d784c
--- /dev/null
+++ b/drivers/platform/x86/p2sb.c
@@ -0,0 +1,98 @@
+/*
+ * Primary to Sideband bridge (P2SB) driver
+ *
+ * Copyright (c) 2016, Intel Corporation.
+ *
+ * Authors: Andy Shevchenko 
+ * Jonathan Yong 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define SBREG_BAR  0x10
+#define SBREG_HIDE 0xe1
+
+static DEFINE_SPINLOCK(p2sb_spinlock);
+
+/*
+ * p2sb_bar - Get Primary to Sideband bridge (P2SB) BAR
+ * @pdev:  PCI device to get PCI bus to communicate with
+ * @devfn: PCI device and function to communicate with
+ * @res:   resources to be filled in
+ *
+ * The BIOS prevents the P2SB device from being enumerated by the PCI
+ * subsystem, so we need to unhide and hide it back to lookup the P2SB BAR.
+ *
+ * Locking is handled by spinlock - cannot sleep.
+ *
+ * Return:
+ * 0 on success or appropriate errno value on error.
+ */
+int p2sb_bar(struct pci_dev *pdev, unsigned int devfn,
+   struct resource *res)
+{
+   u32 base_addr;
+   u64 base64_addr;
+   unsigned long flags;
+
+   if (!res)
+   return -EINVAL;
+
+   spin_lock(_spinlock);
+
+   /* Unhide the P2SB device */
+

[PATCH v11 4/6] mfd: move enum lpc_chipsets into lpc_ich.h

2016-11-17 Thread Tan Jui Nee

Move the enum's definition into a standalone header file which can be used
wherever its definition is needed.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 drivers/mfd/lpc_ich_core.c  | 71 -
 include/linux/mfd/lpc_ich.h | 71 +
 2 files changed, 71 insertions(+), 71 deletions(-)

diff --git a/drivers/mfd/lpc_ich_core.c b/drivers/mfd/lpc_ich_core.c
index 7cbe037..920198a 100644
--- a/drivers/mfd/lpc_ich_core.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -145,77 +145,6 @@ struct lpc_ich_priv {
.ignore_resource_conflicts = true,
 };
 
-/* chipset related info */
-enum lpc_chipsets {
-   LPC_ICH = 0,/* ICH */
-   LPC_ICH0,   /* ICH0 */
-   LPC_ICH2,   /* ICH2 */
-   LPC_ICH2M,  /* ICH2-M */
-   LPC_ICH3,   /* ICH3-S */
-   LPC_ICH3M,  /* ICH3-M */
-   LPC_ICH4,   /* ICH4 */
-   LPC_ICH4M,  /* ICH4-M */
-   LPC_CICH,   /* C-ICH */
-   LPC_ICH5,   /* ICH5 & ICH5R */
-   LPC_6300ESB,/* 6300ESB */
-   LPC_ICH6,   /* ICH6 & ICH6R */
-   LPC_ICH6M,  /* ICH6-M */
-   LPC_ICH6W,  /* ICH6W & ICH6RW */
-   LPC_631XESB,/* 631xESB/632xESB */
-   LPC_ICH7,   /* ICH7 & ICH7R */
-   LPC_ICH7DH, /* ICH7DH */
-   LPC_ICH7M,  /* ICH7-M & ICH7-U */
-   LPC_ICH7MDH,/* ICH7-M DH */
-   LPC_NM10,   /* NM10 */
-   LPC_ICH8,   /* ICH8 & ICH8R */
-   LPC_ICH8DH, /* ICH8DH */
-   LPC_ICH8DO, /* ICH8DO */
-   LPC_ICH8M,  /* ICH8M */
-   LPC_ICH8ME, /* ICH8M-E */
-   LPC_ICH9,   /* ICH9 */
-   LPC_ICH9R,  /* ICH9R */
-   LPC_ICH9DH, /* ICH9DH */
-   LPC_ICH9DO, /* ICH9DO */
-   LPC_ICH9M,  /* ICH9M */
-   LPC_ICH9ME, /* ICH9M-E */
-   LPC_ICH10,  /* ICH10 */
-   LPC_ICH10R, /* ICH10R */
-   LPC_ICH10D, /* ICH10D */
-   LPC_ICH10DO,/* ICH10DO */
-   LPC_PCH,/* PCH Desktop Full Featured */
-   LPC_PCHM,   /* PCH Mobile Full Featured */
-   LPC_P55,/* P55 */
-   LPC_PM55,   /* PM55 */
-   LPC_H55,/* H55 */
-   LPC_QM57,   /* QM57 */
-   LPC_H57,/* H57 */
-   LPC_HM55,   /* HM55 */
-   LPC_Q57,/* Q57 */
-   LPC_HM57,   /* HM57 */
-   LPC_PCHMSFF,/* PCH Mobile SFF Full Featured */
-   LPC_QS57,   /* QS57 */
-   LPC_3400,   /* 3400 */
-   LPC_3420,   /* 3420 */
-   LPC_3450,   /* 3450 */
-   LPC_EP80579,/* EP80579 */
-   LPC_CPT,/* Cougar Point */
-   LPC_CPTD,   /* Cougar Point Desktop */
-   LPC_CPTM,   /* Cougar Point Mobile */
-   LPC_PBG,/* Patsburg */
-   LPC_DH89XXCC,   /* DH89xxCC */
-   LPC_PPT,/* Panther Point */
-   LPC_LPT,/* Lynx Point */
-   LPC_LPT_LP, /* Lynx Point-LP */
-   LPC_WBG,/* Wellsburg */
-   LPC_AVN,/* Avoton SoC */
-   LPC_BAYTRAIL,   /* Bay Trail SoC */
-   LPC_COLETO, /* Coleto Creek */
-   LPC_WPT_LP, /* Wildcat Point-LP */
-   LPC_BRASWELL,   /* Braswell SoC */
-   LPC_LEWISBURG,  /* Lewisburg */
-   LPC_9S, /* 9 Series */
-};
-
 static struct lpc_ich_info lpc_chipset_info[] = {
[LPC_ICH] = {
.name = "ICH",
diff --git a/include/linux/mfd/lpc_ich.h b/include/linux/mfd/lpc_ich.h
index 2b300b4..42307ee 100644
--- a/include/linux/mfd/lpc_ich.h
+++ b/include/linux/mfd/lpc_ich.h
@@ -43,4 +43,75 @@ struct lpc_ich_info {
u8 use_gpio;
 };
 
+/* chipset related info */
+enum lpc_chipsets {
+   LPC_ICH = 0,/* ICH */
+   LPC_ICH0,   /* ICH0 */
+   LPC_ICH2,   /* ICH2 */
+   LPC_ICH2M,  /* ICH2-M */
+   LPC_ICH3,   /* ICH3-S */
+   LPC_ICH3M,  /* ICH3-M */
+   LPC_ICH4,   /* ICH4 */
+   LPC_ICH4M,  /* ICH4-M */
+   LPC_CICH,   /* C-ICH */
+   LPC_ICH5,   /* ICH5 & ICH5R */
+   LPC_6300ESB,/* 6300ESB */
+   LPC_ICH6,   /* ICH6 & ICH6R */
+   LPC_ICH6M,  /* ICH6-M */
+   LPC_ICH6W,  /* ICH6W & ICH6RW */
+   LPC_631XESB,/* 631xESB/632xESB */
+   LPC_ICH7,   /* ICH7 & ICH7R */
+   LPC_ICH7DH, /* ICH7DH */
+   LPC_ICH7M,  /* ICH7-M & ICH7-U */
+   LPC_ICH7MDH,/* ICH7-M DH */
+   LPC_NM10,   /* NM10 */
+   LPC_ICH8,   /* ICH8 & ICH8R */
+   LPC_ICH8DH, /* ICH8DH */
+   LPC_ICH8DO, /* ICH8DO */
+   LPC_ICH8M,  /* ICH8M */
+   LPC_ICH8ME, /* ICH8M-E */
+   LPC_ICH9,   /* ICH9 */
+   LPC_ICH9R,  /* ICH9R */
+   LPC_ICH9DH,

[PATCH v11 5/6] mfd: lpc_ich: Add Device IDs for Intel Apollo Lake PCH

2016-11-17 Thread Tan Jui Nee

Adding Intel codename Apollo Lake platform device IDs for PCH.

Signed-off-by: Tan Jui Nee 
Acked-for-MFD-by: Lee Jones 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 drivers/mfd/lpc_ich_core.c  | 6 ++
 include/linux/mfd/lpc_ich.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/mfd/lpc_ich_core.c b/drivers/mfd/lpc_ich_core.c
index 920198a..3bb6334 100644
--- a/drivers/mfd/lpc_ich_core.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -54,6 +54,7 @@
  * document number TBD : Wildcat Point-LP
  * document number TBD : 9 Series
  * document number TBD : Lewisburg
+ * document number TBD : Apollo Lake
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -458,6 +459,10 @@ struct lpc_ich_priv {
.name = "9 Series",
.iTCO_version = 2,
},
+   [LPC_APL]  = {
+   .name = "Apollo Lake SoC",
+   .iTCO_version = 5,
+   },
 };
 
 /*
@@ -606,6 +611,7 @@ struct lpc_ich_priv {
{ PCI_VDEVICE(INTEL, 0x3b14), LPC_3420},
{ PCI_VDEVICE(INTEL, 0x3b16), LPC_3450},
{ PCI_VDEVICE(INTEL, 0x5031), LPC_EP80579},
+   { PCI_VDEVICE(INTEL, 0x5ae8), LPC_APL},
{ PCI_VDEVICE(INTEL, 0x8c40), LPC_LPT},
{ PCI_VDEVICE(INTEL, 0x8c41), LPC_LPT},
{ PCI_VDEVICE(INTEL, 0x8c42), LPC_LPT},
diff --git a/include/linux/mfd/lpc_ich.h b/include/linux/mfd/lpc_ich.h
index 42307ee..397008c 100644
--- a/include/linux/mfd/lpc_ich.h
+++ b/include/linux/mfd/lpc_ich.h
@@ -112,6 +112,7 @@ enum lpc_chipsets {
LPC_BRASWELL,   /* Braswell SoC */
LPC_LEWISBURG,  /* Lewisburg */
LPC_9S, /* 9 Series */
+   LPC_APL,/* Apollo Lake SoC */
 };
 
 #endif
-- 
1.9.1

[PATCH v11 6/6] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO pinctrl in non-ACPI system

2016-11-17 Thread Tan Jui Nee

This driver uses the P2SB hide/unhide mechanism cooperatively
to pass the PCI BAR address to the gpio platform driver.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- Remove duplicated object file lpc_ich-objs in Makefile.
- Put p2sb.h header file in separate section in lpc_ich-apl.c, as asm 
stuff
  is platform specific (suggested by Andy).
- Rearrange variable declarations in lpc_ich_add_gpio() function
  (suggested by Andy).
- Move warn_continue label before if/else statement for the sake of
  readability (suggested by Andy).
- Add comment to #endif in lpc_ich_apl.h file.

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- Rename source file lpc_ich-apl.c to lpc_ich_apl.c (suggested by Mika).

Changes in V7:
- Add author information and rewrite description of source file 
  lpc_ich-apl.c and lpc_ich_apl.h.
- Sort the header files by alphabetical order in lpc_ich-apl.c.
- Rename header file lpc_ich-apl.h to lpc_ich_apl.h (suggested by Lee).
- Remove unneeded pdata_size and platform_data from mfd_cell.
  Also, remove unneeded apl_pinctrl_pdata.
- Since variable apl_p2sb is only used once, hence switch it out for the
  PCI_DEVFN macro (suggested by Lee).
- Define APL_GPIO_COMMUNITY_MAX as total Apollo Lake GPIO communities
  supported.
- Set resources in mfd_cell for each GPIO community.
- Call p2sb_bar() function once instead of four times inside the for 
loop.
  And make p2sb_bar() function just to fill in the base address into a
  scratch "struct resource" and have the loop do the additions to 
base/end.
- Remove entire apl_pinctrl_pdata.name memory allocation since it is no
  longer needed.
- Return ret at the end of lpc_ich_add_gpio() function.

Changes in V6:
- Rename CONFIG_X86_INTEL_APL to CONFIG_X86_INTEL_IVI so that it
  relates to the actual product, as suggested by Mika.
- Rework Makefile according Andy's comments.
- Rename lpc_ich_misc() to lpc_ich_add_gpio() so that the name should 
not
  be so generic, as suggested by Andy.
- Call lpc_ich_add_gpio() via priv->chipset.
- lpc_ich_add_gpio() function will be moved from 
  .../include/linux/mfd/lpc_ich.h to
  .../drivers/mfd/lpc_ich-apl.h
  as this is a part of internal driver interface as suggested by Andy.
- Move enum lpc_chipsets from 
  .../drivers/mfd/lpc_ich-core.c to
  .../include/linux/mfd/lpc_ich.h
  as lpc_chipsets is also accessed by lpc_ich_add_gpio().
- Check if kasprintf return value for all 4 gpio controllers before
  proceed to add platform device by using mfd_add_devices().

Changes in V5:
- Split lpc-ich driver into two parts (lpc_ich-core and lpc_ich-apl).
  The file lpc_ich-apl.c introduces gpio platform driver in MFD.
- Rename Kconfig option CONFIG_X86_INTEL_NON_ACPI to 
CONFIG_X86_INTEL_APL
  so that it reflects actual product as suggested by Mika.

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.
- Select CONFIG_P2SB when CONFIG_LPC_ICH is enabled.
- Remove #ifdef CONFIG_X86_INTEL_NON_ACPI and use
  #if defined(CONFIG_X86_INTEL_NON_ACPI) when lpc_ich_misc is called
  as suggested by Lee Jones.
- Use single dimensional array instead of 2D array for apl_gpio_io_res
  structure and use DEFINE_RES_IRQ for its IRQ resource.

Changes in V3:
- Simplify register addresses calculation and use DEFINE_RES_MEM_NAMED
  defines for apl_gpio_io_res structure
- Define magic number for P2SB PCI ID
- Replace switch-case with if-else since currently we have only one
  use case
- Only call mfd_add_devices() once for all gpio communities

Changes in V2:
- Add new config option CONFIG_X86_INTEL_NON_ACPI and "select PINCTRL"
  to fix kbuildbot error

 drivers/mfd/Makefile   |   3 ++
 drivers/mfd/lpc_ich_apl.c  | 121 +
 drivers/mfd/lpc_ich_apl.h  |  28 +++
 drivers/mfd/lpc_ich_core.c |   5 ++
 4 files changed, 157 insertions(+)
 create mode 100644 drivers/mfd/lpc_ich_apl.c
 create mode 100644 drivers/mfd/lpc_ich_apl.h

diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 06a91ea..b7fb703 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -161,6 +161,9 @@

[PATCH v11 3/6] x86/intel-ivi: Add Intel In-Vehicle Infotainment (IVI) systems used in cars support

2016-11-17 Thread Tan Jui Nee

Add support for non ACPI system, such as system that uses Advanced Boot
Loader (ABL) whereby a platform device has to be created in order to bind
with PINCTRL/GPIO.

At the moment, Intel Apollo Lake SoC requires P2SB driver to hide and
unhide P2SB to lookup P2SB BAR and pass the PCI BAR address to GPIO.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- Select CONFIG_P2SB when CONFIG_X86_INTEL_IVI is enabled instead of
  CONFIG_LPC_ICH is enabled. This is to fix kbuildbot error.

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 arch/x86/Kconfig | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bada636..6019755 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -512,6 +512,16 @@ config X86_INTEL_CE
  This option compiles in support for the CE4100 SOC for settop
  boxes and media devices.
 
+config X86_INTEL_IVI
+   bool "Intel In-Vehicle Infotainment (IVI) systems used in cars"
+   depends on X86 && PCI
+   select P2SB
+   ---help---
+ Select this option to enable MMIO BAR access over the P2SB for
+ non-ACPI Intel Apollo Lake SoC platforms. This driver uses the P2SB
+ hide/unhide mechanism cooperatively to pass the PCI BAR address to
+ the platform driver, currently GPIO.
+
 config X86_INTEL_MID
bool "Intel MID platform support"
depends on X86_EXTENDED_PLATFORM
-- 
1.9.1

[PATCH v11 4/6] mfd: move enum lpc_chipsets into lpc_ich.h

2016-11-17 Thread Tan Jui Nee

Move the enum's definition into a standalone header file which can be used
wherever its definition is needed.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 drivers/mfd/lpc_ich_core.c  | 71 -
 include/linux/mfd/lpc_ich.h | 71 +
 2 files changed, 71 insertions(+), 71 deletions(-)

diff --git a/drivers/mfd/lpc_ich_core.c b/drivers/mfd/lpc_ich_core.c
index 7cbe037..920198a 100644
--- a/drivers/mfd/lpc_ich_core.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -145,77 +145,6 @@ struct lpc_ich_priv {
.ignore_resource_conflicts = true,
 };
 
-/* chipset related info */
-enum lpc_chipsets {
-   LPC_ICH = 0,/* ICH */
-   LPC_ICH0,   /* ICH0 */
-   LPC_ICH2,   /* ICH2 */
-   LPC_ICH2M,  /* ICH2-M */
-   LPC_ICH3,   /* ICH3-S */
-   LPC_ICH3M,  /* ICH3-M */
-   LPC_ICH4,   /* ICH4 */
-   LPC_ICH4M,  /* ICH4-M */
-   LPC_CICH,   /* C-ICH */
-   LPC_ICH5,   /* ICH5 & ICH5R */
-   LPC_6300ESB,/* 6300ESB */
-   LPC_ICH6,   /* ICH6 & ICH6R */
-   LPC_ICH6M,  /* ICH6-M */
-   LPC_ICH6W,  /* ICH6W & ICH6RW */
-   LPC_631XESB,/* 631xESB/632xESB */
-   LPC_ICH7,   /* ICH7 & ICH7R */
-   LPC_ICH7DH, /* ICH7DH */
-   LPC_ICH7M,  /* ICH7-M & ICH7-U */
-   LPC_ICH7MDH,/* ICH7-M DH */
-   LPC_NM10,   /* NM10 */
-   LPC_ICH8,   /* ICH8 & ICH8R */
-   LPC_ICH8DH, /* ICH8DH */
-   LPC_ICH8DO, /* ICH8DO */
-   LPC_ICH8M,  /* ICH8M */
-   LPC_ICH8ME, /* ICH8M-E */
-   LPC_ICH9,   /* ICH9 */
-   LPC_ICH9R,  /* ICH9R */
-   LPC_ICH9DH, /* ICH9DH */
-   LPC_ICH9DO, /* ICH9DO */
-   LPC_ICH9M,  /* ICH9M */
-   LPC_ICH9ME, /* ICH9M-E */
-   LPC_ICH10,  /* ICH10 */
-   LPC_ICH10R, /* ICH10R */
-   LPC_ICH10D, /* ICH10D */
-   LPC_ICH10DO,/* ICH10DO */
-   LPC_PCH,/* PCH Desktop Full Featured */
-   LPC_PCHM,   /* PCH Mobile Full Featured */
-   LPC_P55,/* P55 */
-   LPC_PM55,   /* PM55 */
-   LPC_H55,/* H55 */
-   LPC_QM57,   /* QM57 */
-   LPC_H57,/* H57 */
-   LPC_HM55,   /* HM55 */
-   LPC_Q57,/* Q57 */
-   LPC_HM57,   /* HM57 */
-   LPC_PCHMSFF,/* PCH Mobile SFF Full Featured */
-   LPC_QS57,   /* QS57 */
-   LPC_3400,   /* 3400 */
-   LPC_3420,   /* 3420 */
-   LPC_3450,   /* 3450 */
-   LPC_EP80579,/* EP80579 */
-   LPC_CPT,/* Cougar Point */
-   LPC_CPTD,   /* Cougar Point Desktop */
-   LPC_CPTM,   /* Cougar Point Mobile */
-   LPC_PBG,/* Patsburg */
-   LPC_DH89XXCC,   /* DH89xxCC */
-   LPC_PPT,/* Panther Point */
-   LPC_LPT,/* Lynx Point */
-   LPC_LPT_LP, /* Lynx Point-LP */
-   LPC_WBG,/* Wellsburg */
-   LPC_AVN,/* Avoton SoC */
-   LPC_BAYTRAIL,   /* Bay Trail SoC */
-   LPC_COLETO, /* Coleto Creek */
-   LPC_WPT_LP, /* Wildcat Point-LP */
-   LPC_BRASWELL,   /* Braswell SoC */
-   LPC_LEWISBURG,  /* Lewisburg */
-   LPC_9S, /* 9 Series */
-};
-
 static struct lpc_ich_info lpc_chipset_info[] = {
[LPC_ICH] = {
.name = "ICH",
diff --git a/include/linux/mfd/lpc_ich.h b/include/linux/mfd/lpc_ich.h
index 2b300b4..42307ee 100644
--- a/include/linux/mfd/lpc_ich.h
+++ b/include/linux/mfd/lpc_ich.h
@@ -43,4 +43,75 @@ struct lpc_ich_info {
u8 use_gpio;
 };
 
+/* chipset related info */
+enum lpc_chipsets {
+   LPC_ICH = 0,/* ICH */
+   LPC_ICH0,   /* ICH0 */
+   LPC_ICH2,   /* ICH2 */
+   LPC_ICH2M,  /* ICH2-M */
+   LPC_ICH3,   /* ICH3-S */
+   LPC_ICH3M,  /* ICH3-M */
+   LPC_ICH4,   /* ICH4 */
+   LPC_ICH4M,  /* ICH4-M */
+   LPC_CICH,   /* C-ICH */
+   LPC_ICH5,   /* ICH5 & ICH5R */
+   LPC_6300ESB,/* 6300ESB */
+   LPC_ICH6,   /* ICH6 & ICH6R */
+   LPC_ICH6M,  /* ICH6-M */
+   LPC_ICH6W,  /* ICH6W & ICH6RW */
+   LPC_631XESB,/* 631xESB/632xESB */
+   LPC_ICH7,   /* ICH7 & ICH7R */
+   LPC_ICH7DH, /* ICH7DH */
+   LPC_ICH7M,  /* ICH7-M & ICH7-U */
+   LPC_ICH7MDH,/* ICH7-M DH */
+   LPC_NM10,   /* NM10 */
+   LPC_ICH8,   /* ICH8 & ICH8R */
+   LPC_ICH8DH, /* ICH8DH */
+   LPC_ICH8DO, /* ICH8DO */
+   LPC_ICH8M,  /* ICH8M */
+   LPC_ICH8ME, /* ICH8M-E */
+   LPC_ICH9,   /* ICH9 */
+   LPC_ICH9R,  /* ICH9R */
+   LPC_ICH9DH, /* ICH9DH */
+   LPC_ICH9DO, /* ICH9DO */
+

[PATCH v11 5/6] mfd: lpc_ich: Add Device IDs for Intel Apollo Lake PCH

2016-11-17 Thread Tan Jui Nee

Adding Intel codename Apollo Lake platform device IDs for PCH.

Signed-off-by: Tan Jui Nee 
Acked-for-MFD-by: Lee Jones 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 drivers/mfd/lpc_ich_core.c  | 6 ++
 include/linux/mfd/lpc_ich.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/mfd/lpc_ich_core.c b/drivers/mfd/lpc_ich_core.c
index 920198a..3bb6334 100644
--- a/drivers/mfd/lpc_ich_core.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -54,6 +54,7 @@
  * document number TBD : Wildcat Point-LP
  * document number TBD : 9 Series
  * document number TBD : Lewisburg
+ * document number TBD : Apollo Lake
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -458,6 +459,10 @@ struct lpc_ich_priv {
.name = "9 Series",
.iTCO_version = 2,
},
+   [LPC_APL]  = {
+   .name = "Apollo Lake SoC",
+   .iTCO_version = 5,
+   },
 };
 
 /*
@@ -606,6 +611,7 @@ struct lpc_ich_priv {
{ PCI_VDEVICE(INTEL, 0x3b14), LPC_3420},
{ PCI_VDEVICE(INTEL, 0x3b16), LPC_3450},
{ PCI_VDEVICE(INTEL, 0x5031), LPC_EP80579},
+   { PCI_VDEVICE(INTEL, 0x5ae8), LPC_APL},
{ PCI_VDEVICE(INTEL, 0x8c40), LPC_LPT},
{ PCI_VDEVICE(INTEL, 0x8c41), LPC_LPT},
{ PCI_VDEVICE(INTEL, 0x8c42), LPC_LPT},
diff --git a/include/linux/mfd/lpc_ich.h b/include/linux/mfd/lpc_ich.h
index 42307ee..397008c 100644
--- a/include/linux/mfd/lpc_ich.h
+++ b/include/linux/mfd/lpc_ich.h
@@ -112,6 +112,7 @@ enum lpc_chipsets {
LPC_BRASWELL,   /* Braswell SoC */
LPC_LEWISBURG,  /* Lewisburg */
LPC_9S, /* 9 Series */
+   LPC_APL,/* Apollo Lake SoC */
 };
 
 #endif
-- 
1.9.1

[PATCH v11 6/6] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO pinctrl in non-ACPI system

2016-11-17 Thread Tan Jui Nee

This driver uses the P2SB hide/unhide mechanism cooperatively
to pass the PCI BAR address to the gpio platform driver.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- Remove duplicated object file lpc_ich-objs in Makefile.
- Put p2sb.h header file in separate section in lpc_ich-apl.c, as asm 
stuff
  is platform specific (suggested by Andy).
- Rearrange variable declarations in lpc_ich_add_gpio() function
  (suggested by Andy).
- Move warn_continue label before if/else statement for the sake of
  readability (suggested by Andy).
- Add comment to #endif in lpc_ich_apl.h file.

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- Rename source file lpc_ich-apl.c to lpc_ich_apl.c (suggested by Mika).

Changes in V7:
- Add author information and rewrite description of source file 
  lpc_ich-apl.c and lpc_ich_apl.h.
- Sort the header files by alphabetical order in lpc_ich-apl.c.
- Rename header file lpc_ich-apl.h to lpc_ich_apl.h (suggested by Lee).
- Remove unneeded pdata_size and platform_data from mfd_cell.
  Also, remove unneeded apl_pinctrl_pdata.
- Since variable apl_p2sb is only used once, hence switch it out for the
  PCI_DEVFN macro (suggested by Lee).
- Define APL_GPIO_COMMUNITY_MAX as total Apollo Lake GPIO communities
  supported.
- Set resources in mfd_cell for each GPIO community.
- Call p2sb_bar() function once instead of four times inside the for 
loop.
  And make p2sb_bar() function just to fill in the base address into a
  scratch "struct resource" and have the loop do the additions to 
base/end.
- Remove entire apl_pinctrl_pdata.name memory allocation since it is no
  longer needed.
- Return ret at the end of lpc_ich_add_gpio() function.

Changes in V6:
- Rename CONFIG_X86_INTEL_APL to CONFIG_X86_INTEL_IVI so that it
  relates to the actual product, as suggested by Mika.
- Rework Makefile according Andy's comments.
- Rename lpc_ich_misc() to lpc_ich_add_gpio() so that the name should 
not
  be so generic, as suggested by Andy.
- Call lpc_ich_add_gpio() via priv->chipset.
- lpc_ich_add_gpio() function will be moved from 
  .../include/linux/mfd/lpc_ich.h to
  .../drivers/mfd/lpc_ich-apl.h
  as this is a part of internal driver interface as suggested by Andy.
- Move enum lpc_chipsets from 
  .../drivers/mfd/lpc_ich-core.c to
  .../include/linux/mfd/lpc_ich.h
  as lpc_chipsets is also accessed by lpc_ich_add_gpio().
- Check if kasprintf return value for all 4 gpio controllers before
  proceed to add platform device by using mfd_add_devices().

Changes in V5:
- Split lpc-ich driver into two parts (lpc_ich-core and lpc_ich-apl).
  The file lpc_ich-apl.c introduces gpio platform driver in MFD.
- Rename Kconfig option CONFIG_X86_INTEL_NON_ACPI to 
CONFIG_X86_INTEL_APL
  so that it reflects actual product as suggested by Mika.

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.
- Select CONFIG_P2SB when CONFIG_LPC_ICH is enabled.
- Remove #ifdef CONFIG_X86_INTEL_NON_ACPI and use
  #if defined(CONFIG_X86_INTEL_NON_ACPI) when lpc_ich_misc is called
  as suggested by Lee Jones.
- Use single dimensional array instead of 2D array for apl_gpio_io_res
  structure and use DEFINE_RES_IRQ for its IRQ resource.

Changes in V3:
- Simplify register addresses calculation and use DEFINE_RES_MEM_NAMED
  defines for apl_gpio_io_res structure
- Define magic number for P2SB PCI ID
- Replace switch-case with if-else since currently we have only one
  use case
- Only call mfd_add_devices() once for all gpio communities

Changes in V2:
- Add new config option CONFIG_X86_INTEL_NON_ACPI and "select PINCTRL"
  to fix kbuildbot error

 drivers/mfd/Makefile   |   3 ++
 drivers/mfd/lpc_ich_apl.c  | 121 +
 drivers/mfd/lpc_ich_apl.h  |  28 +++
 drivers/mfd/lpc_ich_core.c |   5 ++
 4 files changed, 157 insertions(+)
 create mode 100644 drivers/mfd/lpc_ich_apl.c
 create mode 100644 drivers/mfd/lpc_ich_apl.h

diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 06a91ea..b7fb703 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -161,6 +161,9 @@ obj-$(CONFIG_MFD_INTEL_QUARK_I2C_GPIO)  += 
intel_quark_i2c_gpio.o

[PATCH v11 3/6] x86/intel-ivi: Add Intel In-Vehicle Infotainment (IVI) systems used in cars support

2016-11-17 Thread Tan Jui Nee

Add support for non ACPI system, such as system that uses Advanced Boot
Loader (ABL) whereby a platform device has to be created in order to bind
with PINCTRL/GPIO.

At the moment, Intel Apollo Lake SoC requires P2SB driver to hide and
unhide P2SB to lookup P2SB BAR and pass the PCI BAR address to GPIO.

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- Select CONFIG_P2SB when CONFIG_X86_INTEL_IVI is enabled instead of
  CONFIG_LPC_ICH is enabled. This is to fix kbuildbot error.

Changes in V10:
- No change

Changes in V9:
- No change

Changes in V8:
- No change

 arch/x86/Kconfig | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bada636..6019755 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -512,6 +512,16 @@ config X86_INTEL_CE
  This option compiles in support for the CE4100 SOC for settop
  boxes and media devices.
 
+config X86_INTEL_IVI
+   bool "Intel In-Vehicle Infotainment (IVI) systems used in cars"
+   depends on X86 && PCI
+   select P2SB
+   ---help---
+ Select this option to enable MMIO BAR access over the P2SB for
+ non-ACPI Intel Apollo Lake SoC platforms. This driver uses the P2SB
+ hide/unhide mechanism cooperatively to pass the PCI BAR address to
+ the platform driver, currently GPIO.
+
 config X86_INTEL_MID
bool "Intel MID platform support"
depends on X86_EXTENDED_PLATFORM
-- 
1.9.1

[PATCH v11 2/6] mfd: lpc_ich: Rename lpc-ich driver

2016-11-17 Thread Tan Jui Nee

This patch follows the example of mfd/wm831x to rename the driver
from "lpc_ich" to "lpc_ich_core".

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- Remove the filename from the header of lpc_ich_core.c (suggested by 
Lee).

Changes in V8:
- Update new file name with lpc_ich_core.c at description of source 
file.
- Rework Makefile with new source file name lpc_ich_apl.c.

Changes in V7:
- No change

Changes in V6:
- none, just a subject line and commit message change.

 drivers/mfd/Makefile  | 1 +
 drivers/mfd/{lpc_ich.c => lpc_ich_core.c} | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)
 rename drivers/mfd/{lpc_ich.c => lpc_ich_core.c} (99%)

diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9834e66..06a91ea 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -159,6 +159,7 @@ obj-$(CONFIG_PMIC_ADP5520)  += adp5520.o
 obj-$(CONFIG_MFD_KEMPLD)   += kempld-core.o
 obj-$(CONFIG_MFD_INTEL_QUARK_I2C_GPIO) += intel_quark_i2c_gpio.o
 obj-$(CONFIG_LPC_SCH)  += lpc_sch.o
+lpc_ich-objs   := lpc_ich_core.o
 obj-$(CONFIG_LPC_ICH)  += lpc_ich.o
 obj-$(CONFIG_MFD_RDC321X)  += rdc321x-southbridge.o
 obj-$(CONFIG_MFD_JANZ_CMODIO)  += janz-cmodio.o
diff --git a/drivers/mfd/lpc_ich.c b/drivers/mfd/lpc_ich_core.c
similarity index 99%
rename from drivers/mfd/lpc_ich.c
rename to drivers/mfd/lpc_ich_core.c
index c8dee47..7cbe037 100644
--- a/drivers/mfd/lpc_ich.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -1,6 +1,4 @@
 /*
- *  lpc_ich.c - LPC interface for Intel ICH
- *
  *  LPC bridge function of the Intel ICH contains many other
  *  functional units, such as Interrupt controllers, Timers,
  *  Power Management, System Management, GPIO, RTC, and LPC
-- 
1.9.1

[PATCH v11 0/6] pinctrl/broxton: enable platform device in the absent of ACPI enumeration

2016-11-17 Thread Tan Jui Nee

Hi,
The patches are to cater the need for non-ACPI system whereby
a platform device has to be created in order to bind with
Apollo Lake Pinctrl GPIO platform driver.

The MMIO BAR is accessed over the Primary to Sideband bridge
(P2SB). Since the BIOS prevents the P2SB device from being
enumerated by the PCI subsystem, so we need to hide/unhide P2SB
to lookup the P2SB BAR and pass the PCI BAR address to the gpio
platform driver.

All these three patches have dependencies on each other.

Changes in V11:
- Select CONFIG_P2SB when CONFIG_X86_INTEL_IVI is enabled instead of
  CONFIG_LPC_ICH is enabled. This is to fix kbuildbot error.
- Remove duplicated object file lpc_ich-objs in Makefile.
- Put p2sb.h header file in separate section in lpc_ich-apl.c, as asm 
stuff
  is platform specific (suggested by Andy).
- Rearrange variable declarations in lpc_ich_add_gpio() function
  (suggested by Andy).
- Move warn_continue label before if/else statement for the sake of
  readability (suggested by Andy).
- Add comment to #endif in lpc_ich_apl.h file.

Changes in V10:
- Since P2SB is platform enablement driver and therefore should go into
  drivers/platform/x86 (suggested by tglx).

Changes in V9:
- Remove the filename from the header of lpc_ich_core.c (suggested by 
Lee).

Changes in V8:
- Update new file name with lpc_ich_core.c at description of source 
file.
- Rename source file lpc_ich-apl.c to lpc_ich_apl.c (suggested by Mika).
- Rework Makefile with new source file name lpc_ich_apl.c.

Changes in V7:
- EXPORT_SYMBOL_GPL() and MODULE_LICENSE("GPL v2") are used for new file
  p2sb.c.
- Split Kconfig option CONFIG_X86_INTEL_IVI to separate patch 
(suggested by
  Lee).
- Split new platform enabling into a separate patch.
- Move lpc_chipsets enum's definition into a standalone header file 
which
  can be used wherever its definition is needed.
- Add author information and rewrite description of source file 
  lpc_ich-apl.c and lpc_ich_apl.h.
- Sort the header files by alphabetical order in lpc_ich-apl.c.
- Rename header file lpc_ich-apl.h to lpc_ich_apl.h (suggested by Lee).
- Remove unneeded pdata_size and platform_data from mfd_cell.
  Also, remove unneeded apl_pinctrl_pdata.
- Since variable apl_p2sb is only used once, hence switch it out for the
  PCI_DEVFN macro (suggested by Lee).
- Define APL_GPIO_COMMUNITY_MAX as total Apollo Lake GPIO communities
  supported.
- Set resources in mfd_cell for each GPIO community.
- Call p2sb_bar() function once instead of four times inside the for 
loop.
  And make p2sb_bar() function just to fill in the base address into a
  scratch "struct resource" and have the loop do the additions to 
base/end.
- Remove entire apl_pinctrl_pdata.name memory allocation since it is no
  longer needed.
- Return ret at the end of lpc_ich_add_gpio() function.

Changes in V6:
- Rename CONFIG_X86_INTEL_APL to CONFIG_X86_INTEL_IVI so that it
  relates to the actual product, as suggested by Mika.
- Rework Makefile according Andy's comments.
- Rename lpc_ich_misc() to lpc_ich_add_gpio() so that the name should 
not
  be so generic, as suggested by Andy.
- Call lpc_ich_add_gpio() via priv->chipset.
- lpc_ich_add_gpio() function will be moved from 
  .../include/linux/mfd/lpc_ich.h to
  .../drivers/mfd/lpc_ich-apl.h
  as this is a part of internal driver interface as suggested by Andy.
- Move enum lpc_chipsets from 
  .../drivers/mfd/lpc_ich-core.c to
  .../include/linux/mfd/lpc_ich.h
  as lpc_chipsets is also accessed by lpc_ich_add_gpio().
- Check if kasprintf return value for all 4 gpio controllers before
  proceed to add platform device by using mfd_add_devices().

Changes in V5:
- Split lpc-ich driver into two parts (lpc_ich-core and lpc_ich-apl).
  The file lpc_ich-apl.c introduces gpio platform driver in MFD.
- Rename Kconfig option CONFIG_X86_INTEL_NON_ACPI to 
CONFIG_X86_INTEL_APL
  so that it reflects actual product as suggested by Mika.
- The patch: 
  [PATCH] pinctrl/broxton: enable platform device in the absent of ACPI 
enumeration
  is removed in V5 patch-set as the patch is already applied in Linus' 
pinctrl tree.

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.
- Select CONFIG_P2SB when

[PATCH v11 2/6] mfd: lpc_ich: Rename lpc-ich driver

2016-11-17 Thread Tan Jui Nee

This patch follows the example of mfd/wm831x to rename the driver
from "lpc_ich" to "lpc_ich_core".

Signed-off-by: Tan Jui Nee 
Reviewed-by: Mika Westerberg 
---
Changes in V11:
- No change

Changes in V10:
- No change

Changes in V9:
- Remove the filename from the header of lpc_ich_core.c (suggested by 
Lee).

Changes in V8:
- Update new file name with lpc_ich_core.c at description of source 
file.
- Rework Makefile with new source file name lpc_ich_apl.c.

Changes in V7:
- No change

Changes in V6:
- none, just a subject line and commit message change.

 drivers/mfd/Makefile  | 1 +
 drivers/mfd/{lpc_ich.c => lpc_ich_core.c} | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)
 rename drivers/mfd/{lpc_ich.c => lpc_ich_core.c} (99%)

diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9834e66..06a91ea 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -159,6 +159,7 @@ obj-$(CONFIG_PMIC_ADP5520)  += adp5520.o
 obj-$(CONFIG_MFD_KEMPLD)   += kempld-core.o
 obj-$(CONFIG_MFD_INTEL_QUARK_I2C_GPIO) += intel_quark_i2c_gpio.o
 obj-$(CONFIG_LPC_SCH)  += lpc_sch.o
+lpc_ich-objs   := lpc_ich_core.o
 obj-$(CONFIG_LPC_ICH)  += lpc_ich.o
 obj-$(CONFIG_MFD_RDC321X)  += rdc321x-southbridge.o
 obj-$(CONFIG_MFD_JANZ_CMODIO)  += janz-cmodio.o
diff --git a/drivers/mfd/lpc_ich.c b/drivers/mfd/lpc_ich_core.c
similarity index 99%
rename from drivers/mfd/lpc_ich.c
rename to drivers/mfd/lpc_ich_core.c
index c8dee47..7cbe037 100644
--- a/drivers/mfd/lpc_ich.c
+++ b/drivers/mfd/lpc_ich_core.c
@@ -1,6 +1,4 @@
 /*
- *  lpc_ich.c - LPC interface for Intel ICH
- *
  *  LPC bridge function of the Intel ICH contains many other
  *  functional units, such as Interrupt controllers, Timers,
  *  Power Management, System Management, GPIO, RTC, and LPC
-- 
1.9.1

[PATCH v11 0/6] pinctrl/broxton: enable platform device in the absent of ACPI enumeration

2016-11-17 Thread Tan Jui Nee

Hi,
The patches are to cater the need for non-ACPI system whereby
a platform device has to be created in order to bind with
Apollo Lake Pinctrl GPIO platform driver.

The MMIO BAR is accessed over the Primary to Sideband bridge
(P2SB). Since the BIOS prevents the P2SB device from being
enumerated by the PCI subsystem, so we need to hide/unhide P2SB
to lookup the P2SB BAR and pass the PCI BAR address to the gpio
platform driver.

All these three patches have dependencies on each other.

Changes in V11:
- Select CONFIG_P2SB when CONFIG_X86_INTEL_IVI is enabled instead of
  CONFIG_LPC_ICH is enabled. This is to fix kbuildbot error.
- Remove duplicated object file lpc_ich-objs in Makefile.
- Put p2sb.h header file in separate section in lpc_ich-apl.c, as asm 
stuff
  is platform specific (suggested by Andy).
- Rearrange variable declarations in lpc_ich_add_gpio() function
  (suggested by Andy).
- Move warn_continue label before if/else statement for the sake of
  readability (suggested by Andy).
- Add comment to #endif in lpc_ich_apl.h file.

Changes in V10:
- Since P2SB is platform enablement driver and therefore should go into
  drivers/platform/x86 (suggested by tglx).

Changes in V9:
- Remove the filename from the header of lpc_ich_core.c (suggested by 
Lee).

Changes in V8:
- Update new file name with lpc_ich_core.c at description of source 
file.
- Rename source file lpc_ich-apl.c to lpc_ich_apl.c (suggested by Mika).
- Rework Makefile with new source file name lpc_ich_apl.c.

Changes in V7:
- EXPORT_SYMBOL_GPL() and MODULE_LICENSE("GPL v2") are used for new file
  p2sb.c.
- Split Kconfig option CONFIG_X86_INTEL_IVI to separate patch 
(suggested by
  Lee).
- Split new platform enabling into a separate patch.
- Move lpc_chipsets enum's definition into a standalone header file 
which
  can be used wherever its definition is needed.
- Add author information and rewrite description of source file 
  lpc_ich-apl.c and lpc_ich_apl.h.
- Sort the header files by alphabetical order in lpc_ich-apl.c.
- Rename header file lpc_ich-apl.h to lpc_ich_apl.h (suggested by Lee).
- Remove unneeded pdata_size and platform_data from mfd_cell.
  Also, remove unneeded apl_pinctrl_pdata.
- Since variable apl_p2sb is only used once, hence switch it out for the
  PCI_DEVFN macro (suggested by Lee).
- Define APL_GPIO_COMMUNITY_MAX as total Apollo Lake GPIO communities
  supported.
- Set resources in mfd_cell for each GPIO community.
- Call p2sb_bar() function once instead of four times inside the for 
loop.
  And make p2sb_bar() function just to fill in the base address into a
  scratch "struct resource" and have the loop do the additions to 
base/end.
- Remove entire apl_pinctrl_pdata.name memory allocation since it is no
  longer needed.
- Return ret at the end of lpc_ich_add_gpio() function.

Changes in V6:
- Rename CONFIG_X86_INTEL_APL to CONFIG_X86_INTEL_IVI so that it
  relates to the actual product, as suggested by Mika.
- Rework Makefile according Andy's comments.
- Rename lpc_ich_misc() to lpc_ich_add_gpio() so that the name should 
not
  be so generic, as suggested by Andy.
- Call lpc_ich_add_gpio() via priv->chipset.
- lpc_ich_add_gpio() function will be moved from 
  .../include/linux/mfd/lpc_ich.h to
  .../drivers/mfd/lpc_ich-apl.h
  as this is a part of internal driver interface as suggested by Andy.
- Move enum lpc_chipsets from 
  .../drivers/mfd/lpc_ich-core.c to
  .../include/linux/mfd/lpc_ich.h
  as lpc_chipsets is also accessed by lpc_ich_add_gpio().
- Check if kasprintf return value for all 4 gpio controllers before
  proceed to add platform device by using mfd_add_devices().

Changes in V5:
- Split lpc-ich driver into two parts (lpc_ich-core and lpc_ich-apl).
  The file lpc_ich-apl.c introduces gpio platform driver in MFD.
- Rename Kconfig option CONFIG_X86_INTEL_NON_ACPI to 
CONFIG_X86_INTEL_APL
  so that it reflects actual product as suggested by Mika.
- The patch: 
  [PATCH] pinctrl/broxton: enable platform device in the absent of ACPI 
enumeration
  is removed in V5 patch-set as the patch is already applied in Linus' 
pinctrl tree.

Changes in V4:
- Move Kconfig option CONFIG_X86_INTEL_NON_ACPI from
  [PATCH 2/3] x86/platform/p2sb: New Primary to Sideband bridge support 
driver for Intel SOC's
  to
  [PATCH 3/3] mfd: lpc_ich: Add support for Intel Apollo Lake GPIO 
pinctrl in non-ACPI system
  since the config is used in latter patch.
- Select CONFIG_P2SB when

Re: [PATCH v5] drm/mediatek: fixed the calc method of data rate per lane

2016-11-17 Thread CK Hu

Hi, Daniel:

On Fri, 2016-11-18 at 11:22 +0800, Daniel Kurtz wrote:
> Hi CK,
> 
> On Thu, Nov 17, 2016 at 1:36 PM, CK Hu  wrote:
> > Hi, Jitao:
> >
> >
> > On Wed, 2016-11-16 at 11:20 +0800, Jitao Shi wrote:
> >> Tune dsi frame rate by pixel clock, dsi add some extra signal (i.e.
> >> Tlpx, Ths-prepare, Ths-zero, Ths-trail,Ths-exit) when enter and exit LP
> >> mode, those signals will cause h-time larger than normal and reduce FPS.
> >> So need to multiply a coefficient to offset the extra signal's effect.
> >>   coefficient = ((htotal*bpp/lane_number)+Tlpx+Ths_prep+Ths_zero+
> >>Ths_trail+Ths_exit)/(htotal*bpp/lane_number)
> >>
> >> Signed-off-by: Jitao Shi 
> >
> > It looks good to me.
> > But this patch conflict with [1] which is one patch of MT2701 series. I
> > want to apply MT2701 patches first, so please help to refine this patch
> > based on MT2701 patches.
> 
> I don't think the MT2701 DSI patches are quite ready yet (I just
> reviewed the one below).
> Can we instead land Jitao's small targeted change first, and then
> rebase the MT2701 series on top.
> 
> Thanks,
> -Dan


MT2701 series looks still have some defect to be fixed.
Therefore, I would apply this patch first.
Thanks for your help.

Regards,
CK

> >
> > [1] https://patchwork.kernel.org/patch/9422821/
> >
> > Regards,
> > CK
> >
> >> ---
> >> Change since v4:
> >>  - tune the calc comment more clear.
> >>  - define the phy timings as constants.
> >>
> >> Chnage since v3:
> >>  - wrapp the commit msg.
> >>  - fix alignment of some lines.
> >>
> >> Change since v2:
> >>  - move phy timing back to dsi_phy_timconfig.
> >>
> >> Change since v1:
> >>  - phy_timing2 and phy_timing3 refer clock cycle time.
> >>  - define values of LPX HS_PRPR HS_ZERO HS_TRAIL TA_GO TA_SURE TA_GET 
> >> DA_HS_EXIT.
> >> ---
> >>
> >

[md PATCH 4/6] md/raid1: add failfast handling for writes.

2016-11-17 Thread NeilBrown

When writing to a fastfail device we use MD_FASTFAIL unless
it is the only device being written to.

For resync/recovery, assume there was a working device to
read from so always use REQ_FASTFAIL_DEV.

If a write for resync/recovery fails, we just fail the
device - there is not much else to do.

If a normal failfast write fails, but the device cannot be
failed (must be only one left), we queue for write error
handling.  This will call narrow_write_error() to retry the
write synchronously and without any FAILFAST flags.

Signed-off-by: NeilBrown 
---
 drivers/md/raid1.c |   26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 44f93297698d..731fd9fe79ef 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -423,7 +423,24 @@ static void raid1_end_write_request(struct bio *bio)
set_bit(MD_RECOVERY_NEEDED, &
conf->mddev->recovery);
 
-   set_bit(R1BIO_WriteError, _bio->state);
+   if (test_bit(FailFast, >flags) &&
+   (bio->bi_opf & MD_FAILFAST) &&
+   /* We never try FailFast to WriteMostly devices */
+   !test_bit(WriteMostly, >flags)) {
+   md_error(r1_bio->mddev, rdev);
+   if (!test_bit(Faulty, >flags))
+   /* This is the only remaining device,
+* We need to retry the write without
+* FailFast
+*/
+   set_bit(R1BIO_WriteError, _bio->state);
+   else {
+   /* Finished with this branch */
+   r1_bio->bios[mirror] = NULL;
+   to_put = bio;
+   }
+   } else
+   set_bit(R1BIO_WriteError, _bio->state);
} else {
/*
 * Set R1BIO_Uptodate in our master bio, so that we
@@ -1393,6 +1410,10 @@ static void raid1_make_request(struct mddev *mddev, 
struct bio * bio)
mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
mbio->bi_end_io = raid1_end_write_request;
bio_set_op_attrs(mbio, op, do_flush_fua | do_sync);
+   if (test_bit(FailFast, >mirrors[i].rdev->flags) &&
+   !test_bit(WriteMostly, >mirrors[i].rdev->flags) &&
+   conf->raid_disks - mddev->degraded > 1)
+   mbio->bi_opf |= MD_FAILFAST;
mbio->bi_private = r1_bio;
 
atomic_inc(_bio->remaining);
@@ -2061,6 +2082,9 @@ static void sync_request_write(struct mddev *mddev, 
struct r1bio *r1_bio)
continue;
 
bio_set_op_attrs(wbio, REQ_OP_WRITE, 0);
+   if (test_bit(FailFast, >mirrors[i].rdev->flags))
+   wbio->bi_opf |= MD_FAILFAST;
+
wbio->bi_end_io = end_sync_write;
atomic_inc(_bio->remaining);
md_sync_acct(conf->mirrors[i].rdev->bdev, bio_sectors(wbio));

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1904 matches

Mail list logo