date:20160816

Re: [PATCH] x86/efi-bgrt: remove the check of the version field

2016-08-16 Thread Dave Young

On 08/15/16 at 01:56pm, Matt Fleming wrote:
> On Tue, 09 Aug, at 01:25:46PM, Icenowy Zheng wrote:
> > Some broken firmwares have a wrongly filled version field in BGRT table.
> > (See http://wiki.osdev.org/Broken_UEFI_implementations )
> > 
> > As we know, these firmwares can also provide correct BGRT image, although
> > the table is wrong.
> > 
> > After removing the check of the version field, the kernel can now extract
> > the image correctly, and the information is also correct.
> > 
> > Tested on a Thinkpad E531 (68854UC).
> > 
> > Signed-off-by: Icenowy Zheng 
> > ---
> >  arch/x86/platform/efi/efi-bgrt.c | 5 -
> >  1 file changed, 5 deletions(-)
> > 
> > diff --git a/arch/x86/platform/efi/efi-bgrt.c 
> > b/arch/x86/platform/efi/efi-bgrt.c
> > index 6a2f569..f492ea0 100644
> > --- a/arch/x86/platform/efi/efi-bgrt.c
> > +++ b/arch/x86/platform/efi/efi-bgrt.c
> > @@ -47,11 +47,6 @@ void __init efi_bgrt_init(void)
> >bgrt_tab->header.length, sizeof(*bgrt_tab));
> > return;
> > }
> > -   if (bgrt_tab->version != 1) {
> > -   pr_notice("Ignoring BGRT: invalid version %u (expected 1)\n",
> > -  bgrt_tab->version);
> > -   return;
> > -   }
> > if (bgrt_tab->status & 0xfe) {
> > pr_notice("Ignoring BGRT: reserved status bits are non-zero 
> > %u\n",
> >bgrt_tab->status);
> 
> This would be less scary if we checked for known broken and known good
> version values instead of removing the check altogether, i.e. 0 and 1.

Could we add some quirk for these broken hardware instead of changing
the normal code?

> 
> The whole point of the version field is that it tells us about the
> layout of the BGRT table, so it's not exactly a useless check.

Agreed.

Thanks
Dave

Re: [PATCH] x86/efi-bgrt: remove the check of the version field

2016-08-16 Thread Dave Young

On 08/15/16 at 01:56pm, Matt Fleming wrote:
> On Tue, 09 Aug, at 01:25:46PM, Icenowy Zheng wrote:
> > Some broken firmwares have a wrongly filled version field in BGRT table.
> > (See http://wiki.osdev.org/Broken_UEFI_implementations )
> > 
> > As we know, these firmwares can also provide correct BGRT image, although
> > the table is wrong.
> > 
> > After removing the check of the version field, the kernel can now extract
> > the image correctly, and the information is also correct.
> > 
> > Tested on a Thinkpad E531 (68854UC).
> > 
> > Signed-off-by: Icenowy Zheng 
> > ---
> >  arch/x86/platform/efi/efi-bgrt.c | 5 -
> >  1 file changed, 5 deletions(-)
> > 
> > diff --git a/arch/x86/platform/efi/efi-bgrt.c 
> > b/arch/x86/platform/efi/efi-bgrt.c
> > index 6a2f569..f492ea0 100644
> > --- a/arch/x86/platform/efi/efi-bgrt.c
> > +++ b/arch/x86/platform/efi/efi-bgrt.c
> > @@ -47,11 +47,6 @@ void __init efi_bgrt_init(void)
> >bgrt_tab->header.length, sizeof(*bgrt_tab));
> > return;
> > }
> > -   if (bgrt_tab->version != 1) {
> > -   pr_notice("Ignoring BGRT: invalid version %u (expected 1)\n",
> > -  bgrt_tab->version);
> > -   return;
> > -   }
> > if (bgrt_tab->status & 0xfe) {
> > pr_notice("Ignoring BGRT: reserved status bits are non-zero 
> > %u\n",
> >bgrt_tab->status);
> 
> This would be less scary if we checked for known broken and known good
> version values instead of removing the check altogether, i.e. 0 and 1.

Could we add some quirk for these broken hardware instead of changing
the normal code?

> 
> The whole point of the version field is that it tells us about the
> layout of the BGRT table, so it's not exactly a useless check.

Agreed.

Thanks
Dave

Re: [lkp] [x86/hweight] 65ea11ec6a: will-it-scale.per_process_ops 9.3% improvement

2016-08-16 Thread Borislav Petkov

On Tue, Aug 16, 2016 at 04:09:19PM -0700, H. Peter Anvin wrote:
> On August 16, 2016 10:16:35 AM PDT, Borislav Petkov  wrote:
> >On Tue, Aug 16, 2016 at 09:59:00AM -0700, H. Peter Anvin wrote:
> >> Dang...
> >
> >Isn't 9.3% improvement a good thing(tm) ?
> 
> Yes, it's huge.  The only explanation I could imagine is that scrambling %rdi 
> caused the scheduler to do completely the wrong thing.

I'm questioning the validity, actually. Report says test machine was
Sandy Bridge-EP and I'd bet good money this one has POPCNT support so
how are we even hitting that __sw_hweight64() path, at all?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--

Re: [lkp] [x86/hweight] 65ea11ec6a: will-it-scale.per_process_ops 9.3% improvement

2016-08-16 Thread Borislav Petkov

On Tue, Aug 16, 2016 at 04:09:19PM -0700, H. Peter Anvin wrote:
> On August 16, 2016 10:16:35 AM PDT, Borislav Petkov  wrote:
> >On Tue, Aug 16, 2016 at 09:59:00AM -0700, H. Peter Anvin wrote:
> >> Dang...
> >
> >Isn't 9.3% improvement a good thing(tm) ?
> 
> Yes, it's huge.  The only explanation I could imagine is that scrambling %rdi 
> caused the scheduler to do completely the wrong thing.

I'm questioning the validity, actually. Report says test machine was
Sandy Bridge-EP and I'd bet good money this one has POPCNT support so
how are we even hitting that __sw_hweight64() path, at all?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--

Re: [PATCH v3 2/5] net: ethernet: ti: davinci_cpdma: fix locking while ctrl_stop

2016-08-16 Thread Mugunthan V N

On Tuesday 16 August 2016 04:55 AM, Ivan Khoronzhuk wrote:
> The interrupts shouldn't be disabled while receiving skb, but while
> ctrl_stop, the channels are stopped and all remaining packets are
> handled with netif_receive_skb():
> 
> lock_irq_save
> cpdma_ctlr_stop
>cpdma_chan_top
>__cpdma_chan_free
>cpsw_rx_handler
>netif_receive_skb
> 
> So, split locking while ctrl stop thus interrupts are still
> enabled while skbs handling. It can cause WARN_ONCE in rare
> cases when ctrl is stopping while not all packets were handled
> with NAPIs.
> 
> Signed-off-by: Ivan Khoronzhuk 

Reviewed-by: Mugunthan V N 

Regards
Mugunthan V N

Re: [PATCH v3 2/5] net: ethernet: ti: davinci_cpdma: fix locking while ctrl_stop

2016-08-16 Thread Mugunthan V N

On Tuesday 16 August 2016 04:55 AM, Ivan Khoronzhuk wrote:
> The interrupts shouldn't be disabled while receiving skb, but while
> ctrl_stop, the channels are stopped and all remaining packets are
> handled with netif_receive_skb():
> 
> lock_irq_save
> cpdma_ctlr_stop
>cpdma_chan_top
>__cpdma_chan_free
>cpsw_rx_handler
>netif_receive_skb
> 
> So, split locking while ctrl stop thus interrupts are still
> enabled while skbs handling. It can cause WARN_ONCE in rare
> cases when ctrl is stopping while not all packets were handled
> with NAPIs.
> 
> Signed-off-by: Ivan Khoronzhuk 

Reviewed-by: Mugunthan V N 

Regards
Mugunthan V N

Re: [PATCH v5 3/4] drm/bridge: analogix_dp: add the PSR function support

2016-08-16 Thread Archit Taneja


Hi,

On 07/24/2016 12:27 PM, Yakir Yang wrote:

The full name of PSR is Panel Self Refresh, panel device could refresh
itself with the hardware framebuffer in panel, this would make lots of
sense to save the power consumption.

This patch have exported two symbols for platform driver to implement
the PSR function in hardware side:
- analogix_dp_active_psr()
- analogix_dp_inactive_psr()


Could this in any way mess things up if the dev_type is EXYNOS_DP?

Otherwise,

Reviewed-by: Archit Taneja 



Signed-off-by: Yakir Yang 
Reviewed-by: Sean Paul 
---
Changes in v5:
- Add reviewed flag from Sean.

Changes in v4.1:
- Take use of existing edp_psr_vsc struct to swap HBx and DBx setting. (Sean)
- Remove PSR_VID_CRC_FLUSH setting analogix_dp_enable_psr_crc().
- Add comment about PBx magic numbers. (Sean)

Changes in v4:
- Downgrade the PSR version print message to debug level. (Sean)
- Return 'void' instead of 'int' in analogix_dp_enable_sink_psr(). (Sean)
- Delete the unused read dpcd operations in analogix_dp_enable_sink_psr(). 
(Sean)
- Delete the arbitrary usleep_range in analogix_dp_enable_psr_crc. (Sean).
- Clean up the hardcoded values in analogix_dp_send_psr_spd(). (Sean)
- Rename "active/inactive" to "enable/disable". (Sean, Dominik)
- Keep set the PSR_VID_CRC_FLUSH gate in analogix_dp_enable_psr_crc().

Changes in v3:
- split analogix_dp_enable_psr(), make it more clearly
 analogix_dp_detect_sink_psr()
 analogix_dp_enable_sink_psr()
- remove some nosie register setting comments

Changes in v2:
- introduce in v2, splite the common Analogix DP changes out

  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 81 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_core.h |  5 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c  | 51 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_reg.h  | 34 +
  include/drm/bridge/analogix_dp.h   |  3 +
  5 files changed, 174 insertions(+)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 32715da..381b25e 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -97,6 +97,83 @@ static int analogix_dp_detect_hpd(struct analogix_dp_device 
*dp)
return 0;
  }

+int analogix_dp_enable_psr(struct device *dev)
+{
+   struct analogix_dp_device *dp = dev_get_drvdata(dev);
+   struct edp_vsc_psr psr_vsc;
+
+   if (!dp->psr_support)
+   return -EINVAL;
+
+   /* Prepare VSC packet as per EDP 1.4 spec, Table 6.9 */
+   memset(_vsc, 0, sizeof(psr_vsc));
+   psr_vsc.sdp_header.HB0 = 0;
+   psr_vsc.sdp_header.HB1 = 0x7;
+   psr_vsc.sdp_header.HB2 = 0x2;
+   psr_vsc.sdp_header.HB3 = 0x8;
+
+   psr_vsc.DB0 = 0;
+   psr_vsc.DB1 = EDP_VSC_PSR_STATE_ACTIVE | EDP_VSC_PSR_CRC_VALUES_VALID;
+
+   analogix_dp_send_psr_spd(dp, _vsc);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(analogix_dp_enable_psr);
+
+int analogix_dp_disable_psr(struct device *dev)
+{
+   struct analogix_dp_device *dp = dev_get_drvdata(dev);
+   struct edp_vsc_psr psr_vsc;
+
+   if (!dp->psr_support)
+   return -EINVAL;
+
+   /* Prepare VSC packet as per EDP 1.4 spec, Table 6.9 */
+   memset(_vsc, 0, sizeof(psr_vsc));
+   psr_vsc.sdp_header.HB0 = 0;
+   psr_vsc.sdp_header.HB1 = 0x7;
+   psr_vsc.sdp_header.HB2 = 0x2;
+   psr_vsc.sdp_header.HB3 = 0x8;
+
+   psr_vsc.DB0 = 0;
+   psr_vsc.DB1 = 0;
+
+   analogix_dp_send_psr_spd(dp, _vsc);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(analogix_dp_disable_psr);
+
+static bool analogix_dp_detect_sink_psr(struct analogix_dp_device *dp)
+{
+   unsigned char psr_version;
+
+   analogix_dp_read_byte_from_dpcd(dp, DP_PSR_SUPPORT, _version);
+   dev_dbg(dp->dev, "Panel PSR version : %x\n", psr_version);
+
+   return (psr_version & DP_PSR_IS_SUPPORTED) ? true : false;
+}
+
+static void analogix_dp_enable_sink_psr(struct analogix_dp_device *dp)
+{
+   unsigned char psr_en;
+
+   /* Disable psr function */
+   analogix_dp_read_byte_from_dpcd(dp, DP_PSR_EN_CFG, _en);
+   psr_en &= ~DP_PSR_ENABLE;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   /* Main-Link transmitter remains active during PSR active states */
+   psr_en = DP_PSR_MAIN_LINK_ACTIVE | DP_PSR_CRC_VERIFICATION;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   /* Enable psr function */
+   psr_en = DP_PSR_ENABLE | DP_PSR_MAIN_LINK_ACTIVE |
+DP_PSR_CRC_VERIFICATION;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   analogix_dp_enable_psr_crc(dp);
+}
+
  static unsigned char analogix_dp_calc_edid_check_sum(unsigned char *edid_data)
  {
int i;
@@ -921,6 +998,10 @@ static void analogix_dp_commit(struct

Re: [PATCH v5 3/4] drm/bridge: analogix_dp: add the PSR function support

2016-08-16 Thread Archit Taneja


Hi,

On 07/24/2016 12:27 PM, Yakir Yang wrote:

The full name of PSR is Panel Self Refresh, panel device could refresh
itself with the hardware framebuffer in panel, this would make lots of
sense to save the power consumption.

This patch have exported two symbols for platform driver to implement
the PSR function in hardware side:
- analogix_dp_active_psr()
- analogix_dp_inactive_psr()


Could this in any way mess things up if the dev_type is EXYNOS_DP?

Otherwise,

Reviewed-by: Archit Taneja 



Signed-off-by: Yakir Yang 
Reviewed-by: Sean Paul 
---
Changes in v5:
- Add reviewed flag from Sean.

Changes in v4.1:
- Take use of existing edp_psr_vsc struct to swap HBx and DBx setting. (Sean)
- Remove PSR_VID_CRC_FLUSH setting analogix_dp_enable_psr_crc().
- Add comment about PBx magic numbers. (Sean)

Changes in v4:
- Downgrade the PSR version print message to debug level. (Sean)
- Return 'void' instead of 'int' in analogix_dp_enable_sink_psr(). (Sean)
- Delete the unused read dpcd operations in analogix_dp_enable_sink_psr(). 
(Sean)
- Delete the arbitrary usleep_range in analogix_dp_enable_psr_crc. (Sean).
- Clean up the hardcoded values in analogix_dp_send_psr_spd(). (Sean)
- Rename "active/inactive" to "enable/disable". (Sean, Dominik)
- Keep set the PSR_VID_CRC_FLUSH gate in analogix_dp_enable_psr_crc().

Changes in v3:
- split analogix_dp_enable_psr(), make it more clearly
 analogix_dp_detect_sink_psr()
 analogix_dp_enable_sink_psr()
- remove some nosie register setting comments

Changes in v2:
- introduce in v2, splite the common Analogix DP changes out

  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 81 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_core.h |  5 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c  | 51 ++
  drivers/gpu/drm/bridge/analogix/analogix_dp_reg.h  | 34 +
  include/drm/bridge/analogix_dp.h   |  3 +
  5 files changed, 174 insertions(+)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 32715da..381b25e 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -97,6 +97,83 @@ static int analogix_dp_detect_hpd(struct analogix_dp_device 
*dp)
return 0;
  }

+int analogix_dp_enable_psr(struct device *dev)
+{
+   struct analogix_dp_device *dp = dev_get_drvdata(dev);
+   struct edp_vsc_psr psr_vsc;
+
+   if (!dp->psr_support)
+   return -EINVAL;
+
+   /* Prepare VSC packet as per EDP 1.4 spec, Table 6.9 */
+   memset(_vsc, 0, sizeof(psr_vsc));
+   psr_vsc.sdp_header.HB0 = 0;
+   psr_vsc.sdp_header.HB1 = 0x7;
+   psr_vsc.sdp_header.HB2 = 0x2;
+   psr_vsc.sdp_header.HB3 = 0x8;
+
+   psr_vsc.DB0 = 0;
+   psr_vsc.DB1 = EDP_VSC_PSR_STATE_ACTIVE | EDP_VSC_PSR_CRC_VALUES_VALID;
+
+   analogix_dp_send_psr_spd(dp, _vsc);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(analogix_dp_enable_psr);
+
+int analogix_dp_disable_psr(struct device *dev)
+{
+   struct analogix_dp_device *dp = dev_get_drvdata(dev);
+   struct edp_vsc_psr psr_vsc;
+
+   if (!dp->psr_support)
+   return -EINVAL;
+
+   /* Prepare VSC packet as per EDP 1.4 spec, Table 6.9 */
+   memset(_vsc, 0, sizeof(psr_vsc));
+   psr_vsc.sdp_header.HB0 = 0;
+   psr_vsc.sdp_header.HB1 = 0x7;
+   psr_vsc.sdp_header.HB2 = 0x2;
+   psr_vsc.sdp_header.HB3 = 0x8;
+
+   psr_vsc.DB0 = 0;
+   psr_vsc.DB1 = 0;
+
+   analogix_dp_send_psr_spd(dp, _vsc);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(analogix_dp_disable_psr);
+
+static bool analogix_dp_detect_sink_psr(struct analogix_dp_device *dp)
+{
+   unsigned char psr_version;
+
+   analogix_dp_read_byte_from_dpcd(dp, DP_PSR_SUPPORT, _version);
+   dev_dbg(dp->dev, "Panel PSR version : %x\n", psr_version);
+
+   return (psr_version & DP_PSR_IS_SUPPORTED) ? true : false;
+}
+
+static void analogix_dp_enable_sink_psr(struct analogix_dp_device *dp)
+{
+   unsigned char psr_en;
+
+   /* Disable psr function */
+   analogix_dp_read_byte_from_dpcd(dp, DP_PSR_EN_CFG, _en);
+   psr_en &= ~DP_PSR_ENABLE;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   /* Main-Link transmitter remains active during PSR active states */
+   psr_en = DP_PSR_MAIN_LINK_ACTIVE | DP_PSR_CRC_VERIFICATION;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   /* Enable psr function */
+   psr_en = DP_PSR_ENABLE | DP_PSR_MAIN_LINK_ACTIVE |
+DP_PSR_CRC_VERIFICATION;
+   analogix_dp_write_byte_to_dpcd(dp, DP_PSR_EN_CFG, psr_en);
+
+   analogix_dp_enable_psr_crc(dp);
+}
+
  static unsigned char analogix_dp_calc_edid_check_sum(unsigned char *edid_data)
  {
int i;
@@ -921,6 +998,10 @@ static void analogix_dp_commit(struct analogix_dp_device 
*dp)

/* Enable video */

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long

> The perf-profile data for the two commits are attached(for the case of
> prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> case for some reason, I'm checking the problem now).
>
> The CPU gets much more idle time in the bisected commit a6c2f79287:
>
> 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> 28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> ... ...
>
> While its immediate parent commit 826d253d57 is mostly busy working:
>
> 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> 46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> ... ...
> No idle related function above 1%.
>
> Will the bisected commit make the idle possible?
No, not at all. :)

pls help to debug as I said in the last reply.

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long

> The perf-profile data for the two commits are attached(for the case of
> prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> case for some reason, I'm checking the problem now).
>
> The CPU gets much more idle time in the bisected commit a6c2f79287:
>
> 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> 28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> ... ...
>
> While its immediate parent commit 826d253d57 is mostly busy working:
>
> 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> 46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> ... ...
> No idle related function above 1%.
>
> Will the bisected commit make the idle possible?
No, not at all. :)

pls help to debug as I said in the last reply.

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu

On 08/17/2016 01:04 PM, Aaron Lu wrote:
> On 08/16/2016 05:56 PM, Xin Long wrote:
>
> I'm testing on Linus' master, can we all use that please?
>

 [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

 [mechine]
 Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
 mem 62G (66000220K)

 [system]
 # cat /etc/redhat-release
 Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)

 [commit 3684b03]
 [root@hp-dl380pg8-11 lxin]# uname -r
 4.8.0-rc2.3684b03
 [root@hp-dl380pg8-11 lxin]# cat test.sh
 killall -0 netserver || netserver -4 &
 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>>
>>> I just realized the test we are doing is not exactly the same.
>>> As the original report says:
>>> ip: ipv4
>>> runtime: 300s
>>> nr_threads: 200%
>>> cluster: cs-localhost
>>> send_size: 10K
>>> test: SCTP_STREAM_MANY
>>> cpufreq_governor: performance
>>>
>>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>>> processes of netperf.
>>>
>>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>>> are started concurrently:
>> OK, understand.
>>
>>>
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>>
>>> The throughput is the average of those runs.
>>>
>>> And I think we should be doing test on:
>>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>>> and
>>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>>> immediate parent)
>>> instead of Linus' master HEAD to avoid other factors.
>>>
>> OK, I will do tests as your suggestion now,  but need to rebuild again :D
>>
>> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
>> then try again?
> 
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:

The perf-profile data for the two commits are attached(for the case of
prsctp_enable=1, the perf-profile data doesn't get collected for the 0
case for some reason, I'm checking the problem now).

The CPU gets much more idle time in the bisected commit a6c2f79287:

68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
... ...

While its immediate parent commit 826d253d57 is mostly busy working:

98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
... ...
No idle related function above 1%.

Will the bisected commit make the idle possible?

Thanks,
Aaron


perf-profile-a6c2f79287.gz
Description: application/gzip


perf-profile-826d253d57.gz
Description:

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu

On 08/17/2016 01:04 PM, Aaron Lu wrote:
> On 08/16/2016 05:56 PM, Xin Long wrote:
>
> I'm testing on Linus' master, can we all use that please?
>

 [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

 [mechine]
 Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
 mem 62G (66000220K)

 [system]
 # cat /etc/redhat-release
 Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)

 [commit 3684b03]
 [root@hp-dl380pg8-11 lxin]# uname -r
 4.8.0-rc2.3684b03
 [root@hp-dl380pg8-11 lxin]# cat test.sh
 killall -0 netserver || netserver -4 &
 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>>
>>> I just realized the test we are doing is not exactly the same.
>>> As the original report says:
>>> ip: ipv4
>>> runtime: 300s
>>> nr_threads: 200%
>>> cluster: cs-localhost
>>> send_size: 10K
>>> test: SCTP_STREAM_MANY
>>> cpufreq_governor: performance
>>>
>>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>>> processes of netperf.
>>>
>>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>>> are started concurrently:
>> OK, understand.
>>
>>>
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>>
>>> The throughput is the average of those runs.
>>>
>>> And I think we should be doing test on:
>>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>>> and
>>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>>> immediate parent)
>>> instead of Linus' master HEAD to avoid other factors.
>>>
>> OK, I will do tests as your suggestion now,  but need to rebuild again :D
>>
>> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
>> then try again?
> 
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:

The perf-profile data for the two commits are attached(for the case of
prsctp_enable=1, the perf-profile data doesn't get collected for the 0
case for some reason, I'm checking the problem now).

The CPU gets much more idle time in the bisected commit a6c2f79287:

68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
... ...

While its immediate parent commit 826d253d57 is mostly busy working:

98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
... ...
No idle related function above 1%.

Will the bisected commit make the idle possible?

Thanks,
Aaron


perf-profile-a6c2f79287.gz
Description: application/gzip


perf-profile-826d253d57.gz
Description:

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long

>
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 2353.311249997
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 2371.586250003
>   ]
> }
>
> For its immediate parent:
> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
> No matter the value of net.sctp.prsctp_enable, the throughput is again
> almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 3838.83004
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 3751.46005
>   ]
> }
>
> Does this result give any hint?
OK, if you disable prsctp_enable, commit a6c2f79287 really only adds
two if (), which definitely can't affect performance.

if it's really an issue, pls help to reverse the codes from commit a6c2f79287
little by little, rebuild kernel and try.  you will find which line
exactly caused
the performance issue.  it seems the only way to locate the issue, yet it's
only reproducable in your env.

Thanks.

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long

>
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 2353.311249997
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 2371.586250003
>   ]
> }
>
> For its immediate parent:
> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
> No matter the value of net.sctp.prsctp_enable, the throughput is again
> almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 3838.83004
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 3751.46005
>   ]
> }
>
> Does this result give any hint?
OK, if you disable prsctp_enable, commit a6c2f79287 really only adds
two if (), which definitely can't affect performance.

if it's really an issue, pls help to reverse the codes from commit a6c2f79287
little by little, rebuild kernel and try.  you will find which line
exactly caused
the performance issue.  it seems the only way to locate the issue, yet it's
only reproducable in your env.

Thanks.

Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-16 Thread Stephan Mueller

Am Dienstag, 16. August 2016, 15:28:45 CEST schrieb H. Peter Anvin:

Hi Peter,

> > 
> > There are two motivations for that:
> > 
> > - the current /dev/random is compliant to NTG.1 from AIS 20/31 which
> > requires (in brief words) that entropy comes from auditible noise
> > sources. Currently in my LRNG only RDRAND is a fast noise source which is
> > not auditible (and it is designed to cause a VM exit making it even
> > harder to assess it). To make the LRNG to comply with NTG.1, RDRAND can
> > provide entropy but must not become the sole entropy provider which is
> > the case now with that change.
> > 
> > - the current /dev/random implementation follows the same concept with the
> > exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later
> > versions, this was changed.
> 
> I'm not saying it should be *sole*.  I am questioning the value in
> limiting it, as it seems to me that it could only ever produce a worse
> result.

It is not about the limiting of the data. It is all about the entropy estimate 
for those noise sources and how they affect the entropy estimator behind /dev/
random. If that fast noise source injects large amount of data but does not 
increase the entropy estimator, it is of no concern.

Ciao
Stephan

Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-16 Thread Stephan Mueller

Am Dienstag, 16. August 2016, 15:28:45 CEST schrieb H. Peter Anvin:

Hi Peter,

> > 
> > There are two motivations for that:
> > 
> > - the current /dev/random is compliant to NTG.1 from AIS 20/31 which
> > requires (in brief words) that entropy comes from auditible noise
> > sources. Currently in my LRNG only RDRAND is a fast noise source which is
> > not auditible (and it is designed to cause a VM exit making it even
> > harder to assess it). To make the LRNG to comply with NTG.1, RDRAND can
> > provide entropy but must not become the sole entropy provider which is
> > the case now with that change.
> > 
> > - the current /dev/random implementation follows the same concept with the
> > exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later
> > versions, this was changed.
> 
> I'm not saying it should be *sole*.  I am questioning the value in
> limiting it, as it seems to me that it could only ever produce a worse
> result.

It is not about the limiting of the data. It is all about the entropy estimate 
for those noise sources and how they affect the entropy estimator behind /dev/
random. If that fast noise source injects large amount of data but does not 
increase the entropy estimator, it is of no concern.

Ciao
Stephan

[PATCH v3 1/5] arm: dts: msm8974: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm/boot/dts/qcom-msm8974.dtsi | 103 
 1 file changed, 103 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi 
b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 561d4d1..255c61a 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -131,6 +131,88 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 6>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
cpu-pmu {
compatible = "qcom,krait-pmu";
interrupts = <1 7 0xf04>;
@@ -287,6 +369,27 @@
reg = <0xf9011000 0x1000>;
};
 
+   qfprom: qfprom@fc4bc000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "qcom,qfprom";
+   reg = <0xfc4bc000 0x1000>;
+   tsens_calib: calib@d0 {
+   reg = <0xd0 0x18>;
+   };
+   tsens_backup: backup@440 {
+   reg = <0x440 0x10>;
+   };
+   };
+
+   tsens: thermal-sensor@fc4a8000 {
+   compatible = "qcom,msm8974-tsens";
+   reg = <0xfc4a8000 0x2000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #thermal-sensor-cells = <1>;
+   };
+
timer@f902 {
#address-cells = <1>;

[PATCH v3 0/5] dts patches for qcom tsens support

2016-08-16 Thread Rajendra Nayak

Hey Andy,

This is a respin of v2 with some minor fixes pointed out by Rob.
Please pull these in for 4.9

Thanks,
Rajendra

Rajendra Nayak (5):
  arm: dts: msm8974: Add thermal zones, tsens and qfprom nodes
  arm: dts: apq8064: Add thermal zones, tsens and qfprom nodes
  arm: dts: apq8084: Add thermal zones, tsens and qfprom nodes
  arm64: dts: msm8916: Add thermal zones, tsens and qfprom nodes
  arm64: dts: msm8996: Add thermal zones, tsens and qfprom nodes

 .../devicetree/bindings/clock/qcom,gcc.txt |  16 
 arch/arm/boot/dts/qcom-apq8064.dtsi| 103 +
 arch/arm/boot/dts/qcom-apq8084.dtsi| 103 +
 arch/arm/boot/dts/qcom-msm8974.dtsi| 103 +
 arch/arm64/boot/dts/qcom/msm8916.dtsi  |  64 +
 arch/arm64/boot/dts/qcom/msm8996.dtsi  |  92 ++
 6 files changed, 481 insertions(+)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 2/5] arm: dts: apq8064: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

TSENS is part of GCC, hence add TSENS properties as part of GCC node.
Also add thermal zones and qfprom nodes.
Update GCC bindings doc to mention the possibility of optional TSENS
properties that can be part of GCC node.

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 .../devicetree/bindings/clock/qcom,gcc.txt |  16 
 arch/arm/boot/dts/qcom-apq8064.dtsi| 103 +
 2 files changed, 119 insertions(+)

diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc.txt 
b/Documentation/devicetree/bindings/clock/qcom,gcc.txt
index 9a60fde..ea893cb 100644
--- a/Documentation/devicetree/bindings/clock/qcom,gcc.txt
+++ b/Documentation/devicetree/bindings/clock/qcom,gcc.txt
@@ -22,6 +22,11 @@ Required properties :
 
 Optional properties :
 - #power-domain-cells : shall contain 1
+- Qualcomm TSENS (thermal sensor device) on some devices can
+be part of GCC and hence the TSENS properties can also be
+part of the GCC/clock-controller node.
+For more details on the TSENS properties please refer
+Documentation/devicetree/bindings/thermal/qcom-tsens.txt
 
 Example:
clock-controller@90 {
@@ -31,3 +36,14 @@ Example:
#reset-cells = <1>;
#power-domain-cells = <1>;
};
+
+Example of GCC with TSENS properties:
+   clock-controller@90 {
+   compatible = "qcom,gcc-apq8064";
+   reg = <0x0090 0x4000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   #thermal-sensor-cells = <1>;
+   };
diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
b/arch/arm/boot/dts/qcom-apq8064.dtsi
index 74a9b6c..0313da3 100644
--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
@@ -86,6 +86,92 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+   coefficients = <1199 0>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+   coefficients = <1132 0>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 9>;
+   coefficients = <1199 0>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 10>;
+   coefficients = <1132 0>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+

[PATCH v3 4/5] arm64: dts: msm8916: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 64 +++
 1 file changed, 64 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 11bdc24..4bc047b 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -155,6 +155,49 @@
interrupts = ;
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 4>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 3>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupts = ,
@@ -609,6 +652,27 @@
clocks = < GCC_PRNG_AHB_CLK>;
clock-names = "core";
};
+
+   qfprom: qfprom@5c000 {
+   compatible = "qcom,qfprom";
+   reg = <0x5c000 0x1000>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   tsens_caldata: caldata@d0 {
+   reg = <0xd0 0x8>;
+   };
+   tsens_calsel: calsel@ec {
+   reg = <0xec 0x4>;
+   };
+   };
+
+   tsens: thermal-sensor@4a8000 {
+   compatible = "qcom,msm8916-tsens";
+   reg = <0x4a8000 0x2000>;
+   nvmem-cells = <_caldata>, <_calsel>;
+   nvmem-cell-names = "calib", "calib_sel";
+   #thermal-sensor-cells = <1>;
+   };
};
 
smd {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 1/5] arm: dts: msm8974: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm/boot/dts/qcom-msm8974.dtsi | 103 
 1 file changed, 103 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi 
b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 561d4d1..255c61a 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -131,6 +131,88 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 6>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
cpu-pmu {
compatible = "qcom,krait-pmu";
interrupts = <1 7 0xf04>;
@@ -287,6 +369,27 @@
reg = <0xf9011000 0x1000>;
};
 
+   qfprom: qfprom@fc4bc000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "qcom,qfprom";
+   reg = <0xfc4bc000 0x1000>;
+   tsens_calib: calib@d0 {
+   reg = <0xd0 0x18>;
+   };
+   tsens_backup: backup@440 {
+   reg = <0x440 0x10>;
+   };
+   };
+
+   tsens: thermal-sensor@fc4a8000 {
+   compatible = "qcom,msm8974-tsens";
+   reg = <0xfc4a8000 0x2000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #thermal-sensor-cells = <1>;
+   };
+
timer@f902 {
#address-cells = <1>;
#size-cells = <1>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm

[PATCH v3 0/5] dts patches for qcom tsens support

2016-08-16 Thread Rajendra Nayak

Hey Andy,

This is a respin of v2 with some minor fixes pointed out by Rob.
Please pull these in for 4.9

Thanks,
Rajendra

Rajendra Nayak (5):
  arm: dts: msm8974: Add thermal zones, tsens and qfprom nodes
  arm: dts: apq8064: Add thermal zones, tsens and qfprom nodes
  arm: dts: apq8084: Add thermal zones, tsens and qfprom nodes
  arm64: dts: msm8916: Add thermal zones, tsens and qfprom nodes
  arm64: dts: msm8996: Add thermal zones, tsens and qfprom nodes

 .../devicetree/bindings/clock/qcom,gcc.txt |  16 
 arch/arm/boot/dts/qcom-apq8064.dtsi| 103 +
 arch/arm/boot/dts/qcom-apq8084.dtsi| 103 +
 arch/arm/boot/dts/qcom-msm8974.dtsi| 103 +
 arch/arm64/boot/dts/qcom/msm8916.dtsi  |  64 +
 arch/arm64/boot/dts/qcom/msm8996.dtsi  |  92 ++
 6 files changed, 481 insertions(+)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 2/5] arm: dts: apq8064: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

TSENS is part of GCC, hence add TSENS properties as part of GCC node.
Also add thermal zones and qfprom nodes.
Update GCC bindings doc to mention the possibility of optional TSENS
properties that can be part of GCC node.

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 .../devicetree/bindings/clock/qcom,gcc.txt |  16 
 arch/arm/boot/dts/qcom-apq8064.dtsi| 103 +
 2 files changed, 119 insertions(+)

diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc.txt 
b/Documentation/devicetree/bindings/clock/qcom,gcc.txt
index 9a60fde..ea893cb 100644
--- a/Documentation/devicetree/bindings/clock/qcom,gcc.txt
+++ b/Documentation/devicetree/bindings/clock/qcom,gcc.txt
@@ -22,6 +22,11 @@ Required properties :
 
 Optional properties :
 - #power-domain-cells : shall contain 1
+- Qualcomm TSENS (thermal sensor device) on some devices can
+be part of GCC and hence the TSENS properties can also be
+part of the GCC/clock-controller node.
+For more details on the TSENS properties please refer
+Documentation/devicetree/bindings/thermal/qcom-tsens.txt
 
 Example:
clock-controller@90 {
@@ -31,3 +36,14 @@ Example:
#reset-cells = <1>;
#power-domain-cells = <1>;
};
+
+Example of GCC with TSENS properties:
+   clock-controller@90 {
+   compatible = "qcom,gcc-apq8064";
+   reg = <0x0090 0x4000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   #thermal-sensor-cells = <1>;
+   };
diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
b/arch/arm/boot/dts/qcom-apq8064.dtsi
index 74a9b6c..0313da3 100644
--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
@@ -86,6 +86,92 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+   coefficients = <1199 0>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+   coefficients = <1132 0>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 9>;
+   coefficients = <1199 0>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 10>;
+   coefficients = <1132 0>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+

[PATCH v3 4/5] arm64: dts: msm8916: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 64 +++
 1 file changed, 64 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 11bdc24..4bc047b 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -155,6 +155,49 @@
interrupts = ;
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 4>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 3>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupts = ,
@@ -609,6 +652,27 @@
clocks = < GCC_PRNG_AHB_CLK>;
clock-names = "core";
};
+
+   qfprom: qfprom@5c000 {
+   compatible = "qcom,qfprom";
+   reg = <0x5c000 0x1000>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   tsens_caldata: caldata@d0 {
+   reg = <0xd0 0x8>;
+   };
+   tsens_calsel: calsel@ec {
+   reg = <0xec 0x4>;
+   };
+   };
+
+   tsens: thermal-sensor@4a8000 {
+   compatible = "qcom,msm8916-tsens";
+   reg = <0x4a8000 0x2000>;
+   nvmem-cells = <_caldata>, <_calsel>;
+   nvmem-cell-names = "calib", "calib_sel";
+   #thermal-sensor-cells = <1>;
+   };
};
 
smd {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 3/5] arm: dts: apq8084: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm/boot/dts/qcom-apq8084.dtsi | 103 
 1 file changed, 103 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-apq8084.dtsi 
b/arch/arm/boot/dts/qcom-apq8084.dtsi
index 7c2df06..39eb7a4 100644
--- a/arch/arm/boot/dts/qcom-apq8084.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8084.dtsi
@@ -94,6 +94,88 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 6>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
cpu-pmu {
compatible = "qcom,krait-pmu";
interrupts = <1 7 0xf04>;
@@ -150,6 +232,27 @@
reg = <0xf9011000 0x1000>;
};
 
+   qfprom: qfprom@fc4bc000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "qcom,qfprom";
+   reg = <0xfc4bc000 0x1000>;
+   tsens_calib: calib@d0 {
+   reg = <0xd0 0x18>;
+   };
+   tsens_backup: backup@440 {
+   reg = <0x440 0x10>;
+   };
+   };
+
+   tsens: thermal-sensor@fc4a8000 {
+   compatible = "qcom,msm8974-tsens";
+   reg = <0xfc4a8000 0x2000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #thermal-sensor-cells = <1>;
+   };
+
timer@f902 {
#address-cells = <1>;

[PATCH v3 5/5] arm64: dts: msm8996: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones and tsens node

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 92 +++
 1 file changed, 92 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 55ec3e8..f52cba3 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -97,6 +97,92 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 3>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 10>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupts = ,
@@ -181,6 +267,12 @@
status = "disabled";
};
 
+   tsens0: thermal-sensor@4a8000 {
+   compatible = "qcom,msm8996-tsens";
+   reg = <0x4a8000 0x2000>;
+   #thermal-sensor-cells = <1>;
+   };
+
blsp2_uart1: serial@75b {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0x75b 0x1000>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 5/5] arm64: dts: msm8996: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones and tsens node

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 92 +++
 1 file changed, 92 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 55ec3e8..f52cba3 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -97,6 +97,92 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 3>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 10>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
timer {
compatible = "arm,armv8-timer";
interrupts = ,
@@ -181,6 +267,12 @@
status = "disabled";
};
 
+   tsens0: thermal-sensor@4a8000 {
+   compatible = "qcom,msm8996-tsens";
+   reg = <0x4a8000 0x2000>;
+   #thermal-sensor-cells = <1>;
+   };
+
blsp2_uart1: serial@75b {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0x75b 0x1000>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v3 3/5] arm: dts: apq8084: Add thermal zones, tsens and qfprom nodes

2016-08-16 Thread Rajendra Nayak

Add thermal zones, tsens and qfprom nodes

Acked-by: Eduardo Valentin 
Acked-by: Rob Herring 
Signed-off-by: Rajendra Nayak 
---
 arch/arm/boot/dts/qcom-apq8084.dtsi | 103 
 1 file changed, 103 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-apq8084.dtsi 
b/arch/arm/boot/dts/qcom-apq8084.dtsi
index 7c2df06..39eb7a4 100644
--- a/arch/arm/boot/dts/qcom-apq8084.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8084.dtsi
@@ -94,6 +94,88 @@
};
};
 
+   thermal-zones {
+   cpu-thermal0 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 5>;
+
+   trips {
+   cpu_alert0: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit0: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal1 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 6>;
+
+   trips {
+   cpu_alert1: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit1: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal2 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 7>;
+
+   trips {
+   cpu_alert2: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit2: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cpu-thermal3 {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 8>;
+
+   trips {
+   cpu_alert3: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   cpu_crit3: trip1 {
+   temperature = <11>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+   };
+
cpu-pmu {
compatible = "qcom,krait-pmu";
interrupts = <1 7 0xf04>;
@@ -150,6 +232,27 @@
reg = <0xf9011000 0x1000>;
};
 
+   qfprom: qfprom@fc4bc000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "qcom,qfprom";
+   reg = <0xfc4bc000 0x1000>;
+   tsens_calib: calib@d0 {
+   reg = <0xd0 0x18>;
+   };
+   tsens_backup: backup@440 {
+   reg = <0x440 0x10>;
+   };
+   };
+
+   tsens: thermal-sensor@fc4a8000 {
+   compatible = "qcom,msm8974-tsens";
+   reg = <0xfc4a8000 0x2000>;
+   nvmem-cells = <_calib>, <_backup>;
+   nvmem-cell-names = "calib", "calib_backup";
+   #thermal-sensor-cells = <1>;
+   };
+
timer@f902 {
#address-cells = <1>;
#size-cells = <1>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation

Re: [PATCH v3 0/7] x86: Rewrite switch_to()

2016-08-16 Thread Herbert Xu

Andy Lutomirski  wrote:
>
> I will be quite surprised if you can measure any effect at all.  I've
> never seen context switches take fewer than ~2k cycles, and on my
> laptop, they take 8k-9k cycles.  The scheduler is really, really slow.

Indeed, I think I still have a bugzilla entry somewhere regarding
the huge performance regression when we first switched over to the
current scheduler.

Over the years people have been trying to hack around it, e.g.,
network driver polling, low latency socket, anything to avoid
going into the scheduler.  We really should fix it properly some
day.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v3 0/7] x86: Rewrite switch_to()

2016-08-16 Thread Herbert Xu

Andy Lutomirski  wrote:
>
> I will be quite surprised if you can measure any effect at all.  I've
> never seen context switches take fewer than ~2k cycles, and on my
> laptop, they take 8k-9k cycles.  The scheduler is really, really slow.

Indeed, I think I still have a bugzilla entry somewhere regarding
the huge performance regression when we first switched over to the
current scheduler.

Over the years people have been trying to hack around it, e.g.,
network driver polling, low latency socket, anything to avoid
going into the scheduler.  We really should fix it properly some
day.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v5] spi: spi-ti-qspi: Add DMA support for QSPI mmap read

2016-08-16 Thread Vignesh R



On Wednesday 17 August 2016 12:12 AM, Mark Brown wrote:
> On Tue, Aug 09, 2016 at 06:20:01PM +0530, Vignesh R wrote:
>>
>>
> 
>> According to this thread[1], converting virtual address
>> pointer into scatterlist which is then DMA mapped is unsafe on systems
>> with certain cache architecture. Hence, I added code to handle kmap
>> buffers inside the driver rather than updating generic spi-core code.
> 
> That's saying that things that aren't covered aren't DMAable safely at
> all which is a more general issue...
> 
>> If its okay to update the spi_map_buf() to handle kmap buffers as above
>> then I can submit the patch accordingly.
> 
> Yes, the whole point is that this is all in generic code so drivers
> don't need to worry about it.

Ok, I will send another version updating spi_map_buf() to handle kmap
buffers. Thanks!

-- 
Regards
Vignesh

Re: [PATCH v5] spi: spi-ti-qspi: Add DMA support for QSPI mmap read

2016-08-16 Thread Vignesh R



On Wednesday 17 August 2016 12:12 AM, Mark Brown wrote:
> On Tue, Aug 09, 2016 at 06:20:01PM +0530, Vignesh R wrote:
>>
>>
> 
>> According to this thread[1], converting virtual address
>> pointer into scatterlist which is then DMA mapped is unsafe on systems
>> with certain cache architecture. Hence, I added code to handle kmap
>> buffers inside the driver rather than updating generic spi-core code.
> 
> That's saying that things that aren't covered aren't DMAable safely at
> all which is a more general issue...
> 
>> If its okay to update the spi_map_buf() to handle kmap buffers as above
>> then I can submit the patch accordingly.
> 
> Yes, the whole point is that this is all in generic code so drivers
> don't need to worry about it.

Ok, I will send another version updating spi_map_buf() to handle kmap
buffers. Thanks!

-- 
Regards
Vignesh

RE: [PATCH v2] Added perf functionality to mmdc driver

2016-08-16 Thread Zhengyu Shen

> > > > +   hrtimer_start(_mmdc->hrtimer, mmdc_timer_period(),
> > > > +   HRTIMER_MODE_REL_PINNED);
> > >
> > > Why is a hrtimer necessary? Is this just copy-pasted from CCN, or do
> > > you have similar HW issues?
> > >
> > > Is there no overflow interrupt?
> >
> > When overflow occurs, a register bit is set to one. There is no
> > overflow interrupt which is why the timer is needed.
> 
> I see. Please have add comment in the driver explaining this, so that this is
> obvious.
> 
> Does the counter itself wrap and continue counting, or does it saturate?
> 
> How have you tuned your polling period so as to avoid missing events in the
> case of an overflow?
> 
> Thanks,
> Mark.
The counter wraps around once every ten seconds for total-cycles (which is the 
Fastest increasing counter). Polling is done every one second just to be safe.

RE: [PATCH v2] Added perf functionality to mmdc driver

2016-08-16 Thread Zhengyu Shen

> > > > +   hrtimer_start(_mmdc->hrtimer, mmdc_timer_period(),
> > > > +   HRTIMER_MODE_REL_PINNED);
> > >
> > > Why is a hrtimer necessary? Is this just copy-pasted from CCN, or do
> > > you have similar HW issues?
> > >
> > > Is there no overflow interrupt?
> >
> > When overflow occurs, a register bit is set to one. There is no
> > overflow interrupt which is why the timer is needed.
> 
> I see. Please have add comment in the driver explaining this, so that this is
> obvious.
> 
> Does the counter itself wrap and continue counting, or does it saturate?
> 
> How have you tuned your polling period so as to avoid missing events in the
> case of an overflow?
> 
> Thanks,
> Mark.
The counter wraps around once every ten seconds for total-cycles (which is the 
Fastest increasing counter). Polling is done every one second just to be safe.

Re: [PATCH v2] drm/bridge: analogix_dp: Ensure the panel is properly prepared/unprepared

2016-08-16 Thread Archit Taneja




On 08/09/2016 08:05 AM, Yakir Yang wrote:

+ Archit


On 08/09/2016 02:53 AM, Sean Paul wrote:

Instead of just preparing the panel on bind, actually prepare/unprepare
during modeset/disable. The panel must be prepared in order to read hpd
status and edid, so we need to keep state around the prepares in order
to ensure we don't accidentally turn the panel off at the wrong time.

Signed-off-by: Sean Paul 


Reviewed-by: Yakir Yang 

And I also tested this patch on RK3399 Kevin board, panel works rightly,
so:
Tested-by: Yakir Yang 

Also add Archit into CC list, guess this patch should go through his
drm_bridge's tree.


Reviewed-by: Archit Taneja 



Thanks,
- Yakir


---

Changes in v2:
  - Added panel_is_modeset state/lock to avoid racing detect with
modeset (marcheu)
  - Added prepare/unprepare in .get_modes (yakir)

  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 101
++---
  drivers/gpu/drm/bridge/analogix/analogix_dp_core.h |   3 +
  2 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 32715da..47c449a 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -923,11 +923,63 @@ static void analogix_dp_commit(struct
analogix_dp_device *dp)
  analogix_dp_start_video(dp);
  }
+/*
+ * This function is a bit of a catch-all for panel preparation,
hopefully
+ * simplifying the logic of functions that need to prepare/unprepare
the panel
+ * below.
+ *
+ * If @prepare is true, this function will prepare the panel.
Conversely, if it
+ * is false, the panel will be unprepared.
+ *
+ * If @is_modeset_prepare is true, the function will disregard the
current state
+ * of the panel and either prepare/unprepare the panel based on
@prepare. Once
+ * it finishes, it will update dp->panel_is_modeset to reflect the
current state
+ * of the panel.
+ */
+static int analogix_dp_prepare_panel(struct analogix_dp_device *dp,
+ bool prepare, bool is_modeset_prepare)
+{
+int ret = 0;
+
+if (!dp->plat_data->panel)
+return 0;
+
+mutex_lock(>panel_lock);
+
+/*
+ * Exit early if this is a temporary prepare/unprepare and we're
already
+ * modeset (since we neither want to prepare twice or unprepare
early).
+ */
+if (dp->panel_is_modeset && !is_modeset_prepare)
+goto out;
+
+if (prepare)
+ret = drm_panel_prepare(dp->plat_data->panel);
+else
+ret = drm_panel_unprepare(dp->plat_data->panel);
+
+if (ret)
+goto out;
+
+if (is_modeset_prepare)
+dp->panel_is_modeset = prepare;
+
+out:
+mutex_unlock(>panel_lock);
+return ret;
+}
+
  int analogix_dp_get_modes(struct drm_connector *connector)
  {
  struct analogix_dp_device *dp = to_dp(connector);
  struct edid *edid = (struct edid *)dp->edid;
-int num_modes = 0;
+int ret, num_modes = 0;
+
+ret = analogix_dp_prepare_panel(dp, true, false);
+if (ret) {
+DRM_ERROR("Failed to prepare panel (%d)\n", ret);
+return 0;
+}
  if (analogix_dp_handle_edid(dp) == 0) {
  drm_mode_connector_update_edid_property(>connector, edid);
@@ -940,6 +992,10 @@ int analogix_dp_get_modes(struct drm_connector
*connector)
  if (dp->plat_data->get_modes)
  num_modes += dp->plat_data->get_modes(dp->plat_data,
connector);
+ret = analogix_dp_prepare_panel(dp, false, false);
+if (ret)
+DRM_ERROR("Failed to unprepare panel (%d)\n", ret);
+
  return num_modes;
  }
@@ -960,11 +1016,23 @@ enum drm_connector_status
  analogix_dp_detect(struct drm_connector *connector, bool force)
  {
  struct analogix_dp_device *dp = to_dp(connector);
+enum drm_connector_status status = connector_status_disconnected;
+int ret;
-if (analogix_dp_detect_hpd(dp))
+ret = analogix_dp_prepare_panel(dp, true, false);
+if (ret) {
+DRM_ERROR("Failed to prepare panel (%d)\n", ret);
  return connector_status_disconnected;
+}
+
+if (!analogix_dp_detect_hpd(dp))
+status = connector_status_connected;
-return connector_status_connected;
+ret = analogix_dp_prepare_panel(dp, false, false);
+if (ret)
+DRM_ERROR("Failed to unprepare panel (%d)\n", ret);
+
+return status;
  }
  static void analogix_dp_connector_destroy(struct drm_connector
*connector)
@@ -1035,6 +1103,16 @@ static int analogix_dp_bridge_attach(struct
drm_bridge *bridge)
  return 0;
  }
+static void analogix_dp_bridge_pre_enable(struct drm_bridge *bridge)
+{
+struct analogix_dp_device *dp = bridge->driver_private;
+int ret;
+
+ret = analogix_dp_prepare_panel(dp, true, true);
+if (ret)
+DRM_ERROR("failed to setup the panel ret = %d\n", ret);
+}
+
  static void

Re: [PATCH v2] drm/bridge: analogix_dp: Ensure the panel is properly prepared/unprepared

2016-08-16 Thread Archit Taneja




On 08/09/2016 08:05 AM, Yakir Yang wrote:

+ Archit


On 08/09/2016 02:53 AM, Sean Paul wrote:

Instead of just preparing the panel on bind, actually prepare/unprepare
during modeset/disable. The panel must be prepared in order to read hpd
status and edid, so we need to keep state around the prepares in order
to ensure we don't accidentally turn the panel off at the wrong time.

Signed-off-by: Sean Paul 


Reviewed-by: Yakir Yang 

And I also tested this patch on RK3399 Kevin board, panel works rightly,
so:
Tested-by: Yakir Yang 

Also add Archit into CC list, guess this patch should go through his
drm_bridge's tree.


Reviewed-by: Archit Taneja 



Thanks,
- Yakir


---

Changes in v2:
  - Added panel_is_modeset state/lock to avoid racing detect with
modeset (marcheu)
  - Added prepare/unprepare in .get_modes (yakir)

  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 101
++---
  drivers/gpu/drm/bridge/analogix/analogix_dp_core.h |   3 +
  2 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 32715da..47c449a 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -923,11 +923,63 @@ static void analogix_dp_commit(struct
analogix_dp_device *dp)
  analogix_dp_start_video(dp);
  }
+/*
+ * This function is a bit of a catch-all for panel preparation,
hopefully
+ * simplifying the logic of functions that need to prepare/unprepare
the panel
+ * below.
+ *
+ * If @prepare is true, this function will prepare the panel.
Conversely, if it
+ * is false, the panel will be unprepared.
+ *
+ * If @is_modeset_prepare is true, the function will disregard the
current state
+ * of the panel and either prepare/unprepare the panel based on
@prepare. Once
+ * it finishes, it will update dp->panel_is_modeset to reflect the
current state
+ * of the panel.
+ */
+static int analogix_dp_prepare_panel(struct analogix_dp_device *dp,
+ bool prepare, bool is_modeset_prepare)
+{
+int ret = 0;
+
+if (!dp->plat_data->panel)
+return 0;
+
+mutex_lock(>panel_lock);
+
+/*
+ * Exit early if this is a temporary prepare/unprepare and we're
already
+ * modeset (since we neither want to prepare twice or unprepare
early).
+ */
+if (dp->panel_is_modeset && !is_modeset_prepare)
+goto out;
+
+if (prepare)
+ret = drm_panel_prepare(dp->plat_data->panel);
+else
+ret = drm_panel_unprepare(dp->plat_data->panel);
+
+if (ret)
+goto out;
+
+if (is_modeset_prepare)
+dp->panel_is_modeset = prepare;
+
+out:
+mutex_unlock(>panel_lock);
+return ret;
+}
+
  int analogix_dp_get_modes(struct drm_connector *connector)
  {
  struct analogix_dp_device *dp = to_dp(connector);
  struct edid *edid = (struct edid *)dp->edid;
-int num_modes = 0;
+int ret, num_modes = 0;
+
+ret = analogix_dp_prepare_panel(dp, true, false);
+if (ret) {
+DRM_ERROR("Failed to prepare panel (%d)\n", ret);
+return 0;
+}
  if (analogix_dp_handle_edid(dp) == 0) {
  drm_mode_connector_update_edid_property(>connector, edid);
@@ -940,6 +992,10 @@ int analogix_dp_get_modes(struct drm_connector
*connector)
  if (dp->plat_data->get_modes)
  num_modes += dp->plat_data->get_modes(dp->plat_data,
connector);
+ret = analogix_dp_prepare_panel(dp, false, false);
+if (ret)
+DRM_ERROR("Failed to unprepare panel (%d)\n", ret);
+
  return num_modes;
  }
@@ -960,11 +1016,23 @@ enum drm_connector_status
  analogix_dp_detect(struct drm_connector *connector, bool force)
  {
  struct analogix_dp_device *dp = to_dp(connector);
+enum drm_connector_status status = connector_status_disconnected;
+int ret;
-if (analogix_dp_detect_hpd(dp))
+ret = analogix_dp_prepare_panel(dp, true, false);
+if (ret) {
+DRM_ERROR("Failed to prepare panel (%d)\n", ret);
  return connector_status_disconnected;
+}
+
+if (!analogix_dp_detect_hpd(dp))
+status = connector_status_connected;
-return connector_status_connected;
+ret = analogix_dp_prepare_panel(dp, false, false);
+if (ret)
+DRM_ERROR("Failed to unprepare panel (%d)\n", ret);
+
+return status;
  }
  static void analogix_dp_connector_destroy(struct drm_connector
*connector)
@@ -1035,6 +1103,16 @@ static int analogix_dp_bridge_attach(struct
drm_bridge *bridge)
  return 0;
  }
+static void analogix_dp_bridge_pre_enable(struct drm_bridge *bridge)
+{
+struct analogix_dp_device *dp = bridge->driver_private;
+int ret;
+
+ret = analogix_dp_prepare_panel(dp, true, true);
+if (ret)
+DRM_ERROR("failed to setup the panel ret = %d\n", ret);
+}
+
  static void analogix_dp_bridge_enable(struct drm_bridge *bridge)
  {
  struct analogix_dp_device *dp =

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-16 Thread Minchan Kim

On Tue, Aug 16, 2016 at 07:06:00PM -0700, Huang, Ying wrote:
> Hi, Kim,
> 
> Minchan Kim  writes:
> 
> > Hello Huang,
> >
> > On Tue, Aug 09, 2016 at 09:37:42AM -0700, Huang, Ying wrote:
> >> From: Huang Ying 
> >> 
> >> This patchset is based on 8/4 head of mmotm/master.
> >> 
> >> This is the first step for Transparent Huge Page (THP) swap support.
> >> The plan is to delaying splitting THP step by step and avoid splitting
> >> THP finally during THP swapping out and swapping in.
> >
> > What does it mean "delay splitting THP on swapping-in"?
> 
> Sorry for my poor English.  We will only delay splitting the THP during
> swapping out.  The final target is to avoid splitting the THP during
> swapping out, and swap out/in the THP directly.  Thanks for pointing out
> that.  I will revise the patch description in the next version.

Thanks.

> 
> >> 
> >> The advantages of THP swap support are:
> >> 
> >> - Batch swap operations for THP to reduce lock acquiring/releasing,
> >>   including allocating/freeing swap space, adding/deleting to/from swap
> >>   cache, and writing/reading swap space, etc.
> >> 
> >> - THP swap space read/write will be 2M sequence IO.  It is particularly
> >>   helpful for swap read, which usually are 4k random IO.
> >> 
> >> - It will help memory fragmentation, especially when THP is heavily used
> >>   by the applications.  2M continuous pages will be free up after THP
> >>   swapping out.
> >
> > Could we take the benefit for normal pages as well as THP page?
> 
> This patchset benefits the THP swap only.  It has no effect for normal pages.
> 
> > I think Tim and me discussed about that a few weeks ago.
> 
> I work closely with Tim on swap optimization.  This patchset is the part
> of our swap optimization plan.
> 
> > Please search below topics.
> >
> > [1] mm: Batch page reclamation under shink_page_list
> > [2] mm: Cleanup - Reorganize the shrink_page_list code into smaller 
> > functions
> >
> > It's different with yours which focused on THP swapping while the suggestion
> > would be more general if we can do so it's worth to try it, I think.
> 
> I think the general optimization above will benefit both normal pages
> and THP at least for now.  And I think there are no hard conflict
> between those two patchsets.

If we could do general optimzation, I guess THP swap without splitting
would be more straight forward.

If we can reclaim batch a certain of pages all at once, it helps we can
do scan_swap_map(si, SWAP_HAS_CACHE, nr_pages). The nr_pages could be
greater or less than 512 pages. With that, scan_swap_map effectively
search empty swap slots from scan_map or free cluser list.
Then, needed part from your patchset is to just delay splitting of THP.

> 
> The THP swap has more opportunity to be optimized, because we can batch
> 512 operations together more easily.  For full THP swap support, unmap a
> THP could be more efficient with only one swap count operation instead
> of 512, so do many other operations, such as add/remove from swap cache
> with multi-order radix tree etc.  And it will help memory fragmentation.
> THP can be kept after swapping out/in, need not to rebuild THP via
> khugepaged.

It seems you increased cluster size to 512 and search a empty cluster
for a THP swap. With that approach, I have a concern that once clusters
will be fragmented, THP swap support doesn't take benefit at all.

Why do we need a empty cluster for swapping out 512 pages?
IOW, below case could work for the goal.

A : Allocated slot
F : Free slot

cluster A   cluster B
  -  

That's one of the reason I suggested batch reclaim work first and
support THP swap based on it. With that, scan_swap_map can be aware of nr_pages
and selects right clusters.

With the approach, justfication of THP swap support would be easier, too.
IOW, I'm not sure how only THP swap support is valuable in real workload.

Anyways, that's just my two cents.

> 
> But not all pages are huge, so normal pages swap optimization is
> necessary and good anyway.
> 
> > Anyway, I hope [1/11] should be merged regardless of the patchset because
> > I believe anyone doesn't feel comfortable with cluser_info functions. ;-)
> 
> Thanks,
> 
> Best Regards,
> Huang, Ying
> 
> [snip]

Re: [PATCH] staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility

2016-08-16 Thread Spencer E Olson

Sorry for the very belated reply on this.  I'm assuming that this was
already accepted, but I've been working with this patch for a bit.  This
fixes the problems I raised in any case.

Reviewed-by: Spencer E Olson 

On Wed, 2016-07-20 at 17:07 +0100, Ian Abbott wrote:
> On 20/07/16 16:55, Hartley Sweeten wrote:
> > On Tuesday, July 19, 2016 4:18 AM, Ian Abbott wrote:
> >> Commit ebb657babfa9 ("staging: comedi: ni_mio_common: clarify the
> >> cmd->start_arg validation and use") introduced a backwards compatibility
> >> issue in the use of asynchronous commands on the AO subdevice when
> >> `start_src` is `TRIG_EXT`.  Valid values for `start_src` are `TRIG_INT`
> >> (for internal, software trigger), and `TRIG_EXT` (for external trigger).
> >> When set to `TRIG_EXT`.  In both cases, the driver relies on an
> >> internal, software trigger to set things up (allowing the user
> >> application to write sufficient samples to the data buffer before the
> >> trigger), so it acts as a software "pre-trigger" in the `TRIG_EXT` case.
> >> The software trigger is handled by `ni_ao_inttrig()`.
> >>
> >> Prior to the above change, when `start_src` was `TRIG_INT`, `start_arg`
> >> was required to be 0, and `ni_ao_inttrig()` checked that the software
> >> trigger number was also 0.  After the above change, when `start_src` was
> >> `TRIG_INT`, any value was allowed for `start_arg`, and `ni_ao_inttrig()`
> >> checked that the software trigger number matched this `start_arg` value.
> >> The backwards compatibility issue is that the internal trigger number
> >> now has to match `start_arg` when `start_src` is `TRIG_EXT` when it
> >> previously had to be 0.
> >>
> >> Fix the backwards compatibility issue in `ni_ao_inttrig()` by always
> >> allowing software trigger number 0 when `start_src` is something other
> >> than `TRIG_INT`.
> >>
> >> Thanks to Spencer Olson for reporting the issue.
> >>
> >> Signed-off-by: Ian Abbott 
> >> Reported-by: Spencer Olson 
> >> Fixes: ebb657babfa9 ("staging: comedi: ni_mio_common: clarify the
> >>   cmd->start_arg validation and use")
> >> ---
> >>  drivers/staging/comedi/drivers/ni_mio_common.c | 10 +-
> >>  1 file changed, 9 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c 
> >> b/drivers/staging/comedi/drivers/ni_mio_common.c
> >> index 8dabb19..9f4036f 100644
> >> --- a/drivers/staging/comedi/drivers/ni_mio_common.c
> >> +++ b/drivers/staging/comedi/drivers/ni_mio_common.c
> >> @@ -2772,7 +2772,15 @@ static int ni_ao_inttrig(struct comedi_device *dev,
> >>int i;
> >>static const int timeout = 1000;
> >>
> >> -  if (trig_num != cmd->start_arg)
> >> +  /*
> >> +   * Require trig_num == cmd->start_arg when cmd->start_src == TRIG_INT.
> >> +   * For backwards compatibility, also allow trig_num == 0 when
> >> +   * cmd->start_src != TRIG_INT (i.e. when cmd->start_src == TRIG_EXT);
> >> +   * in that case, the internal trigger is being used as a pre-trigger
> >> +   * before the external trigger.
> >> +   */
> >> +  if (!(trig_num == cmd->start_arg ||
> >> +(trig_num == 0 && cmd->start_src != TRIG_INT)))
> >>return -EINVAL;
> >
> > Ian,
> >
> > I think this test is a bit clearer:
> >
> > +   /*
> > +* Require trig_num == cmd->start_arg when cmd->start_src == TRIG_INT.
> > +* For backwards compatibility, any trig_num is valid when
> > +* cmd->start_src != TRIG_INT (i.e. when cmd->start_src == TRIG_EXT);
> > +* in that case, the internal trigger is being used as a pre-trigger
> > +* before the external trigger.
> > +*/
> > +   if (cmd->start_src == TRIG_INT && trig_num != cmd->start_arg)
> > return -EINVAL;
> 
> True, but I restricted it to only accept trig_num values that have been 
> valid in the past.
> 
> >
> > But, either way:
> >
> > Reviewed-by: H Hartley Sweeten 
> >
> > Thanks!
> >
> 
> Thanks for the review.
>

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-16 Thread Minchan Kim

On Tue, Aug 16, 2016 at 07:06:00PM -0700, Huang, Ying wrote:
> Hi, Kim,
> 
> Minchan Kim  writes:
> 
> > Hello Huang,
> >
> > On Tue, Aug 09, 2016 at 09:37:42AM -0700, Huang, Ying wrote:
> >> From: Huang Ying 
> >> 
> >> This patchset is based on 8/4 head of mmotm/master.
> >> 
> >> This is the first step for Transparent Huge Page (THP) swap support.
> >> The plan is to delaying splitting THP step by step and avoid splitting
> >> THP finally during THP swapping out and swapping in.
> >
> > What does it mean "delay splitting THP on swapping-in"?
> 
> Sorry for my poor English.  We will only delay splitting the THP during
> swapping out.  The final target is to avoid splitting the THP during
> swapping out, and swap out/in the THP directly.  Thanks for pointing out
> that.  I will revise the patch description in the next version.

Thanks.

> 
> >> 
> >> The advantages of THP swap support are:
> >> 
> >> - Batch swap operations for THP to reduce lock acquiring/releasing,
> >>   including allocating/freeing swap space, adding/deleting to/from swap
> >>   cache, and writing/reading swap space, etc.
> >> 
> >> - THP swap space read/write will be 2M sequence IO.  It is particularly
> >>   helpful for swap read, which usually are 4k random IO.
> >> 
> >> - It will help memory fragmentation, especially when THP is heavily used
> >>   by the applications.  2M continuous pages will be free up after THP
> >>   swapping out.
> >
> > Could we take the benefit for normal pages as well as THP page?
> 
> This patchset benefits the THP swap only.  It has no effect for normal pages.
> 
> > I think Tim and me discussed about that a few weeks ago.
> 
> I work closely with Tim on swap optimization.  This patchset is the part
> of our swap optimization plan.
> 
> > Please search below topics.
> >
> > [1] mm: Batch page reclamation under shink_page_list
> > [2] mm: Cleanup - Reorganize the shrink_page_list code into smaller 
> > functions
> >
> > It's different with yours which focused on THP swapping while the suggestion
> > would be more general if we can do so it's worth to try it, I think.
> 
> I think the general optimization above will benefit both normal pages
> and THP at least for now.  And I think there are no hard conflict
> between those two patchsets.

If we could do general optimzation, I guess THP swap without splitting
would be more straight forward.

If we can reclaim batch a certain of pages all at once, it helps we can
do scan_swap_map(si, SWAP_HAS_CACHE, nr_pages). The nr_pages could be
greater or less than 512 pages. With that, scan_swap_map effectively
search empty swap slots from scan_map or free cluser list.
Then, needed part from your patchset is to just delay splitting of THP.

> 
> The THP swap has more opportunity to be optimized, because we can batch
> 512 operations together more easily.  For full THP swap support, unmap a
> THP could be more efficient with only one swap count operation instead
> of 512, so do many other operations, such as add/remove from swap cache
> with multi-order radix tree etc.  And it will help memory fragmentation.
> THP can be kept after swapping out/in, need not to rebuild THP via
> khugepaged.

It seems you increased cluster size to 512 and search a empty cluster
for a THP swap. With that approach, I have a concern that once clusters
will be fragmented, THP swap support doesn't take benefit at all.

Why do we need a empty cluster for swapping out 512 pages?
IOW, below case could work for the goal.

A : Allocated slot
F : Free slot

cluster A   cluster B
  -  

That's one of the reason I suggested batch reclaim work first and
support THP swap based on it. With that, scan_swap_map can be aware of nr_pages
and selects right clusters.

With the approach, justfication of THP swap support would be easier, too.
IOW, I'm not sure how only THP swap support is valuable in real workload.

Anyways, that's just my two cents.

> 
> But not all pages are huge, so normal pages swap optimization is
> necessary and good anyway.
> 
> > Anyway, I hope [1/11] should be merged regardless of the patchset because
> > I believe anyone doesn't feel comfortable with cluser_info functions. ;-)
> 
> Thanks,
> 
> Best Regards,
> Huang, Ying
> 
> [snip]

Re: [PATCH] staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility

2016-08-16 Thread Spencer E Olson

Sorry for the very belated reply on this.  I'm assuming that this was
already accepted, but I've been working with this patch for a bit.  This
fixes the problems I raised in any case.

Reviewed-by: Spencer E Olson 

On Wed, 2016-07-20 at 17:07 +0100, Ian Abbott wrote:
> On 20/07/16 16:55, Hartley Sweeten wrote:
> > On Tuesday, July 19, 2016 4:18 AM, Ian Abbott wrote:
> >> Commit ebb657babfa9 ("staging: comedi: ni_mio_common: clarify the
> >> cmd->start_arg validation and use") introduced a backwards compatibility
> >> issue in the use of asynchronous commands on the AO subdevice when
> >> `start_src` is `TRIG_EXT`.  Valid values for `start_src` are `TRIG_INT`
> >> (for internal, software trigger), and `TRIG_EXT` (for external trigger).
> >> When set to `TRIG_EXT`.  In both cases, the driver relies on an
> >> internal, software trigger to set things up (allowing the user
> >> application to write sufficient samples to the data buffer before the
> >> trigger), so it acts as a software "pre-trigger" in the `TRIG_EXT` case.
> >> The software trigger is handled by `ni_ao_inttrig()`.
> >>
> >> Prior to the above change, when `start_src` was `TRIG_INT`, `start_arg`
> >> was required to be 0, and `ni_ao_inttrig()` checked that the software
> >> trigger number was also 0.  After the above change, when `start_src` was
> >> `TRIG_INT`, any value was allowed for `start_arg`, and `ni_ao_inttrig()`
> >> checked that the software trigger number matched this `start_arg` value.
> >> The backwards compatibility issue is that the internal trigger number
> >> now has to match `start_arg` when `start_src` is `TRIG_EXT` when it
> >> previously had to be 0.
> >>
> >> Fix the backwards compatibility issue in `ni_ao_inttrig()` by always
> >> allowing software trigger number 0 when `start_src` is something other
> >> than `TRIG_INT`.
> >>
> >> Thanks to Spencer Olson for reporting the issue.
> >>
> >> Signed-off-by: Ian Abbott 
> >> Reported-by: Spencer Olson 
> >> Fixes: ebb657babfa9 ("staging: comedi: ni_mio_common: clarify the
> >>   cmd->start_arg validation and use")
> >> ---
> >>  drivers/staging/comedi/drivers/ni_mio_common.c | 10 +-
> >>  1 file changed, 9 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c 
> >> b/drivers/staging/comedi/drivers/ni_mio_common.c
> >> index 8dabb19..9f4036f 100644
> >> --- a/drivers/staging/comedi/drivers/ni_mio_common.c
> >> +++ b/drivers/staging/comedi/drivers/ni_mio_common.c
> >> @@ -2772,7 +2772,15 @@ static int ni_ao_inttrig(struct comedi_device *dev,
> >>int i;
> >>static const int timeout = 1000;
> >>
> >> -  if (trig_num != cmd->start_arg)
> >> +  /*
> >> +   * Require trig_num == cmd->start_arg when cmd->start_src == TRIG_INT.
> >> +   * For backwards compatibility, also allow trig_num == 0 when
> >> +   * cmd->start_src != TRIG_INT (i.e. when cmd->start_src == TRIG_EXT);
> >> +   * in that case, the internal trigger is being used as a pre-trigger
> >> +   * before the external trigger.
> >> +   */
> >> +  if (!(trig_num == cmd->start_arg ||
> >> +(trig_num == 0 && cmd->start_src != TRIG_INT)))
> >>return -EINVAL;
> >
> > Ian,
> >
> > I think this test is a bit clearer:
> >
> > +   /*
> > +* Require trig_num == cmd->start_arg when cmd->start_src == TRIG_INT.
> > +* For backwards compatibility, any trig_num is valid when
> > +* cmd->start_src != TRIG_INT (i.e. when cmd->start_src == TRIG_EXT);
> > +* in that case, the internal trigger is being used as a pre-trigger
> > +* before the external trigger.
> > +*/
> > +   if (cmd->start_src == TRIG_INT && trig_num != cmd->start_arg)
> > return -EINVAL;
> 
> True, but I restricted it to only accept trig_num values that have been 
> valid in the past.
> 
> >
> > But, either way:
> >
> > Reviewed-by: H Hartley Sweeten 
> >
> > Thanks!
> >
> 
> Thanks for the review.
>

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu

On 08/16/2016 05:56 PM, Xin Long wrote:

 I'm testing on Linus' master, can we all use that please?

>>>
>>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>
>>> [mechine]
>>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
>>> mem 62G (66000220K)
>>>
>>> [system]
>>> # cat /etc/redhat-release
>>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
>>>
>>> [commit 3684b03]
>>> [root@hp-dl380pg8-11 lxin]# uname -r
>>> 4.8.0-rc2.3684b03
>>> [root@hp-dl380pg8-11 lxin]# cat test.sh
>>> killall -0 netserver || netserver -4 &
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>
>> I just realized the test we are doing is not exactly the same.
>> As the original report says:
>> ip: ipv4
>> runtime: 300s
>> nr_threads: 200%
>> cluster: cs-localhost
>> send_size: 10K
>> test: SCTP_STREAM_MANY
>> cpufreq_governor: performance
>>
>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>> processes of netperf.
>>
>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>> are started concurrently:
> OK, understand.
> 
>>
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>>
>> The throughput is the average of those runs.
>>
>> And I think we should be doing test on:
>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>> and
>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>> immediate parent)
>> instead of Linus' master HEAD to avoid other factors.
>>
> OK, I will do tests as your suggestion now,  but need to rebuild again :D
> 
> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
> then try again?

For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
the value of net.sctp.prsctp_enable, the throughput is almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
2353.311249997
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
2371.586250003
  ]
}

For its immediate parent:
commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
No matter the value of net.sctp.prsctp_enable, the throughput is again
almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
3838.83004
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
3751.46005
  ]
}

Does this result give any hint?

Thanks,
Aaron

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu

On 08/16/2016 05:56 PM, Xin Long wrote:

 I'm testing on Linus' master, can we all use that please?

>>>
>>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>
>>> [mechine]
>>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
>>> mem 62G (66000220K)
>>>
>>> [system]
>>> # cat /etc/redhat-release
>>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
>>>
>>> [commit 3684b03]
>>> [root@hp-dl380pg8-11 lxin]# uname -r
>>> 4.8.0-rc2.3684b03
>>> [root@hp-dl380pg8-11 lxin]# cat test.sh
>>> killall -0 netserver || netserver -4 &
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>
>> I just realized the test we are doing is not exactly the same.
>> As the original report says:
>> ip: ipv4
>> runtime: 300s
>> nr_threads: 200%
>> cluster: cs-localhost
>> send_size: 10K
>> test: SCTP_STREAM_MANY
>> cpufreq_governor: performance
>>
>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>> processes of netperf.
>>
>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>> are started concurrently:
> OK, understand.
> 
>>
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>>
>> The throughput is the average of those runs.
>>
>> And I think we should be doing test on:
>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>> and
>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>> immediate parent)
>> instead of Linus' master HEAD to avoid other factors.
>>
> OK, I will do tests as your suggestion now,  but need to rebuild again :D
> 
> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
> then try again?

For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
the value of net.sctp.prsctp_enable, the throughput is almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
2353.311249997
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
2371.586250003
  ]
}

For its immediate parent:
commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
No matter the value of net.sctp.prsctp_enable, the throughput is again
almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
3838.83004
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
3751.46005
  ]
}

Does this result give any hint?

Thanks,
Aaron

Re: [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-08-16 Thread Thiago Jung Bauermann

Hello Dave,

Am Mittwoch, 17 August 2016, 10:52:26 schrieb Dave Young:
> On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> > This series applies on top of v5 of the "kexec_file_load implementation
> > for PowerPC" patch series (which applies on top of v4.8-rc1):
> > 
> > https://lists.infradead.org/pipermail/kexec/2016-August/016843.html
> 
> I'm trying to review your patches, but seems I can not apply them
> cleanly to mainline kernel or v4.8-rc1

Strange, I just did a test using the patches I received via the kexec 
mailing list, and git am applied them cleanly on v4.8-rc1.

> Apply the kexec_file_load series failed as below on v4.8-rc1:
> 
> Applying: kexec_file: Allow arch-specific memory walking for
> kexec_add_buffer
> error: patch failed: include/linux/kexec.h:149
> error: include/linux/kexec.h: patch does not apply
> Patch failed at 0001 kexec_file: Allow arch-specific memory walking for
> kexec_add_buffer
> The copy of the patch that failed is found in: .git/rebase-apply/patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> What is the order of your patch series of the three patchset?
> 
> [PATCH v2 0/2] extend kexec_file_load system call
> [PATCH v5 00/13] kexec_file_load implementation for PowerPC
> [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

Yes, that is correct.

> Do they depend on other patches?

No, they apply directly on v4.8-rc1.

I just published a branch with the patches, if you want you can use that 
instead. The branch is called kexec-patches and is at the repo 
g...@github.com:bauermann/linux

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center

Re: [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-08-16 Thread Thiago Jung Bauermann

Hello Dave,

Am Mittwoch, 17 August 2016, 10:52:26 schrieb Dave Young:
> On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> > This series applies on top of v5 of the "kexec_file_load implementation
> > for PowerPC" patch series (which applies on top of v4.8-rc1):
> > 
> > https://lists.infradead.org/pipermail/kexec/2016-August/016843.html
> 
> I'm trying to review your patches, but seems I can not apply them
> cleanly to mainline kernel or v4.8-rc1

Strange, I just did a test using the patches I received via the kexec 
mailing list, and git am applied them cleanly on v4.8-rc1.

> Apply the kexec_file_load series failed as below on v4.8-rc1:
> 
> Applying: kexec_file: Allow arch-specific memory walking for
> kexec_add_buffer
> error: patch failed: include/linux/kexec.h:149
> error: include/linux/kexec.h: patch does not apply
> Patch failed at 0001 kexec_file: Allow arch-specific memory walking for
> kexec_add_buffer
> The copy of the patch that failed is found in: .git/rebase-apply/patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> What is the order of your patch series of the three patchset?
> 
> [PATCH v2 0/2] extend kexec_file_load system call
> [PATCH v5 00/13] kexec_file_load implementation for PowerPC
> [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

Yes, that is correct.

> Do they depend on other patches?

No, they apply directly on v4.8-rc1.

I just published a branch with the patches, if you want you can use that 
instead. The branch is called kexec-patches and is at the repo 
g...@github.com:bauermann/linux

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center

perf: rdpmc and PERF_EVENT_IOC_RESET

2016-08-16 Thread Vince Weaver

Hello

so using rdpmc() and the mmap page to do fast perf_event reads seems to 
interact poorly with the PERF_EVENT_IOC_RESET ioctl.

>From what I can tell, on reset event->count is set to zero, but 
event->hw.prev_count is not, so the userpg->offset field ends up negative 
and weird things happen.

Shout reset just not be called if you are using the rdpmc() interface?

Vince

perf: rdpmc and PERF_EVENT_IOC_RESET

2016-08-16 Thread Vince Weaver

Hello

so using rdpmc() and the mmap page to do fast perf_event reads seems to 
interact poorly with the PERF_EVENT_IOC_RESET ioctl.

>From what I can tell, on reset event->count is set to zero, but 
event->hw.prev_count is not, so the userpg->offset field ends up negative 
and weird things happen.

Shout reset just not be called if you are using the rdpmc() interface?

Vince

Re: [PATCH v6 6/8] Documentation: bindings: add dt documentation for rk3399 dmc

2016-08-16 Thread Chanwoo Choi

Hi Lin,

On 2016년 08월 17일 07:36, Lin Huang wrote:
> This patch adds the documentation for rockchip rk3399 dmc driver.
> 
> Signed-off-by: Lin Huang <h...@rock-chips.com>
> ---
> Changes in v6:
> -Add more detail in Documentation
> 
> Changes in v5:
> -None
> 
> Changes in v4:
> -None
> 
> Changes in v3:
> -None
> 
> Changes in v2:
> -None 
> 
> Changes in v1:
> -None
>  .../devicetree/bindings/devfreq/rk3399_dmc.txt | 84 
> ++
>  1 file changed, 84 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
> b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> new file mode 100644
> index 000..e73067c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> @@ -0,0 +1,84 @@
> +* Rockchip rk3399 DMC(Dynamic Memory Controller) device
> +
> +Required properties:
> +- compatible: Must be "rockchip,rk3399-dmc".
> +- devfreq-events: Node to get ddr loading, Refer to
> +   Documentation/devicetree/bindings/devfreq/rockchip-dif.txt
> +- interrupts: The interrupt number to the cpu. The interrupt specifier format
> +   depends on the interrupt controller. it should be dcf interrupts,
> +   when ddr dvfs finish, it will happen.

If possible, you better to keep the indentation with other properties.
s/it->It, dcf->DCF, ddr->DDR


> +- clocks: Phandles for clock specified in "clock-names" property
> +- clock-names : The name of clock used by the DFI, must be "pclk_ddr_mon";
> +- operating-points-v2: Refer to 
> Documentation/devicetree/bindings/power/opp.txt
> +for details.

ditto.

> +- center-supply: Dmc supply node.

s/Dmc/DMC becaue DMC an abbreviation.

> +- status: Marks the node enabled/disabled.
> +
> +Optional properties:
> +- ddr_timing: ddr timing need to pass to arm trust firmware
> +- upthreshold: the upthreshold to simpleondeamnd policy
> +- downdifferential: The downdifferential to simpleondeamnd policy
> +
> +Example:
> + ddr_timing: ddr_timing {
> + compatible = "rockchip,ddr-timing";

I can't find the 'rockchip,ddr-timing' driver on linux-next git repo (20160816).
If ddr_timing includes the only properties for ddr_timing,
I recommend you make the separate a .dtsi file including
the ddr timing configuration. I add the reference and an example on below.

> + ddr3_speed_bin = <21>;
> + pd_idle = <0>;
> + sr_idle = <0>;
> + sr_mc_gate_idle = <0>;
> + srpd_lite_idle  = <0>;
> + standby_idle = <0>;
> + dram_dll_dis_freq = <300>;
> + phy_dll_dis_freq = <125>;
> +
> + ddr3_odt_dis_freq = <333>;
> + ddr3_drv = ;
> + ddr3_odt = ;
> + phy_ddr3_ca_drv = ;
> + phy_ddr3_dq_drv = ;
> + phy_ddr3_odt = ;
> +
> + lpddr3_odt_dis_freq = <333>;
> + lpddr3_drv = ;
> + lpddr3_odt = ;
> + phy_lpddr3_ca_drv = ;
> + phy_lpddr3_dq_drv = ;
> + phy_lpddr3_odt = ;
> +
> + lpddr4_odt_dis_freq = <333>;
> + lpddr4_drv = ;
> + lpddr4_dq_odt = ;
> + lpddr4_ca_odt = ;
> + phy_lpddr4_ca_drv = ;
> + phy_lpddr4_ck_cs_drv = ;
> + phy_lpddr4_dq_drv = ;
> + phy_lpddr4_odt = ;
> + };
> +
> + dmc_opp_table: dmc_opp_table {
> + compatible = "operating-points-v2";
> +
> + opp00 {
> + opp-hz = /bits/ 64 <3>;
> + opp-microvolt = <90>;
> + };
> + opp01 {
> + opp-hz = /bits/ 64 <66600>;
> + opp-microvolt = <90>;
> + };
> + };
> +
> + dmc: dmc {
> + compatible = "rockchip,rk3399-dmc";
> + devfreq-events = <>;
> + interrupts = ;
> + clocks = < SCLK_DDRCLK>;
> + clock-names = "dmc_clk";
> + ddr_timing = <_timing>;

You can use the following '#include' instead of 'ddr_timing'
because the ddr_timing is not a device driver. Instead,
the rk3399-dmc must need the ddr timing configuration.

#include "rk3399-dmc-timing-conf.dtsi"

You can refer the similar usage case[1].
The *.conf.dtsi is used on exynos3250

Re: [PATCH v6 6/8] Documentation: bindings: add dt documentation for rk3399 dmc

2016-08-16 Thread Chanwoo Choi

Hi Lin,

On 2016년 08월 17일 07:36, Lin Huang wrote:
> This patch adds the documentation for rockchip rk3399 dmc driver.
> 
> Signed-off-by: Lin Huang 
> ---
> Changes in v6:
> -Add more detail in Documentation
> 
> Changes in v5:
> -None
> 
> Changes in v4:
> -None
> 
> Changes in v3:
> -None
> 
> Changes in v2:
> -None 
> 
> Changes in v1:
> -None
>  .../devicetree/bindings/devfreq/rk3399_dmc.txt | 84 
> ++
>  1 file changed, 84 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
> b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> new file mode 100644
> index 000..e73067c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> @@ -0,0 +1,84 @@
> +* Rockchip rk3399 DMC(Dynamic Memory Controller) device
> +
> +Required properties:
> +- compatible: Must be "rockchip,rk3399-dmc".
> +- devfreq-events: Node to get ddr loading, Refer to
> +   Documentation/devicetree/bindings/devfreq/rockchip-dif.txt
> +- interrupts: The interrupt number to the cpu. The interrupt specifier format
> +   depends on the interrupt controller. it should be dcf interrupts,
> +   when ddr dvfs finish, it will happen.

If possible, you better to keep the indentation with other properties.
s/it->It, dcf->DCF, ddr->DDR


> +- clocks: Phandles for clock specified in "clock-names" property
> +- clock-names : The name of clock used by the DFI, must be "pclk_ddr_mon";
> +- operating-points-v2: Refer to 
> Documentation/devicetree/bindings/power/opp.txt
> +for details.

ditto.

> +- center-supply: Dmc supply node.

s/Dmc/DMC becaue DMC an abbreviation.

> +- status: Marks the node enabled/disabled.
> +
> +Optional properties:
> +- ddr_timing: ddr timing need to pass to arm trust firmware
> +- upthreshold: the upthreshold to simpleondeamnd policy
> +- downdifferential: The downdifferential to simpleondeamnd policy
> +
> +Example:
> + ddr_timing: ddr_timing {
> + compatible = "rockchip,ddr-timing";

I can't find the 'rockchip,ddr-timing' driver on linux-next git repo (20160816).
If ddr_timing includes the only properties for ddr_timing,
I recommend you make the separate a .dtsi file including
the ddr timing configuration. I add the reference and an example on below.

> + ddr3_speed_bin = <21>;
> + pd_idle = <0>;
> + sr_idle = <0>;
> + sr_mc_gate_idle = <0>;
> + srpd_lite_idle  = <0>;
> + standby_idle = <0>;
> + dram_dll_dis_freq = <300>;
> + phy_dll_dis_freq = <125>;
> +
> + ddr3_odt_dis_freq = <333>;
> + ddr3_drv = ;
> + ddr3_odt = ;
> + phy_ddr3_ca_drv = ;
> + phy_ddr3_dq_drv = ;
> + phy_ddr3_odt = ;
> +
> + lpddr3_odt_dis_freq = <333>;
> + lpddr3_drv = ;
> + lpddr3_odt = ;
> + phy_lpddr3_ca_drv = ;
> + phy_lpddr3_dq_drv = ;
> + phy_lpddr3_odt = ;
> +
> + lpddr4_odt_dis_freq = <333>;
> + lpddr4_drv = ;
> + lpddr4_dq_odt = ;
> + lpddr4_ca_odt = ;
> + phy_lpddr4_ca_drv = ;
> + phy_lpddr4_ck_cs_drv = ;
> + phy_lpddr4_dq_drv = ;
> + phy_lpddr4_odt = ;
> + };
> +
> + dmc_opp_table: dmc_opp_table {
> + compatible = "operating-points-v2";
> +
> + opp00 {
> + opp-hz = /bits/ 64 <3>;
> + opp-microvolt = <90>;
> + };
> + opp01 {
> + opp-hz = /bits/ 64 <66600>;
> + opp-microvolt = <90>;
> + };
> + };
> +
> + dmc: dmc {
> + compatible = "rockchip,rk3399-dmc";
> + devfreq-events = <>;
> + interrupts = ;
> + clocks = < SCLK_DDRCLK>;
> + clock-names = "dmc_clk";
> + ddr_timing = <_timing>;

You can use the following '#include' instead of 'ddr_timing'
because the ddr_timing is not a device driver. Instead,
the rk3399-dmc must need the ddr timing configuration.

#include "rk3399-dmc-timing-conf.dtsi"

You can refer the similar usage case[1].
The *.conf.dtsi is used on exynos3250 tmu dt node[2].

[1] arch/arm/

RE: [PATCH v7 1/2] ACPI / button: Fix an issue that the platform triggered reliable events may not be delivered to the userspace

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Rafael J.
> Wysocki
> Subject: Re: [PATCH v7 1/2] ACPI / button: Fix an issue that the platform 
> triggered reliable events
> may not be delivered to the userspace
> 
> On Tuesday, July 26, 2016 05:52:24 PM Lv Zheng wrote:
> > On most platforms, _LID returning value, lid open/close events are all
> > reliable, but there are exceptions. Some AML tables report wrong initial
> > lid state (Link 1), and some of them never report lid open state (Link 2).
> > The usage model on such buggy platforms is:
> > 1. The initial lid state returned from _LID is not reliable;
> > 2. The lid open event is not reliable;
> > 3. The lid close event is always reliable, used by the platform firmware to
> >trigger OSPM power saving operations.
> > This usage model is not compliant to the Linux SW_LID model as the Linux
> > userspace is very strict to the reliability of the open events.
> >
> > In order not to trigger issues on such buggy platforms, the ACPI button
> > driver currently implements a lid_init_state=open quirk to send additional
> > "open" event after resuming. However, this is still not sufficient because:
> > 1. Some special usage models (e.x., the dark resume scenario) cannot be
> >supported by this mode.
> > 2. If a "close" event is not used to trigger "suspend", then the subsequent
> >"close" events cannot be seen by the userspace.
> > So we need to stop sending the additional "open" event and switch the
> > driver to lid_init_state=ignore mode and make sure the platform triggered
> > events can be reliably delivered to the userspace. The userspace programs
> > then can be changed to not to be strict to the "open" events on such buggy
> > platforms.
> >
> > Why will the subsequent "close" events be lost? This is because the input
> > layer automatically filters redundant events for switch events. Thus given
> > that the buggy AML tables do not guarantee paired "open"/"close" events,
> > the ACPI button driver currently is not able to guarantee that the platform
> > triggered reliable events can be always be seen by the userspace via
> > SW_LID.
> >
> > This patch adds a mechanism to insert lid events as a compensation for the
> > platform triggered ones to form a complete event switches in order to make
> > sure that the platform triggered events can always be reliably delivered
> > to the userspace. This essentially guarantees that the platform triggered
> > reliable "close" events will always be relibly delivered to the userspace.
> >
> > However this mechanism is not suitable for lid_init_state=open/method as
> > it should not send the complement switch event for the unreliable initial
> > lid state notification. 2 unreliable events can trigger unexpected
> > behavior. Thus this patch only implements this mechanism for
> > lid_init_state=ignore.
> >
> > Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=89211
> > https://bugzilla.kernel.org/show_bug.cgi?id=106151
> > Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=106941
> > Signed-off-by: Lv Zheng 
> > Suggested-by: Dmitry Torokhov 
> > Cc: Benjamin Tissoires 
> > Cc: Bastien Nocera: 
> > Cc: linux-in...@vger.kernel.org
> > ---
> >  drivers/acpi/button.c |   51 
> > -
> >  1 file changed, 50 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index 148f4e5..dca1806 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -19,6 +19,8 @@
> >   * 
> > ~~
> >   */
> >
> > +#define pr_fmt(fmt) "ACPI : button: " fmt
> > +
> >  #include 
> >  #include 
> >  #include 
> > @@ -104,6 +106,8 @@ struct acpi_button {
> > struct input_dev *input;
> > char phys[32];  /* for input device */
> > unsigned long pushed;
> > +   int last_state;
> > +   unsigned long last_time;
> 
> Why don't you use ktime_t here?

OK.
I'll update the patch with ktime interfaces.
And send it after tests.

Thanks,
Lv

> 
> > bool suspended;
> >  };
> >
> > @@ -111,6 +115,10 @@ static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> >  static struct acpi_device *lid_device;
> >  static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> >
> > +static unsigned long lid_report_interval __read_mostly = 500;
> > +module_param(lid_report_interval, ulong, 0644);
> > +MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key 
> > events");
> > +
> >  /* 
> > --
> >FS Interface (/proc)
> > 
> > -- 
> > */
> > @@ -135,9 +143,48 @@ static int acpi_lid_notify_state(struct

RE: [PATCH v7 1/2] ACPI / button: Fix an issue that the platform triggered reliable events may not be delivered to the userspace

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Rafael J.
> Wysocki
> Subject: Re: [PATCH v7 1/2] ACPI / button: Fix an issue that the platform 
> triggered reliable events
> may not be delivered to the userspace
> 
> On Tuesday, July 26, 2016 05:52:24 PM Lv Zheng wrote:
> > On most platforms, _LID returning value, lid open/close events are all
> > reliable, but there are exceptions. Some AML tables report wrong initial
> > lid state (Link 1), and some of them never report lid open state (Link 2).
> > The usage model on such buggy platforms is:
> > 1. The initial lid state returned from _LID is not reliable;
> > 2. The lid open event is not reliable;
> > 3. The lid close event is always reliable, used by the platform firmware to
> >trigger OSPM power saving operations.
> > This usage model is not compliant to the Linux SW_LID model as the Linux
> > userspace is very strict to the reliability of the open events.
> >
> > In order not to trigger issues on such buggy platforms, the ACPI button
> > driver currently implements a lid_init_state=open quirk to send additional
> > "open" event after resuming. However, this is still not sufficient because:
> > 1. Some special usage models (e.x., the dark resume scenario) cannot be
> >supported by this mode.
> > 2. If a "close" event is not used to trigger "suspend", then the subsequent
> >"close" events cannot be seen by the userspace.
> > So we need to stop sending the additional "open" event and switch the
> > driver to lid_init_state=ignore mode and make sure the platform triggered
> > events can be reliably delivered to the userspace. The userspace programs
> > then can be changed to not to be strict to the "open" events on such buggy
> > platforms.
> >
> > Why will the subsequent "close" events be lost? This is because the input
> > layer automatically filters redundant events for switch events. Thus given
> > that the buggy AML tables do not guarantee paired "open"/"close" events,
> > the ACPI button driver currently is not able to guarantee that the platform
> > triggered reliable events can be always be seen by the userspace via
> > SW_LID.
> >
> > This patch adds a mechanism to insert lid events as a compensation for the
> > platform triggered ones to form a complete event switches in order to make
> > sure that the platform triggered events can always be reliably delivered
> > to the userspace. This essentially guarantees that the platform triggered
> > reliable "close" events will always be relibly delivered to the userspace.
> >
> > However this mechanism is not suitable for lid_init_state=open/method as
> > it should not send the complement switch event for the unreliable initial
> > lid state notification. 2 unreliable events can trigger unexpected
> > behavior. Thus this patch only implements this mechanism for
> > lid_init_state=ignore.
> >
> > Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=89211
> > https://bugzilla.kernel.org/show_bug.cgi?id=106151
> > Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=106941
> > Signed-off-by: Lv Zheng 
> > Suggested-by: Dmitry Torokhov 
> > Cc: Benjamin Tissoires 
> > Cc: Bastien Nocera: 
> > Cc: linux-in...@vger.kernel.org
> > ---
> >  drivers/acpi/button.c |   51 
> > -
> >  1 file changed, 50 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index 148f4e5..dca1806 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -19,6 +19,8 @@
> >   * 
> > ~~
> >   */
> >
> > +#define pr_fmt(fmt) "ACPI : button: " fmt
> > +
> >  #include 
> >  #include 
> >  #include 
> > @@ -104,6 +106,8 @@ struct acpi_button {
> > struct input_dev *input;
> > char phys[32];  /* for input device */
> > unsigned long pushed;
> > +   int last_state;
> > +   unsigned long last_time;
> 
> Why don't you use ktime_t here?

OK.
I'll update the patch with ktime interfaces.
And send it after tests.

Thanks,
Lv

> 
> > bool suspended;
> >  };
> >
> > @@ -111,6 +115,10 @@ static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> >  static struct acpi_device *lid_device;
> >  static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> >
> > +static unsigned long lid_report_interval __read_mostly = 500;
> > +module_param(lid_report_interval, ulong, 0644);
> > +MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key 
> > events");
> > +
> >  /* 
> > --
> >FS Interface (/proc)
> > 
> > -- 
> > */
> > @@ -135,9 +143,48 @@ static int acpi_lid_notify_state(struct acpi_device 
> > *device, int state)
> > struct acpi_button *button = acpi_driver_data(device);
> >

RE: [PATCH v4 3/3] tools/power/acpi/acpidbg: Add multi-commands support in batch mode

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 3/3] tools/power/acpi/acpidbg: Add multi-commands 
> support in batch mode
> 
> On Tuesday, July 26, 2016 07:01:45 PM Lv Zheng wrote:
> > This patch adds multi-commands support for the batch mode. The same mode
> > can be seen in acpiexec.
> >
> > However people may think this is not useful for an in-kernel debugger,
> > because the in-kernel debugger is always running, never exits. So we can
> > run another command by running another acpidbg batch mode instance.
> >
> > But this mode should still be useful for acpidbg. The reason is: when the
> > in-kernel debugger has entered the single-stepping mode, ending acpidbg
> > (which closes the debugger IO interface) will lead to the end of the
> > single-stepping mode.
> >
> > So we need the acpidbg multi-commands batch mode in order to execute
> > multiple single-stepping mode commands in scripts.
> 
> An example would be really useful here IMO.

Considering the following control method:

Name (TVAL, Zero)
Method (TMTD)
{
Inrement (TVAL)
}

When it is executed in acpiexec:
---
#!/bin/sh
acpiexec -b "ex \TMTD" -es dsdt.aml
acpiexec -b "ex \TVAL" -es dsdt.aml
---
The result is "0" for TVAL, because each acpiexec instance re-initializes the 
namespace.
Thus acpiexec provides a multi-command batch mode:
---
#!/bin/sh
acpiexec -b "ex \TMTD; ex \TVAL" -es dsdt.aml
---
The result is "1" now.

Then for this case (I'll use it to compare acpidbg behavior):
---
#!/bin/sh
acpiexec -b "ex \TMTD" -es dsdt.aml
acpiexec -b "ex \TMTD; ex \TVAL" -es dsdt.aml
---
The result is "1".

Unlike acpiexec, whatever the AML debugger is initialized/terminated.
The namespace won't be re-initialized.

So for the above cases:
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TVAL"
---
The result is "1".
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TMTD; ex \TVAL"
---
The result is "2", it should be no different than:
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TVAL"
---
Thus I said:
People may think this (the multi-command support) is not useful for an 
in-kernel debugger.


But there is a special case for the single stepping mode.
---
#!/bin/sh
acpidbg -b "debug \TMTD"
acpidbg -b "locals"
---
The "debug" command is special, it puts AML debugger into single stepping mode.
In this mode, user can use single stepping mode commands to debug the 
evaluation of the TMTD.
The result of this script is:
There is no method currently executing.

Because for the kernel AML debugger, if we leave a \TMTD unfinished.
Then mutex held in this method could block normal kernel evaluations of the 
methods requiring same mutexes.

Thus if acpidbg exits (closing acpi_dbg IO), single stepping mode will also run 
into an end.
See:
acpi_terminate_debugger() in acpi_aml_release().

So we need the multi-command batch mode for this case:
---
#!/bin/sh
acpidbg -b "debug \TMTD; locals"
---
The result of this script is:
No Local Variables are initialized for method (TMTD)

Best regards
Lv

RE: [PATCH v4 3/3] tools/power/acpi/acpidbg: Add multi-commands support in batch mode

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 3/3] tools/power/acpi/acpidbg: Add multi-commands 
> support in batch mode
> 
> On Tuesday, July 26, 2016 07:01:45 PM Lv Zheng wrote:
> > This patch adds multi-commands support for the batch mode. The same mode
> > can be seen in acpiexec.
> >
> > However people may think this is not useful for an in-kernel debugger,
> > because the in-kernel debugger is always running, never exits. So we can
> > run another command by running another acpidbg batch mode instance.
> >
> > But this mode should still be useful for acpidbg. The reason is: when the
> > in-kernel debugger has entered the single-stepping mode, ending acpidbg
> > (which closes the debugger IO interface) will lead to the end of the
> > single-stepping mode.
> >
> > So we need the acpidbg multi-commands batch mode in order to execute
> > multiple single-stepping mode commands in scripts.
> 
> An example would be really useful here IMO.

Considering the following control method:

Name (TVAL, Zero)
Method (TMTD)
{
Inrement (TVAL)
}

When it is executed in acpiexec:
---
#!/bin/sh
acpiexec -b "ex \TMTD" -es dsdt.aml
acpiexec -b "ex \TVAL" -es dsdt.aml
---
The result is "0" for TVAL, because each acpiexec instance re-initializes the 
namespace.
Thus acpiexec provides a multi-command batch mode:
---
#!/bin/sh
acpiexec -b "ex \TMTD; ex \TVAL" -es dsdt.aml
---
The result is "1" now.

Then for this case (I'll use it to compare acpidbg behavior):
---
#!/bin/sh
acpiexec -b "ex \TMTD" -es dsdt.aml
acpiexec -b "ex \TMTD; ex \TVAL" -es dsdt.aml
---
The result is "1".

Unlike acpiexec, whatever the AML debugger is initialized/terminated.
The namespace won't be re-initialized.

So for the above cases:
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TVAL"
---
The result is "1".
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TMTD; ex \TVAL"
---
The result is "2", it should be no different than:
---
#!/bin/sh
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TMTD"
acpidbg -b "ex \TVAL"
---
Thus I said:
People may think this (the multi-command support) is not useful for an 
in-kernel debugger.


But there is a special case for the single stepping mode.
---
#!/bin/sh
acpidbg -b "debug \TMTD"
acpidbg -b "locals"
---
The "debug" command is special, it puts AML debugger into single stepping mode.
In this mode, user can use single stepping mode commands to debug the 
evaluation of the TMTD.
The result of this script is:
There is no method currently executing.

Because for the kernel AML debugger, if we leave a \TMTD unfinished.
Then mutex held in this method could block normal kernel evaluations of the 
methods requiring same mutexes.

Thus if acpidbg exits (closing acpi_dbg IO), single stepping mode will also run 
into an end.
See:
acpi_terminate_debugger() in acpi_aml_release().

So we need the multi-command batch mode for this case:
---
#!/bin/sh
acpidbg -b "debug \TMTD; locals"
---
The result of this script is:
No Local Variables are initialized for method (TMTD)

Best regards
Lv

Re: [PATCH] tpm: fix a race condition tpm2_unseal_trusted()

2016-08-16 Thread Jarkko Sakkinen

On Tue, Aug 16, 2016 at 10:38:22PM +0300, Jarkko Sakkinen wrote:
> Unseal and load operations should be done as an atomic operation. This
> commit introduces unlocked tpm_transmit() so that tpm2_unseal_trusted()
> can do the locking by itself.
> 
> v2: Introduced an unlocked unseal operation instead of changing locking
> strategy in order to make less intrusive bug fix and thus more
> backportable.
> 
> CC: sta...@vger.kernel.org
> Fixes: 954650efb79f ("tpm: seal/unseal for TPM 2.0")
> Signed-off-by: Jarkko Sakkinen 
> ---
>  drivers/char/tpm/tpm-dev.c   |  2 +-
>  drivers/char/tpm/tpm-interface.c | 14 --
>  drivers/char/tpm/tpm.h   | 18 ++
>  drivers/char/tpm/tpm2-cmd.c  | 13 +
>  4 files changed, 32 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
> index f5d4521..8df8846 100644
> --- a/drivers/char/tpm/tpm-dev.c
> +++ b/drivers/char/tpm/tpm-dev.c
> @@ -145,7 +145,7 @@ static ssize_t tpm_write(struct file *file, const char 
> __user *buf,
>   return -EPIPE;
>   }
>   out_size = tpm_transmit(priv->chip, priv->data_buffer,
> - sizeof(priv->data_buffer));
> + sizeof(priv->data_buffer), TPM_TRANSMIT_LOCK);
>  
>   tpm_put_ops(priv->chip);
>   if (out_size < 0) {
> diff --git a/drivers/char/tpm/tpm-interface.c 
> b/drivers/char/tpm/tpm-interface.c
> index 43ef0ef..627daa7 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -331,7 +331,7 @@ EXPORT_SYMBOL_GPL(tpm_calc_ordinal_duration);
>   * Internal kernel interface to transmit TPM commands
>   */
>  ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> -  size_t bufsiz)
> +  size_t bufsiz, unsigned int flags)
>  {
>   ssize_t rc;
>   u32 count, ordinal;
> @@ -350,7 +350,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
> *buf,
>   return -E2BIG;
>   }
>  
> - mutex_lock(>tpm_mutex);
> + if (flags & TPM_TRANSMIT_LOCK)
> + mutex_lock(>tpm_mutex);
>  
>   rc = chip->ops->send(chip, (u8 *) buf, count);
>   if (rc < 0) {
> @@ -393,20 +394,21 @@ out_recv:
>   dev_err(>dev,
>   "tpm_transmit: tpm_recv: error %zd\n", rc);
>  out:
> - mutex_unlock(>tpm_mutex);
> + if (flags & TPM_TRANSMIT_LOCK)
> + mutex_unlock(>tpm_mutex);
>   return rc;
>  }
>  
>  #define TPM_DIGEST_SIZE 20
>  #define TPM_RET_CODE_IDX 6
>  
> -ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> -  int len, const char *desc)
> +ssize_t __tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> +int len, const char *desc, unsigned int flags)
>  {
>   struct tpm_output_header *header;
>   int err;
>  
> - len = tpm_transmit(chip, (u8 *) cmd, len);
> + len = tpm_transmit(chip, cmd, len, flags);
>   if (len <  0)
>   return len;
>   else if (len < TPM_HEADER_SIZE)
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 6e002c4..b9383fd 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -476,12 +476,22 @@ extern dev_t tpm_devt;
>  extern const struct file_operations tpm_fops;
>  extern struct idr dev_nums_idr;
>  
> +enum tpm_transmit_flags {
> + TPM_TRANSMIT_LOCK,
> +};
> +
> +ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> +  size_t bufsiz, unsigned int flags);
> +ssize_t __tpm_transmit_cmd(struct tpm_chip *chip, void *cmd, int len,
> +const char *desc, unsigned int flags);
> +static inline ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> +int len, const char *desc)
> +{
> + return __tpm_transmit_cmd(chip, cmd, len, desc, TPM_TRANSMIT_LOCK);
> +}
> +
>  ssize_t tpm_getcap(struct tpm_chip *chip, __be32 subcap_id, cap_t *cap,
>  const char *desc);
> -ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> -  size_t bufsiz);

By having also __tpm_transmit() the patch would be more localized, which
would make it easier to backport. I think I'll add that.

/Jarkko

> -ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd, int len,
> -  const char *desc);
>  int tpm_get_timeouts(struct tpm_chip *chip);
>  int tpm1_auto_startup(struct tpm_chip *chip);
>  int tpm_do_selftest(struct tpm_chip *chip);
> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> index 499f405..99007d8 100644
> --- a/drivers/char/tpm/tpm2-cmd.c
> +++ b/drivers/char/tpm/tpm2-cmd.c
> @@ -576,7 +576,7 @@ static int tpm2_load(struct tpm_chip *chip,
>   goto out;
>   }
>  
> - rc = tpm_transmit_cmd(chip, buf.data, PAGE_SIZE, "loading blob");
> + rc = __tpm_transmit_cmd(chip,

Re: [PATCH] tpm: fix a race condition tpm2_unseal_trusted()

2016-08-16 Thread Jarkko Sakkinen

On Tue, Aug 16, 2016 at 10:38:22PM +0300, Jarkko Sakkinen wrote:
> Unseal and load operations should be done as an atomic operation. This
> commit introduces unlocked tpm_transmit() so that tpm2_unseal_trusted()
> can do the locking by itself.
> 
> v2: Introduced an unlocked unseal operation instead of changing locking
> strategy in order to make less intrusive bug fix and thus more
> backportable.
> 
> CC: sta...@vger.kernel.org
> Fixes: 954650efb79f ("tpm: seal/unseal for TPM 2.0")
> Signed-off-by: Jarkko Sakkinen 
> ---
>  drivers/char/tpm/tpm-dev.c   |  2 +-
>  drivers/char/tpm/tpm-interface.c | 14 --
>  drivers/char/tpm/tpm.h   | 18 ++
>  drivers/char/tpm/tpm2-cmd.c  | 13 +
>  4 files changed, 32 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm-dev.c b/drivers/char/tpm/tpm-dev.c
> index f5d4521..8df8846 100644
> --- a/drivers/char/tpm/tpm-dev.c
> +++ b/drivers/char/tpm/tpm-dev.c
> @@ -145,7 +145,7 @@ static ssize_t tpm_write(struct file *file, const char 
> __user *buf,
>   return -EPIPE;
>   }
>   out_size = tpm_transmit(priv->chip, priv->data_buffer,
> - sizeof(priv->data_buffer));
> + sizeof(priv->data_buffer), TPM_TRANSMIT_LOCK);
>  
>   tpm_put_ops(priv->chip);
>   if (out_size < 0) {
> diff --git a/drivers/char/tpm/tpm-interface.c 
> b/drivers/char/tpm/tpm-interface.c
> index 43ef0ef..627daa7 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -331,7 +331,7 @@ EXPORT_SYMBOL_GPL(tpm_calc_ordinal_duration);
>   * Internal kernel interface to transmit TPM commands
>   */
>  ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> -  size_t bufsiz)
> +  size_t bufsiz, unsigned int flags)
>  {
>   ssize_t rc;
>   u32 count, ordinal;
> @@ -350,7 +350,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
> *buf,
>   return -E2BIG;
>   }
>  
> - mutex_lock(>tpm_mutex);
> + if (flags & TPM_TRANSMIT_LOCK)
> + mutex_lock(>tpm_mutex);
>  
>   rc = chip->ops->send(chip, (u8 *) buf, count);
>   if (rc < 0) {
> @@ -393,20 +394,21 @@ out_recv:
>   dev_err(>dev,
>   "tpm_transmit: tpm_recv: error %zd\n", rc);
>  out:
> - mutex_unlock(>tpm_mutex);
> + if (flags & TPM_TRANSMIT_LOCK)
> + mutex_unlock(>tpm_mutex);
>   return rc;
>  }
>  
>  #define TPM_DIGEST_SIZE 20
>  #define TPM_RET_CODE_IDX 6
>  
> -ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> -  int len, const char *desc)
> +ssize_t __tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> +int len, const char *desc, unsigned int flags)
>  {
>   struct tpm_output_header *header;
>   int err;
>  
> - len = tpm_transmit(chip, (u8 *) cmd, len);
> + len = tpm_transmit(chip, cmd, len, flags);
>   if (len <  0)
>   return len;
>   else if (len < TPM_HEADER_SIZE)
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 6e002c4..b9383fd 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -476,12 +476,22 @@ extern dev_t tpm_devt;
>  extern const struct file_operations tpm_fops;
>  extern struct idr dev_nums_idr;
>  
> +enum tpm_transmit_flags {
> + TPM_TRANSMIT_LOCK,
> +};
> +
> +ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> +  size_t bufsiz, unsigned int flags);
> +ssize_t __tpm_transmit_cmd(struct tpm_chip *chip, void *cmd, int len,
> +const char *desc, unsigned int flags);
> +static inline ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd,
> +int len, const char *desc)
> +{
> + return __tpm_transmit_cmd(chip, cmd, len, desc, TPM_TRANSMIT_LOCK);
> +}
> +
>  ssize_t tpm_getcap(struct tpm_chip *chip, __be32 subcap_id, cap_t *cap,
>  const char *desc);
> -ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
> -  size_t bufsiz);

By having also __tpm_transmit() the patch would be more localized, which
would make it easier to backport. I think I'll add that.

/Jarkko

> -ssize_t tpm_transmit_cmd(struct tpm_chip *chip, void *cmd, int len,
> -  const char *desc);
>  int tpm_get_timeouts(struct tpm_chip *chip);
>  int tpm1_auto_startup(struct tpm_chip *chip);
>  int tpm_do_selftest(struct tpm_chip *chip);
> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> index 499f405..99007d8 100644
> --- a/drivers/char/tpm/tpm2-cmd.c
> +++ b/drivers/char/tpm/tpm2-cmd.c
> @@ -576,7 +576,7 @@ static int tpm2_load(struct tpm_chip *chip,
>   goto out;
>   }
>  
> - rc = tpm_transmit_cmd(chip, buf.data, PAGE_SIZE, "loading blob");
> + rc = __tpm_transmit_cmd(chip, buf.data, PAGE_SIZE, "loading

linux-next: Tree for Aug 17

2016-08-16 Thread Stephen Rothwell

Hi all,

Changes since 20160816:

The net-next tree gained a conflict against the net tree.

The kbuild tree gained build warnings for PowerPC, so I reverted a commit.

Non-merge commits (relative to Linus' tree): 2406
 2530 files changed, 99359 insertions(+), 40525 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (3ec60b92d3ba Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (d3e2773c4ede builddeb: Skip gcc-plugins when 
not configured)
Merging arc-current/for-curr (b854f5e83bb2 ARC: Support syscall ABI v4)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (a1560dd7a47f Merge branch 'mediatek-fixes')
Merging ipsec/master (1625f4529957 net/xfrm_input: fix possible NULL deref of 
tunnel.ip6->parms.i_key)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (a855d4d8e71f PCI: Call pci_intx() when using 
legacy interrupts in pci_alloc_irq_vectors())
Merging driver-core.current/driver-core-linus (694d0d0bb203 Linux 4.8-rc2)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (f1f6d9a8b540 xhci: don't dereference a xhci 
member after removing xhci)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (3b7c7e52efda USB: serial: mos7840: fix 
non-atomic allocation in write path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (99f1c013194e staging/lustre/llite: Close 
atomic_open race with several openers)
Merging char-misc.current/char-misc-linus (83cf8df2d4fa 
drivers/iio/light/Kconfig: SENSORS_BH1780 cleanup)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (e67479b13ede crypto: sha512-mb - fix ctx pointer)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL point

linux-next: Tree for Aug 17

2016-08-16 Thread Stephen Rothwell

Hi all,

Changes since 20160816:

The net-next tree gained a conflict against the net tree.

The kbuild tree gained build warnings for PowerPC, so I reverted a commit.

Non-merge commits (relative to Linus' tree): 2406
 2530 files changed, 99359 insertions(+), 40525 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (3ec60b92d3ba Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (d3e2773c4ede builddeb: Skip gcc-plugins when 
not configured)
Merging arc-current/for-curr (b854f5e83bb2 ARC: Support syscall ABI v4)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (a1560dd7a47f Merge branch 'mediatek-fixes')
Merging ipsec/master (1625f4529957 net/xfrm_input: fix possible NULL deref of 
tunnel.ip6->parms.i_key)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (a855d4d8e71f PCI: Call pci_intx() when using 
legacy interrupts in pci_alloc_irq_vectors())
Merging driver-core.current/driver-core-linus (694d0d0bb203 Linux 4.8-rc2)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (f1f6d9a8b540 xhci: don't dereference a xhci 
member after removing xhci)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (3b7c7e52efda USB: serial: mos7840: fix 
non-atomic allocation in write path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (99f1c013194e staging/lustre/llite: Close 
atomic_open race with several openers)
Merging char-misc.current/char-misc-linus (83cf8df2d4fa 
drivers/iio/light/Kconfig: SENSORS_BH1780 cleanup)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (e67479b13ede crypto: sha512-mb - fix ctx pointer)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL point

[PATCH] ARM: qcom: Cleanup/Remove unnecessary board file

2016-08-16 Thread Andy Gross

This patch removes the unnecessary board file.  The generic machine
definition is sufficient for the Qualcomm platforms.

Signed-off-by: Andy Gross 
---
 arch/arm/mach-qcom/Makefile |  1 -
 arch/arm/mach-qcom/board.c  | 31 ---
 2 files changed, 32 deletions(-)
 delete mode 100644 arch/arm/mach-qcom/board.c

diff --git a/arch/arm/mach-qcom/Makefile b/arch/arm/mach-qcom/Makefile
index e324375..12878e9 100644
--- a/arch/arm/mach-qcom/Makefile
+++ b/arch/arm/mach-qcom/Makefile
@@ -1,2 +1 @@
-obj-y  := board.o
 obj-$(CONFIG_SMP)  += platsmp.o
diff --git a/arch/arm/mach-qcom/board.c b/arch/arm/mach-qcom/board.c
deleted file mode 100644
index d8060df..000
--- a/arch/arm/mach-qcom/board.c
+++ /dev/null
@@ -1,31 +0,0 @@
-/* Copyright (c) 2010-2014 The Linux Foundation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 and
- * only version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- */
-
-#include 
-
-#include 
-
-static const char * const qcom_dt_match[] __initconst = {
-   "qcom,apq8064",
-   "qcom,apq8074-dragonboard",
-   "qcom,apq8084",
-   "qcom,ipq8062",
-   "qcom,ipq8064",
-   "qcom,msm8660-surf",
-   "qcom,msm8960-cdp",
-   "qcom,mdm9615",
-   NULL
-};
-
-DT_MACHINE_START(QCOM_DT, "Qualcomm (Flattened Device Tree)")
-   .dt_compat = qcom_dt_match,
-MACHINE_END
-- 
1.9.1

[PATCH] ARM: qcom: Cleanup/Remove unnecessary board file

2016-08-16 Thread Andy Gross

This patch removes the unnecessary board file.  The generic machine
definition is sufficient for the Qualcomm platforms.

Signed-off-by: Andy Gross 
---
 arch/arm/mach-qcom/Makefile |  1 -
 arch/arm/mach-qcom/board.c  | 31 ---
 2 files changed, 32 deletions(-)
 delete mode 100644 arch/arm/mach-qcom/board.c

diff --git a/arch/arm/mach-qcom/Makefile b/arch/arm/mach-qcom/Makefile
index e324375..12878e9 100644
--- a/arch/arm/mach-qcom/Makefile
+++ b/arch/arm/mach-qcom/Makefile
@@ -1,2 +1 @@
-obj-y  := board.o
 obj-$(CONFIG_SMP)  += platsmp.o
diff --git a/arch/arm/mach-qcom/board.c b/arch/arm/mach-qcom/board.c
deleted file mode 100644
index d8060df..000
--- a/arch/arm/mach-qcom/board.c
+++ /dev/null
@@ -1,31 +0,0 @@
-/* Copyright (c) 2010-2014 The Linux Foundation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 and
- * only version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- */
-
-#include 
-
-#include 
-
-static const char * const qcom_dt_match[] __initconst = {
-   "qcom,apq8064",
-   "qcom,apq8074-dragonboard",
-   "qcom,apq8084",
-   "qcom,ipq8062",
-   "qcom,ipq8064",
-   "qcom,msm8660-surf",
-   "qcom,msm8960-cdp",
-   "qcom,mdm9615",
-   NULL
-};
-
-DT_MACHINE_START(QCOM_DT, "Qualcomm (Flattened Device Tree)")
-   .dt_compat = qcom_dt_match,
-MACHINE_END
-- 
1.9.1

Re: [PATCH] be2iscsi: Use a more current logging style

2016-08-16 Thread Joe Perches

On Wed, 2016-08-17 at 09:20 +0530, Jitendra Bhivare wrote:
> > 
> > -Original Message-
> > From: Joe Perches [mailto:j...@perches.com]
> > Sent: Tuesday, August 16, 2016 3:57 PM
> > To: Jitendra Bhivare; Christophe JAILLET; Jayamohan Kallickal; Ketan
> Mukadam
> > 
> > Cc: Bart Van Assche; James E.J. Bottomley; Martin K. Petersen; linux-
> > s...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH] be2iscsi: Use a more current logging style
> > 
> > On Tue, 2016-08-16 at 11:32 +0530, Jitendra Bhivare wrote:
> > > 
> > > Thanks Joe for taking this up. It has been pending for long time from
> > > our side.
> > Thanks, not a problem, it took ~10 minutes.
> > 
> > There was a bit of an issue about your reply though.
> > 
> > First there was ~50 k of quoted stuff without any content
> > 
> > > 
> > > [ hundreds and hundreds of quoted lines ]
> > and then this happened:
> > 
> > > 
> > > > 
> > > > diff --git a/drivers/scsi/be2iscsi/be_main.h
> > > b/drivers/scsi/be2iscsi/be_main.h
> > > > 
> > > > 
> > > > index aa9c682..7cce6e3 100644
> > > > --- a/drivers/scsi/be2iscsi/be_main.h
> > > > +++ b/drivers/scsi/be2iscsi/be_main.h
> > > > @@ -1081,15 +1081,19 @@ struct hwi_context_memory {
> > > >  #define BEISCSI_LOG_CONFIG 0x0020  /* CONFIG Code Path */
> > > >  #define BEISCSI_LOG_ISCSI  0x0040  /* SCSI/iSCSI Protocol
> related
> > 
> > > 
> > > Logs */
> > > > 
> > > > 
> > > > 
> > > > -#define __beiscsi_log(phba, level, fmt, ...)
> \
> > 
> > > 
> > > > 
> > > > -   shost_printk(level, phba->shost, fmt, ##__VA_ARGS__)
> > > > -
> > > > -#define beiscsi_log(phba, level, mask, prefix, fmt, ...)   
> > > > \
> > > > +#define beiscsi_printk(level, phba, mask, fmt, ...)
> \
> > 
> > > 
> > > > 
> > > >  do {
> > \
> > > 
> > > > 
> > > > -   uint32_t log_value = phba->attr_log_enable; 
> > > > \
> > > > -   if (((mask) & log_value) || (level[1] <= '3'))  
> > > > \
> > > > -   __beiscsi_log(phba, level, prefix "_%d: " fmt,  
> > > > \
> > > > -     __LINE__, ##__VA_ARGS__); 
> > > > \
> > > > +   if ((mask) & (phba)->attr_log_enable)   
> > > > \
> > > > +   shost_printk(level, phba->shost,
> > > > \
> > > [JB] PCI dev_printk would be more useful with SCSI host_no included by
> > > default in the message.
> > This is a good note that seems simple enough, but I almost missed this.
> > 
> > Given the reply at the top and the _very_ long uncommented quoted block,
> I just
> > 
> > about assumed it was a useless block quote that you didn't bother to
> trim.
> > 
> > 
> > Please make it easier to find your replies and notes by deleting
> irrelevant quoted
> > 
> > stuff.
> > 
> > Also, I think I misread the code.
> > 
> > The original code is <= '3' i.e.: show all KERN_ERR.
> > That is not correct in the new code.
> > 
> > I don't know the code well and don't have a test bed with the hardware.
> > 
> > Is it possible for a beiscsi_ message to be called before
> phba->pcidev is
> > 
> > set to a valid value in beiscsi_hba_alloc?   It appears the code is
> careful to only
> > 
> > use dev_ logging calls before probe.
> [JB] KERN_ERR messages need to be logged irrespective of the masks.
> I understand, that in some places, mask is unnecessarily passed.
> I had made sure to call __beiscsi_log in some places.

I did as well.

> Can we please keep it that way? So beiscsi_err calls dev_err directly or
> is replaced with dev_err.

No worries, I'll respin the series after Christophe's
patches are applied.

> It's safe to assume pcidev will be valid for all beiscsi_log calls.
> Will test your change on my setup before ack'ing.

Don't bother until you get another patchset.

I suggest you fix your email client when sending
replies to me and to lkml.

What I received is very difficult to read due to
the odd line wrapping.

Re: [PATCH] be2iscsi: Use a more current logging style

2016-08-16 Thread Joe Perches

On Wed, 2016-08-17 at 09:20 +0530, Jitendra Bhivare wrote:
> > 
> > -Original Message-
> > From: Joe Perches [mailto:j...@perches.com]
> > Sent: Tuesday, August 16, 2016 3:57 PM
> > To: Jitendra Bhivare; Christophe JAILLET; Jayamohan Kallickal; Ketan
> Mukadam
> > 
> > Cc: Bart Van Assche; James E.J. Bottomley; Martin K. Petersen; linux-
> > s...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH] be2iscsi: Use a more current logging style
> > 
> > On Tue, 2016-08-16 at 11:32 +0530, Jitendra Bhivare wrote:
> > > 
> > > Thanks Joe for taking this up. It has been pending for long time from
> > > our side.
> > Thanks, not a problem, it took ~10 minutes.
> > 
> > There was a bit of an issue about your reply though.
> > 
> > First there was ~50 k of quoted stuff without any content
> > 
> > > 
> > > [ hundreds and hundreds of quoted lines ]
> > and then this happened:
> > 
> > > 
> > > > 
> > > > diff --git a/drivers/scsi/be2iscsi/be_main.h
> > > b/drivers/scsi/be2iscsi/be_main.h
> > > > 
> > > > 
> > > > index aa9c682..7cce6e3 100644
> > > > --- a/drivers/scsi/be2iscsi/be_main.h
> > > > +++ b/drivers/scsi/be2iscsi/be_main.h
> > > > @@ -1081,15 +1081,19 @@ struct hwi_context_memory {
> > > >  #define BEISCSI_LOG_CONFIG 0x0020  /* CONFIG Code Path */
> > > >  #define BEISCSI_LOG_ISCSI  0x0040  /* SCSI/iSCSI Protocol
> related
> > 
> > > 
> > > Logs */
> > > > 
> > > > 
> > > > 
> > > > -#define __beiscsi_log(phba, level, fmt, ...)
> \
> > 
> > > 
> > > > 
> > > > -   shost_printk(level, phba->shost, fmt, ##__VA_ARGS__)
> > > > -
> > > > -#define beiscsi_log(phba, level, mask, prefix, fmt, ...)   
> > > > \
> > > > +#define beiscsi_printk(level, phba, mask, fmt, ...)
> \
> > 
> > > 
> > > > 
> > > >  do {
> > \
> > > 
> > > > 
> > > > -   uint32_t log_value = phba->attr_log_enable; 
> > > > \
> > > > -   if (((mask) & log_value) || (level[1] <= '3'))  
> > > > \
> > > > -   __beiscsi_log(phba, level, prefix "_%d: " fmt,  
> > > > \
> > > > -     __LINE__, ##__VA_ARGS__); 
> > > > \
> > > > +   if ((mask) & (phba)->attr_log_enable)   
> > > > \
> > > > +   shost_printk(level, phba->shost,
> > > > \
> > > [JB] PCI dev_printk would be more useful with SCSI host_no included by
> > > default in the message.
> > This is a good note that seems simple enough, but I almost missed this.
> > 
> > Given the reply at the top and the _very_ long uncommented quoted block,
> I just
> > 
> > about assumed it was a useless block quote that you didn't bother to
> trim.
> > 
> > 
> > Please make it easier to find your replies and notes by deleting
> irrelevant quoted
> > 
> > stuff.
> > 
> > Also, I think I misread the code.
> > 
> > The original code is <= '3' i.e.: show all KERN_ERR.
> > That is not correct in the new code.
> > 
> > I don't know the code well and don't have a test bed with the hardware.
> > 
> > Is it possible for a beiscsi_ message to be called before
> phba->pcidev is
> > 
> > set to a valid value in beiscsi_hba_alloc?   It appears the code is
> careful to only
> > 
> > use dev_ logging calls before probe.
> [JB] KERN_ERR messages need to be logged irrespective of the masks.
> I understand, that in some places, mask is unnecessarily passed.
> I had made sure to call __beiscsi_log in some places.

I did as well.

> Can we please keep it that way? So beiscsi_err calls dev_err directly or
> is replaced with dev_err.

No worries, I'll respin the series after Christophe's
patches are applied.

> It's safe to assume pcidev will be valid for all beiscsi_log calls.
> Will test your change on my setup before ack'ing.

Don't bother until you get another patchset.

I suggest you fix your email client when sending
replies to me and to lkml.

What I received is very difficult to read due to
the odd line wrapping.

RE: [PATCH] be2iscsi: Use a more current logging style

2016-08-16 Thread Jitendra Bhivare

> -Original Message-
> From: Joe Perches [mailto:j...@perches.com]
> Sent: Tuesday, August 16, 2016 3:57 PM
> To: Jitendra Bhivare; Christophe JAILLET; Jayamohan Kallickal; Ketan
Mukadam
> Cc: Bart Van Assche; James E.J. Bottomley; Martin K. Petersen; linux-
> s...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] be2iscsi: Use a more current logging style
>
> On Tue, 2016-08-16 at 11:32 +0530, Jitendra Bhivare wrote:
> > Thanks Joe for taking this up. It has been pending for long time from
> > our side.
>
> Thanks, not a problem, it took ~10 minutes.
>
> There was a bit of an issue about your reply though.
>
> First there was ~50 k of quoted stuff without any content
>
> > [ hundreds and hundreds of quoted lines ]
>
> and then this happened:
>
> > > diff --git a/drivers/scsi/be2iscsi/be_main.h
> > b/drivers/scsi/be2iscsi/be_main.h
> > >
> > > index aa9c682..7cce6e3 100644
> > > --- a/drivers/scsi/be2iscsi/be_main.h
> > > +++ b/drivers/scsi/be2iscsi/be_main.h
> > > @@ -1081,15 +1081,19 @@ struct hwi_context_memory {
> > >  #define BEISCSI_LOG_CONFIG   0x0020  /* CONFIG Code Path */
> > >  #define BEISCSI_LOG_ISCSI0x0040  /* SCSI/iSCSI Protocol
related
> > Logs */
> > >
> > >
> > > -#define __beiscsi_log(phba, level, fmt, ...)
\
> > > - shost_printk(level, phba->shost, fmt, ##__VA_ARGS__)
> > > -
> > > -#define beiscsi_log(phba, level, mask, prefix, fmt, ...) \
> > > +#define beiscsi_printk(level, phba, mask, fmt, ...)
\
> > >  do {
>   \
> > > - uint32_t log_value = phba->attr_log_enable; \
> > > - if (((mask) & log_value) || (level[1] <= '3'))  \
> > > - __beiscsi_log(phba, level, prefix "_%d: " fmt,  \
> > > -   __LINE__, ##__VA_ARGS__); \
> > > + if ((mask) & (phba)->attr_log_enable)   \
> > > + shost_printk(level, phba->shost,\
> > [JB] PCI dev_printk would be more useful with SCSI host_no included by
> > default in the message.
>
> This is a good note that seems simple enough, but I almost missed this.
>
> Given the reply at the top and the _very_ long uncommented quoted block,
I just
> about assumed it was a useless block quote that you didn't bother to
trim.
>
> Please make it easier to find your replies and notes by deleting
irrelevant quoted
> stuff.
>
> Also, I think I misread the code.
>
> The original code is <= '3' i.e.: show all KERN_ERR.
> That is not correct in the new code.
>
> I don't know the code well and don't have a test bed with the hardware.
>
> Is it possible for a beiscsi_ message to be called before
phba->pcidev is
> set to a valid value in beiscsi_hba_alloc?   It appears the code is
careful to only
> use dev_ logging calls before probe.
[JB] KERN_ERR messages need to be logged irrespective of the masks.
I understand, that in some places, mask is unnecessarily passed.
I had made sure to call __beiscsi_log in some places.
Can we please keep it that way? So beiscsi_err calls dev_err directly or
is replaced with dev_err.

It's safe to assume pcidev will be valid for all beiscsi_log calls.
Will test your change on my setup before ack'ing.

Actually, we too wanted to get rid of BC_/BM_... line# way and replace
with
ABCD = error identifier.
A 
B 
CD 

But that will be substantial change with some testing requirements. For
now, this looks good.

RE: [PATCH] be2iscsi: Use a more current logging style

2016-08-16 Thread Jitendra Bhivare

> -Original Message-
> From: Joe Perches [mailto:j...@perches.com]
> Sent: Tuesday, August 16, 2016 3:57 PM
> To: Jitendra Bhivare; Christophe JAILLET; Jayamohan Kallickal; Ketan
Mukadam
> Cc: Bart Van Assche; James E.J. Bottomley; Martin K. Petersen; linux-
> s...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] be2iscsi: Use a more current logging style
>
> On Tue, 2016-08-16 at 11:32 +0530, Jitendra Bhivare wrote:
> > Thanks Joe for taking this up. It has been pending for long time from
> > our side.
>
> Thanks, not a problem, it took ~10 minutes.
>
> There was a bit of an issue about your reply though.
>
> First there was ~50 k of quoted stuff without any content
>
> > [ hundreds and hundreds of quoted lines ]
>
> and then this happened:
>
> > > diff --git a/drivers/scsi/be2iscsi/be_main.h
> > b/drivers/scsi/be2iscsi/be_main.h
> > >
> > > index aa9c682..7cce6e3 100644
> > > --- a/drivers/scsi/be2iscsi/be_main.h
> > > +++ b/drivers/scsi/be2iscsi/be_main.h
> > > @@ -1081,15 +1081,19 @@ struct hwi_context_memory {
> > >  #define BEISCSI_LOG_CONFIG   0x0020  /* CONFIG Code Path */
> > >  #define BEISCSI_LOG_ISCSI0x0040  /* SCSI/iSCSI Protocol
related
> > Logs */
> > >
> > >
> > > -#define __beiscsi_log(phba, level, fmt, ...)
\
> > > - shost_printk(level, phba->shost, fmt, ##__VA_ARGS__)
> > > -
> > > -#define beiscsi_log(phba, level, mask, prefix, fmt, ...) \
> > > +#define beiscsi_printk(level, phba, mask, fmt, ...)
\
> > >  do {
>   \
> > > - uint32_t log_value = phba->attr_log_enable; \
> > > - if (((mask) & log_value) || (level[1] <= '3'))  \
> > > - __beiscsi_log(phba, level, prefix "_%d: " fmt,  \
> > > -   __LINE__, ##__VA_ARGS__); \
> > > + if ((mask) & (phba)->attr_log_enable)   \
> > > + shost_printk(level, phba->shost,\
> > [JB] PCI dev_printk would be more useful with SCSI host_no included by
> > default in the message.
>
> This is a good note that seems simple enough, but I almost missed this.
>
> Given the reply at the top and the _very_ long uncommented quoted block,
I just
> about assumed it was a useless block quote that you didn't bother to
trim.
>
> Please make it easier to find your replies and notes by deleting
irrelevant quoted
> stuff.
>
> Also, I think I misread the code.
>
> The original code is <= '3' i.e.: show all KERN_ERR.
> That is not correct in the new code.
>
> I don't know the code well and don't have a test bed with the hardware.
>
> Is it possible for a beiscsi_ message to be called before
phba->pcidev is
> set to a valid value in beiscsi_hba_alloc?   It appears the code is
careful to only
> use dev_ logging calls before probe.
[JB] KERN_ERR messages need to be logged irrespective of the masks.
I understand, that in some places, mask is unnecessarily passed.
I had made sure to call __beiscsi_log in some places.
Can we please keep it that way? So beiscsi_err calls dev_err directly or
is replaced with dev_err.

It's safe to assume pcidev will be valid for all beiscsi_log calls.
Will test your change on my setup before ack'ing.

Actually, we too wanted to get rid of BC_/BM_... line# way and replace
with
ABCD = error identifier.
A 
B 
CD 

But that will be substantial change with some testing requirements. For
now, this looks good.

Re: [PATCH 0/2] be2iscsi: Logging neatening

2016-08-16 Thread Joe Perches

On Wed, 2016-08-17 at 01:19 +, Bart Van Assche wrote:
> On 08/14/16 10:29, Joe Perches wrote:
> > On Sun, 2016-08-14 at 17:09 +, Bart Van Assche wrote:
> > > My primary concern is how to enable and disable log messages from user
> > > space.
[]
> > I think you are looking for a system wide equivalent
> > for the ethtool/netif_ mechanism.
> > 
> > Nothing like that exists currently.
> > 
> > Some code uses a bitmask/and, other code uses a
> > level/comparison.
[]
> As far as I can see all that the ethtool msglevel API implements is a 
> mechanism to query and set the log level from user space. What various 
> SCSI drivers implement is not a log level but a log mask mechanism. How 
> about the following approach to associate a name with each bit in a log 
> mask, to export these names to user space and to make it possible to 
> enable/disable messages per log category:
> * Introduce a variant of pr_debug() that allows to specify a textual
>    representation of the log category (a short string without spaces).
> * Make the log category names available in
>    /sys/kernel/debug/dynamic_debug/...
> * Today dynamic debug allows to enable/disable log messages by
>    specifying the source file name, function name, line number, module
>    name and/or format string. My proposal is to make it also possible to
>    enable/disable log messages based on the log category name.

Many of these logging mechanisms are not just debug
facilities.

Perhaps a dynamic_debug control would be inappropriate.

There have also been various custom scsi log level
facilities like the blogic_msg for the very old
BusLogic blogic_msg.

These functions also sometimes write into some
device-specific buffer.

Perhaps the largest problem, if this is to be scsi only
rather than system wide, is finding out what and how
the various bits in a mask should be used.

Re: [PATCH 0/2] be2iscsi: Logging neatening

2016-08-16 Thread Joe Perches

On Wed, 2016-08-17 at 01:19 +, Bart Van Assche wrote:
> On 08/14/16 10:29, Joe Perches wrote:
> > On Sun, 2016-08-14 at 17:09 +, Bart Van Assche wrote:
> > > My primary concern is how to enable and disable log messages from user
> > > space.
[]
> > I think you are looking for a system wide equivalent
> > for the ethtool/netif_ mechanism.
> > 
> > Nothing like that exists currently.
> > 
> > Some code uses a bitmask/and, other code uses a
> > level/comparison.
[]
> As far as I can see all that the ethtool msglevel API implements is a 
> mechanism to query and set the log level from user space. What various 
> SCSI drivers implement is not a log level but a log mask mechanism. How 
> about the following approach to associate a name with each bit in a log 
> mask, to export these names to user space and to make it possible to 
> enable/disable messages per log category:
> * Introduce a variant of pr_debug() that allows to specify a textual
>    representation of the log category (a short string without spaces).
> * Make the log category names available in
>    /sys/kernel/debug/dynamic_debug/...
> * Today dynamic debug allows to enable/disable log messages by
>    specifying the source file name, function name, line number, module
>    name and/or format string. My proposal is to make it also possible to
>    enable/disable log messages based on the log category name.

Many of these logging mechanisms are not just debug
facilities.

Perhaps a dynamic_debug control would be inappropriate.

There have also been various custom scsi log level
facilities like the blogic_msg for the very old
BusLogic blogic_msg.

These functions also sometimes write into some
device-specific buffer.

Perhaps the largest problem, if this is to be scsi only
rather than system wide, is finding out what and how
the various bits in a mask should be used.

Re: [PATCH v2 4/5] bug: Provide toggle for BUG on data corruption

2016-08-16 Thread Kees Cook

On Tue, Aug 16, 2016 at 5:26 PM, Joe Perches  wrote:
> On Tue, 2016-08-16 at 17:20 -0700, Kees Cook wrote:
>> The kernel checks for cases of data structure corruption under some
>> CONFIGs (e.g. CONFIG_DEBUG_LIST). When corruption is detected, some
>> systems may want to BUG() immediately instead of letting the system run
>> with known corruption.  Usually these kinds of manipulation primitives can
>> be used by security flaws to gain arbitrary memory write control. This
>> provides a new config CONFIG_BUG_ON_DATA_CORRUPTION and a corresponding
>> macro CHECK_DATA_CORRUPTION for handling these situations. Notably, even
>> if not BUGing, the kernel should not continue processing the corrupted
>> structure.
> []
>> diff --git a/include/linux/bug.h b/include/linux/bug.h
> []
>> @@ -118,4 +118,21 @@ static inline enum bug_trap_type report_bug(unsigned 
>> long bug_addr,
>>  }
>>
>>  #endif   /* CONFIG_GENERIC_BUG */
>> +
>> +/*
>> + * Since detected data corruption should stop operation on the affected
>> + * structures, this returns false if the corruption condition is found.
>> + */
>> +#define CHECK_DATA_CORRUPTION(condition, format...)   \
>
> My preference would be to use (condition, fmt, ...)
>
>> + do { \
>> + if (unlikely(condition)) {   \
>> + if (IS_ENABLED(CONFIG_BUG_ON_DATA_CORRUPTION)) { \
>> + printk(KERN_ERR format); \
>
> and
> pr_err(fmt, ##__VA_ARGS__);
>
> so that any use would also get any local pr_fmt applied as well.
>
>> + BUG();   \
>> + } else   \
>> + WARN(1, format); \
>> + return false;\
>> + }\
>> + } while (0)
>> +
>>  #endif   /* _LINUX_BUG_H */
>

Ah yes, excellent point. I'll convert this for my v3. Thanks!

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH v2 4/5] bug: Provide toggle for BUG on data corruption

2016-08-16 Thread Kees Cook

On Tue, Aug 16, 2016 at 5:26 PM, Joe Perches  wrote:
> On Tue, 2016-08-16 at 17:20 -0700, Kees Cook wrote:
>> The kernel checks for cases of data structure corruption under some
>> CONFIGs (e.g. CONFIG_DEBUG_LIST). When corruption is detected, some
>> systems may want to BUG() immediately instead of letting the system run
>> with known corruption.  Usually these kinds of manipulation primitives can
>> be used by security flaws to gain arbitrary memory write control. This
>> provides a new config CONFIG_BUG_ON_DATA_CORRUPTION and a corresponding
>> macro CHECK_DATA_CORRUPTION for handling these situations. Notably, even
>> if not BUGing, the kernel should not continue processing the corrupted
>> structure.
> []
>> diff --git a/include/linux/bug.h b/include/linux/bug.h
> []
>> @@ -118,4 +118,21 @@ static inline enum bug_trap_type report_bug(unsigned 
>> long bug_addr,
>>  }
>>
>>  #endif   /* CONFIG_GENERIC_BUG */
>> +
>> +/*
>> + * Since detected data corruption should stop operation on the affected
>> + * structures, this returns false if the corruption condition is found.
>> + */
>> +#define CHECK_DATA_CORRUPTION(condition, format...)   \
>
> My preference would be to use (condition, fmt, ...)
>
>> + do { \
>> + if (unlikely(condition)) {   \
>> + if (IS_ENABLED(CONFIG_BUG_ON_DATA_CORRUPTION)) { \
>> + printk(KERN_ERR format); \
>
> and
> pr_err(fmt, ##__VA_ARGS__);
>
> so that any use would also get any local pr_fmt applied as well.
>
>> + BUG();   \
>> + } else   \
>> + WARN(1, format); \
>> + return false;\
>> + }\
>> + } while (0)
>> +
>>  #endif   /* _LINUX_BUG_H */
>

Ah yes, excellent point. I'll convert this for my v3. Thanks!

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH v2 0/5] bug: Provide toggle for BUG on data corruption

2016-08-16 Thread Kees Cook

On Tue, Aug 16, 2016 at 5:55 PM, Henrique de Moraes Holschuh
 wrote:
> On Tue, 16 Aug 2016, Kees Cook wrote:
>> This adds a CONFIG to trigger BUG()s when the kernel encounters
>> unexpected data structure integrity as currently detected with
>> CONFIG_DEBUG_LIST.
>>
>> Specifically list operations have been a target for widening flaws to gain
>> "write anywhere" primitives for attackers, so this also consolidates the
>> debug checking to avoid code and check duplication (e.g. RCU list debug
>> was missing a check that got added to regular list debug). It also stops
>> manipulations when corruption is detected, since worsening the corruption
>> makes no sense. (Really, everyone should build with CONFIG_DEBUG_LIST
>> since the checks are so inexpensive.)
>
> Well, maybe it wants a name that it looks like something that should be
> enabled by default on production kernels?
>
> I.e. CONFIG_DETECT_LIST_CORRUPTION or somesuch?

Yeah, that very well be true. I'd currently like to avoid CONFIG name
churn, but I've added it to my list of CONFIGs to rename (along with
CONFIG_DEBUG_RODATA). :)

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH v2 0/5] bug: Provide toggle for BUG on data corruption

2016-08-16 Thread Kees Cook

On Tue, Aug 16, 2016 at 5:55 PM, Henrique de Moraes Holschuh
 wrote:
> On Tue, 16 Aug 2016, Kees Cook wrote:
>> This adds a CONFIG to trigger BUG()s when the kernel encounters
>> unexpected data structure integrity as currently detected with
>> CONFIG_DEBUG_LIST.
>>
>> Specifically list operations have been a target for widening flaws to gain
>> "write anywhere" primitives for attackers, so this also consolidates the
>> debug checking to avoid code and check duplication (e.g. RCU list debug
>> was missing a check that got added to regular list debug). It also stops
>> manipulations when corruption is detected, since worsening the corruption
>> makes no sense. (Really, everyone should build with CONFIG_DEBUG_LIST
>> since the checks are so inexpensive.)
>
> Well, maybe it wants a name that it looks like something that should be
> enabled by default on production kernels?
>
> I.e. CONFIG_DETECT_LIST_CORRUPTION or somesuch?

Yeah, that very well be true. I'd currently like to avoid CONFIG name
churn, but I've added it to my list of CONFIGs to rename (along with
CONFIG_DEBUG_RODATA). :)

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH] dma-buf: fix kernel-doc warning and typos

2016-08-16 Thread Sumit Semwal

Hi Randy,

On 17 August 2016 at 05:01, Randy Dunlap  wrote:
> From: Randy Dunlap 
>
> Fix dma-buf kernel-doc warning and 2 minor typos in
> fence_array_create().
>
Thanks for your patch, I will queue it up!
> Fixes this warning:
> ..//drivers/dma-buf/fence-array.c:124: warning: No description found for 
> parameter 'signal_on_any'
>
> Signed-off-by: Randy Dunlap 
> Cc: Sumit Semwal 
> Cc: linux-me...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: linaro-mm-...@lists.linaro.org
> ---
>  drivers/dma-buf/fence-array.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- lnx-48-rc2.orig/drivers/dma-buf/fence-array.c
> +++ lnx-48-rc2/drivers/dma-buf/fence-array.c
> @@ -106,14 +106,14 @@ const struct fence_ops fence_array_ops =
>   * @fences:[in]array containing the fences
>   * @context:   [in]fence context to use
>   * @seqno: [in]sequence number to use
> - * @signal_on_any  [in]signal on any fence in the array
> + * @signal_on_any: [in]signal on any fence in the array
>   *
>   * Allocate a fence_array object and initialize the base fence with 
> fence_init().
>   * In case of error it returns NULL.
>   *
> - * The caller should allocte the fences array with num_fences size
> + * The caller should allocate the fences array with num_fences size
>   * and fill it with the fences it wants to add to the object. Ownership of 
> this
> - * array is take and fence_put() is used on each fence on release.
> + * array is taken and fence_put() is used on each fence on release.
>   *
>   * If @signal_on_any is true the fence array signals if any fence in the 
> array
>   * signals, otherwise it signals when all fences in the array signal.

Best,
Sumit.

Re: [PATCH] dma-buf: fix kernel-doc warning and typos

2016-08-16 Thread Sumit Semwal

Hi Randy,

On 17 August 2016 at 05:01, Randy Dunlap  wrote:
> From: Randy Dunlap 
>
> Fix dma-buf kernel-doc warning and 2 minor typos in
> fence_array_create().
>
Thanks for your patch, I will queue it up!
> Fixes this warning:
> ..//drivers/dma-buf/fence-array.c:124: warning: No description found for 
> parameter 'signal_on_any'
>
> Signed-off-by: Randy Dunlap 
> Cc: Sumit Semwal 
> Cc: linux-me...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: linaro-mm-...@lists.linaro.org
> ---
>  drivers/dma-buf/fence-array.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- lnx-48-rc2.orig/drivers/dma-buf/fence-array.c
> +++ lnx-48-rc2/drivers/dma-buf/fence-array.c
> @@ -106,14 +106,14 @@ const struct fence_ops fence_array_ops =
>   * @fences:[in]array containing the fences
>   * @context:   [in]fence context to use
>   * @seqno: [in]sequence number to use
> - * @signal_on_any  [in]signal on any fence in the array
> + * @signal_on_any: [in]signal on any fence in the array
>   *
>   * Allocate a fence_array object and initialize the base fence with 
> fence_init().
>   * In case of error it returns NULL.
>   *
> - * The caller should allocte the fences array with num_fences size
> + * The caller should allocate the fences array with num_fences size
>   * and fill it with the fences it wants to add to the object. Ownership of 
> this
> - * array is take and fence_put() is used on each fence on release.
> + * array is taken and fence_put() is used on each fence on release.
>   *
>   * If @signal_on_any is true the fence array signals if any fence in the 
> array
>   * signals, otherwise it signals when all fences in the array signal.

Best,
Sumit.

Re: [PATCH v2] clk: max77686: Migrate to clk_hw based OF and registration APIs

2016-08-16 Thread Javier Martinez Canillas

Hello Stephen,

On 08/16/2016 06:38 PM, Stephen Boyd wrote:
> Now that we have clk_hw based provider APIs to register clks, we
> can get rid of struct clk pointers while registering clks in
> these drivers, allowing us to move closer to a clear split of
> consumer and provider clk APIs.
> 
> Cc: Javier Martinez Canillas 
> Cc: Laxman Dewangan 
> Cc: Krzysztof Kozlowski 
> Signed-off-by: Stephen Boyd 
> ---

The patch looks good to me.

Reviewed-by: Javier Martinez Canillas 

Also, I've tested this on an Exynos5800 Peach Pi Chromebook that has a
max77802 (supported by this driver) and the clocks are working correctly.

Tested-by: Javier Martinez Canillas 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America

Re: [PATCH v2] clk: max77686: Migrate to clk_hw based OF and registration APIs

2016-08-16 Thread Javier Martinez Canillas

Hello Stephen,

On 08/16/2016 06:38 PM, Stephen Boyd wrote:
> Now that we have clk_hw based provider APIs to register clks, we
> can get rid of struct clk pointers while registering clks in
> these drivers, allowing us to move closer to a clear split of
> consumer and provider clk APIs.
> 
> Cc: Javier Martinez Canillas 
> Cc: Laxman Dewangan 
> Cc: Krzysztof Kozlowski 
> Signed-off-by: Stephen Boyd 
> ---

The patch looks good to me.

Reviewed-by: Javier Martinez Canillas 

Also, I've tested this on an Exynos5800 Peach Pi Chromebook that has a
max77802 (supported by this driver) and the clocks are working correctly.

Tested-by: Javier Martinez Canillas 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America

[PATCH v3 0/5] Fixing a set of bugs for ioapic hotplug

2016-08-16 Thread Rui Wang

On Wed, Aug 17, 2016 8:36 AM Rafael J. Wysocki wrote:
> On Wednesday, August 10, 2016 12:01:53 PM Rui Wang wrote:
> > A set of patches fixing bugs found while testing IOAPIC hotplug.
> 
> This should have been posted to the x...@kernel.org list too for the benefit
> of the maintainers.
> 
> Can you please resend it with a CC to that one (and the Bjorn's ACK on the
> second patch)?
> 

Thanks for the advice.
Will do.

Regards,
Rui

> Thanks,
> Rafael
> 
> 
> > Changelog:
> >
> > Changes from v2 to v3:
> > * Rebased on top of 4.8-rc1 per Bjorn & Rafael.
> > * Improved the commit message of 0003, w/ clearer explanation.
> >
> > Changes from v1 to v2:
> > * Split the first patch into two as advised by Bjorn: "would be nicer
> > if the interface change and header file munging were in a separate
> > patch so they wouldn't obscure the meat of the change, i.e., the
> > addition of calls to acpi_ioapic_add()."
> > * Removed acpi_ioapic_add() as an exported symbol.
> > * Fixed some typos, and s/acpi/ACPI/, s/ioapic/IOAPIC/ throughout.
> > * Fixed a warning from 0-day testing.
> >
> > Rui Wang (5):
> >   x86/ioapic: Change prototype of acpi_ioapic_add()
> >   x86/ioapic: Support hot-removal of IOAPICs present during boot
> >   x86/ioapic: Fix setup_res() failing to get resource
> >   x86/ioapic: Fix lost IOAPIC resource after hot-removal and hotadd
> >   x86/ioapic: Fix ioapic failing to request resource
> >
> >  drivers/acpi/internal.h |  2 --
> >  drivers/acpi/ioapic.c   | 46 ++-
> ---
> >  drivers/acpi/pci_root.c | 12 +++-  drivers/pci/setup-bus.c |
> > 5 -
> >  include/linux/acpi.h|  6 ++
> >  5 files changed, 47 insertions(+), 24 deletions(-)

[PATCH v3 0/5] Fixing a set of bugs for ioapic hotplug

2016-08-16 Thread Rui Wang

On Wed, Aug 17, 2016 8:36 AM Rafael J. Wysocki wrote:
> On Wednesday, August 10, 2016 12:01:53 PM Rui Wang wrote:
> > A set of patches fixing bugs found while testing IOAPIC hotplug.
> 
> This should have been posted to the x...@kernel.org list too for the benefit
> of the maintainers.
> 
> Can you please resend it with a CC to that one (and the Bjorn's ACK on the
> second patch)?
> 

Thanks for the advice.
Will do.

Regards,
Rui

> Thanks,
> Rafael
> 
> 
> > Changelog:
> >
> > Changes from v2 to v3:
> > * Rebased on top of 4.8-rc1 per Bjorn & Rafael.
> > * Improved the commit message of 0003, w/ clearer explanation.
> >
> > Changes from v1 to v2:
> > * Split the first patch into two as advised by Bjorn: "would be nicer
> > if the interface change and header file munging were in a separate
> > patch so they wouldn't obscure the meat of the change, i.e., the
> > addition of calls to acpi_ioapic_add()."
> > * Removed acpi_ioapic_add() as an exported symbol.
> > * Fixed some typos, and s/acpi/ACPI/, s/ioapic/IOAPIC/ throughout.
> > * Fixed a warning from 0-day testing.
> >
> > Rui Wang (5):
> >   x86/ioapic: Change prototype of acpi_ioapic_add()
> >   x86/ioapic: Support hot-removal of IOAPICs present during boot
> >   x86/ioapic: Fix setup_res() failing to get resource
> >   x86/ioapic: Fix lost IOAPIC resource after hot-removal and hotadd
> >   x86/ioapic: Fix ioapic failing to request resource
> >
> >  drivers/acpi/internal.h |  2 --
> >  drivers/acpi/ioapic.c   | 46 ++-
> ---
> >  drivers/acpi/pci_root.c | 12 +++-  drivers/pci/setup-bus.c |
> > 5 -
> >  include/linux/acpi.h|  6 ++
> >  5 files changed, 47 insertions(+), 24 deletions(-)

[PATCH] Staging: android: ion: ion_heap.c: fix parenthesis alignment

2016-08-16 Thread Ben LeMasurier

This fixes the checkpatch.pl "Alignment should match open parenthesis"
issues in ion_heap.c.

Signed-off-by: Ben LeMasurier 
---
 drivers/staging/android/ion/ion_heap.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_heap.c 
b/drivers/staging/android/ion/ion_heap.c
index ca15a87..4e5c0f1 100644
--- a/drivers/staging/android/ion/ion_heap.c
+++ b/drivers/staging/android/ion/ion_heap.c
@@ -93,7 +93,7 @@ int ion_heap_map_user(struct ion_heap *heap, struct 
ion_buffer *buffer,
}
len = min(len, remainder);
ret = remap_pfn_range(vma, addr, page_to_pfn(page), len,
-   vma->vm_page_prot);
+ vma->vm_page_prot);
if (ret)
return ret;
addr += len;
@@ -116,7 +116,7 @@ static int ion_heap_clear_pages(struct page **pages, int 
num, pgprot_t pgprot)
 }
 
 static int ion_heap_sglist_zero(struct scatterlist *sgl, unsigned int nents,
-   pgprot_t pgprot)
+   pgprot_t pgprot)
 {
int p = 0;
int ret = 0;
@@ -181,7 +181,7 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
 }
 
 static size_t _ion_heap_freelist_drain(struct ion_heap *heap, size_t size,
-   bool skip_pools)
+  bool skip_pools)
 {
struct ion_buffer *buffer;
size_t total_drained = 0;
@@ -266,7 +266,7 @@ int ion_heap_init_deferred_free(struct ion_heap *heap)
 }
 
 static unsigned long ion_heap_shrink_count(struct shrinker *shrinker,
-   struct shrink_control *sc)
+  struct shrink_control *sc)
 {
struct ion_heap *heap = container_of(shrinker, struct ion_heap,
 shrinker);
@@ -279,7 +279,7 @@ static unsigned long ion_heap_shrink_count(struct shrinker 
*shrinker,
 }
 
 static unsigned long ion_heap_shrink_scan(struct shrinker *shrinker,
-   struct shrink_control *sc)
+ struct shrink_control *sc)
 {
struct ion_heap *heap = container_of(shrinker, struct ion_heap,
 shrinker);
-- 
2.9.3

[PATCH] Staging: android: ion: ion_heap.c: fix parenthesis alignment

2016-08-16 Thread Ben LeMasurier

This fixes the checkpatch.pl "Alignment should match open parenthesis"
issues in ion_heap.c.

Signed-off-by: Ben LeMasurier 
---
 drivers/staging/android/ion/ion_heap.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_heap.c 
b/drivers/staging/android/ion/ion_heap.c
index ca15a87..4e5c0f1 100644
--- a/drivers/staging/android/ion/ion_heap.c
+++ b/drivers/staging/android/ion/ion_heap.c
@@ -93,7 +93,7 @@ int ion_heap_map_user(struct ion_heap *heap, struct 
ion_buffer *buffer,
}
len = min(len, remainder);
ret = remap_pfn_range(vma, addr, page_to_pfn(page), len,
-   vma->vm_page_prot);
+ vma->vm_page_prot);
if (ret)
return ret;
addr += len;
@@ -116,7 +116,7 @@ static int ion_heap_clear_pages(struct page **pages, int 
num, pgprot_t pgprot)
 }
 
 static int ion_heap_sglist_zero(struct scatterlist *sgl, unsigned int nents,
-   pgprot_t pgprot)
+   pgprot_t pgprot)
 {
int p = 0;
int ret = 0;
@@ -181,7 +181,7 @@ size_t ion_heap_freelist_size(struct ion_heap *heap)
 }
 
 static size_t _ion_heap_freelist_drain(struct ion_heap *heap, size_t size,
-   bool skip_pools)
+  bool skip_pools)
 {
struct ion_buffer *buffer;
size_t total_drained = 0;
@@ -266,7 +266,7 @@ int ion_heap_init_deferred_free(struct ion_heap *heap)
 }
 
 static unsigned long ion_heap_shrink_count(struct shrinker *shrinker,
-   struct shrink_control *sc)
+  struct shrink_control *sc)
 {
struct ion_heap *heap = container_of(shrinker, struct ion_heap,
 shrinker);
@@ -279,7 +279,7 @@ static unsigned long ion_heap_shrink_count(struct shrinker 
*shrinker,
 }
 
 static unsigned long ion_heap_shrink_scan(struct shrinker *shrinker,
-   struct shrink_control *sc)
+ struct shrink_control *sc)
 {
struct ion_heap *heap = container_of(shrinker, struct ion_heap,
 shrinker);
-- 
2.9.3

Re: [PATCH 17/34] clk: maxgen: Migrate to clk_hw based OF and registration APIs

2016-08-16 Thread Javier Martinez Canillas

Hello Stephen,

On 08/16/2016 04:06 PM, Stephen Boyd wrote:
> On Tue, Jun 7, 2016 at 11:55 AM, Javier Martinez Canillas
>>>
>>> I tried this patch on top of linux-next and my Peach Pi Chromebook
>>> (that has a max77802 chip) failed to boot. Following is the relevant
>>> parts from the boot log:
>>>
>>
>> It seems the mailer did a mess with the line wrapping so here's another 
>> attempt:
>>
> 
> Thanks! Found the problem too.
> 

Great! Thanks a lot for finding the issue.
I tested v2 of your patch and it worked well indeed.

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America

Re: [PATCH 17/34] clk: maxgen: Migrate to clk_hw based OF and registration APIs

2016-08-16 Thread Javier Martinez Canillas

Hello Stephen,

On 08/16/2016 04:06 PM, Stephen Boyd wrote:
> On Tue, Jun 7, 2016 at 11:55 AM, Javier Martinez Canillas
>>>
>>> I tried this patch on top of linux-next and my Peach Pi Chromebook
>>> (that has a max77802 chip) failed to boot. Following is the relevant
>>> parts from the boot log:
>>>
>>
>> It seems the mailer did a mess with the line wrapping so here's another 
>> attempt:
>>
> 
> Thanks! Found the problem too.
> 

Great! Thanks a lot for finding the issue.
I tested v2 of your patch and it worked well indeed.

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America

Re: [PATCH v2] Bluetooth: Add LED triggers for HCI frames tx and rx

2016-08-16 Thread Guodong Xu

On 16 August 2016 at 20:33, Marcel Holtmann  wrote:
>
> Hi Guodong,
>
> >>> Two LED triggers are added into hci_dev: tx_led and rx_led. Upon ACL/SCO
> >>> packets available in tx or rx, the LEDs will blink.
> >>>
> >>> For each hci registration, two triggers are added into LED subsystem:
> >>> [hdev->name]-tx and [hdev-name]-rx.
> >>> Refer to Documentation/leds/leds-class.txt for usage.
> >>>
> >>> Verified on HiKey 96boards, which uses HiSilicon hi6220 SoC and TI
> >>> WL1835 WiFi/BT combo chip.
> >>
> >> so I have no idea what to do with adding adding hci0-rx and hci0-tx 
> >> triggers. Combined with hci0-power trigger these are already 3 triggers. 
> >> And if you have 2 Bluetooth controllers in your system, then you have 6 
> >> triggers.
> >>
> >
> > True, 6 triggers. But, taking example for other subsytems, eg. cpu
> > cores. On my board, I have "heartbeat cpu0 cpu1 cpu2 cpu3 cpu4 cpu5
> > cpu6 cpu7". It doesn't have to mean you need all of them connected to
> > some LED(s). Actually, in most of the case, I only need heartbeat.
> >
> >
> >
> >> If we then maybe add another trigger, then this number just goes up and up.
> >>
> >> As far as I can tell you can only assign a single trigger to a LED.
> >>
> >
> > That's true. And people got a choice of which feature he wants to visualize.
>
> and as a result we keep adding senseless triggers to the kernel and bloating 
> it up for no reason. Especially since it feels like 99% of the LED triggers 
> are not used at all. This makes no sense to me.
>
> >> So this means to even use these triggers, you need now 3 LEDs per 
> >> Bluetooth controller. How is that useful for anybody in a real system? 
> >> Maybe I am missing something here and somehow there is magic to combine 
> >> triggers, but I have not found it yet. So please someone enlighten me on 
> >> how this is suppose to be used with real devices.
> >>
> >> Recently I have added a simple bluetooth-power trigger that combines all 
> >> Bluetooth controllers into a single trigger. If any of them is enabled, 
> >> then you can control your LED. Which makes a lot more sense to me since 
> >> you most likely have a single Bluetooth LED on your system. And you want 
> >> it to show the correct state no matter what Bluetooth controller is in 
> >> use. However I can see the case that someone might want to assign one 
> >> specific Bluetooth controller to a LED status.
> >>
> >> So instead of adding many independent triggers to each controller, why not 
> >> create one global bluetooth trigger and one individual bluetooth-hci0 
> >> trigger for each controller. And the combine power, tx, rx and whatever 
> >> else we need to trigger the LED for?
> >>
> >
> > When I starting this work, I referred to WiFi system. See
> > CONFIG_MAC80211_LEDS. WiFi system implements these types of triggers "
> > phy0rx phy0tx phy0assoc phy0radio" for each 'controller'.
>
> And I actually wonder who ever used these triggers. You need 4 LEDs to 
> visualize the WiFi status. Which systems has 4 LEDs to spare to visualize 
> this.
>
> > Besides, there are also RFKILL which stands for WiFi/BT power status.
> > RFKILL adds triggers for each module too. Eg. in the below example, I
> > have one WiFi (phy0), one BT (hci0). Trigger rfkill1 equals to
> > hci0-power.
> >
> > Ref: here are all LED triggers I found in my 96boards/HiKey:
> >
> > # cat trigger
> > none kbd-scrollock kbd-numlock kbd-capslock kbd-kanalock kbd-shiftlock
> > kbd-altgrlock kbd-ctrllock kbd-altlock kbd-shiftllock kbd-shiftrlock
> > kbd-ctrlllock kbd-ctrlrlock mmc0 mmc1 heartbeat cpu0 cpu1 cpu2 cpu3
> > cpu4 cpu5 cpu6 cpu7 mmc2 rfkill0 phy0rx phy0tx phy0assoc phy0radio
> > hci0-power hci0-tx [hci0-rx] rfkill1
>
> And how many LEDs do you have in the your system? I think you are making my 
> point here.
>
> So I think what we need to do is to not add to this madness and instead 
> create one "bluetooth" LED trigger that combines power and TX/RX for all 
> controllers. And then allow for individual "bluetooth-hci0" LED triggers so 
> that you can bind a single Bluetooth controller to a single LED.
>
> For me, if I can not combine hci0-power, hci0-tx and hci0-rx into a single 
> LED,

By combining them into a single LED, do you mean such a use case?
 - when hci0 is powered on, this LED starts on.
 - then, when there is tx/rx traffic, this LED should blink (reversely
of course).
 - when hci0 is powered off, this LED turns off.

-Guodong

> it becomes utterly useless on pretty much every system that is out there.
>
> Regards
>
> Marcel
>

Re: [PATCH v2] Bluetooth: Add LED triggers for HCI frames tx and rx

2016-08-16 Thread Guodong Xu

On 16 August 2016 at 20:33, Marcel Holtmann  wrote:
>
> Hi Guodong,
>
> >>> Two LED triggers are added into hci_dev: tx_led and rx_led. Upon ACL/SCO
> >>> packets available in tx or rx, the LEDs will blink.
> >>>
> >>> For each hci registration, two triggers are added into LED subsystem:
> >>> [hdev->name]-tx and [hdev-name]-rx.
> >>> Refer to Documentation/leds/leds-class.txt for usage.
> >>>
> >>> Verified on HiKey 96boards, which uses HiSilicon hi6220 SoC and TI
> >>> WL1835 WiFi/BT combo chip.
> >>
> >> so I have no idea what to do with adding adding hci0-rx and hci0-tx 
> >> triggers. Combined with hci0-power trigger these are already 3 triggers. 
> >> And if you have 2 Bluetooth controllers in your system, then you have 6 
> >> triggers.
> >>
> >
> > True, 6 triggers. But, taking example for other subsytems, eg. cpu
> > cores. On my board, I have "heartbeat cpu0 cpu1 cpu2 cpu3 cpu4 cpu5
> > cpu6 cpu7". It doesn't have to mean you need all of them connected to
> > some LED(s). Actually, in most of the case, I only need heartbeat.
> >
> >
> >
> >> If we then maybe add another trigger, then this number just goes up and up.
> >>
> >> As far as I can tell you can only assign a single trigger to a LED.
> >>
> >
> > That's true. And people got a choice of which feature he wants to visualize.
>
> and as a result we keep adding senseless triggers to the kernel and bloating 
> it up for no reason. Especially since it feels like 99% of the LED triggers 
> are not used at all. This makes no sense to me.
>
> >> So this means to even use these triggers, you need now 3 LEDs per 
> >> Bluetooth controller. How is that useful for anybody in a real system? 
> >> Maybe I am missing something here and somehow there is magic to combine 
> >> triggers, but I have not found it yet. So please someone enlighten me on 
> >> how this is suppose to be used with real devices.
> >>
> >> Recently I have added a simple bluetooth-power trigger that combines all 
> >> Bluetooth controllers into a single trigger. If any of them is enabled, 
> >> then you can control your LED. Which makes a lot more sense to me since 
> >> you most likely have a single Bluetooth LED on your system. And you want 
> >> it to show the correct state no matter what Bluetooth controller is in 
> >> use. However I can see the case that someone might want to assign one 
> >> specific Bluetooth controller to a LED status.
> >>
> >> So instead of adding many independent triggers to each controller, why not 
> >> create one global bluetooth trigger and one individual bluetooth-hci0 
> >> trigger for each controller. And the combine power, tx, rx and whatever 
> >> else we need to trigger the LED for?
> >>
> >
> > When I starting this work, I referred to WiFi system. See
> > CONFIG_MAC80211_LEDS. WiFi system implements these types of triggers "
> > phy0rx phy0tx phy0assoc phy0radio" for each 'controller'.
>
> And I actually wonder who ever used these triggers. You need 4 LEDs to 
> visualize the WiFi status. Which systems has 4 LEDs to spare to visualize 
> this.
>
> > Besides, there are also RFKILL which stands for WiFi/BT power status.
> > RFKILL adds triggers for each module too. Eg. in the below example, I
> > have one WiFi (phy0), one BT (hci0). Trigger rfkill1 equals to
> > hci0-power.
> >
> > Ref: here are all LED triggers I found in my 96boards/HiKey:
> >
> > # cat trigger
> > none kbd-scrollock kbd-numlock kbd-capslock kbd-kanalock kbd-shiftlock
> > kbd-altgrlock kbd-ctrllock kbd-altlock kbd-shiftllock kbd-shiftrlock
> > kbd-ctrlllock kbd-ctrlrlock mmc0 mmc1 heartbeat cpu0 cpu1 cpu2 cpu3
> > cpu4 cpu5 cpu6 cpu7 mmc2 rfkill0 phy0rx phy0tx phy0assoc phy0radio
> > hci0-power hci0-tx [hci0-rx] rfkill1
>
> And how many LEDs do you have in the your system? I think you are making my 
> point here.
>
> So I think what we need to do is to not add to this madness and instead 
> create one "bluetooth" LED trigger that combines power and TX/RX for all 
> controllers. And then allow for individual "bluetooth-hci0" LED triggers so 
> that you can bind a single Bluetooth controller to a single LED.
>
> For me, if I can not combine hci0-power, hci0-tx and hci0-rx into a single 
> LED,

By combining them into a single LED, do you mean such a use case?
 - when hci0 is powered on, this LED starts on.
 - then, when there is tx/rx traffic, this LED should blink (reversely
of course).
 - when hci0 is powered off, this LED turns off.

-Guodong

> it becomes utterly useless on pretty much every system that is out there.
>
> Regards
>
> Marcel
>

Re: [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-08-16 Thread Dave Young

On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> Hello,
> 
> This patch series implements a mechanism which allows the kernel to pass
> on a buffer to the kernel that will be kexec'd. This buffer is passed
> as a segment which is added to the kimage when it is being prepared
> by kexec_file_load.
> 
> How the second kernel is informed of this buffer is architecture-specific.
> On powerpc, this is done via the device tree, by checking
> the properties /chosen/linux,kexec-handover-buffer-start and
> /chosen/linux,kexec-handover-buffer-end, which is analogous to how the
> kernel finds the initrd.
> 
> This is needed because the Integrity Measurement Architecture subsystem
> needs to preserve its measurement list accross the kexec reboot. The
> following patch series for the IMA subsystem uses this feature for that
> purpose:
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016745.html
> 
> This is so that IMA can implement trusted boot support on the OpenPower
> platform, because on such systems an intermediary Linux instance running
> as part of the firmware is used to boot the target operating system via
> kexec. Using this mechanism, IMA on this intermediary instance can
> hand over to the target OS the measurements of the components that were
> used to boot it.
> 
> Because there could be additional measurement events between the
> kexec_file_load call and the actual reboot, IMA needs a way to update the
> buffer with those additional events before rebooting. One can minimize
> the interval between the kexec_file_load and the reboot syscalls, but as
> small as it can be, there is always the possibility that the measurement
> list will be out of date at the time of reboot.
> 
> To address this issue, this patch series also introduces
> kexec_update_segment, which allows a reboot notifier to change the
> contents of the image segment during the reboot process.
> 
> Patch 5 makes kimage_load_normal_segment and kexec_update_segment share
> code. It's not much code that they can share though, so I'm not sure if
> the result is actually better.
> 
> The last patch is not intended to be merged, it just demonstrates how
> this feature can be used.
> 
> This series applies on top of v5 of the "kexec_file_load implementation
> for PowerPC" patch series (which applies on top of v4.8-rc1):
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016843.html

I'm trying to review your patches, but seems I can not apply them
cleanly to mainline kernel or v4.8-rc1

Apply the kexec_file_load series failed as below on v4.8-rc1:

Applying: kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
error: patch failed: include/linux/kexec.h:149
error: include/linux/kexec.h: patch does not apply
Patch failed at 0001 kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

What is the order of your patch series of the three patchset?

[PATCH v2 0/2] extend kexec_file_load system call
[PATCH v5 00/13] kexec_file_load implementation for PowerPC
[PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

Do they depend on other patches?

Thanks
Dave

Re: [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-08-16 Thread Dave Young

On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> Hello,
> 
> This patch series implements a mechanism which allows the kernel to pass
> on a buffer to the kernel that will be kexec'd. This buffer is passed
> as a segment which is added to the kimage when it is being prepared
> by kexec_file_load.
> 
> How the second kernel is informed of this buffer is architecture-specific.
> On powerpc, this is done via the device tree, by checking
> the properties /chosen/linux,kexec-handover-buffer-start and
> /chosen/linux,kexec-handover-buffer-end, which is analogous to how the
> kernel finds the initrd.
> 
> This is needed because the Integrity Measurement Architecture subsystem
> needs to preserve its measurement list accross the kexec reboot. The
> following patch series for the IMA subsystem uses this feature for that
> purpose:
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016745.html
> 
> This is so that IMA can implement trusted boot support on the OpenPower
> platform, because on such systems an intermediary Linux instance running
> as part of the firmware is used to boot the target operating system via
> kexec. Using this mechanism, IMA on this intermediary instance can
> hand over to the target OS the measurements of the components that were
> used to boot it.
> 
> Because there could be additional measurement events between the
> kexec_file_load call and the actual reboot, IMA needs a way to update the
> buffer with those additional events before rebooting. One can minimize
> the interval between the kexec_file_load and the reboot syscalls, but as
> small as it can be, there is always the possibility that the measurement
> list will be out of date at the time of reboot.
> 
> To address this issue, this patch series also introduces
> kexec_update_segment, which allows a reboot notifier to change the
> contents of the image segment during the reboot process.
> 
> Patch 5 makes kimage_load_normal_segment and kexec_update_segment share
> code. It's not much code that they can share though, so I'm not sure if
> the result is actually better.
> 
> The last patch is not intended to be merged, it just demonstrates how
> this feature can be used.
> 
> This series applies on top of v5 of the "kexec_file_load implementation
> for PowerPC" patch series (which applies on top of v4.8-rc1):
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016843.html

I'm trying to review your patches, but seems I can not apply them
cleanly to mainline kernel or v4.8-rc1

Apply the kexec_file_load series failed as below on v4.8-rc1:

Applying: kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
error: patch failed: include/linux/kexec.h:149
error: include/linux/kexec.h: patch does not apply
Patch failed at 0001 kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

What is the order of your patch series of the three patchset?

[PATCH v2 0/2] extend kexec_file_load system call
[PATCH v5 00/13] kexec_file_load implementation for PowerPC
[PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

Do they depend on other patches?

Thanks
Dave

RE: [PATCH v4 2/3] tools/power/acpi/acpidbg: Use new flushing mechanism

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 2/3] tools/power/acpi/acpidbg: Use new flushing 
> mechanism
> 
> On Tuesday, July 26, 2016 07:01:39 PM Lv Zheng wrote:
> > This patch converts tools/power/acpi/tools/acpidbg/acpidbg to use the new
> > flushing mechanism.
> 
> I guess it will use the flush interface provided by the kernel instead of the
> previously existing flush implementation in user space?

Yes.
The existing user space flush implementation is just a compromise during the 
period the kernel flush is not ready.

Thanks
Lv

RE: [PATCH v4 2/3] tools/power/acpi/acpidbg: Use new flushing mechanism

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 2/3] tools/power/acpi/acpidbg: Use new flushing 
> mechanism
> 
> On Tuesday, July 26, 2016 07:01:39 PM Lv Zheng wrote:
> > This patch converts tools/power/acpi/tools/acpidbg/acpidbg to use the new
> > flushing mechanism.
> 
> I guess it will use the flush interface provided by the kernel instead of the
> previously existing flush implementation in user space?

Yes.
The existing user space flush implementation is just a compromise during the 
period the kernel flush is not ready.

Thanks
Lv

RE: [PATCH v4 1/3] ACPI / debugger: Add kernel flushing support

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 1/3] ACPI / debugger: Add kernel flushing support
> 
> On Tuesday, July 26, 2016 07:01:33 PM Lv Zheng wrote:
> > This patch adds debugger output flushing support in kernel via .ioctl()
> > callback. The in-kernel flushing is more efficient, because it reduces
> > useless log IOs by bypassing log user_read/kern_write during the flush
> > period.
> >
> > This mechanism is useful for the batch mode.
> 
> Is it only useful or is it required?

Should be required.

> 
> Also the batch mode is introduced by the remaining patches in the series,
> isn't it?

No, the batch mode is already in the upstream.
It's acpidbg -b "debugger commands".
For example:
acpidbg "-b namespace"
It's currently working.

> 
> > Scripts can integrate a batch mode acpidbg instance to perform AML debugger
> > functionalities.
> 
> This sentence is not parsable for me.  What does it mean, really?

By default, just like acpiexec, acpidbg is a shell like interactive utility.
Running acpidbg enters an interactive acpidbg shell.
So it cannot be used by the shell scripts.

With acpidbg -b option, acpidbg executes 1 command and exits.
So it can be used in a script like:

#!/bin/sh

acpidbg -b "find _LID"
acpidbg -b "namespace"

So that validators can use "acpidbg -b" to create recursive test cases.

> 
> > As the batch mode always starts from a new command write, it thus requires
> > the kernel debugger driver to drop the old input/output first.
> 
> What does "the old" mean here?

There are 2 cases, requiring the "flush" operation to be performed before an 
"acpidbg -b" execution.

A. The first case requiring flush is "prompt string":
After the kernel is booted, for the first acpidbg execution, user may choose to 
use it in either batch mode or in interactive mode.

A.1. First use is batch mode, for example, "acpidbg -b namespace"

1. acpi_dbg kernel driver outputs prompt strings into the output buffer.
2. acpidbg user program writes "namespace" command to the acpi_dbg kernel 
driver's input buffer.
3. acpi_dbg kernel driver reads the input buffer.
   It then obtains the "namespace" command and starts to execute it.
4. acpi_dbg kernel driver put the command result into the output buffer.
   And wait the userspace acpidbg to read the output.
5. acpidbg user program reads the acpi_dbg kernel driver's output buffer.
   It then obtains both the unexpected "prompt string" and the expected command 
result.

If there is a "flush" between 1 and 2, acpidbg user program won't obtain the 
unexpected "prompt string".

A.2. First use is interactive mode

However, it is not ensured that the "prompt string" can always appear before 
the command result.
For example:
1. User runs acpidbg with the interactive mode
2. acpi_dbg kernel driver outputs prompt strings into the output buffer.
3. acpidbg user program reads the output buffer and get the "prompt string".
4. User exits acpidbg.
5. User runs acpidbg with the batch mode.
In this case, "prompt string" has already been read by the 1st "acpidbg 
interactive execution"
Thus the 2nd "acpidbg batch execution" won't read unexpected "prompt string".

There is no harm to have a "flush" between 4 and 5.

Conclusion:
Whether there is "prompt string" in the output buffer depends on whether the 
"acpidbg -b" is the first use.
While running “flush" operation before batch mode execution can always drain 
the output buffer. 

B. The second case requiring "flush" is the old output:

The old output can be remained in the output buffer because we allow 
"asynchronous termination" of acpi_dbg IO.

The use case is:

1. User runs acpidbg by executing a command in either interactive mode or in 
batch mode.
2. acpi_dbg kernel driver put the command result into the output buffer.
3. When the output buffer is full, acpi_dbg kernel driver blocks, to wait for 
the userspace to read the result.
4. Instead of reading the result, userspace can close the acpi_dbg IO.
   User can achieve this by typing "ctrl+C" in acpidbg user program.
5. User runs acpidbg with the batch mode.
   It then obtains both the unexpected "old output", "prompt string" and the 
expected command result.

If there is a "flush" between 4 and 5, acpidbg user program won't obtain the 
unexpected "old output" and "prompt string".

> 
> > The old input is automatically dropped by acpi_os_get_line() via an error
> > returning value,
> 
> I can't parse this too, sorry.

OK, so we don't need this statement.
The flush is mainly used to drain the output buffer.
Because it can wrong output for the batch mode.

> 
> > but the output are remained in acpi_dbg output buffers and should be
> > dropped prior than reading the new command, otherwise, the old output can
> > be read out by the batch mode instance and the result of the batch mode
> > command will be messed up.
> 
> Can you give an example here for clarity, please?

Please find the 2 cases in A.1. and B.

Thanks and best regards
Lv

RE: [PATCH v4 1/3] ACPI / debugger: Add kernel flushing support

2016-08-16 Thread Zheng, Lv

Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Subject: Re: [PATCH v4 1/3] ACPI / debugger: Add kernel flushing support
> 
> On Tuesday, July 26, 2016 07:01:33 PM Lv Zheng wrote:
> > This patch adds debugger output flushing support in kernel via .ioctl()
> > callback. The in-kernel flushing is more efficient, because it reduces
> > useless log IOs by bypassing log user_read/kern_write during the flush
> > period.
> >
> > This mechanism is useful for the batch mode.
> 
> Is it only useful or is it required?

Should be required.

> 
> Also the batch mode is introduced by the remaining patches in the series,
> isn't it?

No, the batch mode is already in the upstream.
It's acpidbg -b "debugger commands".
For example:
acpidbg "-b namespace"
It's currently working.

> 
> > Scripts can integrate a batch mode acpidbg instance to perform AML debugger
> > functionalities.
> 
> This sentence is not parsable for me.  What does it mean, really?

By default, just like acpiexec, acpidbg is a shell like interactive utility.
Running acpidbg enters an interactive acpidbg shell.
So it cannot be used by the shell scripts.

With acpidbg -b option, acpidbg executes 1 command and exits.
So it can be used in a script like:

#!/bin/sh

acpidbg -b "find _LID"
acpidbg -b "namespace"

So that validators can use "acpidbg -b" to create recursive test cases.

> 
> > As the batch mode always starts from a new command write, it thus requires
> > the kernel debugger driver to drop the old input/output first.
> 
> What does "the old" mean here?

There are 2 cases, requiring the "flush" operation to be performed before an 
"acpidbg -b" execution.

A. The first case requiring flush is "prompt string":
After the kernel is booted, for the first acpidbg execution, user may choose to 
use it in either batch mode or in interactive mode.

A.1. First use is batch mode, for example, "acpidbg -b namespace"

1. acpi_dbg kernel driver outputs prompt strings into the output buffer.
2. acpidbg user program writes "namespace" command to the acpi_dbg kernel 
driver's input buffer.
3. acpi_dbg kernel driver reads the input buffer.
   It then obtains the "namespace" command and starts to execute it.
4. acpi_dbg kernel driver put the command result into the output buffer.
   And wait the userspace acpidbg to read the output.
5. acpidbg user program reads the acpi_dbg kernel driver's output buffer.
   It then obtains both the unexpected "prompt string" and the expected command 
result.

If there is a "flush" between 1 and 2, acpidbg user program won't obtain the 
unexpected "prompt string".

A.2. First use is interactive mode

However, it is not ensured that the "prompt string" can always appear before 
the command result.
For example:
1. User runs acpidbg with the interactive mode
2. acpi_dbg kernel driver outputs prompt strings into the output buffer.
3. acpidbg user program reads the output buffer and get the "prompt string".
4. User exits acpidbg.
5. User runs acpidbg with the batch mode.
In this case, "prompt string" has already been read by the 1st "acpidbg 
interactive execution"
Thus the 2nd "acpidbg batch execution" won't read unexpected "prompt string".

There is no harm to have a "flush" between 4 and 5.

Conclusion:
Whether there is "prompt string" in the output buffer depends on whether the 
"acpidbg -b" is the first use.
While running “flush" operation before batch mode execution can always drain 
the output buffer. 

B. The second case requiring "flush" is the old output:

The old output can be remained in the output buffer because we allow 
"asynchronous termination" of acpi_dbg IO.

The use case is:

1. User runs acpidbg by executing a command in either interactive mode or in 
batch mode.
2. acpi_dbg kernel driver put the command result into the output buffer.
3. When the output buffer is full, acpi_dbg kernel driver blocks, to wait for 
the userspace to read the result.
4. Instead of reading the result, userspace can close the acpi_dbg IO.
   User can achieve this by typing "ctrl+C" in acpidbg user program.
5. User runs acpidbg with the batch mode.
   It then obtains both the unexpected "old output", "prompt string" and the 
expected command result.

If there is a "flush" between 4 and 5, acpidbg user program won't obtain the 
unexpected "old output" and "prompt string".

> 
> > The old input is automatically dropped by acpi_os_get_line() via an error
> > returning value,
> 
> I can't parse this too, sorry.

OK, so we don't need this statement.
The flush is mainly used to drain the output buffer.
Because it can wrong output for the batch mode.

> 
> > but the output are remained in acpi_dbg output buffers and should be
> > dropped prior than reading the new command, otherwise, the old output can
> > be read out by the batch mode instance and the result of the batch mode
> > command will be messed up.
> 
> Can you give an example here for clarity, please?

Please find the 2 cases in A.1. and B.

Thanks and best regards
Lv

Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer

2016-08-16 Thread Matt Mullins

On Tue, Aug 16, 2016 at 12:19:42PM -0700, Yinghai Lu wrote:
> On Mon, Aug 15, 2016 at 9:01 PM, Matt Mullins  wrote:
> >
> > This appears to have a negative effect on booting the Intel Edison 
> > platform, as
> > it uses u-boot as its bootloader.  u-boot does not copy the init_size 
> > parameter
> > when booting a bzImage: it copies a fixed-size setup_header [1], and its
> > definition of setup_header doesn't include the parameters beyond setup_data 
> > [2].
> >
> > With a zero value for init_size, this calculates a %rsp value of 
> > 0x101ff9600.
> > This causes the boot process to hard-stop at the immediately-following 
> > pushq, as
> > this platform has no usable physical addresses above 4G.
> >
> > What are the options for getting this type of platform to function again?  
> > For
> > now, kexec from a working Linux system does seem to be a work-around, but 
> > there
> > appears to be other x86 hardware using u-boot: the chromium.org folks seem 
> > to be
> > maintaining the u-boot x86 tree.
> >
> > [1] 
> > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/lib/zimage.c;h=1b33c771391f49ffe82864ff1582bdfd07e5e97d;hb=HEAD#l156
> > [2] 
> > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/include/asm/bootparam.h;h=140095117e5a2daef0a097c55f0ed10e08acc781;hb=HEAD#l24
> 
> Then should fix the u-boot about header_size assumption.

I was hoping to avoid that, since the Edison's u-boot is 10,000-line patch atop
the upstream -- I don't trust myself to build and flash one quite yet.

If this turned out to affect Chromebooks, I'd spend more effort pushing for
a kernel fix, but it seems that ChromeOS has a different kernel load procedure
and doesn't use "zboot".  For now, I'll probably just keep a local patch that
hard-codes a value large enough to decompress and launch the kernel.

I may turn that local patch into something gated by a Kconfig eventually, in
hopes that users of the other x86 u-boot platforms will see it in a "make
oldconfig" run.

Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer

2016-08-16 Thread Matt Mullins

On Tue, Aug 16, 2016 at 12:19:42PM -0700, Yinghai Lu wrote:
> On Mon, Aug 15, 2016 at 9:01 PM, Matt Mullins  wrote:
> >
> > This appears to have a negative effect on booting the Intel Edison 
> > platform, as
> > it uses u-boot as its bootloader.  u-boot does not copy the init_size 
> > parameter
> > when booting a bzImage: it copies a fixed-size setup_header [1], and its
> > definition of setup_header doesn't include the parameters beyond setup_data 
> > [2].
> >
> > With a zero value for init_size, this calculates a %rsp value of 
> > 0x101ff9600.
> > This causes the boot process to hard-stop at the immediately-following 
> > pushq, as
> > this platform has no usable physical addresses above 4G.
> >
> > What are the options for getting this type of platform to function again?  
> > For
> > now, kexec from a working Linux system does seem to be a work-around, but 
> > there
> > appears to be other x86 hardware using u-boot: the chromium.org folks seem 
> > to be
> > maintaining the u-boot x86 tree.
> >
> > [1] 
> > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/lib/zimage.c;h=1b33c771391f49ffe82864ff1582bdfd07e5e97d;hb=HEAD#l156
> > [2] 
> > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/include/asm/bootparam.h;h=140095117e5a2daef0a097c55f0ed10e08acc781;hb=HEAD#l24
> 
> Then should fix the u-boot about header_size assumption.

I was hoping to avoid that, since the Edison's u-boot is 10,000-line patch atop
the upstream -- I don't trust myself to build and flash one quite yet.

If this turned out to affect Chromebooks, I'd spend more effort pushing for
a kernel fix, but it seems that ChromeOS has a different kernel load procedure
and doesn't use "zboot".  For now, I'll probably just keep a local patch that
hard-codes a value large enough to decompress and launch the kernel.

I may turn that local patch into something gated by a Kconfig eventually, in
hopes that users of the other x86 u-boot platforms will see it in a "make
oldconfig" run.

[PATCH v3] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Wanpeng Li

From: Wanpeng Li 

Commit:

57430218317e ("sched/cputime: Count actually elapsed irq & softirq 
time")

... triggered a regression:

| An i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
| cpu hog processes(for loop) running in the guest, I hot-unplug the pCPUs 
| on host one by one until there is only one left, then observe the top in 
| guest, there are 100% st for cpu0(housekeeping), and 75% st for other cpus
| (nohz full mode). However, w/o this commit, 75% for all the four cpus.

When a guest is interrupted for a longer amount of time, missed clock ticks 
are not redelivered later. Because of that, we should not limit the amount 
of steal time accounted to the amount of time that the calling functions 
think have passed.

However, the interval returned by account_other_time() is NOT rounded down 
to the nearest jiffy, while the base interval in get_vtime_delta() it is 
subtracted from is, so the max cputime limit is required to avoid underflow.

This patch fix the regression by limiting the account_other_time() from 
get_vtime_delta() to avoid underflow, and let other three call sites
(account_other_time() and steal_account_process_time()) account however 
much steal time the host told us elapsed. 

Suggested-by: Rik van Riel  
Suggested-by: Paolo Bonzini 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Paolo Bonzini 
Cc: Radim Krcmar 
Cc: Mike Galbraith 
Cc: Frederic Weisbecker 
Cc: Thomas Gleixner 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * update code comments
v1 -> v2:
 * add code comments and update the changelog

 kernel/sched/cputime.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9858266..2b9e5e5 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -263,6 +263,11 @@ void account_idle_time(cputime_t cputime)
cpustat[CPUTIME_IDLE] += (__force u64) cputime;
 }
 
+/*
+ * When a guest is interrupted for a longer amount of time, missed clock
+ * ticks are not redelivered later. Due to that, this function may on
+ * occasion account more time than the calling functions think elapsed.
+ */
 static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
 {
 #ifdef CONFIG_PARAVIRT
@@ -371,7 +376,7 @@ static void irqtime_account_process_tick(struct task_struct 
*p, int user_tick,
 * idle, or potentially user or system time. Due to rounding,
 * other time can exceed ticks occasionally.
 */
-   other = account_other_time(cputime);
+   other = account_other_time(ULONG_MAX);
if (other >= cputime)
return;
cputime -= other;
@@ -486,7 +491,7 @@ void account_process_tick(struct task_struct *p, int 
user_tick)
}
 
cputime = cputime_one_jiffy;
-   steal = steal_account_process_time(cputime);
+   steal = steal_account_process_time(ULONG_MAX);
 
if (steal >= cputime)
return;
@@ -516,7 +521,7 @@ void account_idle_ticks(unsigned long ticks)
}
 
cputime = jiffies_to_cputime(ticks);
-   steal = steal_account_process_time(cputime);
+   steal = steal_account_process_time(ULONG_MAX);
 
if (steal >= cputime)
return;
@@ -694,6 +699,13 @@ static cputime_t get_vtime_delta(struct task_struct *tsk)
unsigned long now = READ_ONCE(jiffies);
cputime_t delta, other;
 
+   /*
+* Unlike tick based timing, vtime based timing never has lost
+* ticks, and no need for steal time accounting to make up for
+* lost ticks. Vtime accounts a rounded version of actual
+* elapsed time. Limit account_other_time to prevent rounding
+* errors from causing elapsed vtime to go negative.
+*/
delta = jiffies_to_cputime(now - tsk->vtime_snap);
other = account_other_time(delta);
WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
-- 
1.9.1

[PATCH v3] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Wanpeng Li

From: Wanpeng Li 

Commit:

57430218317e ("sched/cputime: Count actually elapsed irq & softirq 
time")

... triggered a regression:

| An i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
| cpu hog processes(for loop) running in the guest, I hot-unplug the pCPUs 
| on host one by one until there is only one left, then observe the top in 
| guest, there are 100% st for cpu0(housekeeping), and 75% st for other cpus
| (nohz full mode). However, w/o this commit, 75% for all the four cpus.

When a guest is interrupted for a longer amount of time, missed clock ticks 
are not redelivered later. Because of that, we should not limit the amount 
of steal time accounted to the amount of time that the calling functions 
think have passed.

However, the interval returned by account_other_time() is NOT rounded down 
to the nearest jiffy, while the base interval in get_vtime_delta() it is 
subtracted from is, so the max cputime limit is required to avoid underflow.

This patch fix the regression by limiting the account_other_time() from 
get_vtime_delta() to avoid underflow, and let other three call sites
(account_other_time() and steal_account_process_time()) account however 
much steal time the host told us elapsed. 

Suggested-by: Rik van Riel  
Suggested-by: Paolo Bonzini 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Paolo Bonzini 
Cc: Radim Krcmar 
Cc: Mike Galbraith 
Cc: Frederic Weisbecker 
Cc: Thomas Gleixner 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * update code comments
v1 -> v2:
 * add code comments and update the changelog

 kernel/sched/cputime.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9858266..2b9e5e5 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -263,6 +263,11 @@ void account_idle_time(cputime_t cputime)
cpustat[CPUTIME_IDLE] += (__force u64) cputime;
 }
 
+/*
+ * When a guest is interrupted for a longer amount of time, missed clock
+ * ticks are not redelivered later. Due to that, this function may on
+ * occasion account more time than the calling functions think elapsed.
+ */
 static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
 {
 #ifdef CONFIG_PARAVIRT
@@ -371,7 +376,7 @@ static void irqtime_account_process_tick(struct task_struct 
*p, int user_tick,
 * idle, or potentially user or system time. Due to rounding,
 * other time can exceed ticks occasionally.
 */
-   other = account_other_time(cputime);
+   other = account_other_time(ULONG_MAX);
if (other >= cputime)
return;
cputime -= other;
@@ -486,7 +491,7 @@ void account_process_tick(struct task_struct *p, int 
user_tick)
}
 
cputime = cputime_one_jiffy;
-   steal = steal_account_process_time(cputime);
+   steal = steal_account_process_time(ULONG_MAX);
 
if (steal >= cputime)
return;
@@ -516,7 +521,7 @@ void account_idle_ticks(unsigned long ticks)
}
 
cputime = jiffies_to_cputime(ticks);
-   steal = steal_account_process_time(cputime);
+   steal = steal_account_process_time(ULONG_MAX);
 
if (steal >= cputime)
return;
@@ -694,6 +699,13 @@ static cputime_t get_vtime_delta(struct task_struct *tsk)
unsigned long now = READ_ONCE(jiffies);
cputime_t delta, other;
 
+   /*
+* Unlike tick based timing, vtime based timing never has lost
+* ticks, and no need for steal time accounting to make up for
+* lost ticks. Vtime accounts a rounded version of actual
+* elapsed time. Limit account_other_time to prevent rounding
+* errors from causing elapsed vtime to go negative.
+*/
delta = jiffies_to_cputime(now - tsk->vtime_snap);
other = account_other_time(delta);
WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
-- 
1.9.1

Re: [PATCH v2] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Wanpeng Li

2016-08-17 9:54 GMT+08:00 Rik van Riel :
> On Wed, 2016-08-17 at 09:16 +0800, Wanpeng Li wrote:
>>
>> @@ -694,6 +699,12 @@ static cputime_t get_vtime_delta(struct
>> task_struct *tsk)
>>   unsigned long now = READ_ONCE(jiffies);
>>   cputime_t delta, other;
>>
>> + /*
>> +  * The interval returned by account_other_time() is NOT
>> +  * rounded down to the nearest jiffy, while the base
>> +  * interval it is subtracted from is. So the max cputime
>> +  * limit is required to avoid underflow.
>> +  */
>>   delta = jiffies_to_cputime(now - tsk->vtime_snap);
>>   other = account_other_time(delta);
>>   WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
>
> That comment makes sense in the context of the discussion
> we have been having over the past few days, but could be
> somewhat cryptic to someone looking at it 3 years from now.
>
> How about something like the following?
>
> /*
>  * Unlike tick based timing, vtime based timing never has lost
>  * ticks, and no need for steal time accounting to make up for
>  * lost ticks. Vtime accounts a rounded version of actual
>  * elapsed time. Limit account_other_time to prevent rounding
>  * errors from causing elapsed vtime to go negative.
>  */

Great, thanks for your help. I will send out a new version soon. :)

Regards,
Wanpeng Li

Re: [PATCH v2] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Wanpeng Li

2016-08-17 9:54 GMT+08:00 Rik van Riel :
> On Wed, 2016-08-17 at 09:16 +0800, Wanpeng Li wrote:
>>
>> @@ -694,6 +699,12 @@ static cputime_t get_vtime_delta(struct
>> task_struct *tsk)
>>   unsigned long now = READ_ONCE(jiffies);
>>   cputime_t delta, other;
>>
>> + /*
>> +  * The interval returned by account_other_time() is NOT
>> +  * rounded down to the nearest jiffy, while the base
>> +  * interval it is subtracted from is. So the max cputime
>> +  * limit is required to avoid underflow.
>> +  */
>>   delta = jiffies_to_cputime(now - tsk->vtime_snap);
>>   other = account_other_time(delta);
>>   WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
>
> That comment makes sense in the context of the discussion
> we have been having over the past few days, but could be
> somewhat cryptic to someone looking at it 3 years from now.
>
> How about something like the following?
>
> /*
>  * Unlike tick based timing, vtime based timing never has lost
>  * ticks, and no need for steal time accounting to make up for
>  * lost ticks. Vtime accounts a rounded version of actual
>  * elapsed time. Limit account_other_time to prevent rounding
>  * errors from causing elapsed vtime to go negative.
>  */

Great, thanks for your help. I will send out a new version soon. :)

Regards,
Wanpeng Li

Re: [PATCH v3] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Rik van Riel

On Wed, 2016-08-17 at 10:05 +0800, Wanpeng Li wrote:

> This patch fix the regression by limiting the account_other_time()
> from 
> get_vtime_delta() to avoid underflow, and let other three call sites
> (account_other_time() and steal_account_process_time()) account
> however 
> much steal time the host told us elapsed. 
> 
> Suggested-by: Rik van Riel  
> Suggested-by: Paolo Bonzini 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Rik van Riel 
> Cc: Paolo Bonzini 
> Cc: Radim Krcmar 
> Cc: Mike Galbraith 
> Cc: Frederic Weisbecker 
> Cc: Thomas Gleixner 
> Signed-off-by: Wanpeng Li 
> 

Reviewed-by: Rik van Riel 

-- 

All Rights Reversed.

signature.asc
Description: This is a digitally signed message part

Re: [PATCH v3] sched/cputime: Resync steal time when guest & host lose sync

2016-08-16 Thread Rik van Riel

On Wed, 2016-08-17 at 10:05 +0800, Wanpeng Li wrote:

> This patch fix the regression by limiting the account_other_time()
> from 
> get_vtime_delta() to avoid underflow, and let other three call sites
> (account_other_time() and steal_account_process_time()) account
> however 
> much steal time the host told us elapsed. 
> 
> Suggested-by: Rik van Riel  
> Suggested-by: Paolo Bonzini 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Rik van Riel 
> Cc: Paolo Bonzini 
> Cc: Radim Krcmar 
> Cc: Mike Galbraith 
> Cc: Frederic Weisbecker 
> Cc: Thomas Gleixner 
> Signed-off-by: Wanpeng Li 
> 

Reviewed-by: Rik van Riel 

-- 

All Rights Reversed.

signature.asc
Description: This is a digitally signed message part

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-16 Thread Huang, Ying

Hi, Kim,

Minchan Kim  writes:

> Hello Huang,
>
> On Tue, Aug 09, 2016 at 09:37:42AM -0700, Huang, Ying wrote:
>> From: Huang Ying 
>> 
>> This patchset is based on 8/4 head of mmotm/master.
>> 
>> This is the first step for Transparent Huge Page (THP) swap support.
>> The plan is to delaying splitting THP step by step and avoid splitting
>> THP finally during THP swapping out and swapping in.
>
> What does it mean "delay splitting THP on swapping-in"?

Sorry for my poor English.  We will only delay splitting the THP during
swapping out.  The final target is to avoid splitting the THP during
swapping out, and swap out/in the THP directly.  Thanks for pointing out
that.  I will revise the patch description in the next version.

>> 
>> The advantages of THP swap support are:
>> 
>> - Batch swap operations for THP to reduce lock acquiring/releasing,
>>   including allocating/freeing swap space, adding/deleting to/from swap
>>   cache, and writing/reading swap space, etc.
>> 
>> - THP swap space read/write will be 2M sequence IO.  It is particularly
>>   helpful for swap read, which usually are 4k random IO.
>> 
>> - It will help memory fragmentation, especially when THP is heavily used
>>   by the applications.  2M continuous pages will be free up after THP
>>   swapping out.
>
> Could we take the benefit for normal pages as well as THP page?

This patchset benefits the THP swap only.  It has no effect for normal pages.

> I think Tim and me discussed about that a few weeks ago.

I work closely with Tim on swap optimization.  This patchset is the part
of our swap optimization plan.

> Please search below topics.
>
> [1] mm: Batch page reclamation under shink_page_list
> [2] mm: Cleanup - Reorganize the shrink_page_list code into smaller functions
>
> It's different with yours which focused on THP swapping while the suggestion
> would be more general if we can do so it's worth to try it, I think.

I think the general optimization above will benefit both normal pages
and THP at least for now.  And I think there are no hard conflict
between those two patchsets.

The THP swap has more opportunity to be optimized, because we can batch
512 operations together more easily.  For full THP swap support, unmap a
THP could be more efficient with only one swap count operation instead
of 512, so do many other operations, such as add/remove from swap cache
with multi-order radix tree etc.  And it will help memory fragmentation.
THP can be kept after swapping out/in, need not to rebuild THP via
khugepaged.

But not all pages are huge, so normal pages swap optimization is
necessary and good anyway.

> Anyway, I hope [1/11] should be merged regardless of the patchset because
> I believe anyone doesn't feel comfortable with cluser_info functions. ;-)

Thanks,

Best Regards,
Huang, Ying

[snip]

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-16 Thread Huang, Ying

Hi, Kim,

Minchan Kim  writes:

> Hello Huang,
>
> On Tue, Aug 09, 2016 at 09:37:42AM -0700, Huang, Ying wrote:
>> From: Huang Ying 
>> 
>> This patchset is based on 8/4 head of mmotm/master.
>> 
>> This is the first step for Transparent Huge Page (THP) swap support.
>> The plan is to delaying splitting THP step by step and avoid splitting
>> THP finally during THP swapping out and swapping in.
>
> What does it mean "delay splitting THP on swapping-in"?

Sorry for my poor English.  We will only delay splitting the THP during
swapping out.  The final target is to avoid splitting the THP during
swapping out, and swap out/in the THP directly.  Thanks for pointing out
that.  I will revise the patch description in the next version.

>> 
>> The advantages of THP swap support are:
>> 
>> - Batch swap operations for THP to reduce lock acquiring/releasing,
>>   including allocating/freeing swap space, adding/deleting to/from swap
>>   cache, and writing/reading swap space, etc.
>> 
>> - THP swap space read/write will be 2M sequence IO.  It is particularly
>>   helpful for swap read, which usually are 4k random IO.
>> 
>> - It will help memory fragmentation, especially when THP is heavily used
>>   by the applications.  2M continuous pages will be free up after THP
>>   swapping out.
>
> Could we take the benefit for normal pages as well as THP page?

This patchset benefits the THP swap only.  It has no effect for normal pages.

> I think Tim and me discussed about that a few weeks ago.

I work closely with Tim on swap optimization.  This patchset is the part
of our swap optimization plan.

> Please search below topics.
>
> [1] mm: Batch page reclamation under shink_page_list
> [2] mm: Cleanup - Reorganize the shrink_page_list code into smaller functions
>
> It's different with yours which focused on THP swapping while the suggestion
> would be more general if we can do so it's worth to try it, I think.

I think the general optimization above will benefit both normal pages
and THP at least for now.  And I think there are no hard conflict
between those two patchsets.

The THP swap has more opportunity to be optimized, because we can batch
512 operations together more easily.  For full THP swap support, unmap a
THP could be more efficient with only one swap count operation instead
of 512, so do many other operations, such as add/remove from swap cache
with multi-order radix tree etc.  And it will help memory fragmentation.
THP can be kept after swapping out/in, need not to rebuild THP via
khugepaged.

But not all pages are huge, so normal pages swap optimization is
necessary and good anyway.

> Anyway, I hope [1/11] should be merged regardless of the patchset because
> I believe anyone doesn't feel comfortable with cluser_info functions. ;-)

Thanks,

Best Regards,
Huang, Ying

[snip]

Re: [PATCH v5 02/13] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.

2016-08-16 Thread Balbir Singh



On 17/08/16 04:49, Thiago Jung Bauermann wrote:
> Am Dienstag, 16 August 2016, 16:15:55 schrieb Balbir Singh:
>> On 16/08/16 00:49, Thiago Jung Bauermann wrote:
>>> Am Montag, 15 August 2016, 17:30:49 schrieb Balbir Singh:
 On Thu, Aug 11, 2016 at 08:08:07PM -0300, Thiago Jung Bauermann wrote:
> Adapt all callers to the new function prototype.

 Could you please expand on this?
>>>
>>> Is the following better?
>>>
>>> Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.
>>
>> Yes and the reason for doing so? Consolidation/clarity of implementation?
> 
> Indeed. What about this commit message?
> 
> Subject: [PATCH v5 02/13] kexec_file: Change kexec_add_buffer to take  
>  kexec_buf as argument.
> 
> This is done to simplify the kexec_add_buffer argument list.
> Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.
> 
> In addition, change the type of kexec_buf.buffer from char * to void *.
> There is no particular reason for it to be a char *, and the change
> allows us to get rid of 3 existing casts to char * in the code.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Acked-by: Dave Young 
> Acked-by: Balbir Singh 
> 


Looks good

Balbir

Re: [PATCH v5 02/13] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.

2016-08-16 Thread Balbir Singh



On 17/08/16 04:49, Thiago Jung Bauermann wrote:
> Am Dienstag, 16 August 2016, 16:15:55 schrieb Balbir Singh:
>> On 16/08/16 00:49, Thiago Jung Bauermann wrote:
>>> Am Montag, 15 August 2016, 17:30:49 schrieb Balbir Singh:
 On Thu, Aug 11, 2016 at 08:08:07PM -0300, Thiago Jung Bauermann wrote:
> Adapt all callers to the new function prototype.

 Could you please expand on this?
>>>
>>> Is the following better?
>>>
>>> Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.
>>
>> Yes and the reason for doing so? Consolidation/clarity of implementation?
> 
> Indeed. What about this commit message?
> 
> Subject: [PATCH v5 02/13] kexec_file: Change kexec_add_buffer to take  
>  kexec_buf as argument.
> 
> This is done to simplify the kexec_add_buffer argument list.
> Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.
> 
> In addition, change the type of kexec_buf.buffer from char * to void *.
> There is no particular reason for it to be a char *, and the change
> allows us to get rid of 3 existing casts to char * in the code.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Acked-by: Dave Young 
> Acked-by: Balbir Singh 
> 


Looks good

Balbir

Re: [PATCH 3.16 102/305] xfs: xfs_iflush_cluster fails to abort on error

2016-08-16 Thread Dave Chinner

On Tue, Aug 16, 2016 at 08:45:02PM +0100, Ben Hutchings wrote:
> On Sun, 2016-08-14 at 09:36 +1000, Dave Chinner wrote:
> > On Sat, Aug 13, 2016 at 06:42:51PM +0100, Ben Hutchings wrote:
> > > 
> > > 3.16.37-rc1 review patch.  If anyone has any objections, please let me 
> > > know.
> > > 
> > > --
> > > 
> > > > > From: Dave Chinner 
> > > 
> > > commit b1438f477934f5a4d5a44df26f3079a7575d5946 upstream.
> > > 
> > > When a failure due to an inode buffer occurs, the error handling
> > > fails to abort the inode writeback correctly. This can result in the
> > > inode being reclaimed whilst still in the AIL, leading to
> > > use-after-free situations as well as filesystems that cannot be
> > > unmounted as the inode log items left in the AIL never get removed.
> > > 
> > > Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
> > > the inode flush being aborted correctly.
> > 
> > > 
> > >  
> > > > >   /*
> > > > > -  * Get the buffer containing the on-disk inode.
> > > > > +  * Get the buffer containing the on-disk inode. We are doing a 
> > > > > try-lock
> > > > > +  * operation here, so we may get  an EAGAIN error. In that 
> > > > > case, we
> > > > > +  * simply want to return with the inode still dirty.
> > > > > +  *
> > > > > +  * If we get any other error, we effectively have a corruption 
> > > > > situation
> > > > > +  * and we cannot flush the inode, so we treat it the same as 
> > > > > failing
> > > > > +  * xfs_iflush_int().
> > > > >    */
> > > > >   error = xfs_imap_to_bp(mp, NULL, >i_imap, , , 
> > > > > XBF_TRYLOCK,
> > > > >      0);
> > > > > - if (error || !bp) {
> > > > > + if (error == -EAGAIN) {
> > 
> > Wrong. As was pointed out for other -stable trees after users
> > reported regressions, the error signs in XFS changed from positive
> > to negative in 3.17-rc1.
> 
> OK, so do I just need to delete the minus sign there?

Yes.

-Dave.
-- 
Dave Chinner
dchin...@redhat.com

Re: [PATCH 3.16 102/305] xfs: xfs_iflush_cluster fails to abort on error

2016-08-16 Thread Dave Chinner

On Tue, Aug 16, 2016 at 08:45:02PM +0100, Ben Hutchings wrote:
> On Sun, 2016-08-14 at 09:36 +1000, Dave Chinner wrote:
> > On Sat, Aug 13, 2016 at 06:42:51PM +0100, Ben Hutchings wrote:
> > > 
> > > 3.16.37-rc1 review patch.  If anyone has any objections, please let me 
> > > know.
> > > 
> > > --
> > > 
> > > > > From: Dave Chinner 
> > > 
> > > commit b1438f477934f5a4d5a44df26f3079a7575d5946 upstream.
> > > 
> > > When a failure due to an inode buffer occurs, the error handling
> > > fails to abort the inode writeback correctly. This can result in the
> > > inode being reclaimed whilst still in the AIL, leading to
> > > use-after-free situations as well as filesystems that cannot be
> > > unmounted as the inode log items left in the AIL never get removed.
> > > 
> > > Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
> > > the inode flush being aborted correctly.
> > 
> > > 
> > >  
> > > > >   /*
> > > > > -  * Get the buffer containing the on-disk inode.
> > > > > +  * Get the buffer containing the on-disk inode. We are doing a 
> > > > > try-lock
> > > > > +  * operation here, so we may get  an EAGAIN error. In that 
> > > > > case, we
> > > > > +  * simply want to return with the inode still dirty.
> > > > > +  *
> > > > > +  * If we get any other error, we effectively have a corruption 
> > > > > situation
> > > > > +  * and we cannot flush the inode, so we treat it the same as 
> > > > > failing
> > > > > +  * xfs_iflush_int().
> > > > >    */
> > > > >   error = xfs_imap_to_bp(mp, NULL, >i_imap, , , 
> > > > > XBF_TRYLOCK,
> > > > >      0);
> > > > > - if (error || !bp) {
> > > > > + if (error == -EAGAIN) {
> > 
> > Wrong. As was pointed out for other -stable trees after users
> > reported regressions, the error signs in XFS changed from positive
> > to negative in 3.17-rc1.
> 
> OK, so do I just need to delete the minus sign there?

Yes.

-Dave.
-- 
Dave Chinner
dchin...@redhat.com

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1870 matches

Mail list logo