Re: [PATCH] security/keys: Adding newline after declaration

2017-09-06 Thread Joe Perches
On Thu, 2017-09-07 at 11:11 +0530, Pushkar Jambhlekar wrote:
> Fixing checkpatch warning to add newline after declaration
[]
> diff --git a/security/keys/big_key.c b/security/keys/big_key.c
[]
> @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, 
> size_t datalen, u8 *key)
>  {
>   int ret = -EINVAL;
>   struct scatterlist sgio;
> +
>   SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher);

checkpatch doesn't know that SKCIPHER_REQUEST_ON_STACK
is actually 2 declarations.



Re: [PATCH] security/keys: Adding newline after declaration

2017-09-06 Thread Joe Perches
On Thu, 2017-09-07 at 11:11 +0530, Pushkar Jambhlekar wrote:
> Fixing checkpatch warning to add newline after declaration
[]
> diff --git a/security/keys/big_key.c b/security/keys/big_key.c
[]
> @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, 
> size_t datalen, u8 *key)
>  {
>   int ret = -EINVAL;
>   struct scatterlist sgio;
> +
>   SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher);

checkpatch doesn't know that SKCIPHER_REQUEST_ON_STACK
is actually 2 declarations.



Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Thomas Gleixner
On Thu, 7 Sep 2017, Yu Chen wrote:
> On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> > Can you please apply the debug patch below, boot the machine and right
> > after login provide the output of
> > 
> > # cat /sys/kernel/debug/tracing/trace
> >
>  kworker/0:2-303   [000]  9.135467: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 34
>  kworker/0:2-303   [000]  9.135476: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 35
>  kworker/0:2-303   [000]  9.135484: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 36



>  kworker/0:2-303   [000]  9.762268: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 331
>  kworker/0:2-303   [000]  9.762278: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 332
>  kworker/0:2-303   [000]  9.762288: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 333

That's 300 vectors.

>  bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03)
> 
> -+-[:b2]-+-00.0-[b3-bc]00.0-[b4-bc]--+-00.0-[b5-b6]00.0
>  |   |   +-01.0-[b7-b8]00.0
>  |   |   +-02.0-[b9-ba]00.0
>  |   |   \-03.0-[bb-bc]--+-00.0
>  |   |   +-00.1
>  |   |   +-00.2
>  |   |   \-00.3
> 
> and they are using i40e driver, the vectors should be reserved by:
> i40e_probe() ->
>   i40e_init_interrupt_scheme() ->
> i40e_init_msix() ->
>   i40e_reserve_msix_vectors() ->
> pci_enable_msix_range()
> 
> # ls /sys/kernel/debug/irq/irqs
> 0  10   11  13  142  184  217  259  292  31  33
> 337  339  340  342  344  346  348  350  352  354  356
> 358  360  362  364  366  368  370  372  374  376  378
> 380  382  384  386  388  390  392  394  4  6   7  9
> 1  109  12  14  15   224   26   332  335
> 338  34   341  343  345  347  349  351  353  355  357
> 359  361  363  365  367  369  371  373  375  377  379
> 381  383  385  387  389  391  393  395  5  67  8

Out of these 300 interrupts exactly 8 randomly selected ones are actively
used. And the other 292 interrupts are just there because it might need
them in the future when the 32 CPU machine gets magically upgraded to 4096
cores at runtime?

Can the i40e people @intel please fix this waste of resources and sanitize
their interrupt allocation scheme?

Please switch it over to managed interrupts so the affinity spreading
happens in a sane way and the interrupts are properly managed on CPU
hotplug.

Thanks,

tglx


Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Thomas Gleixner
On Thu, 7 Sep 2017, Yu Chen wrote:
> On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> > Can you please apply the debug patch below, boot the machine and right
> > after login provide the output of
> > 
> > # cat /sys/kernel/debug/tracing/trace
> >
>  kworker/0:2-303   [000]  9.135467: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 34
>  kworker/0:2-303   [000]  9.135476: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 35
>  kworker/0:2-303   [000]  9.135484: msi_domain_alloc_irqs: dev: 
> :bb:00.0 nvec 1 virq 36



>  kworker/0:2-303   [000]  9.762268: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 331
>  kworker/0:2-303   [000]  9.762278: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 332
>  kworker/0:2-303   [000]  9.762288: msi_domain_alloc_irqs: dev: 
> :bb:00.3 nvec 1 virq 333

That's 300 vectors.

>  bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03)
> 
> -+-[:b2]-+-00.0-[b3-bc]00.0-[b4-bc]--+-00.0-[b5-b6]00.0
>  |   |   +-01.0-[b7-b8]00.0
>  |   |   +-02.0-[b9-ba]00.0
>  |   |   \-03.0-[bb-bc]--+-00.0
>  |   |   +-00.1
>  |   |   +-00.2
>  |   |   \-00.3
> 
> and they are using i40e driver, the vectors should be reserved by:
> i40e_probe() ->
>   i40e_init_interrupt_scheme() ->
> i40e_init_msix() ->
>   i40e_reserve_msix_vectors() ->
> pci_enable_msix_range()
> 
> # ls /sys/kernel/debug/irq/irqs
> 0  10   11  13  142  184  217  259  292  31  33
> 337  339  340  342  344  346  348  350  352  354  356
> 358  360  362  364  366  368  370  372  374  376  378
> 380  382  384  386  388  390  392  394  4  6   7  9
> 1  109  12  14  15   224   26   332  335
> 338  34   341  343  345  347  349  351  353  355  357
> 359  361  363  365  367  369  371  373  375  377  379
> 381  383  385  387  389  391  393  395  5  67  8

Out of these 300 interrupts exactly 8 randomly selected ones are actively
used. And the other 292 interrupts are just there because it might need
them in the future when the 32 CPU machine gets magically upgraded to 4096
cores at runtime?

Can the i40e people @intel please fix this waste of resources and sanitize
their interrupt allocation scheme?

Please switch it over to managed interrupts so the affinity spreading
happens in a sane way and the interrupts are properly managed on CPU
hotplug.

Thanks,

tglx


Re: [GIT] Networking

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 10:40 PM, Luca Coelho  wrote:
>
> This patch is not very important (unless you really like blinking lights
> -- maybe I'll change my mind when the holidays approach :P). so it is
> fine if you just want to revert it for now.
>
> In any case, I'll send a patch fixing this problem soon.

No need to revert if we can get this fixed quickly enough.

I'll leave the fw-31 on my laptop, so that I can continue to use it for now.

Thanks,

   Linus


Re: [GIT] Networking

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 10:40 PM, Luca Coelho  wrote:
>
> This patch is not very important (unless you really like blinking lights
> -- maybe I'll change my mind when the holidays approach :P). so it is
> fine if you just want to revert it for now.
>
> In any case, I'll send a patch fixing this problem soon.

No need to revert if we can get this fixed quickly enough.

I'll leave the fw-31 on my laptop, so that I can continue to use it for now.

Thanks,

   Linus


[PATCH] security/keys: Adding newline after declaration

2017-09-06 Thread Pushkar Jambhlekar
Fixing checkpatch warning to add newline after declaration

Signed-off-by: Pushkar Jambhlekar 
---
 security/keys/big_key.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/keys/big_key.c b/security/keys/big_key.c
index 835c1ab..bc4c6eb 100644
--- a/security/keys/big_key.c
+++ b/security/keys/big_key.c
@@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, size_t 
datalen, u8 *key)
 {
int ret = -EINVAL;
struct scatterlist sgio;
+
SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher);
 
if (crypto_skcipher_setkey(big_key_skcipher, key, ENC_KEY_SIZE)) {
-- 
2.7.4



[PATCH] security/keys: Adding newline after declaration

2017-09-06 Thread Pushkar Jambhlekar
Fixing checkpatch warning to add newline after declaration

Signed-off-by: Pushkar Jambhlekar 
---
 security/keys/big_key.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/keys/big_key.c b/security/keys/big_key.c
index 835c1ab..bc4c6eb 100644
--- a/security/keys/big_key.c
+++ b/security/keys/big_key.c
@@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, size_t 
datalen, u8 *key)
 {
int ret = -EINVAL;
struct scatterlist sgio;
+
SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher);
 
if (crypto_skcipher_setkey(big_key_skcipher, key, ENC_KEY_SIZE)) {
-- 
2.7.4



Re: [GIT] Networking

2017-09-06 Thread Luca Coelho
On Thu, 2017-09-07 at 05:04 +, Coelho, Luciano wrote:
> On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote:
> > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
> >  wrote:
> > > 
> > > This seems to be a problem with backwards-compatibility with FW version
> > > 27.  We are now in version 31[1] and upgrading will probably fix that.
> > 
> > I can confirm that fw version 31 works.
> 
> Great, so I know for sure that this is a backwards-compatibility issue
> with the FW API.
> 
> 
> > > But obviously the driver should not fail miserably like this with
> > > version 27, because it claims to support it still.
> > 
> > Not just "claims to support it", but if it's what is shipped with a
> > fairly recent distro like an up-to-date version of F26, I would really
> > hope that the driver can still work with it.
> 
> I totally agree, we support a bunch of older versions for that exact
> reason.  We just don't really test all the supported versions very
> often.  We should probably change that.
> 
> I'll make sure it still works with version 27.

Okay, I found the offending patch:

commit 7089ae634c50544b29b31faf1a751e8765c8de3b
Author: Johannes Berg 
AuthorDate: Wed Jun 28 16:19:49 2017 +0200
Commit: Luca Coelho 
CommitDate: Wed Aug 9 09:15:32 2017 +0300

iwlwifi: mvm: use firmware LED command where applicable

On devices starting from 8000 series, the host can no longer toggle
the LED through the CSR_LED_REG register, but must do it via the
firmware instead. Add support for this. Note that this means that
the LED cannot be turned on while the firmware is off, so using an
arbitrary LED trigger may not work as expected.

Fixes: 503ab8c56ca0 ("iwlwifi: Add 8000 HW family support")
Signed-off-by: Johannes Berg 
Signed-off-by: Luca Coelho 

Reverting it solves the problem.  We introduced a new command to control
the LED lights and assumed it was available in older FW versions as
well, which turned out not to be the case.

This patch is not very important (unless you really like blinking lights
-- maybe I'll change my mind when the holidays approach :P). so it is
fine if you just want to revert it for now.

In any case, I'll send a patch fixing this problem soon.


--
Cheers,
Luca.


Re: [GIT] Networking

2017-09-06 Thread Luca Coelho
On Thu, 2017-09-07 at 05:04 +, Coelho, Luciano wrote:
> On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote:
> > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
> >  wrote:
> > > 
> > > This seems to be a problem with backwards-compatibility with FW version
> > > 27.  We are now in version 31[1] and upgrading will probably fix that.
> > 
> > I can confirm that fw version 31 works.
> 
> Great, so I know for sure that this is a backwards-compatibility issue
> with the FW API.
> 
> 
> > > But obviously the driver should not fail miserably like this with
> > > version 27, because it claims to support it still.
> > 
> > Not just "claims to support it", but if it's what is shipped with a
> > fairly recent distro like an up-to-date version of F26, I would really
> > hope that the driver can still work with it.
> 
> I totally agree, we support a bunch of older versions for that exact
> reason.  We just don't really test all the supported versions very
> often.  We should probably change that.
> 
> I'll make sure it still works with version 27.

Okay, I found the offending patch:

commit 7089ae634c50544b29b31faf1a751e8765c8de3b
Author: Johannes Berg 
AuthorDate: Wed Jun 28 16:19:49 2017 +0200
Commit: Luca Coelho 
CommitDate: Wed Aug 9 09:15:32 2017 +0300

iwlwifi: mvm: use firmware LED command where applicable

On devices starting from 8000 series, the host can no longer toggle
the LED through the CSR_LED_REG register, but must do it via the
firmware instead. Add support for this. Note that this means that
the LED cannot be turned on while the firmware is off, so using an
arbitrary LED trigger may not work as expected.

Fixes: 503ab8c56ca0 ("iwlwifi: Add 8000 HW family support")
Signed-off-by: Johannes Berg 
Signed-off-by: Luca Coelho 

Reverting it solves the problem.  We introduced a new command to control
the LED lights and assumed it was available in older FW versions as
well, which turned out not to be the case.

This patch is not very important (unless you really like blinking lights
-- maybe I'll change my mind when the holidays approach :P). so it is
fine if you just want to revert it for now.

In any case, I'll send a patch fixing this problem soon.


--
Cheers,
Luca.


[PATCH] arm64: KVM: VHE: reset PSTATE.UAO when switch to host

2017-09-06 Thread Dongjiu Geng
In VHE mode, host kernel runs in the EL2 and can enable
'User Access Override' when fs==KERNEL_DS so that it can
access kernel memory. However, PSTATE.UAO is set to 0 on
an exception taken from EL1 to EL2. Thus when VHE is used
and exception taken from a guest UAO will be disabled and
host will use the incorrect PSTATE.UAO. So check and reset
the PSTATE.UAO when switching to host.

Move the reset PSTATE.PAN on entry to EL2 together with
PSTATE.UAO reset.

Signed-off-by: Dongjiu Geng 
Signed-off-by: Haibin Zhang 
Tested-by: Dongjiu Geng 
---
 arch/arm64/kvm/hyp/entry.S  |  2 --
 arch/arm64/kvm/hyp/switch.c | 12 
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d..7662ef5 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -96,8 +96,6 @@ ENTRY(__guest_exit)
 
add x1, x1, #VCPU_CONTEXT
 
-   ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
-
// Store the guest regs x2 and x3
stp x2, x3,   [x1, #CPU_XREG_OFFSET(2)]
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index a733461..715b3941 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static bool __hyp_text __fpsimd_enabled_nvhe(void)
 {
@@ -399,6 +400,17 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
__sysreg_restore_host_state(host_ctxt);
 
+   if (has_vhe()) {
+   /*
+* PSTATE was not saved over guest enter/exit, re-enable
+* any detecte features that might not have been set
+* correctly.
+*/
+   uao_thread_switch(current);
+   asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1),
+   ARM64_HAS_PAN, CONFIG_ARM64_PAN));
+   }
+
if (fp_enabled) {
__fpsimd_save_state(_ctxt->gp_regs.fp_regs);
__fpsimd_restore_state(_ctxt->gp_regs.fp_regs);
-- 
1.8.3.1



[PATCH] arm64: KVM: VHE: reset PSTATE.UAO when switch to host

2017-09-06 Thread Dongjiu Geng
In VHE mode, host kernel runs in the EL2 and can enable
'User Access Override' when fs==KERNEL_DS so that it can
access kernel memory. However, PSTATE.UAO is set to 0 on
an exception taken from EL1 to EL2. Thus when VHE is used
and exception taken from a guest UAO will be disabled and
host will use the incorrect PSTATE.UAO. So check and reset
the PSTATE.UAO when switching to host.

Move the reset PSTATE.PAN on entry to EL2 together with
PSTATE.UAO reset.

Signed-off-by: Dongjiu Geng 
Signed-off-by: Haibin Zhang 
Tested-by: Dongjiu Geng 
---
 arch/arm64/kvm/hyp/entry.S  |  2 --
 arch/arm64/kvm/hyp/switch.c | 12 
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d..7662ef5 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -96,8 +96,6 @@ ENTRY(__guest_exit)
 
add x1, x1, #VCPU_CONTEXT
 
-   ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
-
// Store the guest regs x2 and x3
stp x2, x3,   [x1, #CPU_XREG_OFFSET(2)]
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index a733461..715b3941 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static bool __hyp_text __fpsimd_enabled_nvhe(void)
 {
@@ -399,6 +400,17 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
__sysreg_restore_host_state(host_ctxt);
 
+   if (has_vhe()) {
+   /*
+* PSTATE was not saved over guest enter/exit, re-enable
+* any detecte features that might not have been set
+* correctly.
+*/
+   uao_thread_switch(current);
+   asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1),
+   ARM64_HAS_PAN, CONFIG_ARM64_PAN));
+   }
+
if (fp_enabled) {
__fpsimd_save_state(_ctxt->gp_regs.fp_regs);
__fpsimd_restore_state(_ctxt->gp_regs.fp_regs);
-- 
1.8.3.1



Re: [PATCH] [RESEND] drm/stm: fix warning about multiplication in condition

2017-09-06 Thread Archit Taneja

Hi Benjamin,

This should be pushed to drm-misc by you, right?

Thanks,
Archit

On 09/06/2017 06:43 PM, Arnd Bergmann wrote:

gcc-7 complains about multiplying within a condition being
suspicious:

drivers/gpu/drm/stm/dw_mipi_dsi-stm.c: In function 'dsi_pll_get_clkout_khz':
drivers/gpu/drm/stm/dw_mipi_dsi-stm.c:117:10: error: '*' in boolean context, suggest 
'&&' instead [-Werror=int-in-bool-context]

The code here is correct, but can be easily rephrased to make
that more obvious. I also swap out the error handling and the normal
code path for clarity.

Fixes: b0f09a3c69d9 ("drm/stm: Add STM32 DSI controller driver")
Acked-by: Philippe Cornu 
Tested-by: Philippe Cornu 
Signed-off-by: Arnd Bergmann 
---
Originally sent on July 25, but never made it into linux-next.
The warning is currently disabled in mainline, but this seems
to be a legitimate instance, and we may want to turn it back
on in the future.
---
  drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c 
b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
index 568c5d0461ea..e5b6310240fe 100644
--- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -113,11 +113,13 @@ static enum dsi_color dsi_color_from_mipi(enum 
mipi_dsi_pixel_format fmt)
  
  static int dsi_pll_get_clkout_khz(int clkin_khz, int idf, int ndiv, int odf)

  {
+   int divisor = idf * odf;
+
/* prevent from division by 0 */
-   if (idf * odf)
-   return DIV_ROUND_CLOSEST(clkin_khz * ndiv, idf * odf);
+   if (!divisor)
+   return 0;
  
-	return 0;

+   return DIV_ROUND_CLOSEST(clkin_khz * ndiv, divisor);
  }
  
  static int dsi_pll_get_params(int clkin_khz, int clkout_khz,




--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH] [RESEND] drm/stm: fix warning about multiplication in condition

2017-09-06 Thread Archit Taneja

Hi Benjamin,

This should be pushed to drm-misc by you, right?

Thanks,
Archit

On 09/06/2017 06:43 PM, Arnd Bergmann wrote:

gcc-7 complains about multiplying within a condition being
suspicious:

drivers/gpu/drm/stm/dw_mipi_dsi-stm.c: In function 'dsi_pll_get_clkout_khz':
drivers/gpu/drm/stm/dw_mipi_dsi-stm.c:117:10: error: '*' in boolean context, suggest 
'&&' instead [-Werror=int-in-bool-context]

The code here is correct, but can be easily rephrased to make
that more obvious. I also swap out the error handling and the normal
code path for clarity.

Fixes: b0f09a3c69d9 ("drm/stm: Add STM32 DSI controller driver")
Acked-by: Philippe Cornu 
Tested-by: Philippe Cornu 
Signed-off-by: Arnd Bergmann 
---
Originally sent on July 25, but never made it into linux-next.
The warning is currently disabled in mainline, but this seems
to be a legitimate instance, and we may want to turn it back
on in the future.
---
  drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c 
b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
index 568c5d0461ea..e5b6310240fe 100644
--- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -113,11 +113,13 @@ static enum dsi_color dsi_color_from_mipi(enum 
mipi_dsi_pixel_format fmt)
  
  static int dsi_pll_get_clkout_khz(int clkin_khz, int idf, int ndiv, int odf)

  {
+   int divisor = idf * odf;
+
/* prevent from division by 0 */
-   if (idf * odf)
-   return DIV_ROUND_CLOSEST(clkin_khz * ndiv, idf * odf);
+   if (!divisor)
+   return 0;
  
-	return 0;

+   return DIV_ROUND_CLOSEST(clkin_khz * ndiv, divisor);
  }
  
  static int dsi_pll_get_params(int clkin_khz, int clkout_khz,




--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: linux-next: build failure after merge of the akpm-current tree

2017-09-06 Thread Stephen Rothwell
Hi all,

On Wed, 2 Aug 2017 16:31:45 +1000 Stephen Rothwell  
wrote:
>
> On Wed, 2 Aug 2017 15:45:54 +1000 Stephen Rothwell  
> wrote:
> >
> > On Tue, 01 Aug 2017 09:08:01 -0400 "Zi Yan"  wrote:  
> > >
> > > I found two possible fixes.
> > > 
> > > 1. This uses C++ zero initializer, GCC is OK with it.
> > > I tested with GCC 4.9.3 (has the initialization bug) and GCC 6.4.0.
> > > 
> > > --- a/include/linux/swapops.h~a
> > > +++ a/include/linux/swapops.h
> > > @@ -217,7 +217,7 @@ static inline swp_entry_t pmd_to_swp_ent
> > > 
> > >  static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
> > >  {
> > > - return (pmd_t){ 0 };
> > > + return (pmd_t){};
> > >  }
> > 
> > I have done that for today ... please decide which is best (or find
> > something better - maybe every platform really needs to have a __pmd()
> > definition) and submit a real fix patch to Andrew.  
> 
> OK, that failed for my compiler (gcc 5.2.0) like this:
> 
> In file included from mm/vmscan.c:55:0:
> include/linux/swapops.h: In function 'swp_entry_to_pmd':
> include/linux/swapops.h:226:16: error: empty scalar initializer
>   return (pmd_t){};
> ^
> include/linux/swapops.h:226:16: note: (near initialization for '(anonymous)')

This has reappeared today :-( (for the arm multi_v7_defconfig build at
least)

> So I used the other idea (on top of Andrew's current tree):

The fix patch now looks like this:

From: Stephen Rothwell 
Date: Wed, 2 Aug 2017 15:55:02 +1000
Subject: [PATCH] mm-thp-enable-thp-migration-in-generic-path-fix-fix

Signed-off-by: Stephen Rothwell 
---
 include/linux/swapops.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 45b092aa6419..61cffa148a79 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -223,7 +223,9 @@ static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd)
 
 static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
 {
-   return (pmd_t){};
+   pmd_t e;
+   memset(, 0, sizeof(pmd_t));
+   return e;
 }
 
 static inline int is_pmd_migration_entry(pmd_t pmd)

-- 
Cheers,
Stephen Rothwell


Re: linux-next: build failure after merge of the akpm-current tree

2017-09-06 Thread Stephen Rothwell
Hi all,

On Wed, 2 Aug 2017 16:31:45 +1000 Stephen Rothwell  
wrote:
>
> On Wed, 2 Aug 2017 15:45:54 +1000 Stephen Rothwell  
> wrote:
> >
> > On Tue, 01 Aug 2017 09:08:01 -0400 "Zi Yan"  wrote:  
> > >
> > > I found two possible fixes.
> > > 
> > > 1. This uses C++ zero initializer, GCC is OK with it.
> > > I tested with GCC 4.9.3 (has the initialization bug) and GCC 6.4.0.
> > > 
> > > --- a/include/linux/swapops.h~a
> > > +++ a/include/linux/swapops.h
> > > @@ -217,7 +217,7 @@ static inline swp_entry_t pmd_to_swp_ent
> > > 
> > >  static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
> > >  {
> > > - return (pmd_t){ 0 };
> > > + return (pmd_t){};
> > >  }
> > 
> > I have done that for today ... please decide which is best (or find
> > something better - maybe every platform really needs to have a __pmd()
> > definition) and submit a real fix patch to Andrew.  
> 
> OK, that failed for my compiler (gcc 5.2.0) like this:
> 
> In file included from mm/vmscan.c:55:0:
> include/linux/swapops.h: In function 'swp_entry_to_pmd':
> include/linux/swapops.h:226:16: error: empty scalar initializer
>   return (pmd_t){};
> ^
> include/linux/swapops.h:226:16: note: (near initialization for '(anonymous)')

This has reappeared today :-( (for the arm multi_v7_defconfig build at
least)

> So I used the other idea (on top of Andrew's current tree):

The fix patch now looks like this:

From: Stephen Rothwell 
Date: Wed, 2 Aug 2017 15:55:02 +1000
Subject: [PATCH] mm-thp-enable-thp-migration-in-generic-path-fix-fix

Signed-off-by: Stephen Rothwell 
---
 include/linux/swapops.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 45b092aa6419..61cffa148a79 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -223,7 +223,9 @@ static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd)
 
 static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
 {
-   return (pmd_t){};
+   pmd_t e;
+   memset(, 0, sizeof(pmd_t));
+   return e;
 }
 
 static inline int is_pmd_migration_entry(pmd_t pmd)

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode

2017-09-06 Thread Baoquan He
On 09/07/17 at 12:19pm, Dou Liyang wrote:
> Hi Baoquan
> 
> I am wordy one ah:
> our target is checking if BIOS supports APIC, no matter what
> type(separated/integrated) it is. if not, go to PIC mode.
> 
> Let‘s discuss the original logic and the smp_found_config,
> then take about your code.
> 
> The existing logic is:
> 
>   if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1)
>   return -1;
> 
>   if (!boot_cpu_has(X86_FEATURE_APIC) &&
>   APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2)
>   pr_err();
> 
> why smp_found_config has to be checked in (1)?
> 
> Because, In case of discrete (pretty old) apics we may not set
> X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic
> feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1]
> So we assume that if SMP configuration is found from MP table
> (smp_found_config = 1) in above case, there maybe a separated
> chip in our pc.
> 
> After passing the check of (1), we in (2), check whether local APIC
> is detected or not, If we have a BIOS bug.
> 
> [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics")

Hmm, sounds reasonable. Just a sentence to describe it could be better.

> 
> At 09/06/2017 06:17 PM, Baoquan He wrote:
> > Hi Dou,
> > 
> > On 08/28/17 at 11:20am, Dou Liyang wrote:
> > > +static int __init apic_intr_mode_select(void)
> > > +{
> > > + /* Check kernel option */
> > > + if (disable_apic) {
> > > + pr_info("APIC disabled via kernel command line\n");
> > > + return APIC_PIC;
> > > + }
> > > +
> > 
> > I am not very familiar with cpu registers, not sure if we can adjust
> > below code flow as:
> > 
> > /* If APIC is integrated, check local APIC only */
> > if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) {
> > disable_apic = 1;
> > pr_info("APIC disabled by BIOS\n");
> > return APIC_PIC;
> > }
> > 
> > /* If APIC is on a separate chip, check if smp_found_config is found*/
> > if (!lapic_is_integrated() && !smp_found_config) {
> > disable_apic = 1;
> > return APIC_PIC;
> > }
> 
> Yes, Awesome, we first consider it from APIC register space, then
> the BOIS and software configration. let me do more investigation.
> 
> I rewrite it based on you, any comments will welcome.
> 
>   /* If APIC is not integrated, check if SMP configuration is
>* found from MP table. If not too, no 82489DX. switch to
>* PIC mode
>*
>* Else APIC is integrated, check if the BIOS allows local APIC
>*
>*/
>   if (!lapic_is_integrated()) {
>   if (!smp_found_config) {
>   disable_apic = 1;
>   return APIC_PIC;
>   }
>   } else if(!boot_cpu_has(X86_FEATURE_APIC)) {
>   disable_apic = 1;
>   pr_info("APIC disabled by BIOS\n");
>   return APIC_PIC;
>   }
>   }

Yeah, it's fine to me. At least the logic looks more understandable.

> 
> BTW, As the macro APIC_INTEGRATED(x) has already wrapped by
> CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity
> like that:

Yes, looks good. There's duplicate judgement of X86_64 in
lapic_is_integrated.

> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 7834f73..63b3ae9 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -211,11 +211,7 @@ static inline int lapic_get_version(void)
>   */
>  static inline int lapic_is_integrated(void)
>  {
> -#ifdef CONFIG_X86_64
> -   return 1;
> -#else
> return APIC_INTEGRATED(lapic_get_version());
> -#endif
>  }
> 
> 
> Do you think so. ;-)
> 
> 
> Thanks,
>   dou.
> 
> 
> >  Now, I haven't think of why smp_found_config has to be
> >  checked here.
> > 
> > In this way, we don't need the CONFIG_X86_64 checking since it's
> > contained in lapic_is_integrated() already. And the checking is obvious
> > for understanding. Just not very sure if the checking is adequate.
> > 
> > Just my personal opinion.
> > 
> > > + /* Check BIOS */
> > > +#ifdef CONFIG_X86_64
> > > + /* On 64-bit, the APIC must be integrated, Check local APIC only */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC)) {
> > > + disable_apic = 1;
> > > + pr_info("APIC disabled by BIOS\n");
> > > + return APIC_PIC;
> > > + }
> > > +#else
> > > + /*
> > > +  * On 32-bit, check whether there is a separate chip or integrated
> > > +  * APIC
> > > +  */
> > > +
> > > + /* Has a separate chip ? */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) {
> > > + disable_apic = 1;
> > > +
> > > + return APIC_PIC;
> > > + }
> > > +
> > > + /* Has a local APIC ? */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC) &&
> > > + APIC_INTEGRATED(boot_cpu_apic_version)) {
> > > +  

Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode

2017-09-06 Thread Baoquan He
On 09/07/17 at 12:19pm, Dou Liyang wrote:
> Hi Baoquan
> 
> I am wordy one ah:
> our target is checking if BIOS supports APIC, no matter what
> type(separated/integrated) it is. if not, go to PIC mode.
> 
> Let‘s discuss the original logic and the smp_found_config,
> then take about your code.
> 
> The existing logic is:
> 
>   if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1)
>   return -1;
> 
>   if (!boot_cpu_has(X86_FEATURE_APIC) &&
>   APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2)
>   pr_err();
> 
> why smp_found_config has to be checked in (1)?
> 
> Because, In case of discrete (pretty old) apics we may not set
> X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic
> feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1]
> So we assume that if SMP configuration is found from MP table
> (smp_found_config = 1) in above case, there maybe a separated
> chip in our pc.
> 
> After passing the check of (1), we in (2), check whether local APIC
> is detected or not, If we have a BIOS bug.
> 
> [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics")

Hmm, sounds reasonable. Just a sentence to describe it could be better.

> 
> At 09/06/2017 06:17 PM, Baoquan He wrote:
> > Hi Dou,
> > 
> > On 08/28/17 at 11:20am, Dou Liyang wrote:
> > > +static int __init apic_intr_mode_select(void)
> > > +{
> > > + /* Check kernel option */
> > > + if (disable_apic) {
> > > + pr_info("APIC disabled via kernel command line\n");
> > > + return APIC_PIC;
> > > + }
> > > +
> > 
> > I am not very familiar with cpu registers, not sure if we can adjust
> > below code flow as:
> > 
> > /* If APIC is integrated, check local APIC only */
> > if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) {
> > disable_apic = 1;
> > pr_info("APIC disabled by BIOS\n");
> > return APIC_PIC;
> > }
> > 
> > /* If APIC is on a separate chip, check if smp_found_config is found*/
> > if (!lapic_is_integrated() && !smp_found_config) {
> > disable_apic = 1;
> > return APIC_PIC;
> > }
> 
> Yes, Awesome, we first consider it from APIC register space, then
> the BOIS and software configration. let me do more investigation.
> 
> I rewrite it based on you, any comments will welcome.
> 
>   /* If APIC is not integrated, check if SMP configuration is
>* found from MP table. If not too, no 82489DX. switch to
>* PIC mode
>*
>* Else APIC is integrated, check if the BIOS allows local APIC
>*
>*/
>   if (!lapic_is_integrated()) {
>   if (!smp_found_config) {
>   disable_apic = 1;
>   return APIC_PIC;
>   }
>   } else if(!boot_cpu_has(X86_FEATURE_APIC)) {
>   disable_apic = 1;
>   pr_info("APIC disabled by BIOS\n");
>   return APIC_PIC;
>   }
>   }

Yeah, it's fine to me. At least the logic looks more understandable.

> 
> BTW, As the macro APIC_INTEGRATED(x) has already wrapped by
> CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity
> like that:

Yes, looks good. There's duplicate judgement of X86_64 in
lapic_is_integrated.

> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 7834f73..63b3ae9 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -211,11 +211,7 @@ static inline int lapic_get_version(void)
>   */
>  static inline int lapic_is_integrated(void)
>  {
> -#ifdef CONFIG_X86_64
> -   return 1;
> -#else
> return APIC_INTEGRATED(lapic_get_version());
> -#endif
>  }
> 
> 
> Do you think so. ;-)
> 
> 
> Thanks,
>   dou.
> 
> 
> >  Now, I haven't think of why smp_found_config has to be
> >  checked here.
> > 
> > In this way, we don't need the CONFIG_X86_64 checking since it's
> > contained in lapic_is_integrated() already. And the checking is obvious
> > for understanding. Just not very sure if the checking is adequate.
> > 
> > Just my personal opinion.
> > 
> > > + /* Check BIOS */
> > > +#ifdef CONFIG_X86_64
> > > + /* On 64-bit, the APIC must be integrated, Check local APIC only */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC)) {
> > > + disable_apic = 1;
> > > + pr_info("APIC disabled by BIOS\n");
> > > + return APIC_PIC;
> > > + }
> > > +#else
> > > + /*
> > > +  * On 32-bit, check whether there is a separate chip or integrated
> > > +  * APIC
> > > +  */
> > > +
> > > + /* Has a separate chip ? */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) {
> > > + disable_apic = 1;
> > > +
> > > + return APIC_PIC;
> > > + }
> > > +
> > > + /* Has a local APIC ? */
> > > + if (!boot_cpu_has(X86_FEATURE_APIC) &&
> > > + APIC_INTEGRATED(boot_cpu_apic_version)) {
> > > +  

Re: [PATCH 3/3] drm/exynos/gsc: Add rotation hardware limits of gscaler

2017-09-06 Thread Hoegeun Kwon

On 09/04/2017 03:19 PM, Hoegeun Kwon wrote:

On 09/01/2017 04:31 PM, Marek Szyprowski wrote:

Hi Hoegeun,

On 2017-09-01 03:47, Hoegeun Kwon wrote:

The gscaler has hardware rotation limits that need to be imported from
dts. Parse them and add them to the property list.

The rotation hardware limits are related to the cropped source size.
When swap occurs, use rot_max size instead of crop_max size.

Also the scaling limits are related to post size, use pos size to
check the limits.

Signed-off-by: Hoegeun Kwon 
---
  drivers/gpu/drm/exynos/exynos_drm_gsc.c | 63 
+

  include/uapi/drm/exynos_drm.h   |  2 ++
  2 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c 
b/drivers/gpu/drm/exynos/exynos_drm_gsc.c

index 0506b2b..dd9b057 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
@@ -1401,6 +1401,23 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  bool swap;
  int i;
  +config = >config[EXYNOS_DRM_OPS_DST];
+
+/* check for degree */
+switch (config->degree) {
+case EXYNOS_DRM_DEGREE_90:
+case EXYNOS_DRM_DEGREE_270:
+swap = true;
+break;
+case EXYNOS_DRM_DEGREE_0:
+case EXYNOS_DRM_DEGREE_180:
+swap = false;
+break;
+default:
+DRM_ERROR("invalid degree.\n");
+goto err_property;
+}
+
  for_each_ipp_ops(i) {
  if ((i == EXYNOS_DRM_OPS_SRC) &&
  (property->cmd == IPP_CMD_WB))
@@ -1416,21 +1433,6 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  goto err_property;
  }
  -/* check for degree */
-switch (config->degree) {
-case EXYNOS_DRM_DEGREE_90:
-case EXYNOS_DRM_DEGREE_270:
-swap = true;
-break;
-case EXYNOS_DRM_DEGREE_0:
-case EXYNOS_DRM_DEGREE_180:
-swap = false;
-break;
-default:
-DRM_ERROR("invalid degree.\n");
-goto err_property;
-}
-
  /* check for buffer bound */
  if ((pos->x + pos->w > sz->hsize) ||
  (pos->y + pos->h > sz->vsize)) {
@@ -1438,21 +1440,27 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  goto err_property;
  }
  +/*
+ * The rotation hardware limits are related to the cropped
+ * source size. So use rot_max size to check the limits when
+ * swap happens. And also the scaling limits are related to 
pos

+ * size, use pos size to check the limits.
+ */
  /* check for crop */
  if ((i == EXYNOS_DRM_OPS_SRC) && (pp->crop)) {
  if (swap) {
  if ((pos->h < pp->crop_min.hsize) ||
-(sz->vsize > pp->crop_max.hsize) ||
+(pos->h > pp->rot_max.hsize) ||
  (pos->w < pp->crop_min.vsize) ||
-(sz->hsize > pp->crop_max.vsize)) {
+(pos->w > pp->rot_max.vsize)) {
  DRM_ERROR("out of crop size.\n");
  goto err_property;
  }
  } else {
  if ((pos->w < pp->crop_min.hsize) ||
-(sz->hsize > pp->crop_max.hsize) ||
+(pos->w > pp->crop_max.hsize) ||
  (pos->h < pp->crop_min.vsize) ||
-(sz->vsize > pp->crop_max.vsize)) {
+(pos->h > pp->crop_max.vsize)) {
  DRM_ERROR("out of crop size.\n");
  goto err_property;
  }
@@ -1463,17 +1471,17 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  if ((i == EXYNOS_DRM_OPS_DST) && (pp->scale)) {
  if (swap) {
  if ((pos->h < pp->scale_min.hsize) ||
-(sz->vsize > pp->scale_max.hsize) ||
+(pos->h > pp->scale_max.hsize) ||
  (pos->w < pp->scale_min.vsize) ||
-(sz->hsize > pp->scale_max.vsize)) {
+(pos->w > pp->scale_max.vsize)) {
  DRM_ERROR("out of scale size.\n");
  goto err_property;
  }
  } else {
  if ((pos->w < pp->scale_min.hsize) ||
-(sz->hsize > pp->scale_max.hsize) ||
+(pos->w > pp->scale_max.hsize) ||
  (pos->h < pp->scale_min.vsize) ||
-(sz->vsize > pp->scale_max.vsize)) {
+(pos->h > pp->scale_max.vsize)) {
  DRM_ERROR("out of scale size.\n");
  goto err_property;
  }
@@ -1676,6 +1684,15 @@ static int gsc_probe(struct platform_device 
*pdev)

  dev_warn(dev, "failed to get system register.\n");
  ctx->sysreg = NULL;

Re: [PATCH 3/3] drm/exynos/gsc: Add rotation hardware limits of gscaler

2017-09-06 Thread Hoegeun Kwon

On 09/04/2017 03:19 PM, Hoegeun Kwon wrote:

On 09/01/2017 04:31 PM, Marek Szyprowski wrote:

Hi Hoegeun,

On 2017-09-01 03:47, Hoegeun Kwon wrote:

The gscaler has hardware rotation limits that need to be imported from
dts. Parse them and add them to the property list.

The rotation hardware limits are related to the cropped source size.
When swap occurs, use rot_max size instead of crop_max size.

Also the scaling limits are related to post size, use pos size to
check the limits.

Signed-off-by: Hoegeun Kwon 
---
  drivers/gpu/drm/exynos/exynos_drm_gsc.c | 63 
+

  include/uapi/drm/exynos_drm.h   |  2 ++
  2 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c 
b/drivers/gpu/drm/exynos/exynos_drm_gsc.c

index 0506b2b..dd9b057 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
@@ -1401,6 +1401,23 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  bool swap;
  int i;
  +config = >config[EXYNOS_DRM_OPS_DST];
+
+/* check for degree */
+switch (config->degree) {
+case EXYNOS_DRM_DEGREE_90:
+case EXYNOS_DRM_DEGREE_270:
+swap = true;
+break;
+case EXYNOS_DRM_DEGREE_0:
+case EXYNOS_DRM_DEGREE_180:
+swap = false;
+break;
+default:
+DRM_ERROR("invalid degree.\n");
+goto err_property;
+}
+
  for_each_ipp_ops(i) {
  if ((i == EXYNOS_DRM_OPS_SRC) &&
  (property->cmd == IPP_CMD_WB))
@@ -1416,21 +1433,6 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  goto err_property;
  }
  -/* check for degree */
-switch (config->degree) {
-case EXYNOS_DRM_DEGREE_90:
-case EXYNOS_DRM_DEGREE_270:
-swap = true;
-break;
-case EXYNOS_DRM_DEGREE_0:
-case EXYNOS_DRM_DEGREE_180:
-swap = false;
-break;
-default:
-DRM_ERROR("invalid degree.\n");
-goto err_property;
-}
-
  /* check for buffer bound */
  if ((pos->x + pos->w > sz->hsize) ||
  (pos->y + pos->h > sz->vsize)) {
@@ -1438,21 +1440,27 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  goto err_property;
  }
  +/*
+ * The rotation hardware limits are related to the cropped
+ * source size. So use rot_max size to check the limits when
+ * swap happens. And also the scaling limits are related to 
pos

+ * size, use pos size to check the limits.
+ */
  /* check for crop */
  if ((i == EXYNOS_DRM_OPS_SRC) && (pp->crop)) {
  if (swap) {
  if ((pos->h < pp->crop_min.hsize) ||
-(sz->vsize > pp->crop_max.hsize) ||
+(pos->h > pp->rot_max.hsize) ||
  (pos->w < pp->crop_min.vsize) ||
-(sz->hsize > pp->crop_max.vsize)) {
+(pos->w > pp->rot_max.vsize)) {
  DRM_ERROR("out of crop size.\n");
  goto err_property;
  }
  } else {
  if ((pos->w < pp->crop_min.hsize) ||
-(sz->hsize > pp->crop_max.hsize) ||
+(pos->w > pp->crop_max.hsize) ||
  (pos->h < pp->crop_min.vsize) ||
-(sz->vsize > pp->crop_max.vsize)) {
+(pos->h > pp->crop_max.vsize)) {
  DRM_ERROR("out of crop size.\n");
  goto err_property;
  }
@@ -1463,17 +1471,17 @@ static int gsc_ippdrv_check_property(struct 
device *dev,

  if ((i == EXYNOS_DRM_OPS_DST) && (pp->scale)) {
  if (swap) {
  if ((pos->h < pp->scale_min.hsize) ||
-(sz->vsize > pp->scale_max.hsize) ||
+(pos->h > pp->scale_max.hsize) ||
  (pos->w < pp->scale_min.vsize) ||
-(sz->hsize > pp->scale_max.vsize)) {
+(pos->w > pp->scale_max.vsize)) {
  DRM_ERROR("out of scale size.\n");
  goto err_property;
  }
  } else {
  if ((pos->w < pp->scale_min.hsize) ||
-(sz->hsize > pp->scale_max.hsize) ||
+(pos->w > pp->scale_max.hsize) ||
  (pos->h < pp->scale_min.vsize) ||
-(sz->vsize > pp->scale_max.vsize)) {
+(pos->h > pp->scale_max.vsize)) {
  DRM_ERROR("out of scale size.\n");
  goto err_property;
  }
@@ -1676,6 +1684,15 @@ static int gsc_probe(struct platform_device 
*pdev)

  dev_warn(dev, "failed to get system register.\n");
  ctx->sysreg = NULL;
  }
+
+ret = 

Re: [PATCH v4 1/3] arm: npcm: add basic support for Nuvoton BMCs

2017-09-06 Thread Brendan Higgins
On Wed, Sep 6, 2017 at 2:04 PM, Avi Fishman  wrote:
> Sorry for sending again some mailing list rejected it due to HTML
> involved (althoug I tried Plain Text), Trying again.
>
> 2017-09-06 11:07 GMT+03:00 Brendan Higgins :
>> Adds basic support for the Nuvoton NPCM750 BMC.
>>
>> Signed-off-by: Brendan Higgins 
>> ---
>>  arch/arm/Kconfig |  2 +
>>  arch/arm/Makefile|  1 +
>>  arch/arm/mach-npcm/Kconfig   | 50 +
>>  arch/arm/mach-npcm/Makefile  |  3 ++
>>  arch/arm/mach-npcm/headsmp.S | 17 +
>>  arch/arm/mach-npcm/npcm7xx.c | 34 +
>>  arch/arm/mach-npcm/platsmp.c | 89 
>> 
>>  7 files changed, 196 insertions(+)
>>  create mode 100644 arch/arm/mach-npcm/Kconfig
>>  create mode 100644 arch/arm/mach-npcm/Makefile
>>  create mode 100644 arch/arm/mach-npcm/headsmp.S
>>  create mode 100644 arch/arm/mach-npcm/npcm7xx.c
>>  create mode 100644 arch/arm/mach-npcm/platsmp.c
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index 61a0cb15067e..05543f1cfbde 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -782,6 +782,8 @@ source "arch/arm/mach-netx/Kconfig"
>>
>>  source "arch/arm/mach-nomadik/Kconfig"
>>
>> +source "arch/arm/mach-npcm/Kconfig"
>> +
>>  source "arch/arm/mach-nspire/Kconfig"
>>
>>  source "arch/arm/plat-omap/Kconfig"
>> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
>> index 47d3a1ab08d2..60ca50c7d762 100644
>> --- a/arch/arm/Makefile
>> +++ b/arch/arm/Makefile
>> @@ -191,6 +191,7 @@ machine-$(CONFIG_ARCH_MEDIATEK) += mediatek
>>  machine-$(CONFIG_ARCH_MXS) += mxs
>>  machine-$(CONFIG_ARCH_NETX)+= netx
>>  machine-$(CONFIG_ARCH_NOMADIK) += nomadik
>> +machine-$(CONFIG_ARCH_NPCM)+= npcm
>>  machine-$(CONFIG_ARCH_NSPIRE)  += nspire
>>  machine-$(CONFIG_ARCH_OXNAS)   += oxnas
>>  machine-$(CONFIG_ARCH_OMAP1)   += omap1
>> diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig
>> new file mode 100644
>> index ..d47061855439
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/Kconfig
>> @@ -0,0 +1,50 @@
>> +menuconfig ARCH_NPCM
>> +   bool "Nuvoton NPCM Architecture"
>> +   select ARCH_REQUIRE_GPIOLIB
>> +   select USE_OF
>> +   select PINCTRL
>> +   select PINCTRL_NPCM7XX
>> +
>> +if ARCH_NPCM
>> +
>> +comment "NPCMX50 CPU type"
>> +
>> +config CPU_NPCM750
>> +   depends on ARCH_NPCM && ARCH_MULTI_V7
>> +   bool "Support for NPCM750 BMC CPU (Poleg)"
>> +   select CACHE_L2X0
>> +   select CPU_V7
>> +   select ARM_GIC
>> +   select HAVE_SMP
>> +   select SMP
>> +   select SMP_ON_UP
>> +   select HAVE_ARM_SCU
>> +   select HAVE_ARM_TWD if SMP
>> +   select ARM_ERRATA_458693
>> +   select ARM_ERRATA_720789
>> +   select ARM_ERRATA_742231
>> +   select ARM_ERRATA_754322
>> +   select ARM_ERRATA_764369
>> +   select ARM_ERRATA_794072
>> +   select PL310_ERRATA_588369
>> +   select PL310_ERRATA_727915
>> +   select USB_EHCI_ROOT_HUB_TT
>> +   select USB_ARCH_HAS_HCD
>> +   select USB_ARCH_HAS_EHCI
>> +   select USB_EHCI_HCD
>> +   select USB_ARCH_HAS_OHCI
>> +   select USB_OHCI_HCD
>> +   select USB
>> +   select FIQ
>> +   select CPU_USE_DOMAINS
>> +   select GENERIC_CLOCKEVENTS
>> +   select CLKDEV_LOOKUP
>> +   select COMMON_CLK if OF
>> +   select NPCM750_TIMER
>> +   select MFD_SYSCON
>> +   help
>> + Support for NPCM750 BMC CPU (Poleg).
>> +
>> + Nuvoton NPCM750 BMC based on the Cortex A9.
>> +
>> +endif
>> diff --git a/arch/arm/mach-npcm/Makefile b/arch/arm/mach-npcm/Makefile
>> new file mode 100644
>> index ..78416055b854
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/Makefile
>> @@ -0,0 +1,3 @@
>> +AFLAGS_headsmp.o   += -march=armv7-a
>> +
>> +obj-$(CONFIG_CPU_NPCM750)  += npcm7xx.o platsmp.o headsmp.o
>> diff --git a/arch/arm/mach-npcm/headsmp.S b/arch/arm/mach-npcm/headsmp.S
>> new file mode 100644
>> index ..9fccbbd49ed4
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/headsmp.S
>> @@ -0,0 +1,17 @@
>> +/*
>> + * Copyright 2017 Google, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +ENTRY(npcm7xx_secondary_startup)
>> +   safe_svcmode_maskall r0
>
> I saw you  answered to Florian Fainelli that the BootRom is not
> starting the secondary CPU in SVC mode. Are you sure? In our
> engineering npcm7xx revision, Z1, the BootRom indeed started to run it
> in IRQ mode but we fixed it in the production version, A1, (quite long
> time ago), I hope you didn't get an EB with 

Re: [PATCH v4 1/3] arm: npcm: add basic support for Nuvoton BMCs

2017-09-06 Thread Brendan Higgins
On Wed, Sep 6, 2017 at 2:04 PM, Avi Fishman  wrote:
> Sorry for sending again some mailing list rejected it due to HTML
> involved (althoug I tried Plain Text), Trying again.
>
> 2017-09-06 11:07 GMT+03:00 Brendan Higgins :
>> Adds basic support for the Nuvoton NPCM750 BMC.
>>
>> Signed-off-by: Brendan Higgins 
>> ---
>>  arch/arm/Kconfig |  2 +
>>  arch/arm/Makefile|  1 +
>>  arch/arm/mach-npcm/Kconfig   | 50 +
>>  arch/arm/mach-npcm/Makefile  |  3 ++
>>  arch/arm/mach-npcm/headsmp.S | 17 +
>>  arch/arm/mach-npcm/npcm7xx.c | 34 +
>>  arch/arm/mach-npcm/platsmp.c | 89 
>> 
>>  7 files changed, 196 insertions(+)
>>  create mode 100644 arch/arm/mach-npcm/Kconfig
>>  create mode 100644 arch/arm/mach-npcm/Makefile
>>  create mode 100644 arch/arm/mach-npcm/headsmp.S
>>  create mode 100644 arch/arm/mach-npcm/npcm7xx.c
>>  create mode 100644 arch/arm/mach-npcm/platsmp.c
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index 61a0cb15067e..05543f1cfbde 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -782,6 +782,8 @@ source "arch/arm/mach-netx/Kconfig"
>>
>>  source "arch/arm/mach-nomadik/Kconfig"
>>
>> +source "arch/arm/mach-npcm/Kconfig"
>> +
>>  source "arch/arm/mach-nspire/Kconfig"
>>
>>  source "arch/arm/plat-omap/Kconfig"
>> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
>> index 47d3a1ab08d2..60ca50c7d762 100644
>> --- a/arch/arm/Makefile
>> +++ b/arch/arm/Makefile
>> @@ -191,6 +191,7 @@ machine-$(CONFIG_ARCH_MEDIATEK) += mediatek
>>  machine-$(CONFIG_ARCH_MXS) += mxs
>>  machine-$(CONFIG_ARCH_NETX)+= netx
>>  machine-$(CONFIG_ARCH_NOMADIK) += nomadik
>> +machine-$(CONFIG_ARCH_NPCM)+= npcm
>>  machine-$(CONFIG_ARCH_NSPIRE)  += nspire
>>  machine-$(CONFIG_ARCH_OXNAS)   += oxnas
>>  machine-$(CONFIG_ARCH_OMAP1)   += omap1
>> diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig
>> new file mode 100644
>> index ..d47061855439
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/Kconfig
>> @@ -0,0 +1,50 @@
>> +menuconfig ARCH_NPCM
>> +   bool "Nuvoton NPCM Architecture"
>> +   select ARCH_REQUIRE_GPIOLIB
>> +   select USE_OF
>> +   select PINCTRL
>> +   select PINCTRL_NPCM7XX
>> +
>> +if ARCH_NPCM
>> +
>> +comment "NPCMX50 CPU type"
>> +
>> +config CPU_NPCM750
>> +   depends on ARCH_NPCM && ARCH_MULTI_V7
>> +   bool "Support for NPCM750 BMC CPU (Poleg)"
>> +   select CACHE_L2X0
>> +   select CPU_V7
>> +   select ARM_GIC
>> +   select HAVE_SMP
>> +   select SMP
>> +   select SMP_ON_UP
>> +   select HAVE_ARM_SCU
>> +   select HAVE_ARM_TWD if SMP
>> +   select ARM_ERRATA_458693
>> +   select ARM_ERRATA_720789
>> +   select ARM_ERRATA_742231
>> +   select ARM_ERRATA_754322
>> +   select ARM_ERRATA_764369
>> +   select ARM_ERRATA_794072
>> +   select PL310_ERRATA_588369
>> +   select PL310_ERRATA_727915
>> +   select USB_EHCI_ROOT_HUB_TT
>> +   select USB_ARCH_HAS_HCD
>> +   select USB_ARCH_HAS_EHCI
>> +   select USB_EHCI_HCD
>> +   select USB_ARCH_HAS_OHCI
>> +   select USB_OHCI_HCD
>> +   select USB
>> +   select FIQ
>> +   select CPU_USE_DOMAINS
>> +   select GENERIC_CLOCKEVENTS
>> +   select CLKDEV_LOOKUP
>> +   select COMMON_CLK if OF
>> +   select NPCM750_TIMER
>> +   select MFD_SYSCON
>> +   help
>> + Support for NPCM750 BMC CPU (Poleg).
>> +
>> + Nuvoton NPCM750 BMC based on the Cortex A9.
>> +
>> +endif
>> diff --git a/arch/arm/mach-npcm/Makefile b/arch/arm/mach-npcm/Makefile
>> new file mode 100644
>> index ..78416055b854
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/Makefile
>> @@ -0,0 +1,3 @@
>> +AFLAGS_headsmp.o   += -march=armv7-a
>> +
>> +obj-$(CONFIG_CPU_NPCM750)  += npcm7xx.o platsmp.o headsmp.o
>> diff --git a/arch/arm/mach-npcm/headsmp.S b/arch/arm/mach-npcm/headsmp.S
>> new file mode 100644
>> index ..9fccbbd49ed4
>> --- /dev/null
>> +++ b/arch/arm/mach-npcm/headsmp.S
>> @@ -0,0 +1,17 @@
>> +/*
>> + * Copyright 2017 Google, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +ENTRY(npcm7xx_secondary_startup)
>> +   safe_svcmode_maskall r0
>
> I saw you  answered to Florian Fainelli that the BootRom is not
> starting the secondary CPU in SVC mode. Are you sure? In our
> engineering npcm7xx revision, Z1, the BootRom indeed started to run it
> in IRQ mode but we fixed it in the production version, A1, (quite long
> time ago), I hope you didn't get an EB with Z1 you can check that in
> BootBlock console print:
>> ADC CLK is set to 

Re: [PATCH 1/2] pidmap(2)

2017-09-06 Thread Djalal Harouni
Hi Alexey,

On Thu, Sep 7, 2017 at 4:04 AM, Andy Lutomirski  wrote:
> On Wed, Sep 6, 2017 at 2:04 AM, Alexey Dobriyan  wrote:
>> On 9/6/17, Randy Dunlap  wrote:
>>> On 09/05/17 15:53, Andrew Morton wrote:
[...]
>>>
>>> also, I expect that the tiny kernel people will want kconfig options for
>>> these syscalls.
>>
>> We'll add it but the question if it is a good idea. Ideally these system 
>> calls
>> should be mandatory and /proc optional.
>>
>> $ size kernel/pidmap.o fs/fdmap.o
>>textdata bss dec hex filename
>> 560   0   0 560 230 kernel/pidmap.o
>> 617   0   0 617 269 fs/fdmap.o
>
> After much discussion at LPC/KS last year, I thought the idea was to
> try to speed up /proc rather than replacing it outright.  The two
> specific ideas I recall were:
>
> 1. Add a syscall like readfileat() that you can use to, in a single
> operation, open, read, and close a /proc file (or other file).  This
> should vastly reduce locking and RCU overhead.
>
> 2. Add a /proc file that has a nice binary format for task info.  (nl_attr?)
>
> I don't see why pidmap() deserves to be significantly faster than getdents().
>
> Also, a pidmap() syscall like this inherently bypasses any security
> restrictions implied by the way that /proc is mounted.  It can respect
> hidepid, but hidepid (as a per-namespace concept) is an enormous turd
> that badly needs to be deprecated, and Djalal is working on exactly
> that.

Yes as noted by Andy, me and Alexey Gladkov are working on modernizing
procfs [1] and to reduce/remove ties within pid namespaces which has lot
of problems now.

We just picked the task again, and this was the result of discussion
with Andy some months ago, on how to improve hidepid, but also how to
improve procfs in general, so we can add other mechanisms to hide or return
NULL on other /proc/_file_not_needed_by_containers_  or
/proc/_specific_module_files_ everything that is not virtualized , or mount only
some specific view of the whole /proc API this will also be used by containers.
This also should make it hard for attackers since we are planning to have
a backward compatible options on how to better treat some of these files in
regard of some namespaces.

The syscall or readfileat() for one operation is a nice addition
definitively. But
in general it would be better to treat /proc as a filesystem and not add other
specific interfaces that may abstract it with pidns, as it is the situation now
which make it from userspace perspective: hard to use especially for security
context.

Alexey, could you please Cc'us on future, thank you very much!


[1] https://lkml.org/lkml/2017/4/25/282

-- 
tixxdz


Re: [PATCH 1/2] pidmap(2)

2017-09-06 Thread Djalal Harouni
Hi Alexey,

On Thu, Sep 7, 2017 at 4:04 AM, Andy Lutomirski  wrote:
> On Wed, Sep 6, 2017 at 2:04 AM, Alexey Dobriyan  wrote:
>> On 9/6/17, Randy Dunlap  wrote:
>>> On 09/05/17 15:53, Andrew Morton wrote:
[...]
>>>
>>> also, I expect that the tiny kernel people will want kconfig options for
>>> these syscalls.
>>
>> We'll add it but the question if it is a good idea. Ideally these system 
>> calls
>> should be mandatory and /proc optional.
>>
>> $ size kernel/pidmap.o fs/fdmap.o
>>textdata bss dec hex filename
>> 560   0   0 560 230 kernel/pidmap.o
>> 617   0   0 617 269 fs/fdmap.o
>
> After much discussion at LPC/KS last year, I thought the idea was to
> try to speed up /proc rather than replacing it outright.  The two
> specific ideas I recall were:
>
> 1. Add a syscall like readfileat() that you can use to, in a single
> operation, open, read, and close a /proc file (or other file).  This
> should vastly reduce locking and RCU overhead.
>
> 2. Add a /proc file that has a nice binary format for task info.  (nl_attr?)
>
> I don't see why pidmap() deserves to be significantly faster than getdents().
>
> Also, a pidmap() syscall like this inherently bypasses any security
> restrictions implied by the way that /proc is mounted.  It can respect
> hidepid, but hidepid (as a per-namespace concept) is an enormous turd
> that badly needs to be deprecated, and Djalal is working on exactly
> that.

Yes as noted by Andy, me and Alexey Gladkov are working on modernizing
procfs [1] and to reduce/remove ties within pid namespaces which has lot
of problems now.

We just picked the task again, and this was the result of discussion
with Andy some months ago, on how to improve hidepid, but also how to
improve procfs in general, so we can add other mechanisms to hide or return
NULL on other /proc/_file_not_needed_by_containers_  or
/proc/_specific_module_files_ everything that is not virtualized , or mount only
some specific view of the whole /proc API this will also be used by containers.
This also should make it hard for attackers since we are planning to have
a backward compatible options on how to better treat some of these files in
regard of some namespaces.

The syscall or readfileat() for one operation is a nice addition
definitively. But
in general it would be better to treat /proc as a filesystem and not add other
specific interfaces that may abstract it with pidns, as it is the situation now
which make it from userspace perspective: hard to use especially for security
context.

Alexey, could you please Cc'us on future, thank you very much!


[1] https://lkml.org/lkml/2017/4/25/282

-- 
tixxdz


Re: [GIT] Networking

2017-09-06 Thread Coelho, Luciano
On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote:
> On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
>  wrote:
> > 
> > This seems to be a problem with backwards-compatibility with FW version
> > 27.  We are now in version 31[1] and upgrading will probably fix that.
> 
> I can confirm that fw version 31 works.

Great, so I know for sure that this is a backwards-compatibility issue
with the FW API.


> > But obviously the driver should not fail miserably like this with
> > version 27, because it claims to support it still.
> 
> Not just "claims to support it", but if it's what is shipped with a
> fairly recent distro like an up-to-date version of F26, I would really
> hope that the driver can still work with it.

I totally agree, we support a bunch of older versions for that exact
reason.  We just don't really test all the supported versions very
often.  We should probably change that.

I'll make sure it still works with version 27.

--
Cheers,
Luca.

Re: [GIT] Networking

2017-09-06 Thread Coelho, Luciano
On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote:
> On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
>  wrote:
> > 
> > This seems to be a problem with backwards-compatibility with FW version
> > 27.  We are now in version 31[1] and upgrading will probably fix that.
> 
> I can confirm that fw version 31 works.

Great, so I know for sure that this is a backwards-compatibility issue
with the FW API.


> > But obviously the driver should not fail miserably like this with
> > version 27, because it claims to support it still.
> 
> Not just "claims to support it", but if it's what is shipped with a
> fairly recent distro like an up-to-date version of F26, I would really
> hope that the driver can still work with it.

I totally agree, we support a bunch of older versions for that exact
reason.  We just don't really test all the supported versions very
often.  We should probably change that.

I'll make sure it still works with version 27.

--
Cheers,
Luca.

Re: [GIT] Networking

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
 wrote:
>
> This seems to be a problem with backwards-compatibility with FW version
> 27.  We are now in version 31[1] and upgrading will probably fix that.

I can confirm that fw version 31 works.

> But obviously the driver should not fail miserably like this with
> version 27, because it claims to support it still.

Not just "claims to support it", but if it's what is shipped with a
fairly recent distro like an up-to-date version of F26, I would really
hope that the driver can still work with it.

> I'm looking into this now and will provide a fix asap.

Thanks,

  Linus


Re: [GIT] Networking

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano
 wrote:
>
> This seems to be a problem with backwards-compatibility with FW version
> 27.  We are now in version 31[1] and upgrading will probably fix that.

I can confirm that fw version 31 works.

> But obviously the driver should not fail miserably like this with
> version 27, because it claims to support it still.

Not just "claims to support it", but if it's what is shipped with a
fairly recent distro like an up-to-date version of F26, I would really
hope that the driver can still work with it.

> I'm looking into this now and will provide a fix asap.

Thanks,

  Linus


[PATCH] drivers/staging/rtl8188eu: cleanup crc32_init logic

2017-09-06 Thread Pushkar Jambhlekar
crc32_init is using unnecessary else condition. Cleaningup this function

Signed-off-by: Pushkar Jambhlekar 
---
 drivers/staging/rtl8188eu/core/rtw_security.c | 42 +--
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_security.c 
b/drivers/staging/rtl8188eu/core/rtw_security.c
index b283a490..9d17204 100644
--- a/drivers/staging/rtl8188eu/core/rtw_security.c
+++ b/drivers/staging/rtl8188eu/core/rtw_security.c
@@ -94,29 +94,29 @@ static u8 crc32_reverseBit(u8 data)
 
 static void crc32_init(void)
 {
-   if (bcrc32initialized == 1) {
+   if (bcrc32initialized == 1)
return;
-   } else {
-   int i, j;
-   u32 c;
-   u8 *p = (u8 *), *p1;
-   u8 k;
-
-   c = 0x1234;
-
-   for (i = 0; i < 256; ++i) {
-   k = crc32_reverseBit((u8)i);
-   for (c = ((u32)k) << 24, j = 8; j > 0; --j)
-   c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c 
<< 1);
-   p1 = (u8 *)_table[i];
-
-   p1[0] = crc32_reverseBit(p[3]);
-   p1[1] = crc32_reverseBit(p[2]);
-   p1[2] = crc32_reverseBit(p[1]);
-   p1[3] = crc32_reverseBit(p[0]);
-   }
-   bcrc32initialized = 1;
+
+   int i, j;
+   u32 c;
+   u8 *p = (u8 *), *p1;
+   u8 k;
+
+   c = 0x1234;
+
+   for (i = 0; i < 256; ++i) {
+   k = crc32_reverseBit((u8)i);
+   for (c = ((u32)k) << 24, j = 8; j > 0; --j)
+   c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1);
+   p1 = (u8 *)_table[i];
+
+   p1[0] = crc32_reverseBit(p[3]);
+   p1[1] = crc32_reverseBit(p[2]);
+   p1[2] = crc32_reverseBit(p[1]);
+   p1[3] = crc32_reverseBit(p[0]);
}
+
+   bcrc32initialized = 1;
 }
 
 static __le32 getcrc32(u8 *buf, int len)
-- 
2.7.4



[PATCH] drivers/staging/rtl8188eu: cleanup crc32_init logic

2017-09-06 Thread Pushkar Jambhlekar
crc32_init is using unnecessary else condition. Cleaningup this function

Signed-off-by: Pushkar Jambhlekar 
---
 drivers/staging/rtl8188eu/core/rtw_security.c | 42 +--
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_security.c 
b/drivers/staging/rtl8188eu/core/rtw_security.c
index b283a490..9d17204 100644
--- a/drivers/staging/rtl8188eu/core/rtw_security.c
+++ b/drivers/staging/rtl8188eu/core/rtw_security.c
@@ -94,29 +94,29 @@ static u8 crc32_reverseBit(u8 data)
 
 static void crc32_init(void)
 {
-   if (bcrc32initialized == 1) {
+   if (bcrc32initialized == 1)
return;
-   } else {
-   int i, j;
-   u32 c;
-   u8 *p = (u8 *), *p1;
-   u8 k;
-
-   c = 0x1234;
-
-   for (i = 0; i < 256; ++i) {
-   k = crc32_reverseBit((u8)i);
-   for (c = ((u32)k) << 24, j = 8; j > 0; --j)
-   c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c 
<< 1);
-   p1 = (u8 *)_table[i];
-
-   p1[0] = crc32_reverseBit(p[3]);
-   p1[1] = crc32_reverseBit(p[2]);
-   p1[2] = crc32_reverseBit(p[1]);
-   p1[3] = crc32_reverseBit(p[0]);
-   }
-   bcrc32initialized = 1;
+
+   int i, j;
+   u32 c;
+   u8 *p = (u8 *), *p1;
+   u8 k;
+
+   c = 0x1234;
+
+   for (i = 0; i < 256; ++i) {
+   k = crc32_reverseBit((u8)i);
+   for (c = ((u32)k) << 24, j = 8; j > 0; --j)
+   c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1);
+   p1 = (u8 *)_table[i];
+
+   p1[0] = crc32_reverseBit(p[3]);
+   p1[1] = crc32_reverseBit(p[2]);
+   p1[2] = crc32_reverseBit(p[1]);
+   p1[3] = crc32_reverseBit(p[0]);
}
+
+   bcrc32initialized = 1;
 }
 
 static __le32 getcrc32(u8 *buf, int len)
-- 
2.7.4



Re: [PATCH 06/14] md/bcache: Use %pS printk format for direct addresses

2017-09-06 Thread Coly Li
On 2017/9/7 上午4:27, Helge Deller wrote:
> Use the %pS instead of the %pF printk format specifier for printing symbols
> from direct addresses. This is needed for the ia64, ppc64 and parisc64
> architectures.
> 
> Signed-off-by: Helge Deller 
> Cc: linux-bca...@vger.kernel.org
> Cc: linux-r...@vger.kernel.org
> ---
>  drivers/md/bcache/closure.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
> index 864e673..0b0c9bc 100644
> --- a/drivers/md/bcache/closure.c
> +++ b/drivers/md/bcache/closure.c
> @@ -175,7 +175,7 @@ static int debug_seq_show(struct seq_file *f, void *data)
>   list_for_each_entry(cl, _list, all) {
>   int r = atomic_read(>remaining);
>  
> - seq_printf(f, "%p: %pF -> %pf p %p r %i ",
> + seq_printf(f, "%p: %pS -> %pf p %p r %i ",
>  cl, (void *) cl->ip, cl->fn, cl->parent,
>  r & CLOSURE_REMAINING_MASK);
>  
> @@ -187,7 +187,7 @@ static int debug_seq_show(struct seq_file *f, void *data)
>  r & CLOSURE_SLEEPING ? "Sl" : "");
>  
>   if (r & CLOSURE_WAITING)
> - seq_printf(f, " W %pF\n",
> + seq_printf(f, " W %pS\n",
>  (void *) cl->waiting_on);
>  
>   seq_printf(f, "\n");
> 

It is unclear to me, that if %pF is used, on ia64/ppc64/parisc64 a
function descriptor conversion happens, what negative effect exactly
takes place ? Or you just want to unify the out put format, and get rid
of extra conversion information ?

Thanks.

-- 
Coly Li


Re: [PATCH 06/14] md/bcache: Use %pS printk format for direct addresses

2017-09-06 Thread Coly Li
On 2017/9/7 上午4:27, Helge Deller wrote:
> Use the %pS instead of the %pF printk format specifier for printing symbols
> from direct addresses. This is needed for the ia64, ppc64 and parisc64
> architectures.
> 
> Signed-off-by: Helge Deller 
> Cc: linux-bca...@vger.kernel.org
> Cc: linux-r...@vger.kernel.org
> ---
>  drivers/md/bcache/closure.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
> index 864e673..0b0c9bc 100644
> --- a/drivers/md/bcache/closure.c
> +++ b/drivers/md/bcache/closure.c
> @@ -175,7 +175,7 @@ static int debug_seq_show(struct seq_file *f, void *data)
>   list_for_each_entry(cl, _list, all) {
>   int r = atomic_read(>remaining);
>  
> - seq_printf(f, "%p: %pF -> %pf p %p r %i ",
> + seq_printf(f, "%p: %pS -> %pf p %p r %i ",
>  cl, (void *) cl->ip, cl->fn, cl->parent,
>  r & CLOSURE_REMAINING_MASK);
>  
> @@ -187,7 +187,7 @@ static int debug_seq_show(struct seq_file *f, void *data)
>  r & CLOSURE_SLEEPING ? "Sl" : "");
>  
>   if (r & CLOSURE_WAITING)
> - seq_printf(f, " W %pF\n",
> + seq_printf(f, " W %pS\n",
>  (void *) cl->waiting_on);
>  
>   seq_printf(f, "\n");
> 

It is unclear to me, that if %pF is used, on ia64/ppc64/parisc64 a
function descriptor conversion happens, what negative effect exactly
takes place ? Or you just want to unify the out put format, and get rid
of extra conversion information ?

Thanks.

-- 
Coly Li


Re: [PATCH 3/3] f2fs: support flexible inline xattr size

2017-09-06 Thread Jaegeuk Kim
On 09/06, Chao Yu wrote:
> Hi Jaegeuk,
> 
> Do we have time to test and stabilize this new feature before merge window?

Let's try this in the next merge window. It's too tight to test now. :)

> 
> Thanks,
> 
> On 2017/9/4 18:58, Chao Yu wrote:
> > Now, in product, more and more features based on file encryption were
> > introduced, their demand of xattr space is increasing, however, inline
> > xattr has fixed-size of 200 bytes, once inline xattr space is full, new
> > increased xattr data would occupy additional xattr block which may bring
> > us more space usage and performance regression during persisting.
> > 
> > In order to resolve above issue, it's better to expand inline xattr size
> > flexibly according to user's requirement.
> > 
> > So this patch introduces new filesystem feature 'flexible inline xattr',
> > and new mount option 'inline_xattr_size=%u', once mkfs enables the
> > feature, we can use the option to make f2fs supporting flexible inline
> > xattr size.
> > 
> > To support this feature, we add extra attribute i_inline_xattr_size in
> > inode layout, indicating that how many space inline xattr borrows from
> > block address mapping space in inode layout, by this, we can easily
> > locate and store flexible-sized inline xattr data in inode.
> > 
> > Inode disk layout:
> >   +--+
> >   | .i_mode  |
> >   | ...  |
> >   | .i_ext   |
> >   +--+
> >   | .i_extra_isize   |
> >   | .i_inline_xattr_size |---+
> >   | ...  |   |
> >   +--+   |
> >   | .i_addr  |   |
> >   |  - block address or  |   |
> >   |  - inline data   |   |
> >   +--+<---+  v
> >   |inline xattr  |+---inline xattr range
> >   +--+<---+
> >   | .i_nid   |
> >   +--+
> >   |   node_footer|
> >   | (nid, ino, offset)   |
> >   +--+
> > 
> > Signed-off-by: Chao Yu 
> > ---
> >  fs/f2fs/f2fs.h  | 43 ++-
> >  fs/f2fs/inode.c | 12 
> >  fs/f2fs/namei.c |  6 ++
> >  fs/f2fs/node.c  |  4 ++--
> >  fs/f2fs/super.c | 32 +++-
> >  fs/f2fs/sysfs.c |  7 +++
> >  fs/f2fs/xattr.c | 18 +-
> >  include/linux/f2fs_fs.h |  5 +++--
> >  8 files changed, 100 insertions(+), 27 deletions(-)
> > 
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 1f24ad4ca1bb..168ad51b7fb9 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -93,6 +93,7 @@ extern char *fault_name[FAULT_MAX];
> >  #define F2FS_MOUNT_GRPQUOTA0x0010
> >  #define F2FS_MOUNT_PRJQUOTA0x0020
> >  #define F2FS_MOUNT_QUOTA   0x0040
> > +#define F2FS_MOUNT_INLINE_XATTR_SIZE   0x0080
> >  
> >  #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= 
> > ~F2FS_MOUNT_##option)
> >  #define set_opt(sbi, option)   ((sbi)->mount_opt.opt |= 
> > F2FS_MOUNT_##option)
> > @@ -118,6 +119,7 @@ struct f2fs_mount_info {
> >  #define F2FS_FEATURE_EXTRA_ATTR0x0008
> >  #define F2FS_FEATURE_PRJQUOTA  0x0010
> >  #define F2FS_FEATURE_INODE_CHKSUM  0x0020
> > +#define F2FS_FEATURE_FLEXIBLE_INLINE_XATTR 0x0040
> >  
> >  #define F2FS_HAS_FEATURE(sb, mask) \
> > ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> > @@ -379,11 +381,14 @@ struct f2fs_flush_device {
> >  
> >  /* for inline stuff */
> >  #define DEF_INLINE_RESERVED_SIZE   1
> > +#define DEF_MIN_INLINE_SIZE1
> >  static inline int get_extra_isize(struct inode *inode);
> > -#define MAX_INLINE_DATA(inode) (sizeof(__le32) * \
> > -   (CUR_ADDRS_PER_INODE(inode) - \
> > -   DEF_INLINE_RESERVED_SIZE - \
> > -   F2FS_INLINE_XATTR_ADDRS))
> > +static inline int get_inline_xattr_addrs(struct inode *inode);
> > +#define F2FS_INLINE_XATTR_ADDRS(inode) get_inline_xattr_addrs(inode)
> > +#define MAX_INLINE_DATA(inode) (sizeof(__le32) *   
> > \
> > +   (CUR_ADDRS_PER_INODE(inode) -   \
> > +   F2FS_INLINE_XATTR_ADDRS(inode) -\
> > +   DEF_INLINE_RESERVED_SIZE))
> >  
> >  /* for inline dir */
> >  #define NR_INLINE_DENTRY(inode)(MAX_INLINE_DATA(inode) * BITS_PER_BYTE 
> > / \
> > @@ -592,6 +597,7 @@ struct f2fs_inode_info {
> >  
> > int i_extra_isize;  /* size of extra space located in 
> > i_addr */
> > kprojid_t i_projid; /* id for project quota */
> > +   int i_inline_xattr_size;/* inline xattr size */
> >  };
> >  
> >  static inline void get_extent_info(struct extent_info 

Re: [PATCH 3/3] f2fs: support flexible inline xattr size

2017-09-06 Thread Jaegeuk Kim
On 09/06, Chao Yu wrote:
> Hi Jaegeuk,
> 
> Do we have time to test and stabilize this new feature before merge window?

Let's try this in the next merge window. It's too tight to test now. :)

> 
> Thanks,
> 
> On 2017/9/4 18:58, Chao Yu wrote:
> > Now, in product, more and more features based on file encryption were
> > introduced, their demand of xattr space is increasing, however, inline
> > xattr has fixed-size of 200 bytes, once inline xattr space is full, new
> > increased xattr data would occupy additional xattr block which may bring
> > us more space usage and performance regression during persisting.
> > 
> > In order to resolve above issue, it's better to expand inline xattr size
> > flexibly according to user's requirement.
> > 
> > So this patch introduces new filesystem feature 'flexible inline xattr',
> > and new mount option 'inline_xattr_size=%u', once mkfs enables the
> > feature, we can use the option to make f2fs supporting flexible inline
> > xattr size.
> > 
> > To support this feature, we add extra attribute i_inline_xattr_size in
> > inode layout, indicating that how many space inline xattr borrows from
> > block address mapping space in inode layout, by this, we can easily
> > locate and store flexible-sized inline xattr data in inode.
> > 
> > Inode disk layout:
> >   +--+
> >   | .i_mode  |
> >   | ...  |
> >   | .i_ext   |
> >   +--+
> >   | .i_extra_isize   |
> >   | .i_inline_xattr_size |---+
> >   | ...  |   |
> >   +--+   |
> >   | .i_addr  |   |
> >   |  - block address or  |   |
> >   |  - inline data   |   |
> >   +--+<---+  v
> >   |inline xattr  |+---inline xattr range
> >   +--+<---+
> >   | .i_nid   |
> >   +--+
> >   |   node_footer|
> >   | (nid, ino, offset)   |
> >   +--+
> > 
> > Signed-off-by: Chao Yu 
> > ---
> >  fs/f2fs/f2fs.h  | 43 ++-
> >  fs/f2fs/inode.c | 12 
> >  fs/f2fs/namei.c |  6 ++
> >  fs/f2fs/node.c  |  4 ++--
> >  fs/f2fs/super.c | 32 +++-
> >  fs/f2fs/sysfs.c |  7 +++
> >  fs/f2fs/xattr.c | 18 +-
> >  include/linux/f2fs_fs.h |  5 +++--
> >  8 files changed, 100 insertions(+), 27 deletions(-)
> > 
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 1f24ad4ca1bb..168ad51b7fb9 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -93,6 +93,7 @@ extern char *fault_name[FAULT_MAX];
> >  #define F2FS_MOUNT_GRPQUOTA0x0010
> >  #define F2FS_MOUNT_PRJQUOTA0x0020
> >  #define F2FS_MOUNT_QUOTA   0x0040
> > +#define F2FS_MOUNT_INLINE_XATTR_SIZE   0x0080
> >  
> >  #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= 
> > ~F2FS_MOUNT_##option)
> >  #define set_opt(sbi, option)   ((sbi)->mount_opt.opt |= 
> > F2FS_MOUNT_##option)
> > @@ -118,6 +119,7 @@ struct f2fs_mount_info {
> >  #define F2FS_FEATURE_EXTRA_ATTR0x0008
> >  #define F2FS_FEATURE_PRJQUOTA  0x0010
> >  #define F2FS_FEATURE_INODE_CHKSUM  0x0020
> > +#define F2FS_FEATURE_FLEXIBLE_INLINE_XATTR 0x0040
> >  
> >  #define F2FS_HAS_FEATURE(sb, mask) \
> > ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> > @@ -379,11 +381,14 @@ struct f2fs_flush_device {
> >  
> >  /* for inline stuff */
> >  #define DEF_INLINE_RESERVED_SIZE   1
> > +#define DEF_MIN_INLINE_SIZE1
> >  static inline int get_extra_isize(struct inode *inode);
> > -#define MAX_INLINE_DATA(inode) (sizeof(__le32) * \
> > -   (CUR_ADDRS_PER_INODE(inode) - \
> > -   DEF_INLINE_RESERVED_SIZE - \
> > -   F2FS_INLINE_XATTR_ADDRS))
> > +static inline int get_inline_xattr_addrs(struct inode *inode);
> > +#define F2FS_INLINE_XATTR_ADDRS(inode) get_inline_xattr_addrs(inode)
> > +#define MAX_INLINE_DATA(inode) (sizeof(__le32) *   
> > \
> > +   (CUR_ADDRS_PER_INODE(inode) -   \
> > +   F2FS_INLINE_XATTR_ADDRS(inode) -\
> > +   DEF_INLINE_RESERVED_SIZE))
> >  
> >  /* for inline dir */
> >  #define NR_INLINE_DENTRY(inode)(MAX_INLINE_DATA(inode) * BITS_PER_BYTE 
> > / \
> > @@ -592,6 +597,7 @@ struct f2fs_inode_info {
> >  
> > int i_extra_isize;  /* size of extra space located in 
> > i_addr */
> > kprojid_t i_projid; /* id for project quota */
> > +   int i_inline_xattr_size;/* inline xattr size */
> >  };
> >  
> >  static inline void get_extent_info(struct extent_info *ext,
> > @@ -1043,6 

[PATCH] f2fs: make get_lock_data_page to handle encrypted inode

2017-09-06 Thread Jaegeuk Kim
This patch refactors get_lock_data_page() to handle encryption case directly.
In order to do that, it introduces common f2fs_submit_page_read().

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c | 109 +++--
 1 file changed, 51 insertions(+), 58 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ee6801fdbdec..95f30fb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -456,6 +456,53 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
return err;
 }
 
+static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
+unsigned nr_pages)
+{
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   struct fscrypt_ctx *ctx = NULL;
+   struct bio *bio;
+
+   if (f2fs_encrypted_file(inode)) {
+   ctx = fscrypt_get_ctx(inode, GFP_NOFS);
+   if (IS_ERR(ctx))
+   return ERR_CAST(ctx);
+
+   /* wait the page to be moved by cleaning */
+   f2fs_wait_on_block_writeback(sbi, blkaddr);
+   }
+
+   bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES));
+   if (!bio) {
+   if (ctx)
+   fscrypt_release_ctx(ctx);
+   return ERR_PTR(-ENOMEM);
+   }
+   f2fs_target_device(sbi, blkaddr, bio);
+   bio->bi_end_io = f2fs_read_end_io;
+   bio->bi_private = ctx;
+   bio_set_op_attrs(bio, REQ_OP_READ, 0);
+
+   return bio;
+}
+
+/* This can handle encryption stuffs */
+static int f2fs_submit_page_read(struct inode *inode, struct page *page,
+   block_t blkaddr)
+{
+   struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1);
+
+   if (IS_ERR(bio))
+   return PTR_ERR(bio);
+
+   if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
+   bio_put(bio);
+   return -EFAULT;
+   }
+   __submit_bio(F2FS_I_SB(inode), bio, DATA);
+   return 0;
+}
+
 static void __set_data_blkaddr(struct dnode_of_data *dn)
 {
struct f2fs_node *rn = F2FS_NODE(dn->node_page);
@@ -573,16 +620,6 @@ struct page *get_read_data_page(struct inode *inode, 
pgoff_t index,
struct page *page;
struct extent_info ei = {0,0,0};
int err;
-   struct f2fs_io_info fio = {
-   .sbi = F2FS_I_SB(inode),
-   .type = DATA,
-   .op = REQ_OP_READ,
-   .op_flags = op_flags,
-   .encrypted_page = NULL,
-   };
-
-   if (f2fs_encrypted_file(inode))
-   return read_mapping_page(mapping, index, NULL);
 
page = f2fs_grab_cache_page(mapping, index, for_write);
if (!page)
@@ -623,9 +660,7 @@ struct page *get_read_data_page(struct inode *inode, 
pgoff_t index,
return page;
}
 
-   fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
-   fio.page = page;
-   err = f2fs_submit_page_bio();
+   err = f2fs_submit_page_read(inode, page, dn.data_blkaddr);
if (err)
goto put_err;
return page;
@@ -1150,35 +1185,6 @@ int f2fs_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
return ret;
 }
 
-static struct bio *f2fs_grab_bio(struct inode *inode, block_t blkaddr,
-unsigned nr_pages)
-{
-   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-   struct fscrypt_ctx *ctx = NULL;
-   struct bio *bio;
-
-   if (f2fs_encrypted_file(inode)) {
-   ctx = fscrypt_get_ctx(inode, GFP_NOFS);
-   if (IS_ERR(ctx))
-   return ERR_CAST(ctx);
-
-   /* wait the page to be moved by cleaning */
-   f2fs_wait_on_block_writeback(sbi, blkaddr);
-   }
-
-   bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES));
-   if (!bio) {
-   if (ctx)
-   fscrypt_release_ctx(ctx);
-   return ERR_PTR(-ENOMEM);
-   }
-   f2fs_target_device(sbi, blkaddr, bio);
-   bio->bi_end_io = f2fs_read_end_io;
-   bio->bi_private = ctx;
-
-   return bio;
-}
-
 /*
  * This function was originally taken from fs/mpage.c, and customized for f2fs.
  * Major change was from block_size == page_size in f2fs by default.
@@ -1275,12 +1281,11 @@ static int f2fs_mpage_readpages(struct address_space 
*mapping,
bio = NULL;
}
if (bio == NULL) {
-   bio = f2fs_grab_bio(inode, block_nr, nr_pages);
+   bio = f2fs_grab_read_bio(inode, block_nr, nr_pages);
if (IS_ERR(bio)) {
bio = NULL;
goto set_error_page;
}
-   bio_set_op_attrs(bio, REQ_OP_READ, 0);
}
 
if 

[PATCH] f2fs: make get_lock_data_page to handle encrypted inode

2017-09-06 Thread Jaegeuk Kim
This patch refactors get_lock_data_page() to handle encryption case directly.
In order to do that, it introduces common f2fs_submit_page_read().

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c | 109 +++--
 1 file changed, 51 insertions(+), 58 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ee6801fdbdec..95f30fb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -456,6 +456,53 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
return err;
 }
 
+static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
+unsigned nr_pages)
+{
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   struct fscrypt_ctx *ctx = NULL;
+   struct bio *bio;
+
+   if (f2fs_encrypted_file(inode)) {
+   ctx = fscrypt_get_ctx(inode, GFP_NOFS);
+   if (IS_ERR(ctx))
+   return ERR_CAST(ctx);
+
+   /* wait the page to be moved by cleaning */
+   f2fs_wait_on_block_writeback(sbi, blkaddr);
+   }
+
+   bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES));
+   if (!bio) {
+   if (ctx)
+   fscrypt_release_ctx(ctx);
+   return ERR_PTR(-ENOMEM);
+   }
+   f2fs_target_device(sbi, blkaddr, bio);
+   bio->bi_end_io = f2fs_read_end_io;
+   bio->bi_private = ctx;
+   bio_set_op_attrs(bio, REQ_OP_READ, 0);
+
+   return bio;
+}
+
+/* This can handle encryption stuffs */
+static int f2fs_submit_page_read(struct inode *inode, struct page *page,
+   block_t blkaddr)
+{
+   struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1);
+
+   if (IS_ERR(bio))
+   return PTR_ERR(bio);
+
+   if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
+   bio_put(bio);
+   return -EFAULT;
+   }
+   __submit_bio(F2FS_I_SB(inode), bio, DATA);
+   return 0;
+}
+
 static void __set_data_blkaddr(struct dnode_of_data *dn)
 {
struct f2fs_node *rn = F2FS_NODE(dn->node_page);
@@ -573,16 +620,6 @@ struct page *get_read_data_page(struct inode *inode, 
pgoff_t index,
struct page *page;
struct extent_info ei = {0,0,0};
int err;
-   struct f2fs_io_info fio = {
-   .sbi = F2FS_I_SB(inode),
-   .type = DATA,
-   .op = REQ_OP_READ,
-   .op_flags = op_flags,
-   .encrypted_page = NULL,
-   };
-
-   if (f2fs_encrypted_file(inode))
-   return read_mapping_page(mapping, index, NULL);
 
page = f2fs_grab_cache_page(mapping, index, for_write);
if (!page)
@@ -623,9 +660,7 @@ struct page *get_read_data_page(struct inode *inode, 
pgoff_t index,
return page;
}
 
-   fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
-   fio.page = page;
-   err = f2fs_submit_page_bio();
+   err = f2fs_submit_page_read(inode, page, dn.data_blkaddr);
if (err)
goto put_err;
return page;
@@ -1150,35 +1185,6 @@ int f2fs_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
return ret;
 }
 
-static struct bio *f2fs_grab_bio(struct inode *inode, block_t blkaddr,
-unsigned nr_pages)
-{
-   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-   struct fscrypt_ctx *ctx = NULL;
-   struct bio *bio;
-
-   if (f2fs_encrypted_file(inode)) {
-   ctx = fscrypt_get_ctx(inode, GFP_NOFS);
-   if (IS_ERR(ctx))
-   return ERR_CAST(ctx);
-
-   /* wait the page to be moved by cleaning */
-   f2fs_wait_on_block_writeback(sbi, blkaddr);
-   }
-
-   bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES));
-   if (!bio) {
-   if (ctx)
-   fscrypt_release_ctx(ctx);
-   return ERR_PTR(-ENOMEM);
-   }
-   f2fs_target_device(sbi, blkaddr, bio);
-   bio->bi_end_io = f2fs_read_end_io;
-   bio->bi_private = ctx;
-
-   return bio;
-}
-
 /*
  * This function was originally taken from fs/mpage.c, and customized for f2fs.
  * Major change was from block_size == page_size in f2fs by default.
@@ -1275,12 +1281,11 @@ static int f2fs_mpage_readpages(struct address_space 
*mapping,
bio = NULL;
}
if (bio == NULL) {
-   bio = f2fs_grab_bio(inode, block_nr, nr_pages);
+   bio = f2fs_grab_read_bio(inode, block_nr, nr_pages);
if (IS_ERR(bio)) {
bio = NULL;
goto set_error_page;
}
-   bio_set_op_attrs(bio, REQ_OP_READ, 0);
}
 
if (bio_add_page(bio, page, 

Re: iov_iter_pipe warning.

2017-09-06 Thread Al Viro
On Wed, Sep 06, 2017 at 11:48:35PM -0400, Dave Jones wrote:

>  > That doesn't seem like an XFS problem - it indicates the pipe we are
>  > filling in generic_file_splice_read() is not being emptied by
>  > whatever we are splicing the file data to
> 
> The puzzling part is this runs for a day on ext4 or btrfs, whereas I can
> make xfs fall over pretty quickly.  As Darrick pointed out though, this
> could be due to xfs being the only user of iomap_dio_rw.
> 
> I'm juggling a few other things right now, so probably not going to
> have much time to dig further on this until after plumbers + 1 wk.

I'll look into that tomorrow...


Re: iov_iter_pipe warning.

2017-09-06 Thread Al Viro
On Wed, Sep 06, 2017 at 11:48:35PM -0400, Dave Jones wrote:

>  > That doesn't seem like an XFS problem - it indicates the pipe we are
>  > filling in generic_file_splice_read() is not being emptied by
>  > whatever we are splicing the file data to
> 
> The puzzling part is this runs for a day on ext4 or btrfs, whereas I can
> make xfs fall over pretty quickly.  As Darrick pointed out though, this
> could be due to xfs being the only user of iomap_dio_rw.
> 
> I'm juggling a few other things right now, so probably not going to
> have much time to dig further on this until after plumbers + 1 wk.

I'll look into that tomorrow...


Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode

2017-09-06 Thread Dou Liyang

Hi Baoquan

I am wordy one ah:
our target is checking if BIOS supports APIC, no matter what
type(separated/integrated) it is. if not, go to PIC mode.

Let‘s discuss the original logic and the smp_found_config,
then take about your code.

The existing logic is:

if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1)
return -1;

if (!boot_cpu_has(X86_FEATURE_APIC) &&
APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2)
pr_err();

why smp_found_config has to be checked in (1)?

Because, In case of discrete (pretty old) apics we may not set
X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic
feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1]
So we assume that if SMP configuration is found from MP table
(smp_found_config = 1) in above case, there maybe a separated
chip in our pc.

After passing the check of (1), we in (2), check whether local APIC
is detected or not, If we have a BIOS bug.

[1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics")

At 09/06/2017 06:17 PM, Baoquan He wrote:

Hi Dou,

On 08/28/17 at 11:20am, Dou Liyang wrote:

+static int __init apic_intr_mode_select(void)
+{
+   /* Check kernel option */
+   if (disable_apic) {
+   pr_info("APIC disabled via kernel command line\n");
+   return APIC_PIC;
+   }
+


I am not very familiar with cpu registers, not sure if we can adjust
below code flow as:

/* If APIC is integrated, check local APIC only */
if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) {
disable_apic = 1;
pr_info("APIC disabled by BIOS\n");
return APIC_PIC;
}

/* If APIC is on a separate chip, check if smp_found_config is found*/
if (!lapic_is_integrated() && !smp_found_config) {
disable_apic = 1;
return APIC_PIC;
}


Yes, Awesome, we first consider it from APIC register space, then
the BOIS and software configration. let me do more investigation.

I rewrite it based on you, any comments will welcome.

/* If APIC is not integrated, check if SMP configuration is
 * found from MP table. If not too, no 82489DX. switch to
 * PIC mode
 *
 * Else APIC is integrated, check if the BIOS allows local APIC
 *
 */
if (!lapic_is_integrated()) {
if (!smp_found_config) {
disable_apic = 1;
return APIC_PIC;
}
} else if(!boot_cpu_has(X86_FEATURE_APIC)) {
disable_apic = 1;
pr_info("APIC disabled by BIOS\n");
return APIC_PIC;
}
}

BTW, As the macro APIC_INTEGRATED(x) has already wrapped by
CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity
like that:

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 7834f73..63b3ae9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -211,11 +211,7 @@ static inline int lapic_get_version(void)
  */
 static inline int lapic_is_integrated(void)
 {
-#ifdef CONFIG_X86_64
-   return 1;
-#else
return APIC_INTEGRATED(lapic_get_version());
-#endif
 }


Do you think so. ;-)


Thanks,
dou.



 Now, I haven't think of why smp_found_config has to be
 checked here.

In this way, we don't need the CONFIG_X86_64 checking since it's
contained in lapic_is_integrated() already. And the checking is obvious
for understanding. Just not very sure if the checking is adequate.

Just my personal opinion.


+   /* Check BIOS */
+#ifdef CONFIG_X86_64
+   /* On 64-bit, the APIC must be integrated, Check local APIC only */
+   if (!boot_cpu_has(X86_FEATURE_APIC)) {
+   disable_apic = 1;
+   pr_info("APIC disabled by BIOS\n");
+   return APIC_PIC;
+   }
+#else
+   /*
+* On 32-bit, check whether there is a separate chip or integrated
+* APIC
+*/
+
+   /* Has a separate chip ? */
+   if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) {
+   disable_apic = 1;
+
+   return APIC_PIC;
+   }
+
+   /* Has a local APIC ? */
+   if (!boot_cpu_has(X86_FEATURE_APIC) &&
+   APIC_INTEGRATED(boot_cpu_apic_version)) {
+   disable_apic = 1;
+   pr_err(FW_BUG "Local APIC %d not detected, force emulation\n",
+  boot_cpu_physical_apicid);
+
+   return APIC_PIC;
+   }
+#endif
+
+   /* Check MP table or ACPI MADT configuration */
+   if (!smp_found_config) {
+   disable_ioapic_support();
+
+   if (!acpi_lapic)
+   pr_info("APIC: ACPI MADT or MP tables are not 
detected\n");
+
+   

Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode

2017-09-06 Thread Dou Liyang

Hi Baoquan

I am wordy one ah:
our target is checking if BIOS supports APIC, no matter what
type(separated/integrated) it is. if not, go to PIC mode.

Let‘s discuss the original logic and the smp_found_config,
then take about your code.

The existing logic is:

if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1)
return -1;

if (!boot_cpu_has(X86_FEATURE_APIC) &&
APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2)
pr_err();

why smp_found_config has to be checked in (1)?

Because, In case of discrete (pretty old) apics we may not set
X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic
feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1]
So we assume that if SMP configuration is found from MP table
(smp_found_config = 1) in above case, there maybe a separated
chip in our pc.

After passing the check of (1), we in (2), check whether local APIC
is detected or not, If we have a BIOS bug.

[1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics")

At 09/06/2017 06:17 PM, Baoquan He wrote:

Hi Dou,

On 08/28/17 at 11:20am, Dou Liyang wrote:

+static int __init apic_intr_mode_select(void)
+{
+   /* Check kernel option */
+   if (disable_apic) {
+   pr_info("APIC disabled via kernel command line\n");
+   return APIC_PIC;
+   }
+


I am not very familiar with cpu registers, not sure if we can adjust
below code flow as:

/* If APIC is integrated, check local APIC only */
if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) {
disable_apic = 1;
pr_info("APIC disabled by BIOS\n");
return APIC_PIC;
}

/* If APIC is on a separate chip, check if smp_found_config is found*/
if (!lapic_is_integrated() && !smp_found_config) {
disable_apic = 1;
return APIC_PIC;
}


Yes, Awesome, we first consider it from APIC register space, then
the BOIS and software configration. let me do more investigation.

I rewrite it based on you, any comments will welcome.

/* If APIC is not integrated, check if SMP configuration is
 * found from MP table. If not too, no 82489DX. switch to
 * PIC mode
 *
 * Else APIC is integrated, check if the BIOS allows local APIC
 *
 */
if (!lapic_is_integrated()) {
if (!smp_found_config) {
disable_apic = 1;
return APIC_PIC;
}
} else if(!boot_cpu_has(X86_FEATURE_APIC)) {
disable_apic = 1;
pr_info("APIC disabled by BIOS\n");
return APIC_PIC;
}
}

BTW, As the macro APIC_INTEGRATED(x) has already wrapped by
CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity
like that:

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 7834f73..63b3ae9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -211,11 +211,7 @@ static inline int lapic_get_version(void)
  */
 static inline int lapic_is_integrated(void)
 {
-#ifdef CONFIG_X86_64
-   return 1;
-#else
return APIC_INTEGRATED(lapic_get_version());
-#endif
 }


Do you think so. ;-)


Thanks,
dou.



 Now, I haven't think of why smp_found_config has to be
 checked here.

In this way, we don't need the CONFIG_X86_64 checking since it's
contained in lapic_is_integrated() already. And the checking is obvious
for understanding. Just not very sure if the checking is adequate.

Just my personal opinion.


+   /* Check BIOS */
+#ifdef CONFIG_X86_64
+   /* On 64-bit, the APIC must be integrated, Check local APIC only */
+   if (!boot_cpu_has(X86_FEATURE_APIC)) {
+   disable_apic = 1;
+   pr_info("APIC disabled by BIOS\n");
+   return APIC_PIC;
+   }
+#else
+   /*
+* On 32-bit, check whether there is a separate chip or integrated
+* APIC
+*/
+
+   /* Has a separate chip ? */
+   if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) {
+   disable_apic = 1;
+
+   return APIC_PIC;
+   }
+
+   /* Has a local APIC ? */
+   if (!boot_cpu_has(X86_FEATURE_APIC) &&
+   APIC_INTEGRATED(boot_cpu_apic_version)) {
+   disable_apic = 1;
+   pr_err(FW_BUG "Local APIC %d not detected, force emulation\n",
+  boot_cpu_physical_apicid);
+
+   return APIC_PIC;
+   }
+#endif
+
+   /* Check MP table or ACPI MADT configuration */
+   if (!smp_found_config) {
+   disable_ioapic_support();
+
+   if (!acpi_lapic)
+   pr_info("APIC: ACPI MADT or MP tables are not 
detected\n");
+
+   

Re: [PATCH v2 1/3] arm64/ras: support sea error recovery

2017-09-06 Thread Xie XiuQi
Hi Borislav,

On 2017/9/6 18:12, Borislav Petkov wrote:
> On Tue, Sep 05, 2017 at 07:06:04PM +0800, Xie XiuQi wrote:
>> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors
>> are consumed. In some cases, if the error address is in a clean page or a
>> read-only page, there is a chance to recover. Such as error occurs in a
>> instruction page, we can reread this page from disk instead of killing 
>> process.
>>
>> Because memory_failure() may sleep, we can not call it directly in SEA 
>> exception
>> context. So we saved faulting physical address associated with a process in 
>> the
>> ghes handler and set __TIF_SEA_NOTIFY. When we return from SEA exception 
>> context
>> and get into do_notify_resume() before the process running, we could check it
>> and call memory_failure() to do recovery. It's safe, because we are in 
>> process
>> context.
>>
>> Signed-off-by: Xie XiuQi 
>> Signed-off-by: Wang Xiongfeng 
>> ---
>>  arch/arm64/Kconfig   |  11 +++
>>  arch/arm64/include/asm/ras.h |  36 +
>>  arch/arm64/include/asm/thread_info.h |   4 +-
>>  arch/arm64/kernel/Makefile   |   1 +
>>  arch/arm64/kernel/ras.c  | 143 
>> +++
>>  arch/arm64/kernel/signal.c   |   8 ++
>>  arch/arm64/mm/fault.c|  27 +--
>>  drivers/acpi/apei/ghes.c |   4 +-
>>  8 files changed, 223 insertions(+), 11 deletions(-)
>>  create mode 100644 arch/arm64/include/asm/ras.h
>>  create mode 100644 arch/arm64/kernel/ras.c
> 
> Please integrate scripts/checkpatch.pl into your patch creation workflow
> and run all patches through it before submitting:

Sorry for my mistake. I'll fix it, thanks.

> 
> ERROR: code indent should use tabs where possible
> #200: FILE: arch/arm64/kernel/ras.c:42:
> +atomic_tinuse;$
> 
> WARNING: please, no spaces at the start of a line
> #200: FILE: arch/arm64/kernel/ras.c:42:
> +atomic_tinuse;$
> 
> ERROR: code indent should use tabs where possible
> #201: FILE: arch/arm64/kernel/ras.c:43:
> +struct task_struct  *t;$
> 
> WARNING: please, no spaces at the start of a line
> #201: FILE: arch/arm64/kernel/ras.c:43:
> +struct task_struct  *t;$
> 
> ERROR: code indent should use tabs where possible
> #202: FILE: arch/arm64/kernel/ras.c:44:
> +__u64   paddr;$
> 
> WARNING: please, no spaces at the start of a line
> #202: FILE: arch/arm64/kernel/ras.c:44:
> +__u64   paddr;$
> 
> ERROR: code indent should use tabs where possible
> #207: FILE: arch/arm64/kernel/ras.c:49:
> +struct sea_info *si;$
> 
> WARNING: please, no spaces at the start of a line
> #207: FILE: arch/arm64/kernel/ras.c:49:
> +struct sea_info *si;$
> 
> ERROR: code indent should use tabs where possible
> #209: FILE: arch/arm64/kernel/ras.c:51:
> +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$
> 
> WARNING: please, no spaces at the start of a line
> #209: FILE: arch/arm64/kernel/ras.c:51:
> +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$
> 
> ERROR: code indent should use tabs where possible
> #210: FILE: arch/arm64/kernel/ras.c:52:
> +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$
> 
> WARNING: please, no spaces at the start of a line
> #210: FILE: arch/arm64/kernel/ras.c:52:
> +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$
> 
> ERROR: code indent should use tabs where possible
> #211: FILE: arch/arm64/kernel/ras.c:53:
> +si->t = current;$
> 
> WARNING: please, no spaces at the start of a line
> #211: FILE: arch/arm64/kernel/ras.c:53:
> +si->t = current;$
> 
> ERROR: code indent should use tabs where possible
> #212: FILE: arch/arm64/kernel/ras.c:54:
> +si->paddr = addr;$
> 
> WARNING: please, no spaces at the start of a line
> #212: FILE: arch/arm64/kernel/ras.c:54:
> +si->paddr = addr;$
> 
> ERROR: code indent should use tabs where possible
> #213: FILE: arch/arm64/kernel/ras.c:55:
> +return true;$
> 
> WARNING: please, no spaces at the start of a line
> #213: FILE: arch/arm64/kernel/ras.c:55:
> +return true;$
> 
> ERROR: code indent should use tabs where possible
> #214: FILE: arch/arm64/kernel/ras.c:56:
> +}$
> 
> WARNING: please, no spaces at the start of a line
> #214: FILE: arch/arm64/kernel/ras.c:56:
> +}$
> 
> ERROR: code indent should use tabs where possible
> #215: FILE: arch/arm64/kernel/ras.c:57:
> +}$
> 
> WARNING: please, no spaces at the start of a line
> #215: FILE: arch/arm64/kernel/ras.c:57:
> +}$
> 
> ERROR: code indent should use tabs where possible
> #223: FILE: arch/arm64/kernel/ras.c:65:
> +struct sea_info *si;$
> 
> WARNING: please, no spaces at the start of 

Re: [PATCH v2 1/3] arm64/ras: support sea error recovery

2017-09-06 Thread Xie XiuQi
Hi Borislav,

On 2017/9/6 18:12, Borislav Petkov wrote:
> On Tue, Sep 05, 2017 at 07:06:04PM +0800, Xie XiuQi wrote:
>> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors
>> are consumed. In some cases, if the error address is in a clean page or a
>> read-only page, there is a chance to recover. Such as error occurs in a
>> instruction page, we can reread this page from disk instead of killing 
>> process.
>>
>> Because memory_failure() may sleep, we can not call it directly in SEA 
>> exception
>> context. So we saved faulting physical address associated with a process in 
>> the
>> ghes handler and set __TIF_SEA_NOTIFY. When we return from SEA exception 
>> context
>> and get into do_notify_resume() before the process running, we could check it
>> and call memory_failure() to do recovery. It's safe, because we are in 
>> process
>> context.
>>
>> Signed-off-by: Xie XiuQi 
>> Signed-off-by: Wang Xiongfeng 
>> ---
>>  arch/arm64/Kconfig   |  11 +++
>>  arch/arm64/include/asm/ras.h |  36 +
>>  arch/arm64/include/asm/thread_info.h |   4 +-
>>  arch/arm64/kernel/Makefile   |   1 +
>>  arch/arm64/kernel/ras.c  | 143 
>> +++
>>  arch/arm64/kernel/signal.c   |   8 ++
>>  arch/arm64/mm/fault.c|  27 +--
>>  drivers/acpi/apei/ghes.c |   4 +-
>>  8 files changed, 223 insertions(+), 11 deletions(-)
>>  create mode 100644 arch/arm64/include/asm/ras.h
>>  create mode 100644 arch/arm64/kernel/ras.c
> 
> Please integrate scripts/checkpatch.pl into your patch creation workflow
> and run all patches through it before submitting:

Sorry for my mistake. I'll fix it, thanks.

> 
> ERROR: code indent should use tabs where possible
> #200: FILE: arch/arm64/kernel/ras.c:42:
> +atomic_tinuse;$
> 
> WARNING: please, no spaces at the start of a line
> #200: FILE: arch/arm64/kernel/ras.c:42:
> +atomic_tinuse;$
> 
> ERROR: code indent should use tabs where possible
> #201: FILE: arch/arm64/kernel/ras.c:43:
> +struct task_struct  *t;$
> 
> WARNING: please, no spaces at the start of a line
> #201: FILE: arch/arm64/kernel/ras.c:43:
> +struct task_struct  *t;$
> 
> ERROR: code indent should use tabs where possible
> #202: FILE: arch/arm64/kernel/ras.c:44:
> +__u64   paddr;$
> 
> WARNING: please, no spaces at the start of a line
> #202: FILE: arch/arm64/kernel/ras.c:44:
> +__u64   paddr;$
> 
> ERROR: code indent should use tabs where possible
> #207: FILE: arch/arm64/kernel/ras.c:49:
> +struct sea_info *si;$
> 
> WARNING: please, no spaces at the start of a line
> #207: FILE: arch/arm64/kernel/ras.c:49:
> +struct sea_info *si;$
> 
> ERROR: code indent should use tabs where possible
> #209: FILE: arch/arm64/kernel/ras.c:51:
> +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$
> 
> WARNING: please, no spaces at the start of a line
> #209: FILE: arch/arm64/kernel/ras.c:51:
> +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$
> 
> ERROR: code indent should use tabs where possible
> #210: FILE: arch/arm64/kernel/ras.c:52:
> +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$
> 
> WARNING: please, no spaces at the start of a line
> #210: FILE: arch/arm64/kernel/ras.c:52:
> +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$
> 
> ERROR: code indent should use tabs where possible
> #211: FILE: arch/arm64/kernel/ras.c:53:
> +si->t = current;$
> 
> WARNING: please, no spaces at the start of a line
> #211: FILE: arch/arm64/kernel/ras.c:53:
> +si->t = current;$
> 
> ERROR: code indent should use tabs where possible
> #212: FILE: arch/arm64/kernel/ras.c:54:
> +si->paddr = addr;$
> 
> WARNING: please, no spaces at the start of a line
> #212: FILE: arch/arm64/kernel/ras.c:54:
> +si->paddr = addr;$
> 
> ERROR: code indent should use tabs where possible
> #213: FILE: arch/arm64/kernel/ras.c:55:
> +return true;$
> 
> WARNING: please, no spaces at the start of a line
> #213: FILE: arch/arm64/kernel/ras.c:55:
> +return true;$
> 
> ERROR: code indent should use tabs where possible
> #214: FILE: arch/arm64/kernel/ras.c:56:
> +}$
> 
> WARNING: please, no spaces at the start of a line
> #214: FILE: arch/arm64/kernel/ras.c:56:
> +}$
> 
> ERROR: code indent should use tabs where possible
> #215: FILE: arch/arm64/kernel/ras.c:57:
> +}$
> 
> WARNING: please, no spaces at the start of a line
> #215: FILE: arch/arm64/kernel/ras.c:57:
> +}$
> 
> ERROR: code indent should use tabs where possible
> #223: FILE: arch/arm64/kernel/ras.c:65:
> +struct sea_info *si;$
> 
> WARNING: please, no spaces at the start of a line
> #223: FILE: arch/arm64/kernel/ras.c:65:

Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-06 Thread Ethan Zhao



On 2017/9/7 3:50, Luis R. Rodriguez wrote:

On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote:

Peter,


On 2017/9/4 15:49, Peter Zijlstra wrote:

On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote:

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 6648fbb..609bed2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, 
int write,
.data   = _sched_time_avg,
.maxlen = sizeof(unsigned int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec,
+   .proc_handler   = sched_time_avg_handler,

*sigh*, what's wrong with the below? Too easy?

  :),  seems I walked zigzag several cycles to get the right point.

It sounds like negative values are not possible too, right? If so then
you can use proc_douintvec_minmax().

 V4 has been sent days ago, please see v4.

 Thank,
 Ethan


   Luis





Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg

2017-09-06 Thread Ethan Zhao



On 2017/9/7 3:50, Luis R. Rodriguez wrote:

On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote:

Peter,


On 2017/9/4 15:49, Peter Zijlstra wrote:

On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote:

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 6648fbb..609bed2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, 
int write,
.data   = _sched_time_avg,
.maxlen = sizeof(unsigned int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec,
+   .proc_handler   = sched_time_avg_handler,

*sigh*, what's wrong with the below? Too easy?

  :),  seems I walked zigzag several cycles to get the right point.

It sounds like negative values are not possible too, right? If so then
you can use proc_douintvec_minmax().

 V4 has been sent days ago, please see v4.

 Thank,
 Ethan


   Luis





Re: Donation

2017-09-06 Thread Mavis Wanczyk Foundation
Greetings To You,

My Name is Mavis wanczyk , the winner of the Power ball jackpot of $ $758.7 
million  in the AUGUST 24, 2017, My jackpot was a gift from God to me hence my 
Entire family/foundation has AGREED to do this. My foundation is donating 
$500,000.00USD to you. please contac maviswanczyk...@gmail.com for full details 
and please accept this token as a gift from me and my family.

Read more: 
http://money.cnn.com/2017/08/23/news/powerball-700-million-jackpot/index.html

 Best Regards,
 Mavis Wanczyk


Re: Donation

2017-09-06 Thread Mavis Wanczyk Foundation
Greetings To You,

My Name is Mavis wanczyk , the winner of the Power ball jackpot of $ $758.7 
million  in the AUGUST 24, 2017, My jackpot was a gift from God to me hence my 
Entire family/foundation has AGREED to do this. My foundation is donating 
$500,000.00USD to you. please contac maviswanczyk...@gmail.com for full details 
and please accept this token as a gift from me and my family.

Read more: 
http://money.cnn.com/2017/08/23/news/powerball-700-million-jackpot/index.html

 Best Regards,
 Mavis Wanczyk


Re: [PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Andy Lutomirski


> On Sep 6, 2017, at 8:25 PM, Linus Torvalds  
> wrote:
> 
>> On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski  wrote:
>> Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
>> chasing down a false lead.
> 
> Yes, this seems to fix things for me. Thanks.
> 
> Of course, right now that laptop has no working wifi with tip-of-tree
> due to some issues with the networking tree, but that's an independent
> thing and I could suspend and resume with this. So applied and pushed
> out,
> 
>Linus


Great!

FWIW, there's still a possible glitch where doing EFI calls could corrupt ASID 
0.  I figured that fixing that could wait for tomorrow.

Re: [PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Andy Lutomirski


> On Sep 6, 2017, at 8:25 PM, Linus Torvalds  
> wrote:
> 
>> On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski  wrote:
>> Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
>> chasing down a false lead.
> 
> Yes, this seems to fix things for me. Thanks.
> 
> Of course, right now that laptop has no working wifi with tip-of-tree
> due to some issues with the networking tree, but that's an independent
> thing and I could suspend and resume with this. So applied and pushed
> out,
> 
>Linus


Great!

FWIW, there's still a possible glitch where doing EFI calls could corrupt ASID 
0.  I figured that fixing that could wait for tomorrow.

Re: [PATCH] dt-binding: phy: don't confuse with Ethernet phy properties

2017-09-06 Thread David Miller
From: Baruch Siach 
Date: Sun,  3 Sep 2017 17:32:16 +0300

> The generic PHY 'phys' property sometime appears in the same node with
> the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
> phy-bindings.txt to reduce confusion.
> 
> Signed-off-by: Baruch Siach 

Applied.


Re: [PATCH] dt-binding: phy: don't confuse with Ethernet phy properties

2017-09-06 Thread David Miller
From: Baruch Siach 
Date: Sun,  3 Sep 2017 17:32:16 +0300

> The generic PHY 'phys' property sometime appears in the same node with
> the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
> phy-bindings.txt to reduce confusion.
> 
> Signed-off-by: Baruch Siach 

Applied.


Re: [GIT] Networking

2017-09-06 Thread Coelho, Luciano
On Wed, 2017-09-06 at 16:27 -0700, Linus Torvalds wrote:
> This pull request completely breaks Intel wireless for me.
> 
> This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a).
> 
> That remains a very standard Intel machine with absolutely zero odd
> things going on.
> 
> The firmware is iwlwifi-8000C-28.ucode from
> iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports
> 
>   iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm
> 
> the thing starts acting badly with this:
> 
>   iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04
>   iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004
>   iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84
>   iwlwifi :3a:00.0: Microcode SW error detected.  Restarting 0x200.
>   iwlwifi :3a:00.0: Start IWL Error Log Dump:
>   iwlwifi :3a:00.0: Status: 0x0100, count: 6
>   iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0
>   iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND
>   iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0
>   ...
>   iwlwifi :3a:00.0: 0x | isr status reg
>   ieee80211 phy0: Hardware restart was requested
>   iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD

This seems to be a problem with backwards-compatibility with FW version
27.  We are now in version 31[1] and upgrading will probably fix that.

But obviously the driver should not fail miserably like this with
version 27, because it claims to support it still.

I'm looking into this now and will provide a fix asap.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/iwlwifi-8000C-31.ucode


--
Cheers,
Luca.

Re: [GIT] Networking

2017-09-06 Thread Coelho, Luciano
On Wed, 2017-09-06 at 16:27 -0700, Linus Torvalds wrote:
> This pull request completely breaks Intel wireless for me.
> 
> This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a).
> 
> That remains a very standard Intel machine with absolutely zero odd
> things going on.
> 
> The firmware is iwlwifi-8000C-28.ucode from
> iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports
> 
>   iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm
> 
> the thing starts acting badly with this:
> 
>   iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04
>   iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004
>   iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84
>   iwlwifi :3a:00.0: Microcode SW error detected.  Restarting 0x200.
>   iwlwifi :3a:00.0: Start IWL Error Log Dump:
>   iwlwifi :3a:00.0: Status: 0x0100, count: 6
>   iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0
>   iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND
>   iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0
>   ...
>   iwlwifi :3a:00.0: 0x | isr status reg
>   ieee80211 phy0: Hardware restart was requested
>   iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD

This seems to be a problem with backwards-compatibility with FW version
27.  We are now in version 31[1] and upgrading will probably fix that.

But obviously the driver should not fail miserably like this with
version 27, because it claims to support it still.

I'm looking into this now and will provide a fix asap.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/iwlwifi-8000C-31.ucode


--
Cheers,
Luca.

Re: iov_iter_pipe warning.

2017-09-06 Thread Dave Jones
On Thu, Sep 07, 2017 at 09:46:17AM +1000, Dave Chinner wrote:
 > On Wed, Sep 06, 2017 at 04:03:37PM -0400, Dave Jones wrote:
 > > On Mon, Aug 28, 2017 at 09:25:42PM -0700, Darrick J. Wong wrote:
 > >  > On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote:
 > >  > > I'm still trying to narrow down an exact reproducer, but it seems 
 > > having
 > >  > > trinity do a combination of sendfile & writev, with pipes and regular
 > >  > > files as fd's is the best repro.
 > >  > > 
 > >  > > Is this a real problem, or am I chasing ghosts ?  That it doesn't 
 > > happen
 > >  > > on ext4 or btrfs is making me wonder...
 > >  > 
 > >  >  I haven't heard of any problems w/ directio xfs lately, but OTOH
 > >  > I think it's the only filesystem that uses iomap_dio_rw, which would
 > >  > explain why ext4/btrfs don't have this problem.
 > > 
 > > Another warning, from likely the same root cause.
 > > 
 > > WARNING: CPU: 3 PID: 572 at lib/iov_iter.c:962 iov_iter_pipe+0xe2/0xf0
 > 
 >  WARN_ON(pipe->nrbufs == pipe->buffers);
 > 
 >  *  @nrbufs: the number of non-empty pipe buffers in this pipe
 >  *  @buffers: total number of buffers (should be a power of 2)
 > 
 > So that's warning that the pipe buffer is already full before we
 > try to read from the filesystem?
 > 
 > That doesn't seem like an XFS problem - it indicates the pipe we are
 > filling in generic_file_splice_read() is not being emptied by
 > whatever we are splicing the file data to

The puzzling part is this runs for a day on ext4 or btrfs, whereas I can
make xfs fall over pretty quickly.  As Darrick pointed out though, this
could be due to xfs being the only user of iomap_dio_rw.

I'm juggling a few other things right now, so probably not going to
have much time to dig further on this until after plumbers + 1 wk.

Dave



Re: iov_iter_pipe warning.

2017-09-06 Thread Dave Jones
On Thu, Sep 07, 2017 at 09:46:17AM +1000, Dave Chinner wrote:
 > On Wed, Sep 06, 2017 at 04:03:37PM -0400, Dave Jones wrote:
 > > On Mon, Aug 28, 2017 at 09:25:42PM -0700, Darrick J. Wong wrote:
 > >  > On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote:
 > >  > > I'm still trying to narrow down an exact reproducer, but it seems 
 > > having
 > >  > > trinity do a combination of sendfile & writev, with pipes and regular
 > >  > > files as fd's is the best repro.
 > >  > > 
 > >  > > Is this a real problem, or am I chasing ghosts ?  That it doesn't 
 > > happen
 > >  > > on ext4 or btrfs is making me wonder...
 > >  > 
 > >  >  I haven't heard of any problems w/ directio xfs lately, but OTOH
 > >  > I think it's the only filesystem that uses iomap_dio_rw, which would
 > >  > explain why ext4/btrfs don't have this problem.
 > > 
 > > Another warning, from likely the same root cause.
 > > 
 > > WARNING: CPU: 3 PID: 572 at lib/iov_iter.c:962 iov_iter_pipe+0xe2/0xf0
 > 
 >  WARN_ON(pipe->nrbufs == pipe->buffers);
 > 
 >  *  @nrbufs: the number of non-empty pipe buffers in this pipe
 >  *  @buffers: total number of buffers (should be a power of 2)
 > 
 > So that's warning that the pipe buffer is already full before we
 > try to read from the filesystem?
 > 
 > That doesn't seem like an XFS problem - it indicates the pipe we are
 > filling in generic_file_splice_read() is not being emptied by
 > whatever we are splicing the file data to

The puzzling part is this runs for a day on ext4 or btrfs, whereas I can
make xfs fall over pretty quickly.  As Darrick pointed out though, this
could be due to xfs being the only user of iomap_dio_rw.

I'm juggling a few other things right now, so probably not going to
have much time to dig further on this until after plumbers + 1 wk.

Dave



Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum ADI controller documentation

2017-09-06 Thread Baolin Wang
Hi Mark,

On 6 September 2017 at 22:59, Mark Brown  wrote:
> On Wed, Sep 06, 2017 at 02:10:43PM +0800, Baolin Wang wrote:
>
>> +- hwlocks: Reference to a phandle of a hwlock provider node.
>> +- hwlock-names: Reference to hwlock name strings defined in the same order
>> + as the hwlocks.
>
> What are these hwlocks protecting, and what names are expected?

I made one explanation in above sentence, I assume it is not clear.
Since we have multi-subsystems will use ADI to access analog chip,
when one system is reading/writing data by ADI, which should be under
one hardware spinlock protection to prevent other systems from
reading/writing data by ADI at the same time, or two parallel routine
of setting ADI registers will get incorrect results.

The hwspinlock name should be "adi", and I will make it clear in next version.

>
>> +Optional properties:
>> +- sprd,hw-channels: Specify the hardware channel number and mapped address
>> + for hardware channel accessing.
>
> What do these mean and how are the numbers and how will the binding be
> interpreted?

I also gave one explanation in above sentence, is it not clear? I try again.

ADI controller has 50 channels including 2 software read/write
channels and 48 hardware channels to access analog chip. For 2
software read/write channels, which means we should set ADI registers
to access analog chip. But For hardware channels, we can just mapped
one analog chip address to one hardware channel, then user can access
analog chip by hardware channel without setting ADI registers.

For this "sprd,hw-channels" property, the first value specifies the
channel id, and the second value specifies the address which is mapped
into analog chip space.

-- 
Baolin.wang
Best Regards


Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum ADI controller documentation

2017-09-06 Thread Baolin Wang
Hi Mark,

On 6 September 2017 at 22:59, Mark Brown  wrote:
> On Wed, Sep 06, 2017 at 02:10:43PM +0800, Baolin Wang wrote:
>
>> +- hwlocks: Reference to a phandle of a hwlock provider node.
>> +- hwlock-names: Reference to hwlock name strings defined in the same order
>> + as the hwlocks.
>
> What are these hwlocks protecting, and what names are expected?

I made one explanation in above sentence, I assume it is not clear.
Since we have multi-subsystems will use ADI to access analog chip,
when one system is reading/writing data by ADI, which should be under
one hardware spinlock protection to prevent other systems from
reading/writing data by ADI at the same time, or two parallel routine
of setting ADI registers will get incorrect results.

The hwspinlock name should be "adi", and I will make it clear in next version.

>
>> +Optional properties:
>> +- sprd,hw-channels: Specify the hardware channel number and mapped address
>> + for hardware channel accessing.
>
> What do these mean and how are the numbers and how will the binding be
> interpreted?

I also gave one explanation in above sentence, is it not clear? I try again.

ADI controller has 50 channels including 2 software read/write
channels and 48 hardware channels to access analog chip. For 2
software read/write channels, which means we should set ADI registers
to access analog chip. But For hardware channels, we can just mapped
one analog chip address to one hardware channel, then user can access
analog chip by hardware channel without setting ADI registers.

For this "sprd,hw-channels" property, the first value specifies the
channel id, and the second value specifies the address which is mapped
into analog chip space.

-- 
Baolin.wang
Best Regards


Re: [PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski  wrote:
> Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
> chasing down a false lead.

Yes, this seems to fix things for me. Thanks.

Of course, right now that laptop has no working wifi with tip-of-tree
due to some issues with the networking tree, but that's an independent
thing and I could suspend and resume with this. So applied and pushed
out,

Linus


Re: [PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Linus Torvalds
On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski  wrote:
> Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
> chasing down a false lead.

Yes, this seems to fix things for me. Thanks.

Of course, right now that laptop has no working wifi with tip-of-tree
due to some issues with the networking tree, but that's an independent
thing and I could suspend and resume with this. So applied and pushed
out,

Linus


[PATCH v5 3/3] perf config: Allow creating empty config set for config file autogeneration

2017-09-06 Thread Taeung Song
When there isn't a config file (e.g. ~/.perfconfig)
or it has nothing, the config set wasn't created.
If the config set not exists, a config file can't be autogenerated.

So allow creating a empty config set in the above case,
then we can support the config file autogeneration.

Before:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  cat: /root/.perfconfig: No such file or directory

But I think it should work even if there isn't a config file.

After:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  # this file is auto-generated.
  [report]
  children = false

NOTE:
As a result, if perf_config_set__init() is failed,
it seems that the config set isn't freed. But it isn't a problem.
Because the config set  will be freed by perf_config_set__delete()
at the end of cmd_config().

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/util/config.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index bc75596..d2b6983 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -700,10 +700,7 @@ struct perf_config_set *perf_config_set__new(void)
 
if (set) {
INIT_LIST_HEAD(>sections);
-   if (perf_config_set__init(set) < 0) {
-   perf_config_set__delete(set);
-   set = NULL;
-   }
+   perf_config_set__init(set);
}
 
return set;
-- 
2.7.4



[PATCH v5 1/3] perf config: Check not only section->from_system_config but also item's

2017-09-06 Thread Taeung Song
Currently only section->from_system_config is being checked multiple times.
items->from_system_config should be also checked, so fix it.

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/builtin-config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 3ddcc6e..a1d82e3 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -59,7 +59,7 @@ static int set_config(struct perf_config_set *set, const char 
*file_name,
fprintf(fp, "[%s]\n", section->name);
 
perf_config_items__for_each_entry(>items, item) {
-   if (!use_system_config && section->from_system_config)
+   if (!use_system_config && item->from_system_config)
continue;
if (item->value)
fprintf(fp, "\t%s = %s\n",
-- 
2.7.4



[PATCH v5 3/3] perf config: Allow creating empty config set for config file autogeneration

2017-09-06 Thread Taeung Song
When there isn't a config file (e.g. ~/.perfconfig)
or it has nothing, the config set wasn't created.
If the config set not exists, a config file can't be autogenerated.

So allow creating a empty config set in the above case,
then we can support the config file autogeneration.

Before:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  cat: /root/.perfconfig: No such file or directory

But I think it should work even if there isn't a config file.

After:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  # this file is auto-generated.
  [report]
  children = false

NOTE:
As a result, if perf_config_set__init() is failed,
it seems that the config set isn't freed. But it isn't a problem.
Because the config set  will be freed by perf_config_set__delete()
at the end of cmd_config().

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/util/config.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index bc75596..d2b6983 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -700,10 +700,7 @@ struct perf_config_set *perf_config_set__new(void)
 
if (set) {
INIT_LIST_HEAD(>sections);
-   if (perf_config_set__init(set) < 0) {
-   perf_config_set__delete(set);
-   set = NULL;
-   }
+   perf_config_set__init(set);
}
 
return set;
-- 
2.7.4



[PATCH v5 1/3] perf config: Check not only section->from_system_config but also item's

2017-09-06 Thread Taeung Song
Currently only section->from_system_config is being checked multiple times.
items->from_system_config should be also checked, so fix it.

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/builtin-config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 3ddcc6e..a1d82e3 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -59,7 +59,7 @@ static int set_config(struct perf_config_set *set, const char 
*file_name,
fprintf(fp, "[%s]\n", section->name);
 
perf_config_items__for_each_entry(>items, item) {
-   if (!use_system_config && section->from_system_config)
+   if (!use_system_config && item->from_system_config)
continue;
if (item->value)
fprintf(fp, "\t%s = %s\n",
-- 
2.7.4



[PATCH v5 0/3] perf config: Simple bugfixes & refactoring

2017-09-06 Thread Taeung Song
Hi all,

This is simple patchset for perf-config
to fix small bugs and refactor code.

I'd appreciate some feedback on this patchset.

The code is also available at 'config/refactoring-v5' branch on

  git://github.com/taeung/linux-perf.git


Thanks,
Taeung

v5:
- rebase on current acme/perf/core
- adjust commit log messages to be more proper

v4:
- rebase on current acme/perf/core
- simplify commit log messages
- remove needless two patches

v3:
- fix a bug of no checked 'ret' in the loop in cmd_config() (Arnaldo)
- modify commit log messages to be more clear (Aranaldo)
- return -1 if show_spec_config() cannot show the config
- initialize 'ret' with -1 instead of 0 for more compact code in cmd_config()
- Add a error message when perf_config_set__new() failed in cmd_config()

v2:
- there is no need to consider empty config file (Arnaldo)

Taeung Song (3):
  perf config: Check not only section->from_system_config but also
item's
  perf config: Once write a config file in the end, not a repeat
  perf config: Allow creating empty config set for config file
autogeneration

 tools/perf/builtin-config.c | 24 +---
 tools/perf/util/config.c|  5 +
 2 files changed, 18 insertions(+), 11 deletions(-)

-- 
2.7.4



[PATCH v5 2/3] perf config: Once write a config file in the end, not a repeat

2017-09-06 Thread Taeung Song
Currently set_config() can be repeatedly called for each
input config on the below case:

  $ perf config kmem.default=slab report.children=false ...

But it's a waste, so only once write a config file
gathering all given config key=value pairs

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/builtin-config.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index a1d82e3..b89417d 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -34,8 +34,7 @@ static struct option config_options[] = {
OPT_END()
 };
 
-static int set_config(struct perf_config_set *set, const char *file_name,
- const char *var, const char *value)
+static int set_config(struct perf_config_set *set, const char *file_name)
 {
struct perf_config_section *section = NULL;
struct perf_config_item *item = NULL;
@@ -49,7 +48,6 @@ static int set_config(struct perf_config_set *set, const char 
*file_name,
if (!fp)
return -1;
 
-   perf_config_set__collect(set, file_name, var, value);
fprintf(fp, "%s\n", first_line);
 
/* overwrite configvariables */
@@ -161,6 +159,7 @@ int cmd_config(int argc, const char **argv)
struct perf_config_set *set;
char *user_config = mkpath("%s/.perfconfig", getenv("HOME"));
const char *config_filename;
+   bool changed = false;
 
argc = parse_options(argc, argv, config_options, config_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
@@ -231,15 +230,26 @@ int cmd_config(int argc, const char **argv)
goto out_err;
}
} else {
-   if (set_config(set, config_filename, var, 
value) < 0) {
-   pr_err("Failed to set '%s=%s' on %s\n",
-  var, value, config_filename);
+   if (perf_config_set__collect(set, 
config_filename,
+var, value) < 0) {
+   pr_err("Failed to add '%s=%s'\n",
+  var, value);
free(arg);
goto out_err;
}
+   changed = true;
}
free(arg);
}
+
+   if (!changed)
+   break;
+
+   if (set_config(set, config_filename) < 0) {
+   pr_err("Failed to set the configs on %s\n",
+  config_filename);
+   goto out_err;
+   }
}
 
ret = 0;
-- 
2.7.4



[PATCH v5 2/3] perf config: Once write a config file in the end, not a repeat

2017-09-06 Thread Taeung Song
Currently set_config() can be repeatedly called for each
input config on the below case:

  $ perf config kmem.default=slab report.children=false ...

But it's a waste, so only once write a config file
gathering all given config key=value pairs

Cc: Jiri Olsa 
Cc: Namhyung Kim 
Signed-off-by: Taeung Song 
---
 tools/perf/builtin-config.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index a1d82e3..b89417d 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -34,8 +34,7 @@ static struct option config_options[] = {
OPT_END()
 };
 
-static int set_config(struct perf_config_set *set, const char *file_name,
- const char *var, const char *value)
+static int set_config(struct perf_config_set *set, const char *file_name)
 {
struct perf_config_section *section = NULL;
struct perf_config_item *item = NULL;
@@ -49,7 +48,6 @@ static int set_config(struct perf_config_set *set, const char 
*file_name,
if (!fp)
return -1;
 
-   perf_config_set__collect(set, file_name, var, value);
fprintf(fp, "%s\n", first_line);
 
/* overwrite configvariables */
@@ -161,6 +159,7 @@ int cmd_config(int argc, const char **argv)
struct perf_config_set *set;
char *user_config = mkpath("%s/.perfconfig", getenv("HOME"));
const char *config_filename;
+   bool changed = false;
 
argc = parse_options(argc, argv, config_options, config_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
@@ -231,15 +230,26 @@ int cmd_config(int argc, const char **argv)
goto out_err;
}
} else {
-   if (set_config(set, config_filename, var, 
value) < 0) {
-   pr_err("Failed to set '%s=%s' on %s\n",
-  var, value, config_filename);
+   if (perf_config_set__collect(set, 
config_filename,
+var, value) < 0) {
+   pr_err("Failed to add '%s=%s'\n",
+  var, value);
free(arg);
goto out_err;
}
+   changed = true;
}
free(arg);
}
+
+   if (!changed)
+   break;
+
+   if (set_config(set, config_filename) < 0) {
+   pr_err("Failed to set the configs on %s\n",
+  config_filename);
+   goto out_err;
+   }
}
 
ret = 0;
-- 
2.7.4



[PATCH v5 0/3] perf config: Simple bugfixes & refactoring

2017-09-06 Thread Taeung Song
Hi all,

This is simple patchset for perf-config
to fix small bugs and refactor code.

I'd appreciate some feedback on this patchset.

The code is also available at 'config/refactoring-v5' branch on

  git://github.com/taeung/linux-perf.git


Thanks,
Taeung

v5:
- rebase on current acme/perf/core
- adjust commit log messages to be more proper

v4:
- rebase on current acme/perf/core
- simplify commit log messages
- remove needless two patches

v3:
- fix a bug of no checked 'ret' in the loop in cmd_config() (Arnaldo)
- modify commit log messages to be more clear (Aranaldo)
- return -1 if show_spec_config() cannot show the config
- initialize 'ret' with -1 instead of 0 for more compact code in cmd_config()
- Add a error message when perf_config_set__new() failed in cmd_config()

v2:
- there is no need to consider empty config file (Arnaldo)

Taeung Song (3):
  perf config: Check not only section->from_system_config but also
item's
  perf config: Once write a config file in the end, not a repeat
  perf config: Allow creating empty config set for config file
autogeneration

 tools/perf/builtin-config.c | 24 +---
 tools/perf/util/config.c|  5 +
 2 files changed, 18 insertions(+), 11 deletions(-)

-- 
2.7.4



Re: [PATCH 2/2] spi: Add ADI driver for Spreadtrum platform

2017-09-06 Thread Baolin Wang
Hi Mark,

On 6 September 2017 at 23:04, Mark Brown  wrote:
> On Wed, Sep 06, 2017 at 02:10:44PM +0800, Baolin Wang wrote:
>
> This looks like a nice, clean driver.  A few fairly small issues though:
>
>> +config SPI_SPRD_ADI
>> + tristate "Spreadtrum ADI controller"
>> + depends on ARCH_SPRD
>> + help
>> +   ADI driver based on SPI for Spreadtrum SoCs.
>> +
>
> I can't see any hard dependencies on the architecture - can you add an
> || COMPILE_TEST for more coverage?

Yes.

>
>> + ret = devm_spi_register_controller(>dev, ctlr);
>> + if (ret) {
>> + dev_err(>dev, "failed to register SPI controller\n");
>> + goto free_hwlock;
>> + }
>
>> + spi_controller_put(ctlr);
>
> You used devm_ to register the controller, no need to free it
> explicitly.

Indeed.

>
>> +static int __init sprd_adi_init(void)
>> +{
>> + return platform_driver_register(_adi_driver);
>> +}
>> +subsys_initcall(sprd_adi_init);
>
> Why is this subsys_initcall() and not module_platform_driver()?

Since ADI is one very fundamental driver for our SoC, many drivers
such as regulator need depend on ADI, and regulator need to regulate
core voltage as earlier as possible.
Very appreciated for your comments.

-- 
Baolin.wang
Best Regards


Re: [PATCH 2/2] spi: Add ADI driver for Spreadtrum platform

2017-09-06 Thread Baolin Wang
Hi Mark,

On 6 September 2017 at 23:04, Mark Brown  wrote:
> On Wed, Sep 06, 2017 at 02:10:44PM +0800, Baolin Wang wrote:
>
> This looks like a nice, clean driver.  A few fairly small issues though:
>
>> +config SPI_SPRD_ADI
>> + tristate "Spreadtrum ADI controller"
>> + depends on ARCH_SPRD
>> + help
>> +   ADI driver based on SPI for Spreadtrum SoCs.
>> +
>
> I can't see any hard dependencies on the architecture - can you add an
> || COMPILE_TEST for more coverage?

Yes.

>
>> + ret = devm_spi_register_controller(>dev, ctlr);
>> + if (ret) {
>> + dev_err(>dev, "failed to register SPI controller\n");
>> + goto free_hwlock;
>> + }
>
>> + spi_controller_put(ctlr);
>
> You used devm_ to register the controller, no need to free it
> explicitly.

Indeed.

>
>> +static int __init sprd_adi_init(void)
>> +{
>> + return platform_driver_register(_adi_driver);
>> +}
>> +subsys_initcall(sprd_adi_init);
>
> Why is this subsys_initcall() and not module_platform_driver()?

Since ADI is one very fundamental driver for our SoC, many drivers
such as regulator need depend on ADI, and regulator need to regulate
core voltage as earlier as possible.
Very appreciated for your comments.

-- 
Baolin.wang
Best Regards


[PATCH] ipv4: Namespaceify tcp_max_orphans knob

2017-09-06 Thread Haishuang Yan
Different namespace application might require different maximal number
of TCP sockets independently of the host.

Signed-off-by: Haishuang Yan 
---
 include/net/netns/ipv4.h   |  1 +
 include/net/tcp.h  |  5 +++--
 net/ipv4/sysctl_net_ipv4.c | 14 +++---
 net/ipv4/tcp.c |  3 ---
 net/ipv4/tcp_input.c   |  1 -
 net/ipv4/tcp_ipv4.c|  1 +
 6 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 20d061c..305e031 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -127,6 +127,7 @@ struct netns_ipv4 {
int sysctl_tcp_timestamps;
struct inet_timewait_death_row tcp_death_row;
int sysctl_max_syn_backlog;
+   int sysctl_tcp_max_orphans;
 
 #ifdef CONFIG_NET_L3_MASTER_DEV
int sysctl_udp_l3mdev_accept;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index b510f28..ac2d998 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -320,10 +320,11 @@ static inline bool tcp_too_many_orphans(struct sock *sk, 
int shift)
 {
struct percpu_counter *ocp = sk->sk_prot->orphan_count;
int orphans = percpu_counter_read_positive(ocp);
+   int tcp_max_orphans = sock_net(sk)->ipv4.sysctl_tcp_max_orphans;
 
-   if (orphans << shift > sysctl_tcp_max_orphans) {
+   if (orphans << shift > tcp_max_orphans) {
orphans = percpu_counter_sum_positive(ocp);
-   if (orphans << shift > sysctl_tcp_max_orphans)
+   if (orphans << shift > tcp_max_orphans)
return true;
}
return false;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0d3c038..4f26c8d3 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -394,13 +394,6 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.proc_handler   = proc_dointvec
},
{
-   .procname   = "tcp_max_orphans",
-   .data   = _tcp_max_orphans,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec
-   },
-   {
.procname   = "tcp_fastopen",
.data   = _tcp_fastopen,
.maxlen = sizeof(int),
@@ -1085,6 +1078,13 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.mode   = 0644,
.proc_handler   = proc_dointvec
},
+   {
+   .procname   = "tcp_max_orphans",
+   .data   = _net.ipv4.sysctl_tcp_max_orphans,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec
+   },
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
{
.procname   = "fib_multipath_use_neigh",
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5091402..39187ac 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3522,9 +3522,6 @@ void __init tcp_init(void)
}
 
 
-   cnt = tcp_hashinfo.ehash_mask + 1;
-   sysctl_tcp_max_orphans = cnt / 2;
-
tcp_init_mem();
/* Set per-socket limits to no more than 1/128 the pressure threshold */
limit = nr_free_buffer_pages() << (PAGE_SHIFT - 7);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c5d7656..0230509 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -88,7 +88,6 @@
 
 int sysctl_tcp_stdurg __read_mostly;
 int sysctl_tcp_rfc1337 __read_mostly;
-int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
 int sysctl_tcp_frto __read_mostly = 2;
 int sysctl_tcp_min_rtt_wlen __read_mostly = 300;
 int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a63486a..4b17a91 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2468,6 +2468,7 @@ static int __net_init tcp_sk_init(struct net *net)
net->ipv4.tcp_death_row.hashinfo = _hashinfo;
 
net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256);
+   net->ipv4.sysctl_tcp_max_orphans = cnt / 2;
net->ipv4.sysctl_tcp_sack = 1;
net->ipv4.sysctl_tcp_window_scaling = 1;
net->ipv4.sysctl_tcp_timestamps = 1;
-- 
1.8.3.1





[PATCH] ipv4: Namespaceify tcp_max_orphans knob

2017-09-06 Thread Haishuang Yan
Different namespace application might require different maximal number
of TCP sockets independently of the host.

Signed-off-by: Haishuang Yan 
---
 include/net/netns/ipv4.h   |  1 +
 include/net/tcp.h  |  5 +++--
 net/ipv4/sysctl_net_ipv4.c | 14 +++---
 net/ipv4/tcp.c |  3 ---
 net/ipv4/tcp_input.c   |  1 -
 net/ipv4/tcp_ipv4.c|  1 +
 6 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 20d061c..305e031 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -127,6 +127,7 @@ struct netns_ipv4 {
int sysctl_tcp_timestamps;
struct inet_timewait_death_row tcp_death_row;
int sysctl_max_syn_backlog;
+   int sysctl_tcp_max_orphans;
 
 #ifdef CONFIG_NET_L3_MASTER_DEV
int sysctl_udp_l3mdev_accept;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index b510f28..ac2d998 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -320,10 +320,11 @@ static inline bool tcp_too_many_orphans(struct sock *sk, 
int shift)
 {
struct percpu_counter *ocp = sk->sk_prot->orphan_count;
int orphans = percpu_counter_read_positive(ocp);
+   int tcp_max_orphans = sock_net(sk)->ipv4.sysctl_tcp_max_orphans;
 
-   if (orphans << shift > sysctl_tcp_max_orphans) {
+   if (orphans << shift > tcp_max_orphans) {
orphans = percpu_counter_sum_positive(ocp);
-   if (orphans << shift > sysctl_tcp_max_orphans)
+   if (orphans << shift > tcp_max_orphans)
return true;
}
return false;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0d3c038..4f26c8d3 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -394,13 +394,6 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.proc_handler   = proc_dointvec
},
{
-   .procname   = "tcp_max_orphans",
-   .data   = _tcp_max_orphans,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec
-   },
-   {
.procname   = "tcp_fastopen",
.data   = _tcp_fastopen,
.maxlen = sizeof(int),
@@ -1085,6 +1078,13 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
.mode   = 0644,
.proc_handler   = proc_dointvec
},
+   {
+   .procname   = "tcp_max_orphans",
+   .data   = _net.ipv4.sysctl_tcp_max_orphans,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec
+   },
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
{
.procname   = "fib_multipath_use_neigh",
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5091402..39187ac 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3522,9 +3522,6 @@ void __init tcp_init(void)
}
 
 
-   cnt = tcp_hashinfo.ehash_mask + 1;
-   sysctl_tcp_max_orphans = cnt / 2;
-
tcp_init_mem();
/* Set per-socket limits to no more than 1/128 the pressure threshold */
limit = nr_free_buffer_pages() << (PAGE_SHIFT - 7);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c5d7656..0230509 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -88,7 +88,6 @@
 
 int sysctl_tcp_stdurg __read_mostly;
 int sysctl_tcp_rfc1337 __read_mostly;
-int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
 int sysctl_tcp_frto __read_mostly = 2;
 int sysctl_tcp_min_rtt_wlen __read_mostly = 300;
 int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a63486a..4b17a91 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2468,6 +2468,7 @@ static int __net_init tcp_sk_init(struct net *net)
net->ipv4.tcp_death_row.hashinfo = _hashinfo;
 
net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256);
+   net->ipv4.sysctl_tcp_max_orphans = cnt / 2;
net->ipv4.sysctl_tcp_sack = 1;
net->ipv4.sysctl_tcp_window_scaling = 1;
net->ipv4.sysctl_tcp_timestamps = 1;
-- 
1.8.3.1





Re: [PATCH v3 2/5] dt-bindings: input: Add document bindings for mtk-pmic-keys

2017-09-06 Thread Chen Zhong
On Tue, 2017-09-05 at 11:05 -0500, Rob Herring wrote:
> On Fri, Sep 1, 2017 at 9:16 PM, Chen Zhong  wrote:
> > On Thu, 2017-08-31 at 14:52 -0500, Rob Herring wrote:
> >> On Fri, Aug 25, 2017 at 02:32:30PM +0800, Chen Zhong wrote:
> >> > This patch adds the device tree binding documentation for the MediaTek
> >> > pmic keys found on PMIC MT6397/MT6323.
> >> >
> >> > Signed-off-by: Chen Zhong 
> >> > ---
> >> >  .../devicetree/bindings/input/mtk-pmic-keys.txt|   38 
> >> > 
> >> >  1 file changed, 38 insertions(+)
> >> >  create mode 100644 
> >> > Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt 
> >> > b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> > new file mode 100644
> >> > index 000..100ec44
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> > @@ -0,0 +1,38 @@
> >> > +MediaTek MT6397/MT6323 PMIC Keys Device Driver
> >> > +
> >> > +There are two key functions provided by MT6397/MT6323 PMIC, pwrkey
> >> > +and homekey. The key functions are defined as the subnode of the 
> >> > function
> >> > +node provided by MT6397/MT6323 PMIC that is being defined as one kind
> >> > +of Muti-Function Device (MFD)
> >> > +
> >> > +For MT6397/MT6323 MFD bindings see:
> >> > +Documentation/devicetree/bindings/mfd/mt6397.txt
> >> > +
> >> > +Required properties:
> >> > +- compatible: "mediatek,mt6397-keys" or "mediatek,mt6323-keys"
> >> > +- linux,keycodes: Specifies the numeric keycode values to
> >> > +   be used for reporting keys presses. The array can
> >> > +   contain up to 2 entries.
> >> > +
> >> > +Optional Properties:
> >> > +- wakeup-source: each key can be used as a wakeup source.
> >>
> >> wakeup-source is defined as a boolean.
> >
> > Hi Rob,
> >
> > Could I modify it as this?
> >
> > mediatek,wakeup-keys = <1>, <0>;
> > wakeup-source;
> 
> What do the values and index signify? The power key is index 0 and the
> value 1 means enable wakeup? Or each value is the raw key (i.e.
> indexes in linux,keycode) that wakeup is enabled for?
> 
> I don't think this should be in DT really. It's really up to the user
> (i.e. userspace) to decide what keys cause wakeup (or maybe that's
> only suspend). If you default to the power key causes wakeup, do you
> really need to support different options?
> 
> If we do put this in DT, then it should be a common binding to specify
> keys that cause wake-up.
> 
> Rob

Hi Rob,

Yes, we want to describe that power key is index 0 and is a wakeup
source, homekey is index 1 and not a wakeup source.

Since power key and homekey are two real HW keys, customer can decide
which key to be the wakeup source or both due to their hw design, so we
put this in DT and can be different for different boards.

Thank you.

Chen




Re: [PATCH v3 2/5] dt-bindings: input: Add document bindings for mtk-pmic-keys

2017-09-06 Thread Chen Zhong
On Tue, 2017-09-05 at 11:05 -0500, Rob Herring wrote:
> On Fri, Sep 1, 2017 at 9:16 PM, Chen Zhong  wrote:
> > On Thu, 2017-08-31 at 14:52 -0500, Rob Herring wrote:
> >> On Fri, Aug 25, 2017 at 02:32:30PM +0800, Chen Zhong wrote:
> >> > This patch adds the device tree binding documentation for the MediaTek
> >> > pmic keys found on PMIC MT6397/MT6323.
> >> >
> >> > Signed-off-by: Chen Zhong 
> >> > ---
> >> >  .../devicetree/bindings/input/mtk-pmic-keys.txt|   38 
> >> > 
> >> >  1 file changed, 38 insertions(+)
> >> >  create mode 100644 
> >> > Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt 
> >> > b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> > new file mode 100644
> >> > index 000..100ec44
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt
> >> > @@ -0,0 +1,38 @@
> >> > +MediaTek MT6397/MT6323 PMIC Keys Device Driver
> >> > +
> >> > +There are two key functions provided by MT6397/MT6323 PMIC, pwrkey
> >> > +and homekey. The key functions are defined as the subnode of the 
> >> > function
> >> > +node provided by MT6397/MT6323 PMIC that is being defined as one kind
> >> > +of Muti-Function Device (MFD)
> >> > +
> >> > +For MT6397/MT6323 MFD bindings see:
> >> > +Documentation/devicetree/bindings/mfd/mt6397.txt
> >> > +
> >> > +Required properties:
> >> > +- compatible: "mediatek,mt6397-keys" or "mediatek,mt6323-keys"
> >> > +- linux,keycodes: Specifies the numeric keycode values to
> >> > +   be used for reporting keys presses. The array can
> >> > +   contain up to 2 entries.
> >> > +
> >> > +Optional Properties:
> >> > +- wakeup-source: each key can be used as a wakeup source.
> >>
> >> wakeup-source is defined as a boolean.
> >
> > Hi Rob,
> >
> > Could I modify it as this?
> >
> > mediatek,wakeup-keys = <1>, <0>;
> > wakeup-source;
> 
> What do the values and index signify? The power key is index 0 and the
> value 1 means enable wakeup? Or each value is the raw key (i.e.
> indexes in linux,keycode) that wakeup is enabled for?
> 
> I don't think this should be in DT really. It's really up to the user
> (i.e. userspace) to decide what keys cause wakeup (or maybe that's
> only suspend). If you default to the power key causes wakeup, do you
> really need to support different options?
> 
> If we do put this in DT, then it should be a common binding to specify
> keys that cause wake-up.
> 
> Rob

Hi Rob,

Yes, we want to describe that power key is index 0 and is a wakeup
source, homekey is index 1 and not a wakeup source.

Since power key and homekey are two real HW keys, customer can decide
which key to be the wakeup source or both due to their hw design, so we
put this in DT and can be different for different boards.

Thank you.

Chen




Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver

2017-09-06 Thread Baolin Wang
On 7 September 2017 at 10:48, Baolin Wang  wrote:
> Hi Vinod,
>
> On 7 September 2017 at 00:22, Vinod Koul  wrote:
>> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote:
>>
>>> > > +/* DMA global registers definition */
>>> > > +#define DMA_GLB_PAUSE0x0
>>> > > +#define DMA_GLB_FRAG_WAIT0x4
>>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8
>>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc
>>> > > +#define DMA_GLB_INT_RAW_STS  0x10
>>> > > +#define DMA_GLB_INT_MSK_STS  0x14
>>> > > +#define DMA_GLB_REQ_STS  0x18
>>> > > +#define DMA_GLB_CHN_EN_STS   0x1c
>>> > > +#define DMA_GLB_DEBUG_STS0x20
>>> > > +#define DMA_GLB_ARB_SEL_STS  0x24
>>> > > +#define DMA_GLB_CHN_START_CHN_CFG1   0x28
>>> > > +#define DMA_GLB_CHN_START_CHN_CFG2   0x2c
>>> > > +#define DMA_CHN_LLIST_OFFSET 0x10
>>> > > +#define DMA_GLB_REQ_CID_OFFSET   0x2000
>>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1))
>>> >
>>> > namespaced properly please, here and rest..
>>>
>>> Could you elaborate which name need to named properly?
>>> I guess DMA_GLB_REQ_CID is not very clear, can I change
>>> to DMA_GLB_REQ_UID(uid) which is used to define the UID
>>> registers?
>>
>> I meant something like:
>>
>> SPRD_DMA_
>>
>> DMA_GLB is very generic term and might cause collisions in future so lets
>> make these future proof..
>
> Make sense.
>
>>
>>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev)
>>> > > +{
>>> > > + int ret;
>>> > > +
>>> > > + ret = clk_prepare_enable(sdev->clk);
>>> > > + if (ret)
>>> > > + return ret;
>>> > > +
>>> > > + if (!IS_ERR(sdev->ashb_clk))
>>> >
>>> > that looks odd, can you explain this?
>>>
>>> Since the ashb_clk is optional and only for AGCP DMA controller,
>>> so here we add one condition to check if the ashb_clk need enable.
>>
>> it be worth documenting this..
>
> OK.
>
>>
>>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan)
>>> > > +{
>>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>>> > > + unsigned long flags;
>>> > > +
>>> > > + spin_lock_irqsave(>vc.lock, flags);
>>> > > + sprd_dma_stop(schan);
>>> > > + spin_unlock_irqrestore(>vc.lock, flags);
>>> > > +
>>> > > + vchan_free_chan_resources(>vc);
>>> > > + pm_runtime_put_sync(chan->device->dev);
>>> >
>>> > why put_sync()
>>>
>>> Since we will get pm counter to resume DMA when allocating resources, then
>>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend
>>> DMA if the pm counter is 0.
>>
>> runtime_pm part is fine, you could have used pm_runtime_put(), why the
>> sync() variant here...
>
> After checking again, I agree pm_runtime_put() is enough here and I
> will fix it in next version.
>
>>>
>>> >
>>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan,
>>> > > +   dma_cookie_t cookie,
>>> > > +   struct dma_tx_state *txstate)
>>> > > +{
>>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>>> > > + unsigned long flags;
>>> > > + enum dma_status ret;
>>> > > +
>>> > > + ret = dma_cookie_status(chan, cookie, txstate);
>>> > > +
>>> > > + spin_lock_irqsave(>vc.lock, flags);
>>> > > + txstate->residue = sprd_dma_get_dst_addr(schan);
>>> >
>>> > I dont think this is correct, the residue needs to be read only for 
>>> > current
>>> > cookie and the query might be for one which is not even submitted
>>>
>>> We have one scenario for our audio driver, the audio driver need to get
>>> the destination address to check if their transfer is done in no irq mode.
>>
>> Yes but for the descriptor requested but not any. Audio maybe working as
>> period count maybe lesser...

Sorry for missing one comment. I know what is your meaning, now I
understand the residue needs to be read only for current cookie and I
will fix this in next version.

>>
>>> > > +static int sprd_dma_probe(struct platform_device *pdev)
>>> > > +{
>>> > > + struct device_node *np = pdev->dev.of_node;
>>> > > + struct sprd_dma_dev *sdev;
>>> > > + struct sprd_dma_chn *dma_chn;
>>> > > + struct resource *res;
>>> > > + u32 chn_count;
>>> > > + int ret, i;
>>> > > +
>>> > > + ret = of_property_read_u32(np, "#dma-channels", _count);
>>> > > + if (ret) {
>>> > > + dev_err(>dev, "get dma channels count failed\n");
>>> > > + return ret;
>>> > > + }
>>> > > +
>>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) +
>>> > > + (sizeof(struct sprd_dma_chn) * chn_count)),
>>> > > + GFP_KERNEL);
>>> > > + if (!sdev)
>>> > > + return -ENOMEM;
>>> > > +
>>> > > + sdev->clk = devm_clk_get(>dev, "enable");
>>> > > + if (IS_ERR(sdev->clk)) {
>>> > > + dev_err(>dev, "get enable clock failed\n");
>>> > > + return PTR_ERR(sdev->clk);
>>> > > + }
>>> > > +
>>> 

Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver

2017-09-06 Thread Baolin Wang
On 7 September 2017 at 10:48, Baolin Wang  wrote:
> Hi Vinod,
>
> On 7 September 2017 at 00:22, Vinod Koul  wrote:
>> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote:
>>
>>> > > +/* DMA global registers definition */
>>> > > +#define DMA_GLB_PAUSE0x0
>>> > > +#define DMA_GLB_FRAG_WAIT0x4
>>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8
>>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc
>>> > > +#define DMA_GLB_INT_RAW_STS  0x10
>>> > > +#define DMA_GLB_INT_MSK_STS  0x14
>>> > > +#define DMA_GLB_REQ_STS  0x18
>>> > > +#define DMA_GLB_CHN_EN_STS   0x1c
>>> > > +#define DMA_GLB_DEBUG_STS0x20
>>> > > +#define DMA_GLB_ARB_SEL_STS  0x24
>>> > > +#define DMA_GLB_CHN_START_CHN_CFG1   0x28
>>> > > +#define DMA_GLB_CHN_START_CHN_CFG2   0x2c
>>> > > +#define DMA_CHN_LLIST_OFFSET 0x10
>>> > > +#define DMA_GLB_REQ_CID_OFFSET   0x2000
>>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1))
>>> >
>>> > namespaced properly please, here and rest..
>>>
>>> Could you elaborate which name need to named properly?
>>> I guess DMA_GLB_REQ_CID is not very clear, can I change
>>> to DMA_GLB_REQ_UID(uid) which is used to define the UID
>>> registers?
>>
>> I meant something like:
>>
>> SPRD_DMA_
>>
>> DMA_GLB is very generic term and might cause collisions in future so lets
>> make these future proof..
>
> Make sense.
>
>>
>>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev)
>>> > > +{
>>> > > + int ret;
>>> > > +
>>> > > + ret = clk_prepare_enable(sdev->clk);
>>> > > + if (ret)
>>> > > + return ret;
>>> > > +
>>> > > + if (!IS_ERR(sdev->ashb_clk))
>>> >
>>> > that looks odd, can you explain this?
>>>
>>> Since the ashb_clk is optional and only for AGCP DMA controller,
>>> so here we add one condition to check if the ashb_clk need enable.
>>
>> it be worth documenting this..
>
> OK.
>
>>
>>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan)
>>> > > +{
>>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>>> > > + unsigned long flags;
>>> > > +
>>> > > + spin_lock_irqsave(>vc.lock, flags);
>>> > > + sprd_dma_stop(schan);
>>> > > + spin_unlock_irqrestore(>vc.lock, flags);
>>> > > +
>>> > > + vchan_free_chan_resources(>vc);
>>> > > + pm_runtime_put_sync(chan->device->dev);
>>> >
>>> > why put_sync()
>>>
>>> Since we will get pm counter to resume DMA when allocating resources, then
>>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend
>>> DMA if the pm counter is 0.
>>
>> runtime_pm part is fine, you could have used pm_runtime_put(), why the
>> sync() variant here...
>
> After checking again, I agree pm_runtime_put() is enough here and I
> will fix it in next version.
>
>>>
>>> >
>>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan,
>>> > > +   dma_cookie_t cookie,
>>> > > +   struct dma_tx_state *txstate)
>>> > > +{
>>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>>> > > + unsigned long flags;
>>> > > + enum dma_status ret;
>>> > > +
>>> > > + ret = dma_cookie_status(chan, cookie, txstate);
>>> > > +
>>> > > + spin_lock_irqsave(>vc.lock, flags);
>>> > > + txstate->residue = sprd_dma_get_dst_addr(schan);
>>> >
>>> > I dont think this is correct, the residue needs to be read only for 
>>> > current
>>> > cookie and the query might be for one which is not even submitted
>>>
>>> We have one scenario for our audio driver, the audio driver need to get
>>> the destination address to check if their transfer is done in no irq mode.
>>
>> Yes but for the descriptor requested but not any. Audio maybe working as
>> period count maybe lesser...

Sorry for missing one comment. I know what is your meaning, now I
understand the residue needs to be read only for current cookie and I
will fix this in next version.

>>
>>> > > +static int sprd_dma_probe(struct platform_device *pdev)
>>> > > +{
>>> > > + struct device_node *np = pdev->dev.of_node;
>>> > > + struct sprd_dma_dev *sdev;
>>> > > + struct sprd_dma_chn *dma_chn;
>>> > > + struct resource *res;
>>> > > + u32 chn_count;
>>> > > + int ret, i;
>>> > > +
>>> > > + ret = of_property_read_u32(np, "#dma-channels", _count);
>>> > > + if (ret) {
>>> > > + dev_err(>dev, "get dma channels count failed\n");
>>> > > + return ret;
>>> > > + }
>>> > > +
>>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) +
>>> > > + (sizeof(struct sprd_dma_chn) * chn_count)),
>>> > > + GFP_KERNEL);
>>> > > + if (!sdev)
>>> > > + return -ENOMEM;
>>> > > +
>>> > > + sdev->clk = devm_clk_get(>dev, "enable");
>>> > > + if (IS_ERR(sdev->clk)) {
>>> > > + dev_err(>dev, "get enable clock failed\n");
>>> > > + return PTR_ERR(sdev->clk);
>>> > > + }
>>> > > +
>>> > > + /* ashb clock is optional for AGCP DMA 

[PATCH v1] [media] uvcvideo: mark buffer error where overflow

2017-09-06 Thread Baoyou Xie
Some cameras post inaccurate frame where next frame data overlap
it. this results in screen flicker, and it need to be prevented.

So this patch marks the buffer error to discard the frame where
buffer overflow.

Signed-off-by: Baoyou Xie 
---
 drivers/media/usb/uvc/uvc_video.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/usb/uvc/uvc_video.c 
b/drivers/media/usb/uvc/uvc_video.c
index fb86d6a..81a3530 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1077,6 +1077,7 @@ static void uvc_video_decode_data(struct uvc_streaming 
*stream,
/* Complete the current frame if the buffer size was exceeded. */
if (len > maxlen) {
uvc_trace(UVC_TRACE_FRAME, "Frame complete (overflow).\n");
+   buf->error = 1;
buf->state = UVC_BUF_STATE_READY;
}
 }
-- 
2.7.4



[PATCH v1] [media] uvcvideo: mark buffer error where overflow

2017-09-06 Thread Baoyou Xie
Some cameras post inaccurate frame where next frame data overlap
it. this results in screen flicker, and it need to be prevented.

So this patch marks the buffer error to discard the frame where
buffer overflow.

Signed-off-by: Baoyou Xie 
---
 drivers/media/usb/uvc/uvc_video.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/usb/uvc/uvc_video.c 
b/drivers/media/usb/uvc/uvc_video.c
index fb86d6a..81a3530 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1077,6 +1077,7 @@ static void uvc_video_decode_data(struct uvc_streaming 
*stream,
/* Complete the current frame if the buffer size was exceeded. */
if (len > maxlen) {
uvc_trace(UVC_TRACE_FRAME, "Frame complete (overflow).\n");
+   buf->error = 1;
buf->state = UVC_BUF_STATE_READY;
}
 }
-- 
2.7.4



Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Yu Chen
On Wed, Sep 06, 2017 at 10:46:17AM -0700, Dan Williams wrote:
> On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwig  wrote:
> > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote:
> >> I agree, the driver could be rewritten, but it might take some time, so
> >> meanwhile I'm looking at also other possible optimization.
> >
> > Which driver are we talking about anyway?  Let's start looking at it
> > and fix the issue there.
> 
> As far as I understand, it's already fixed there:
> 
> commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1
> Author: Carolyn Wyborny 
> Date:   Tue Jun 20 15:16:53 2017 -0700
> 
> i40e: Fix for trace found with S4 state
> 
> This patch fixes a problem found in systems when entering
> S4 state.  This patch fixes the problem by ensuring that
> the misc vector's IRQ is disabled as well.  Without this
> patch a stack trace can be seen upon entering S4 state.
> 
> However this seems like something that should be handled generically
> in the irq-core especially since commit c5cb83bb337c
> "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in
> that direction. It's otherwise non-obvious when a driver needs to
> release and re-acquire interrupts or be reworked to use managed
> interrupts.
Yes, thanks for the explaination! I did not notice this patch has
been merged already.
I'm using the normal CPU hotplug to reproduce the issue:
#!/bin/bash

n=1

while [ $n -le 31 ]
do
echo 0 > /sys/devices/system/cpu/cpu${n}/online
n=$(( n+1 ))
done

Thanks,
Yu


Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Yu Chen
On Wed, Sep 06, 2017 at 10:46:17AM -0700, Dan Williams wrote:
> On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwig  wrote:
> > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote:
> >> I agree, the driver could be rewritten, but it might take some time, so
> >> meanwhile I'm looking at also other possible optimization.
> >
> > Which driver are we talking about anyway?  Let's start looking at it
> > and fix the issue there.
> 
> As far as I understand, it's already fixed there:
> 
> commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1
> Author: Carolyn Wyborny 
> Date:   Tue Jun 20 15:16:53 2017 -0700
> 
> i40e: Fix for trace found with S4 state
> 
> This patch fixes a problem found in systems when entering
> S4 state.  This patch fixes the problem by ensuring that
> the misc vector's IRQ is disabled as well.  Without this
> patch a stack trace can be seen upon entering S4 state.
> 
> However this seems like something that should be handled generically
> in the irq-core especially since commit c5cb83bb337c
> "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in
> that direction. It's otherwise non-obvious when a driver needs to
> release and re-acquire interrupts or be reworked to use managed
> interrupts.
Yes, thanks for the explaination! I did not notice this patch has
been merged already.
I'm using the normal CPU hotplug to reproduce the issue:
#!/bin/bash

n=1

while [ $n -le 31 ]
do
echo 0 > /sys/devices/system/cpu/cpu${n}/online
n=$(( n+1 ))
done

Thanks,
Yu


[PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Andy Lutomirski
Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
chasing down a false lead.

I've tested patch 2 using CPU hotplug and suspend/resume.  I haven't
tested hibernation or kexec because I don't know how.  (If I do
systemctl hibernate on my laptop, it happily writes out a hiberation
image somewhere and then it equally happily ignores it on the next
boot.  I don't know how to test kexec.)

I haven't tested the 32-bit version.  I'll try to get to that
tomorrow.

Andy Lutomirski (2):
  x86/mm: Reinitialize TLB state on hotplug and resume
  x86/mm: Document how CR4.PCIDE restore works

 arch/x86/include/asm/tlbflush.h |  2 ++
 arch/x86/kernel/cpu/common.c| 15 ++
 arch/x86/mm/tlb.c   | 44 +
 arch/x86/power/cpu.c|  1 +
 4 files changed, 62 insertions(+)

-- 
2.13.5



[PATCH 0/2] Fix resume failure due to PCID

2017-09-06 Thread Andy Lutomirski
Patch 1 is the fix.  Patch 2 is a comment that would have kept me from
chasing down a false lead.

I've tested patch 2 using CPU hotplug and suspend/resume.  I haven't
tested hibernation or kexec because I don't know how.  (If I do
systemctl hibernate on my laptop, it happily writes out a hiberation
image somewhere and then it equally happily ignores it on the next
boot.  I don't know how to test kexec.)

I haven't tested the 32-bit version.  I'll try to get to that
tomorrow.

Andy Lutomirski (2):
  x86/mm: Reinitialize TLB state on hotplug and resume
  x86/mm: Document how CR4.PCIDE restore works

 arch/x86/include/asm/tlbflush.h |  2 ++
 arch/x86/kernel/cpu/common.c| 15 ++
 arch/x86/mm/tlb.c   | 44 +
 arch/x86/power/cpu.c|  1 +
 4 files changed, 62 insertions(+)

-- 
2.13.5



[PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works

2017-09-06 Thread Andy Lutomirski
While debugging a problem, I thought that using
cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be
helpful.  It turns out to be counterproductive.

Add a comment documenting how this works.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/cpu/common.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 40cb4d0a5982..4c31a6585333 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -333,6 +333,19 @@ static void setup_pcid(struct cpuinfo_x86 *c)
 {
if (cpu_has(c, X86_FEATURE_PCID)) {
if (cpu_has(c, X86_FEATURE_PGE)) {
+   /*
+* We'd like to use cr4_set_bits_and_update_boot(),
+* but we can't.  CR4.PCIDE is special and can only
+* be set in long mode, and the early CPU init code
+* doesn't know this and would try to restore CR4.PCIDE
+* prior to entering long mode.
+*
+* Instead, we rely on the fact that hotplug, resume,
+* etc all fully restore CR4 before they write anything
+* that could have nonzero PCID bits to CR3.  CR4.PCIDE
+* has no effect on the page tables themselves, so we
+* don't need it top be restored early.
+*/
cr4_set_bits(X86_CR4_PCIDE);
} else {
/*
-- 
2.13.5



[PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume

2017-09-06 Thread Andy Lutomirski
When Linux brings a CPU down and back up, it switches to init_mm and then
loads swapper_pg_dir into CR3.  With PCID enabled, this has the side effect
of masking off the ASID bits in CR3.

This can result in some confusion in the TLB handling code.  If we
bring a CPU down and back up with any ASID other than 0, we end up
with the wrong ASID active on the CPU after resume.  This could
cause our internal state to become corrupt, although major
corruption is unlikely because init_mm doesn't have any user pages.
More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
in the next context switch.  The result of *that* is a failure to
resume from suspend with probability 1 - 1/6^(cpus-1).

Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.

Reported-by: Linus Torvalds 
Reported-by: Jiri Kosina 
Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to 
preserve old TLB entries using PCID")
Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/tlbflush.h |  2 ++
 arch/x86/kernel/cpu/common.c|  2 ++
 arch/x86/mm/tlb.c   | 44 +
 arch/x86/power/cpu.c|  1 +
 4 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index d23e61dc0640..4893abf7f74f 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -198,6 +198,8 @@ static inline void cr4_set_bits_and_update_boot(unsigned 
long mask)
cr4_set_bits(mask);
 }
 
+extern void initialize_tlbstate_and_flush(void);
+
 static inline void __native_flush_tlb(void)
 {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index efba8e3da3e2..40cb4d0a5982 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1583,6 +1583,7 @@ void cpu_init(void)
mmgrab(_mm);
me->active_mm = _mm;
BUG_ON(me->mm);
+   initialize_tlbstate_and_flush();
enter_lazy_tlb(_mm, me);
 
load_sp0(t, >thread);
@@ -1637,6 +1638,7 @@ void cpu_init(void)
mmgrab(_mm);
curr->active_mm = _mm;
BUG_ON(curr->mm);
+   initialize_tlbstate_and_flush();
enter_lazy_tlb(_mm, curr);
 
load_sp0(t, thread);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index ce104b962a17..dbbcfd59726a 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -214,6 +214,50 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
 }
 
 /*
+ * Call this when reinitializing a CPU.  It fixes the following potential
+ * problems:
+ *
+ * - The ASID changed from what cpu_tlbstate thinks it is (most likely
+ *   because the CPU was taken down and came back up with CR3's PCID
+ *   bits clear.  CPU hotplug can do this.
+ *
+ * - The TLB contains junk in slots corresponding to inactive ASIDs.
+ *
+ * - The CPU went so far out to lunch that it may have missed a TLB
+ *   flush.
+ */
+void initialize_tlbstate_and_flush(void)
+{
+   int i;
+   struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+   u64 tlb_gen = atomic64_read(_mm.context.tlb_gen);
+   unsigned long cr3 = __read_cr3();
+
+   /* Assert that CR3 already references the right mm. */
+   WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd));
+
+   /*
+* Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
+* doesn't work like other CR4 bits because it can only be set from
+* long mode.)
+*/
+   WARN_ON(boot_cpu_has(X86_CR4_PCIDE) &&
+   !(cr4_read_shadow() & X86_CR4_PCIDE));
+
+   /* Force ASID 0 and force a TLB flush. */
+   write_cr3(cr3 & ~CR3_PCID_MASK);
+
+   /* Reinitialize tlbstate. */
+   this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
+   this_cpu_write(cpu_tlbstate.next_asid, 1);
+   this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
+   this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
+
+   for (i = 1; i < TLB_NR_DYN_ASIDS; i++)
+   this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);
+}
+
+/*
  * flush_tlb_func_common()'s memory ordering requirement is that any
  * TLB fills that happen after we flush the TLB are ordered after we
  * read active_mm's tlb_gen.  We don't need any explicit barriers
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 78459a6d455a..4d68d59f457d 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -181,6 +181,7 @@ static void fix_processor_context(void)
 #endif
load_TR_desc(); /* This does ltr */
load_mm_ldt(current->active_mm);/* This does lldt */
+   initialize_tlbstate_and_flush();
 
fpu__resume_cpu();
 
-- 
2.13.5



[PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works

2017-09-06 Thread Andy Lutomirski
While debugging a problem, I thought that using
cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be
helpful.  It turns out to be counterproductive.

Add a comment documenting how this works.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/cpu/common.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 40cb4d0a5982..4c31a6585333 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -333,6 +333,19 @@ static void setup_pcid(struct cpuinfo_x86 *c)
 {
if (cpu_has(c, X86_FEATURE_PCID)) {
if (cpu_has(c, X86_FEATURE_PGE)) {
+   /*
+* We'd like to use cr4_set_bits_and_update_boot(),
+* but we can't.  CR4.PCIDE is special and can only
+* be set in long mode, and the early CPU init code
+* doesn't know this and would try to restore CR4.PCIDE
+* prior to entering long mode.
+*
+* Instead, we rely on the fact that hotplug, resume,
+* etc all fully restore CR4 before they write anything
+* that could have nonzero PCID bits to CR3.  CR4.PCIDE
+* has no effect on the page tables themselves, so we
+* don't need it top be restored early.
+*/
cr4_set_bits(X86_CR4_PCIDE);
} else {
/*
-- 
2.13.5



[PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume

2017-09-06 Thread Andy Lutomirski
When Linux brings a CPU down and back up, it switches to init_mm and then
loads swapper_pg_dir into CR3.  With PCID enabled, this has the side effect
of masking off the ASID bits in CR3.

This can result in some confusion in the TLB handling code.  If we
bring a CPU down and back up with any ASID other than 0, we end up
with the wrong ASID active on the CPU after resume.  This could
cause our internal state to become corrupt, although major
corruption is unlikely because init_mm doesn't have any user pages.
More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
in the next context switch.  The result of *that* is a failure to
resume from suspend with probability 1 - 1/6^(cpus-1).

Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.

Reported-by: Linus Torvalds 
Reported-by: Jiri Kosina 
Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to 
preserve old TLB entries using PCID")
Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/tlbflush.h |  2 ++
 arch/x86/kernel/cpu/common.c|  2 ++
 arch/x86/mm/tlb.c   | 44 +
 arch/x86/power/cpu.c|  1 +
 4 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index d23e61dc0640..4893abf7f74f 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -198,6 +198,8 @@ static inline void cr4_set_bits_and_update_boot(unsigned 
long mask)
cr4_set_bits(mask);
 }
 
+extern void initialize_tlbstate_and_flush(void);
+
 static inline void __native_flush_tlb(void)
 {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index efba8e3da3e2..40cb4d0a5982 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1583,6 +1583,7 @@ void cpu_init(void)
mmgrab(_mm);
me->active_mm = _mm;
BUG_ON(me->mm);
+   initialize_tlbstate_and_flush();
enter_lazy_tlb(_mm, me);
 
load_sp0(t, >thread);
@@ -1637,6 +1638,7 @@ void cpu_init(void)
mmgrab(_mm);
curr->active_mm = _mm;
BUG_ON(curr->mm);
+   initialize_tlbstate_and_flush();
enter_lazy_tlb(_mm, curr);
 
load_sp0(t, thread);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index ce104b962a17..dbbcfd59726a 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -214,6 +214,50 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
 }
 
 /*
+ * Call this when reinitializing a CPU.  It fixes the following potential
+ * problems:
+ *
+ * - The ASID changed from what cpu_tlbstate thinks it is (most likely
+ *   because the CPU was taken down and came back up with CR3's PCID
+ *   bits clear.  CPU hotplug can do this.
+ *
+ * - The TLB contains junk in slots corresponding to inactive ASIDs.
+ *
+ * - The CPU went so far out to lunch that it may have missed a TLB
+ *   flush.
+ */
+void initialize_tlbstate_and_flush(void)
+{
+   int i;
+   struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+   u64 tlb_gen = atomic64_read(_mm.context.tlb_gen);
+   unsigned long cr3 = __read_cr3();
+
+   /* Assert that CR3 already references the right mm. */
+   WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd));
+
+   /*
+* Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
+* doesn't work like other CR4 bits because it can only be set from
+* long mode.)
+*/
+   WARN_ON(boot_cpu_has(X86_CR4_PCIDE) &&
+   !(cr4_read_shadow() & X86_CR4_PCIDE));
+
+   /* Force ASID 0 and force a TLB flush. */
+   write_cr3(cr3 & ~CR3_PCID_MASK);
+
+   /* Reinitialize tlbstate. */
+   this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
+   this_cpu_write(cpu_tlbstate.next_asid, 1);
+   this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
+   this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
+
+   for (i = 1; i < TLB_NR_DYN_ASIDS; i++)
+   this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);
+}
+
+/*
  * flush_tlb_func_common()'s memory ordering requirement is that any
  * TLB fills that happen after we flush the TLB are ordered after we
  * read active_mm's tlb_gen.  We don't need any explicit barriers
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 78459a6d455a..4d68d59f457d 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -181,6 +181,7 @@ static void fix_processor_context(void)
 #endif
load_TR_desc(); /* This does ltr */
load_mm_ldt(current->active_mm);/* This does lldt */
+   initialize_tlbstate_and_flush();
 
fpu__resume_cpu();
 
-- 
2.13.5



Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Yu Chen
On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> On Wed, 6 Sep 2017, Yu Chen wrote:
> > On Wed, Sep 06, 2017 at 12:57:41AM +0200, Thomas Gleixner wrote:
> > > I have a hard time to figure out how the 133 vectors on CPU31 are now
> > > magically fitting in the empty space on CPU0, which is 204 - 133 = 71. In
> > > my limited understanding of math 133 is greater than 71, but your patch
> > > might make that magically be wrong.
> > >
> > The problem is reproduced when the network cable is not plugged in,
> > because this driver looks like this:
> > 
> > step 1. Reserved enough irq vectors and corresponding IRQs.
> > step 2. If the network is activated, invoke request_irq() to
> > register the handler.
> > step 3. Invoke set_affinity() to spread the IRQs onto different
> > CPUs, thus to spread the vectors too.
> > 
> > Here's my understanding for why spreading vectors might help for this
> > special case: 
> > As step 2 will not get invoked, the IRQs of this driver
> > has not been enabled, thus in migrate_one_irq() this IRQ
> > will not be considered because there is a check of
> > irqd_is_started(d), thus there should only be 8 vectors
> > allocated by this driver on CPU0, and 8 vectors left on
> > CPU31, and the 8 vectors on CPU31 will not be migrated
> > to CPU0 neither, so there is room for other 'valid' vectors
> > to be migrated to CPU0.
> 
> Can you please spare me repeating your theories, as long as you don't have
> hard facts to back them up? The network cable is changing the symptoms,
> but the underlying root cause is definitely something different.
> 
> > # cat /sys/kernel/debug/irq/domains/*
> > name:   VECTOR
> >  size:   0
> >  mapped: 388
> >  flags:  0x0041
> 
> So we have 388 vectors mapped in total. And those are just device vectors
> because system vectors are not accounted there.
> 
> > name:   IO-APIC-0
> >  size:   24
> >  mapped: 16
> 
> That's the legacy space
> 
> > name:   IO-APIC-1
> >  size:   8
> >  mapped: 2
> 
> > name:   IO-APIC-2
> >  size:   8
> >  mapped: 0
> 
> > name:   IO-APIC-3
> >  size:   8
> >  mapped: 0
> 
> > name:   IO-APIC-4
> >  size:   8
> >  mapped: 5
> 
> And a few GSIs: Total GSIs = 16 + 2 + 5 = 23
> 
> > name:   PCI-MSI-2
> >  size:   0
> >  mapped: 365
> 
> Plus 365 PCI-MSI vectors allocated.
> 
> >  flags:  0x0051
> >  parent: VECTOR
> > name:   VECTOR
> >  size:   0
> >  mapped: 388
> 
> Which nicely sums up to 388
> 
> > # ls /sys/kernel/debug/irq/irqs
> > ls /sys/kernel/debug/irq/irqs
> > 0  10   11  13  142  184  217  259  292  31  33   337  339
> > 340  342  344  346  348  350  352  354  356  358  360  362
> > 364  366  368  370  372  374  376  378  380  382  384  386
> > 388  390  392  394  4  6   7  9  1  109  12  14  15   2
> > 24   26   332  335  338  34   341  343  345  347  349
> > 351  353  355  357  359  361  363  365  367  369  371  373
> > 375  377  379  381  383  385  387  389  391  393  395  5
> > 67  8
> 
> That are all interrupts which are active. That's a total of 89. Can you
> explain where the delta of 299 vectors comes from?
> 
> 299 allocated, vector mapped, but unused interrupts?
> 
> That's where your problem is, not in the vector spreading. You have a
> massive leak.
> 
> > BTW, do we have sysfs to display how much vectors used on each CPUs?
> 
> Not yet.
> 
> Can you please apply the debug patch below, boot the machine and right
> after login provide the output of
> 
> # cat /sys/kernel/debug/tracing/trace
>
Ok, I've tested on top of 4.13, here're the results:

# cat /sys/kernel/debug/tracing/trace

# tracer: nop
#
#  _-=> irqs-off
# / _=> need-resched
#| / _---=> hardirq/softirq
#|| / _--=> preempt-depth
#||| / delay
#   TASK-PID   CPU#  TIMESTAMP  FUNCTION
#  | |   |      | |
 kworker/0:0-3 [000]  7.373461: msi_domain_alloc_irqs: dev: 
:00:1c.0 nvec 1 virq 24
 kworker/0:0-3 [000]  7.373666: msi_domain_alloc_irqs: dev: 
:b2:00.0 nvec 1 virq 26
 kworker/0:0-3 [000]  7.578980: msi_domain_alloc_irqs: dev: 
:00:11.5 nvec 1 virq 31
 kworker/0:0-3 [000]  7.676938: msi_domain_alloc_irqs: dev: 
:00:17.0 nvec 1 virq 32
 kworker/0:0-3 [000]  8.017664: msi_domain_alloc_irqs: dev: 
:00:14.0 nvec 1 virq 33
 kworker/0:2-303   [000]  9.135467: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 34
 kworker/0:2-303   [000]  9.135476: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 35
 kworker/0:2-303   [000]  9.135484: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 36
 kworker/0:2-303   [000]  9.135492: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 37
 kworker/0:2-303   [000]  9.135499: msi_domain_alloc_irqs: 

Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

2017-09-06 Thread Yu Chen
On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> On Wed, 6 Sep 2017, Yu Chen wrote:
> > On Wed, Sep 06, 2017 at 12:57:41AM +0200, Thomas Gleixner wrote:
> > > I have a hard time to figure out how the 133 vectors on CPU31 are now
> > > magically fitting in the empty space on CPU0, which is 204 - 133 = 71. In
> > > my limited understanding of math 133 is greater than 71, but your patch
> > > might make that magically be wrong.
> > >
> > The problem is reproduced when the network cable is not plugged in,
> > because this driver looks like this:
> > 
> > step 1. Reserved enough irq vectors and corresponding IRQs.
> > step 2. If the network is activated, invoke request_irq() to
> > register the handler.
> > step 3. Invoke set_affinity() to spread the IRQs onto different
> > CPUs, thus to spread the vectors too.
> > 
> > Here's my understanding for why spreading vectors might help for this
> > special case: 
> > As step 2 will not get invoked, the IRQs of this driver
> > has not been enabled, thus in migrate_one_irq() this IRQ
> > will not be considered because there is a check of
> > irqd_is_started(d), thus there should only be 8 vectors
> > allocated by this driver on CPU0, and 8 vectors left on
> > CPU31, and the 8 vectors on CPU31 will not be migrated
> > to CPU0 neither, so there is room for other 'valid' vectors
> > to be migrated to CPU0.
> 
> Can you please spare me repeating your theories, as long as you don't have
> hard facts to back them up? The network cable is changing the symptoms,
> but the underlying root cause is definitely something different.
> 
> > # cat /sys/kernel/debug/irq/domains/*
> > name:   VECTOR
> >  size:   0
> >  mapped: 388
> >  flags:  0x0041
> 
> So we have 388 vectors mapped in total. And those are just device vectors
> because system vectors are not accounted there.
> 
> > name:   IO-APIC-0
> >  size:   24
> >  mapped: 16
> 
> That's the legacy space
> 
> > name:   IO-APIC-1
> >  size:   8
> >  mapped: 2
> 
> > name:   IO-APIC-2
> >  size:   8
> >  mapped: 0
> 
> > name:   IO-APIC-3
> >  size:   8
> >  mapped: 0
> 
> > name:   IO-APIC-4
> >  size:   8
> >  mapped: 5
> 
> And a few GSIs: Total GSIs = 16 + 2 + 5 = 23
> 
> > name:   PCI-MSI-2
> >  size:   0
> >  mapped: 365
> 
> Plus 365 PCI-MSI vectors allocated.
> 
> >  flags:  0x0051
> >  parent: VECTOR
> > name:   VECTOR
> >  size:   0
> >  mapped: 388
> 
> Which nicely sums up to 388
> 
> > # ls /sys/kernel/debug/irq/irqs
> > ls /sys/kernel/debug/irq/irqs
> > 0  10   11  13  142  184  217  259  292  31  33   337  339
> > 340  342  344  346  348  350  352  354  356  358  360  362
> > 364  366  368  370  372  374  376  378  380  382  384  386
> > 388  390  392  394  4  6   7  9  1  109  12  14  15   2
> > 24   26   332  335  338  34   341  343  345  347  349
> > 351  353  355  357  359  361  363  365  367  369  371  373
> > 375  377  379  381  383  385  387  389  391  393  395  5
> > 67  8
> 
> That are all interrupts which are active. That's a total of 89. Can you
> explain where the delta of 299 vectors comes from?
> 
> 299 allocated, vector mapped, but unused interrupts?
> 
> That's where your problem is, not in the vector spreading. You have a
> massive leak.
> 
> > BTW, do we have sysfs to display how much vectors used on each CPUs?
> 
> Not yet.
> 
> Can you please apply the debug patch below, boot the machine and right
> after login provide the output of
> 
> # cat /sys/kernel/debug/tracing/trace
>
Ok, I've tested on top of 4.13, here're the results:

# cat /sys/kernel/debug/tracing/trace

# tracer: nop
#
#  _-=> irqs-off
# / _=> need-resched
#| / _---=> hardirq/softirq
#|| / _--=> preempt-depth
#||| / delay
#   TASK-PID   CPU#  TIMESTAMP  FUNCTION
#  | |   |      | |
 kworker/0:0-3 [000]  7.373461: msi_domain_alloc_irqs: dev: 
:00:1c.0 nvec 1 virq 24
 kworker/0:0-3 [000]  7.373666: msi_domain_alloc_irqs: dev: 
:b2:00.0 nvec 1 virq 26
 kworker/0:0-3 [000]  7.578980: msi_domain_alloc_irqs: dev: 
:00:11.5 nvec 1 virq 31
 kworker/0:0-3 [000]  7.676938: msi_domain_alloc_irqs: dev: 
:00:17.0 nvec 1 virq 32
 kworker/0:0-3 [000]  8.017664: msi_domain_alloc_irqs: dev: 
:00:14.0 nvec 1 virq 33
 kworker/0:2-303   [000]  9.135467: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 34
 kworker/0:2-303   [000]  9.135476: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 35
 kworker/0:2-303   [000]  9.135484: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 36
 kworker/0:2-303   [000]  9.135492: msi_domain_alloc_irqs: dev: 
:bb:00.0 nvec 1 virq 37
 kworker/0:2-303   [000]  9.135499: msi_domain_alloc_irqs: 

Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver

2017-09-06 Thread Baolin Wang
Hi Vinod,

On 7 September 2017 at 00:22, Vinod Koul  wrote:
> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote:
>
>> > > +/* DMA global registers definition */
>> > > +#define DMA_GLB_PAUSE0x0
>> > > +#define DMA_GLB_FRAG_WAIT0x4
>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8
>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc
>> > > +#define DMA_GLB_INT_RAW_STS  0x10
>> > > +#define DMA_GLB_INT_MSK_STS  0x14
>> > > +#define DMA_GLB_REQ_STS  0x18
>> > > +#define DMA_GLB_CHN_EN_STS   0x1c
>> > > +#define DMA_GLB_DEBUG_STS0x20
>> > > +#define DMA_GLB_ARB_SEL_STS  0x24
>> > > +#define DMA_GLB_CHN_START_CHN_CFG1   0x28
>> > > +#define DMA_GLB_CHN_START_CHN_CFG2   0x2c
>> > > +#define DMA_CHN_LLIST_OFFSET 0x10
>> > > +#define DMA_GLB_REQ_CID_OFFSET   0x2000
>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1))
>> >
>> > namespaced properly please, here and rest..
>>
>> Could you elaborate which name need to named properly?
>> I guess DMA_GLB_REQ_CID is not very clear, can I change
>> to DMA_GLB_REQ_UID(uid) which is used to define the UID
>> registers?
>
> I meant something like:
>
> SPRD_DMA_
>
> DMA_GLB is very generic term and might cause collisions in future so lets
> make these future proof..

Make sense.

>
>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev)
>> > > +{
>> > > + int ret;
>> > > +
>> > > + ret = clk_prepare_enable(sdev->clk);
>> > > + if (ret)
>> > > + return ret;
>> > > +
>> > > + if (!IS_ERR(sdev->ashb_clk))
>> >
>> > that looks odd, can you explain this?
>>
>> Since the ashb_clk is optional and only for AGCP DMA controller,
>> so here we add one condition to check if the ashb_clk need enable.
>
> it be worth documenting this..

OK.

>
>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan)
>> > > +{
>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>> > > + unsigned long flags;
>> > > +
>> > > + spin_lock_irqsave(>vc.lock, flags);
>> > > + sprd_dma_stop(schan);
>> > > + spin_unlock_irqrestore(>vc.lock, flags);
>> > > +
>> > > + vchan_free_chan_resources(>vc);
>> > > + pm_runtime_put_sync(chan->device->dev);
>> >
>> > why put_sync()
>>
>> Since we will get pm counter to resume DMA when allocating resources, then
>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend
>> DMA if the pm counter is 0.
>
> runtime_pm part is fine, you could have used pm_runtime_put(), why the
> sync() variant here...

After checking again, I agree pm_runtime_put() is enough here and I
will fix it in next version.

>>
>> >
>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan,
>> > > +   dma_cookie_t cookie,
>> > > +   struct dma_tx_state *txstate)
>> > > +{
>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>> > > + unsigned long flags;
>> > > + enum dma_status ret;
>> > > +
>> > > + ret = dma_cookie_status(chan, cookie, txstate);
>> > > +
>> > > + spin_lock_irqsave(>vc.lock, flags);
>> > > + txstate->residue = sprd_dma_get_dst_addr(schan);
>> >
>> > I dont think this is correct, the residue needs to be read only for current
>> > cookie and the query might be for one which is not even submitted
>>
>> We have one scenario for our audio driver, the audio driver need to get
>> the destination address to check if their transfer is done in no irq mode.
>
> Yes but for the descriptor requested but not any. Audio maybe working as
> period count maybe lesser...
>
>> > > +static int sprd_dma_probe(struct platform_device *pdev)
>> > > +{
>> > > + struct device_node *np = pdev->dev.of_node;
>> > > + struct sprd_dma_dev *sdev;
>> > > + struct sprd_dma_chn *dma_chn;
>> > > + struct resource *res;
>> > > + u32 chn_count;
>> > > + int ret, i;
>> > > +
>> > > + ret = of_property_read_u32(np, "#dma-channels", _count);
>> > > + if (ret) {
>> > > + dev_err(>dev, "get dma channels count failed\n");
>> > > + return ret;
>> > > + }
>> > > +
>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) +
>> > > + (sizeof(struct sprd_dma_chn) * chn_count)),
>> > > + GFP_KERNEL);
>> > > + if (!sdev)
>> > > + return -ENOMEM;
>> > > +
>> > > + sdev->clk = devm_clk_get(>dev, "enable");
>> > > + if (IS_ERR(sdev->clk)) {
>> > > + dev_err(>dev, "get enable clock failed\n");
>> > > + return PTR_ERR(sdev->clk);
>> > > + }
>> > > +
>> > > + /* ashb clock is optional for AGCP DMA */
>> > > + sdev->ashb_clk = devm_clk_get(>dev, "ashb_eb");
>> > > + if (IS_ERR(sdev->ashb_clk))
>> > > + dev_warn(>dev, "no optional ashb eb clock\n");
>> > > +
>> > > + sdev->irq = platform_get_irq(pdev, 0);
>> > > + if (sdev->irq > 0) {
>> > > + ret = devm_request_irq(>dev, sdev->irq, dma_irq_handle,
>> > > + 

Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver

2017-09-06 Thread Baolin Wang
Hi Vinod,

On 7 September 2017 at 00:22, Vinod Koul  wrote:
> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote:
>
>> > > +/* DMA global registers definition */
>> > > +#define DMA_GLB_PAUSE0x0
>> > > +#define DMA_GLB_FRAG_WAIT0x4
>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8
>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc
>> > > +#define DMA_GLB_INT_RAW_STS  0x10
>> > > +#define DMA_GLB_INT_MSK_STS  0x14
>> > > +#define DMA_GLB_REQ_STS  0x18
>> > > +#define DMA_GLB_CHN_EN_STS   0x1c
>> > > +#define DMA_GLB_DEBUG_STS0x20
>> > > +#define DMA_GLB_ARB_SEL_STS  0x24
>> > > +#define DMA_GLB_CHN_START_CHN_CFG1   0x28
>> > > +#define DMA_GLB_CHN_START_CHN_CFG2   0x2c
>> > > +#define DMA_CHN_LLIST_OFFSET 0x10
>> > > +#define DMA_GLB_REQ_CID_OFFSET   0x2000
>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1))
>> >
>> > namespaced properly please, here and rest..
>>
>> Could you elaborate which name need to named properly?
>> I guess DMA_GLB_REQ_CID is not very clear, can I change
>> to DMA_GLB_REQ_UID(uid) which is used to define the UID
>> registers?
>
> I meant something like:
>
> SPRD_DMA_
>
> DMA_GLB is very generic term and might cause collisions in future so lets
> make these future proof..

Make sense.

>
>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev)
>> > > +{
>> > > + int ret;
>> > > +
>> > > + ret = clk_prepare_enable(sdev->clk);
>> > > + if (ret)
>> > > + return ret;
>> > > +
>> > > + if (!IS_ERR(sdev->ashb_clk))
>> >
>> > that looks odd, can you explain this?
>>
>> Since the ashb_clk is optional and only for AGCP DMA controller,
>> so here we add one condition to check if the ashb_clk need enable.
>
> it be worth documenting this..

OK.

>
>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan)
>> > > +{
>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>> > > + unsigned long flags;
>> > > +
>> > > + spin_lock_irqsave(>vc.lock, flags);
>> > > + sprd_dma_stop(schan);
>> > > + spin_unlock_irqrestore(>vc.lock, flags);
>> > > +
>> > > + vchan_free_chan_resources(>vc);
>> > > + pm_runtime_put_sync(chan->device->dev);
>> >
>> > why put_sync()
>>
>> Since we will get pm counter to resume DMA when allocating resources, then
>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend
>> DMA if the pm counter is 0.
>
> runtime_pm part is fine, you could have used pm_runtime_put(), why the
> sync() variant here...

After checking again, I agree pm_runtime_put() is enough here and I
will fix it in next version.

>>
>> >
>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan,
>> > > +   dma_cookie_t cookie,
>> > > +   struct dma_tx_state *txstate)
>> > > +{
>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan);
>> > > + unsigned long flags;
>> > > + enum dma_status ret;
>> > > +
>> > > + ret = dma_cookie_status(chan, cookie, txstate);
>> > > +
>> > > + spin_lock_irqsave(>vc.lock, flags);
>> > > + txstate->residue = sprd_dma_get_dst_addr(schan);
>> >
>> > I dont think this is correct, the residue needs to be read only for current
>> > cookie and the query might be for one which is not even submitted
>>
>> We have one scenario for our audio driver, the audio driver need to get
>> the destination address to check if their transfer is done in no irq mode.
>
> Yes but for the descriptor requested but not any. Audio maybe working as
> period count maybe lesser...
>
>> > > +static int sprd_dma_probe(struct platform_device *pdev)
>> > > +{
>> > > + struct device_node *np = pdev->dev.of_node;
>> > > + struct sprd_dma_dev *sdev;
>> > > + struct sprd_dma_chn *dma_chn;
>> > > + struct resource *res;
>> > > + u32 chn_count;
>> > > + int ret, i;
>> > > +
>> > > + ret = of_property_read_u32(np, "#dma-channels", _count);
>> > > + if (ret) {
>> > > + dev_err(>dev, "get dma channels count failed\n");
>> > > + return ret;
>> > > + }
>> > > +
>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) +
>> > > + (sizeof(struct sprd_dma_chn) * chn_count)),
>> > > + GFP_KERNEL);
>> > > + if (!sdev)
>> > > + return -ENOMEM;
>> > > +
>> > > + sdev->clk = devm_clk_get(>dev, "enable");
>> > > + if (IS_ERR(sdev->clk)) {
>> > > + dev_err(>dev, "get enable clock failed\n");
>> > > + return PTR_ERR(sdev->clk);
>> > > + }
>> > > +
>> > > + /* ashb clock is optional for AGCP DMA */
>> > > + sdev->ashb_clk = devm_clk_get(>dev, "ashb_eb");
>> > > + if (IS_ERR(sdev->ashb_clk))
>> > > + dev_warn(>dev, "no optional ashb eb clock\n");
>> > > +
>> > > + sdev->irq = platform_get_irq(pdev, 0);
>> > > + if (sdev->irq > 0) {
>> > > + ret = devm_request_irq(>dev, sdev->irq, dma_irq_handle,
>> > > +0, 

[PATCH] crc32-pclmul: remove useless relative addressing

2017-09-06 Thread Mikulas Patocka
In 32-bit mode, the x86 architecture can hold full 32-bit pointers.
Therefore, the code that copies the current address to the %ecx register
and uses %ecx-relative addressing is useless, we could just use absolute
addressing.

The processors have a stack of return addresses for branch prediction. If 
we use a call instruction and pop the return address, it desynchronizes 
the return stack and causes branch prediction misses.

This patch also moves the data to the .rodata section.

Signed-off-by: Mikulas Patocka 

---
 arch/x86/crypto/crc32-pclmul_asm.S |   17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

Index: linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S
===
--- linux-2.6.orig/arch/x86/crypto/crc32-pclmul_asm.S
+++ linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S
@@ -41,6 +41,7 @@
 #include 
 
 
+.section .rodata
 .align 16
 /*
  * [x4*128+32 mod P(x) << 32)]'  << 1   = 0x154442bd4
@@ -111,19 +112,13 @@ ENTRY(crc32_pclmul_le_16) /* buffer and
pxorCONSTANT, %xmm1
sub $0x40, LEN
add $0x40, BUF
-#ifndef __x86_64__
-   /* This is for position independent code(-fPIC) support for 32bit */
-   calldelta
-delta:
-   pop %ecx
-#endif
cmp $0x40, LEN
jb  less_64
 
 #ifdef __x86_64__
movdqa .Lconstant_R2R1(%rip), CONSTANT
 #else
-   movdqa .Lconstant_R2R1 - delta(%ecx), CONSTANT
+   movdqa .Lconstant_R2R1, CONSTANT
 #endif
 
 loop_64:/*  64 bytes Full cache line folding */
@@ -172,7 +167,7 @@ less_64:/*  Folding cache line into 128b
 #ifdef __x86_64__
movdqa  .Lconstant_R4R3(%rip), CONSTANT
 #else
-   movdqa  .Lconstant_R4R3 - delta(%ecx), CONSTANT
+   movdqa  .Lconstant_R4R3, CONSTANT
 #endif
prefetchnta (BUF)
 
@@ -220,8 +215,8 @@ fold_64:
movdqa  .Lconstant_R5(%rip), CONSTANT
movdqa  .Lconstant_mask32(%rip), %xmm3
 #else
-   movdqa  .Lconstant_R5 - delta(%ecx), CONSTANT
-   movdqa  .Lconstant_mask32 - delta(%ecx), %xmm3
+   movdqa  .Lconstant_R5, CONSTANT
+   movdqa  .Lconstant_mask32, %xmm3
 #endif
psrldq  $0x04, %xmm2
pand%xmm3, %xmm1
@@ -232,7 +227,7 @@ fold_64:
 #ifdef __x86_64__
movdqa  .Lconstant_RUpoly(%rip), CONSTANT
 #else
-   movdqa  .Lconstant_RUpoly - delta(%ecx), CONSTANT
+   movdqa  .Lconstant_RUpoly, CONSTANT
 #endif
movdqa  %xmm1, %xmm2
pand%xmm3, %xmm1


[PATCH] crc32-pclmul: remove useless relative addressing

2017-09-06 Thread Mikulas Patocka
In 32-bit mode, the x86 architecture can hold full 32-bit pointers.
Therefore, the code that copies the current address to the %ecx register
and uses %ecx-relative addressing is useless, we could just use absolute
addressing.

The processors have a stack of return addresses for branch prediction. If 
we use a call instruction and pop the return address, it desynchronizes 
the return stack and causes branch prediction misses.

This patch also moves the data to the .rodata section.

Signed-off-by: Mikulas Patocka 

---
 arch/x86/crypto/crc32-pclmul_asm.S |   17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

Index: linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S
===
--- linux-2.6.orig/arch/x86/crypto/crc32-pclmul_asm.S
+++ linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S
@@ -41,6 +41,7 @@
 #include 
 
 
+.section .rodata
 .align 16
 /*
  * [x4*128+32 mod P(x) << 32)]'  << 1   = 0x154442bd4
@@ -111,19 +112,13 @@ ENTRY(crc32_pclmul_le_16) /* buffer and
pxorCONSTANT, %xmm1
sub $0x40, LEN
add $0x40, BUF
-#ifndef __x86_64__
-   /* This is for position independent code(-fPIC) support for 32bit */
-   calldelta
-delta:
-   pop %ecx
-#endif
cmp $0x40, LEN
jb  less_64
 
 #ifdef __x86_64__
movdqa .Lconstant_R2R1(%rip), CONSTANT
 #else
-   movdqa .Lconstant_R2R1 - delta(%ecx), CONSTANT
+   movdqa .Lconstant_R2R1, CONSTANT
 #endif
 
 loop_64:/*  64 bytes Full cache line folding */
@@ -172,7 +167,7 @@ less_64:/*  Folding cache line into 128b
 #ifdef __x86_64__
movdqa  .Lconstant_R4R3(%rip), CONSTANT
 #else
-   movdqa  .Lconstant_R4R3 - delta(%ecx), CONSTANT
+   movdqa  .Lconstant_R4R3, CONSTANT
 #endif
prefetchnta (BUF)
 
@@ -220,8 +215,8 @@ fold_64:
movdqa  .Lconstant_R5(%rip), CONSTANT
movdqa  .Lconstant_mask32(%rip), %xmm3
 #else
-   movdqa  .Lconstant_R5 - delta(%ecx), CONSTANT
-   movdqa  .Lconstant_mask32 - delta(%ecx), %xmm3
+   movdqa  .Lconstant_R5, CONSTANT
+   movdqa  .Lconstant_mask32, %xmm3
 #endif
psrldq  $0x04, %xmm2
pand%xmm3, %xmm1
@@ -232,7 +227,7 @@ fold_64:
 #ifdef __x86_64__
movdqa  .Lconstant_RUpoly(%rip), CONSTANT
 #else
-   movdqa  .Lconstant_RUpoly - delta(%ecx), CONSTANT
+   movdqa  .Lconstant_RUpoly, CONSTANT
 #endif
movdqa  %xmm1, %xmm2
pand%xmm3, %xmm1


Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup

2017-09-06 Thread Baoquan He
On 09/07/17 at 10:27am, Dou Liyang wrote:
> 
> 
> At 09/06/2017 04:03 PM, Baoquan He wrote:
> > On 09/06/17 at 01:41pm, Dou Liyang wrote:
> > > Hi Baoquan,
> > > 
> > > At 09/06/2017 01:26 PM, Baoquan He wrote:
> > > [...]
> > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > > index 4f63afc..9f8479c 100644
> > > --- a/arch/x86/kernel/smpboot.c
> > > +++ b/arch/x86/kernel/smpboot.c
> > > @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void)
> > >  }
> > > 
> > >  /*
> > > - * Prepare for SMP bootup.  The MP table or ACPI has been read
> > > - * earlier.  Just do some sanity checking here and enable APIC mode.
> > > + * Prepare for SMP bootup.
> > > + *
> > > + * @max_cpus: configured maximum number of CPUs
> > > + * It don't be used, but other arch also have this hook, so keep it.
> 
> How about:
> 
> @max_cpus: configured maximum number of CPUs
> It is a legacy parameter for common interface support

Looks good to me.

> 
> > 
> > Yeah, makes sense. Above words can be reconsidered.
> > 
> 
> 
> 


Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup

2017-09-06 Thread Baoquan He
On 09/07/17 at 10:27am, Dou Liyang wrote:
> 
> 
> At 09/06/2017 04:03 PM, Baoquan He wrote:
> > On 09/06/17 at 01:41pm, Dou Liyang wrote:
> > > Hi Baoquan,
> > > 
> > > At 09/06/2017 01:26 PM, Baoquan He wrote:
> > > [...]
> > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > > index 4f63afc..9f8479c 100644
> > > --- a/arch/x86/kernel/smpboot.c
> > > +++ b/arch/x86/kernel/smpboot.c
> > > @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void)
> > >  }
> > > 
> > >  /*
> > > - * Prepare for SMP bootup.  The MP table or ACPI has been read
> > > - * earlier.  Just do some sanity checking here and enable APIC mode.
> > > + * Prepare for SMP bootup.
> > > + *
> > > + * @max_cpus: configured maximum number of CPUs
> > > + * It don't be used, but other arch also have this hook, so keep it.
> 
> How about:
> 
> @max_cpus: configured maximum number of CPUs
> It is a legacy parameter for common interface support

Looks good to me.

> 
> > 
> > Yeah, makes sense. Above words can be reconsidered.
> > 
> 
> 
> 


linux-next: build warning after merge of the tpmdd tree

2017-09-06 Thread Stephen Rothwell
Hi Jarkko,

After merging the tpmdd tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/char/tpm/tpm_tis_core.c: In function 'probe_itpm':
drivers/char/tpm/tpm_tis_core.c:469:31: warning: passing argument 2 of 
'tpm_tis_send_data' discards 'const' qualifier from pointer target type 
[-Wdiscarded-qualifiers]
  rc = tpm_tis_send_data(chip, cmd_getticks, len);
   ^
drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char 
*}' but argument is of type 'const u8 * {aka const unsigned char *}'
 static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
^
drivers/char/tpm/tpm_tis_core.c:477:31: warning: passing argument 2 of 
'tpm_tis_send_data' discards 'const' qualifier from pointer target type 
[-Wdiscarded-qualifiers]
  rc = tpm_tis_send_data(chip, cmd_getticks, len);
   ^
drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char 
*}' but argument is of type 'const u8 * {aka const unsigned char *}'
 static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
^

Introduced by commit

  5e35bd8e06b9 ("tpm_tis: make array cmd_getticks static const to shink object 
code size")

-- 
Cheers,
Stephen Rothwell


linux-next: build warning after merge of the tpmdd tree

2017-09-06 Thread Stephen Rothwell
Hi Jarkko,

After merging the tpmdd tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/char/tpm/tpm_tis_core.c: In function 'probe_itpm':
drivers/char/tpm/tpm_tis_core.c:469:31: warning: passing argument 2 of 
'tpm_tis_send_data' discards 'const' qualifier from pointer target type 
[-Wdiscarded-qualifiers]
  rc = tpm_tis_send_data(chip, cmd_getticks, len);
   ^
drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char 
*}' but argument is of type 'const u8 * {aka const unsigned char *}'
 static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
^
drivers/char/tpm/tpm_tis_core.c:477:31: warning: passing argument 2 of 
'tpm_tis_send_data' discards 'const' qualifier from pointer target type 
[-Wdiscarded-qualifiers]
  rc = tpm_tis_send_data(chip, cmd_getticks, len);
   ^
drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char 
*}' but argument is of type 'const u8 * {aka const unsigned char *}'
 static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
^

Introduced by commit

  5e35bd8e06b9 ("tpm_tis: make array cmd_getticks static const to shink object 
code size")

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup

2017-09-06 Thread Dou Liyang



At 09/06/2017 04:03 PM, Baoquan He wrote:

On 09/06/17 at 01:41pm, Dou Liyang wrote:

Hi Baoquan,

At 09/06/2017 01:26 PM, Baoquan He wrote:
[...]
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4f63afc..9f8479c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void)
 }

 /*
- * Prepare for SMP bootup.  The MP table or ACPI has been read
- * earlier.  Just do some sanity checking here and enable APIC mode.
+ * Prepare for SMP bootup.
+ *
+ * @max_cpus: configured maximum number of CPUs
+ * It don't be used, but other arch also have this hook, so keep it.


How about:

@max_cpus: configured maximum number of CPUs
It is a legacy parameter for common interface support



Yeah, makes sense. Above words can be reconsidered.







Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup

2017-09-06 Thread Dou Liyang



At 09/06/2017 04:03 PM, Baoquan He wrote:

On 09/06/17 at 01:41pm, Dou Liyang wrote:

Hi Baoquan,

At 09/06/2017 01:26 PM, Baoquan He wrote:
[...]
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4f63afc..9f8479c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void)
 }

 /*
- * Prepare for SMP bootup.  The MP table or ACPI has been read
- * earlier.  Just do some sanity checking here and enable APIC mode.
+ * Prepare for SMP bootup.
+ *
+ * @max_cpus: configured maximum number of CPUs
+ * It don't be used, but other arch also have this hook, so keep it.


How about:

@max_cpus: configured maximum number of CPUs
It is a legacy parameter for common interface support



Yeah, makes sense. Above words can be reconsidered.







Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

2017-09-06 Thread Guenter Roeck

On 09/06/2017 04:32 PM, Andrew Jeffery wrote:

On Wed, 2017-09-06 at 15:51 -0700, Guenter Roeck wrote:

On Wed, Sep 06, 2017 at 10:23:37AM +1000, Andrew Jeffery wrote:

On Tue, 2017-09-05 at 10:00 -0700, Guenter Roeck wrote:

On Tue, Sep 05, 2017 at 05:01:32PM +1000, Andrew Jeffery wrote:

Some functions exposed by pmbus core conflated errors that occurred when
setting the page to access with errors that occurred when accessing
registers in a page. In some cases, this caused legitimate errors to be
hidden under the guise of the register not being supported.
  
  
I'll have to look into this. If I recall correctly, the reason for not returning

errors from the "clear faults" command was that some early chips did not support
it, or did not support it on all pages. Those chips would now always fail.
  
Yes, that is a concern.
  
However, shouldn't the lack of support for CLEAR_FAULTS be a described

property of the chip or page?
  
Regardless, the issue here is the PAGE command is sometimes failing

(more below). This means we won't have issued a CLEAR_FAULTS against
the page even if the page supports CLEAR_FAULTS. That to me is
something that should be propagated back up the call chain, as faults
that can be cleared will not have been.
  
  
We could also consider

- retry


I guess that leads to the usual retries problem: How many? Oneshot? More?

My expectation is that oneshot would be enough, but we'd still need to consider
what to do if that wasn't successful.

Does the retry policy need to be in the kernel? I guess if it's not we would
need all operations in the path to the failure to be idempotent.


- Add a marker indicating that faults (specifically command errors)
   are unreliable after clearing faults failed


What kind of marker were you thinking? Something in dmesg? If it's anything
else we'd probably want something to respond to the condition, which nothing
would do currently.

  
If we can't reliably clear faults, all bets are off.


Yeah, it's a messy problem. However even if we resolve our issues in hardware
(i.e. it's discovered that it is a design failure or something), I don't think
we should drop a resolution to the problem highlighted by this patch.

  
  
On a higher level, I am not sure if it is a good idea to return an error

from a function intended to clear faults (and nothing else). That seems
counterproductive.
  
Is this a problem you have actually observed ?
  
Unfortunately yes. I was trying to reproduce some issues that we are

seeing in userspace but wasn't having much luck (getting -EIO on
reading a hwmon attribute). I ended up instrumenting our I2C bus
driver, and found that the problem was more prevalent than what the
read failures in userspace were indicating. The paths that were
triggering these unreported conditions all traced through the check
register and clear fault functions, which on analysis were swallowing
the error.
  

If so, what is the chip ?
  
Well, the chip that I'm *communicating* with is the Maxim MAX31785

Intelligent Fan Controller. As to whether that's what is *causing* the
PAGE command to fail is still up in the air. I would not be surprised
if we have other issues in the hardware design.
  
I haven't sent a follow-up to the series introducing the MAX31785

driver because I've been chasing down communication issues. I felt it
was important to understand whether the device has quirks that need to
be worked around in the driver, or if our hardware design has bigger
problems that should be handled in other ways. I've been in touch with
Maxim who have asserted that some of the problems we're seeing cannot
be caused by their chip.
  
  
Guess I need to dig up my eval board and see if I can reproduce the problem.

Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?


Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.

[1] https://patchwork.kernel.org/patch/9876083/

  
The MAX31785 is microcode based, so I would not be entirely surprised if

it sometimes fails to reply if a sequence of 
commands is executed.


Have you seen this behaviour in other microcode-based chips?



ltc2978 (commit e04d1ce9bbb) and most of the Zilker Labs chips (zl6100.c).

You could try the approach I used in the zl6100 driver: Add a minimum time
between accesses to the chip. I would probably no longer use udelay(), though,
but usleeep_range(), if I were to implement the code today.

  
If possible, you might try reducing the i2c clock. I have not seen that with

Maxim chips, but some other PMBus chips don't operate reliably at 400 KHz.


It's already running at 100kHz, which is the maximum clock rate the device is
specified for. I've even underclocked the bus to 75kHz and still observed

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

2017-09-06 Thread Guenter Roeck

On 09/06/2017 04:32 PM, Andrew Jeffery wrote:

On Wed, 2017-09-06 at 15:51 -0700, Guenter Roeck wrote:

On Wed, Sep 06, 2017 at 10:23:37AM +1000, Andrew Jeffery wrote:

On Tue, 2017-09-05 at 10:00 -0700, Guenter Roeck wrote:

On Tue, Sep 05, 2017 at 05:01:32PM +1000, Andrew Jeffery wrote:

Some functions exposed by pmbus core conflated errors that occurred when
setting the page to access with errors that occurred when accessing
registers in a page. In some cases, this caused legitimate errors to be
hidden under the guise of the register not being supported.
  
  
I'll have to look into this. If I recall correctly, the reason for not returning

errors from the "clear faults" command was that some early chips did not support
it, or did not support it on all pages. Those chips would now always fail.
  
Yes, that is a concern.
  
However, shouldn't the lack of support for CLEAR_FAULTS be a described

property of the chip or page?
  
Regardless, the issue here is the PAGE command is sometimes failing

(more below). This means we won't have issued a CLEAR_FAULTS against
the page even if the page supports CLEAR_FAULTS. That to me is
something that should be propagated back up the call chain, as faults
that can be cleared will not have been.
  
  
We could also consider

- retry


I guess that leads to the usual retries problem: How many? Oneshot? More?

My expectation is that oneshot would be enough, but we'd still need to consider
what to do if that wasn't successful.

Does the retry policy need to be in the kernel? I guess if it's not we would
need all operations in the path to the failure to be idempotent.


- Add a marker indicating that faults (specifically command errors)
   are unreliable after clearing faults failed


What kind of marker were you thinking? Something in dmesg? If it's anything
else we'd probably want something to respond to the condition, which nothing
would do currently.

  
If we can't reliably clear faults, all bets are off.


Yeah, it's a messy problem. However even if we resolve our issues in hardware
(i.e. it's discovered that it is a design failure or something), I don't think
we should drop a resolution to the problem highlighted by this patch.

  
  
On a higher level, I am not sure if it is a good idea to return an error

from a function intended to clear faults (and nothing else). That seems
counterproductive.
  
Is this a problem you have actually observed ?
  
Unfortunately yes. I was trying to reproduce some issues that we are

seeing in userspace but wasn't having much luck (getting -EIO on
reading a hwmon attribute). I ended up instrumenting our I2C bus
driver, and found that the problem was more prevalent than what the
read failures in userspace were indicating. The paths that were
triggering these unreported conditions all traced through the check
register and clear fault functions, which on analysis were swallowing
the error.
  

If so, what is the chip ?
  
Well, the chip that I'm *communicating* with is the Maxim MAX31785

Intelligent Fan Controller. As to whether that's what is *causing* the
PAGE command to fail is still up in the air. I would not be surprised
if we have other issues in the hardware design.
  
I haven't sent a follow-up to the series introducing the MAX31785

driver because I've been chasing down communication issues. I felt it
was important to understand whether the device has quirks that need to
be worked around in the driver, or if our hardware design has bigger
problems that should be handled in other ways. I've been in touch with
Maxim who have asserted that some of the problems we're seeing cannot
be caused by their chip.
  
  
Guess I need to dig up my eval board and see if I can reproduce the problem.

Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?


Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.

[1] https://patchwork.kernel.org/patch/9876083/

  
The MAX31785 is microcode based, so I would not be entirely surprised if

it sometimes fails to reply if a sequence of 
commands is executed.


Have you seen this behaviour in other microcode-based chips?



ltc2978 (commit e04d1ce9bbb) and most of the Zilker Labs chips (zl6100.c).

You could try the approach I used in the zl6100 driver: Add a minimum time
between accesses to the chip. I would probably no longer use udelay(), though,
but usleeep_range(), if I were to implement the code today.

  
If possible, you might try reducing the i2c clock. I have not seen that with

Maxim chips, but some other PMBus chips don't operate reliably at 400 KHz.


It's already running at 100kHz, which is the maximum clock rate the device is
specified for. I've even underclocked the bus to 75kHz and still observed

Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3

2017-09-06 Thread Bob Liu
On 2017/9/6 10:12, Jerome Glisse wrote:
> On Wed, Sep 06, 2017 at 09:25:36AM +0800, Bob Liu wrote:
>> On 2017/9/6 2:54, Ross Zwisler wrote:
>>> On Mon, Sep 04, 2017 at 10:38:27PM -0400, Jerome Glisse wrote:
 On Tue, Sep 05, 2017 at 09:13:24AM +0800, Bob Liu wrote:
> On 2017/9/4 23:51, Jerome Glisse wrote:
>> On Mon, Sep 04, 2017 at 11:09:14AM +0800, Bob Liu wrote:
>>> On 2017/8/17 8:05, Jérôme Glisse wrote:
 Unlike unaddressable memory, coherent device memory has a real
 resource associated with it on the system (as CPU can address
 it). Add a new helper to hotplug such memory within the HMM
 framework.

>>>
>>> Got an new question, coherent device( e.g CCIX) memory are likely 
>>> reported to OS 
>>> through ACPI and recognized as NUMA memory node.
>>> Then how can their memory be captured and managed by HMM framework?
>>>
>>
>> Only platform that has such memory today is powerpc and it is not 
>> reported
>> as regular memory by the firmware hence why they need this helper.
>>
>> I don't think anyone has defined anything yet for x86 and acpi. As this 
>> is
>
> Not yet, but now the ACPI spec has Heterogeneous Memory Attribute
> Table (HMAT) table defined in ACPI 6.2.
> The HMAT can cover CPU-addressable memory types(though not non-cache
> coherent on-device memory).
>
> Ross from Intel already done some work on this, see:
> https://lwn.net/Articles/724562/
>
> arm64 supports APCI also, there is likely more this kind of device when 
> CCIX
> is out (should be very soon if on schedule).

 HMAT is not for the same thing, AFAIK HMAT is for deep "hierarchy" memory 
 ie
 when you have several kind of memory each with different characteristics:
   - HBM very fast (latency) and high bandwidth, non persistent, somewhat
 small (ie few giga bytes)
   - Persistent memory, slower (both latency and bandwidth) big (tera bytes)
   - DDR (good old memory) well characteristics are between HBM and 
 persistent

 So AFAICT this has nothing to do with what HMM is for, ie device memory. 
 Note
 that device memory can have a hierarchy of memory themself (HBM, GDDR and 
 in
 maybe even persistent memory).

>> memory on PCIE like interface then i don't expect it to be reported as 
>> NUMA
>> memory node but as io range like any regular PCIE resources. Device 
>> driver
>> through capabilities flags would then figure out if the link between the
>> device and CPU is CCIX capable if so it can use this helper to hotplug it
>> as device memory.
>>
>
> From my point of view,  Cache coherent device memory will popular soon and
> reported through ACPI/UEFI. Extending NUMA policy still sounds more 
> reasonable
> to me.

 Cache coherent device will be reported through standard mecanisms defined 
 by
 the bus standard they are using. To my knowledge all the standard are 
 either
 on top of PCIE or are similar to PCIE.

 It is true that on many platform PCIE resource is manage/initialize by the
 bios (UEFI) but it is platform specific. In some case we reprogram what the
 bios pick.

 So like i was saying i don't expect the BIOS/UEFI to report device memory 
 as
 regular memory. It will be reported as a regular PCIE resources and then 
 the
 device driver will be able to determine through some flags if the link 
 between
 the CPU(s) and the device is cache coherent or not. At that point the 
 device
 driver can use register it with HMM helper.


 The whole NUMA discussion happen several time in the past i suggest looking
 on mm list archive for them. But it was rule out for several reasons. Top 
 of
 my head:
   - people hate CPU less node and device memory is inherently CPU less
>>>
>>> With the introduction of the HMAT in ACPI 6.2 one of the things that was 
>>> added
>>> was the ability to have an ACPI proximity domain that isn't associated with 
>>> a
>>> CPU.  This can be seen in the changes in the text of the "Proximity Domain"
>>> field in table 5-73 which describes the "Memory Affinity Structure".  One of
>>> the major features of the HMAT was the separation of "Initiator" proximity
>>> domains (CPUs, devices that initiate memory transfers), and "target" 
>>> proximity
>>> domains (memory regions, be they attached to a CPU or some other device).
>>>
>>> ACPI proximity domains map directly to Linux NUMA nodes, so I think we're
>>> already in a place where we have to support CPU-less NUMA nodes.
>>>
   - device driver want total control over memory and thus to be isolated 
 from
 mm mecanism and doing all those special cases was not welcome
>>>
>>> I agree that the kernel doesn't have enough information to be able to
>>> accurately handle 

Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3

2017-09-06 Thread Bob Liu
On 2017/9/6 10:12, Jerome Glisse wrote:
> On Wed, Sep 06, 2017 at 09:25:36AM +0800, Bob Liu wrote:
>> On 2017/9/6 2:54, Ross Zwisler wrote:
>>> On Mon, Sep 04, 2017 at 10:38:27PM -0400, Jerome Glisse wrote:
 On Tue, Sep 05, 2017 at 09:13:24AM +0800, Bob Liu wrote:
> On 2017/9/4 23:51, Jerome Glisse wrote:
>> On Mon, Sep 04, 2017 at 11:09:14AM +0800, Bob Liu wrote:
>>> On 2017/8/17 8:05, Jérôme Glisse wrote:
 Unlike unaddressable memory, coherent device memory has a real
 resource associated with it on the system (as CPU can address
 it). Add a new helper to hotplug such memory within the HMM
 framework.

>>>
>>> Got an new question, coherent device( e.g CCIX) memory are likely 
>>> reported to OS 
>>> through ACPI and recognized as NUMA memory node.
>>> Then how can their memory be captured and managed by HMM framework?
>>>
>>
>> Only platform that has such memory today is powerpc and it is not 
>> reported
>> as regular memory by the firmware hence why they need this helper.
>>
>> I don't think anyone has defined anything yet for x86 and acpi. As this 
>> is
>
> Not yet, but now the ACPI spec has Heterogeneous Memory Attribute
> Table (HMAT) table defined in ACPI 6.2.
> The HMAT can cover CPU-addressable memory types(though not non-cache
> coherent on-device memory).
>
> Ross from Intel already done some work on this, see:
> https://lwn.net/Articles/724562/
>
> arm64 supports APCI also, there is likely more this kind of device when 
> CCIX
> is out (should be very soon if on schedule).

 HMAT is not for the same thing, AFAIK HMAT is for deep "hierarchy" memory 
 ie
 when you have several kind of memory each with different characteristics:
   - HBM very fast (latency) and high bandwidth, non persistent, somewhat
 small (ie few giga bytes)
   - Persistent memory, slower (both latency and bandwidth) big (tera bytes)
   - DDR (good old memory) well characteristics are between HBM and 
 persistent

 So AFAICT this has nothing to do with what HMM is for, ie device memory. 
 Note
 that device memory can have a hierarchy of memory themself (HBM, GDDR and 
 in
 maybe even persistent memory).

>> memory on PCIE like interface then i don't expect it to be reported as 
>> NUMA
>> memory node but as io range like any regular PCIE resources. Device 
>> driver
>> through capabilities flags would then figure out if the link between the
>> device and CPU is CCIX capable if so it can use this helper to hotplug it
>> as device memory.
>>
>
> From my point of view,  Cache coherent device memory will popular soon and
> reported through ACPI/UEFI. Extending NUMA policy still sounds more 
> reasonable
> to me.

 Cache coherent device will be reported through standard mecanisms defined 
 by
 the bus standard they are using. To my knowledge all the standard are 
 either
 on top of PCIE or are similar to PCIE.

 It is true that on many platform PCIE resource is manage/initialize by the
 bios (UEFI) but it is platform specific. In some case we reprogram what the
 bios pick.

 So like i was saying i don't expect the BIOS/UEFI to report device memory 
 as
 regular memory. It will be reported as a regular PCIE resources and then 
 the
 device driver will be able to determine through some flags if the link 
 between
 the CPU(s) and the device is cache coherent or not. At that point the 
 device
 driver can use register it with HMM helper.


 The whole NUMA discussion happen several time in the past i suggest looking
 on mm list archive for them. But it was rule out for several reasons. Top 
 of
 my head:
   - people hate CPU less node and device memory is inherently CPU less
>>>
>>> With the introduction of the HMAT in ACPI 6.2 one of the things that was 
>>> added
>>> was the ability to have an ACPI proximity domain that isn't associated with 
>>> a
>>> CPU.  This can be seen in the changes in the text of the "Proximity Domain"
>>> field in table 5-73 which describes the "Memory Affinity Structure".  One of
>>> the major features of the HMAT was the separation of "Initiator" proximity
>>> domains (CPUs, devices that initiate memory transfers), and "target" 
>>> proximity
>>> domains (memory regions, be they attached to a CPU or some other device).
>>>
>>> ACPI proximity domains map directly to Linux NUMA nodes, so I think we're
>>> already in a place where we have to support CPU-less NUMA nodes.
>>>
   - device driver want total control over memory and thus to be isolated 
 from
 mm mecanism and doing all those special cases was not welcome
>>>
>>> I agree that the kernel doesn't have enough information to be able to
>>> accurately handle 

  1   2   3   4   5   6   7   8   9   10   >