Re: [PATCH] security/keys: Adding newline after declaration
On Thu, 2017-09-07 at 11:11 +0530, Pushkar Jambhlekar wrote: > Fixing checkpatch warning to add newline after declaration [] > diff --git a/security/keys/big_key.c b/security/keys/big_key.c [] > @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, > size_t datalen, u8 *key) > { > int ret = -EINVAL; > struct scatterlist sgio; > + > SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher); checkpatch doesn't know that SKCIPHER_REQUEST_ON_STACK is actually 2 declarations.
Re: [PATCH] security/keys: Adding newline after declaration
On Thu, 2017-09-07 at 11:11 +0530, Pushkar Jambhlekar wrote: > Fixing checkpatch warning to add newline after declaration [] > diff --git a/security/keys/big_key.c b/security/keys/big_key.c [] > @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, > size_t datalen, u8 *key) > { > int ret = -EINVAL; > struct scatterlist sgio; > + > SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher); checkpatch doesn't know that SKCIPHER_REQUEST_ON_STACK is actually 2 declarations.
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Thu, 7 Sep 2017, Yu Chen wrote: > On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote: > > Can you please apply the debug patch below, boot the machine and right > > after login provide the output of > > > > # cat /sys/kernel/debug/tracing/trace > > > kworker/0:2-303 [000] 9.135467: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 34 > kworker/0:2-303 [000] 9.135476: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 35 > kworker/0:2-303 [000] 9.135484: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 36 > kworker/0:2-303 [000] 9.762268: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 331 > kworker/0:2-303 [000] 9.762278: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 332 > kworker/0:2-303 [000] 9.762288: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 333 That's 300 vectors. > bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03) > > -+-[:b2]-+-00.0-[b3-bc]00.0-[b4-bc]--+-00.0-[b5-b6]00.0 > | | +-01.0-[b7-b8]00.0 > | | +-02.0-[b9-ba]00.0 > | | \-03.0-[bb-bc]--+-00.0 > | | +-00.1 > | | +-00.2 > | | \-00.3 > > and they are using i40e driver, the vectors should be reserved by: > i40e_probe() -> > i40e_init_interrupt_scheme() -> > i40e_init_msix() -> > i40e_reserve_msix_vectors() -> > pci_enable_msix_range() > > # ls /sys/kernel/debug/irq/irqs > 0 10 11 13 142 184 217 259 292 31 33 > 337 339 340 342 344 346 348 350 352 354 356 > 358 360 362 364 366 368 370 372 374 376 378 > 380 382 384 386 388 390 392 394 4 6 7 9 > 1 109 12 14 15 224 26 332 335 > 338 34 341 343 345 347 349 351 353 355 357 > 359 361 363 365 367 369 371 373 375 377 379 > 381 383 385 387 389 391 393 395 5 67 8 Out of these 300 interrupts exactly 8 randomly selected ones are actively used. And the other 292 interrupts are just there because it might need them in the future when the 32 CPU machine gets magically upgraded to 4096 cores at runtime? Can the i40e people @intel please fix this waste of resources and sanitize their interrupt allocation scheme? Please switch it over to managed interrupts so the affinity spreading happens in a sane way and the interrupts are properly managed on CPU hotplug. Thanks, tglx
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Thu, 7 Sep 2017, Yu Chen wrote: > On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote: > > Can you please apply the debug patch below, boot the machine and right > > after login provide the output of > > > > # cat /sys/kernel/debug/tracing/trace > > > kworker/0:2-303 [000] 9.135467: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 34 > kworker/0:2-303 [000] 9.135476: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 35 > kworker/0:2-303 [000] 9.135484: msi_domain_alloc_irqs: dev: > :bb:00.0 nvec 1 virq 36 > kworker/0:2-303 [000] 9.762268: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 331 > kworker/0:2-303 [000] 9.762278: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 332 > kworker/0:2-303 [000] 9.762288: msi_domain_alloc_irqs: dev: > :bb:00.3 nvec 1 virq 333 That's 300 vectors. > bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03) > > -+-[:b2]-+-00.0-[b3-bc]00.0-[b4-bc]--+-00.0-[b5-b6]00.0 > | | +-01.0-[b7-b8]00.0 > | | +-02.0-[b9-ba]00.0 > | | \-03.0-[bb-bc]--+-00.0 > | | +-00.1 > | | +-00.2 > | | \-00.3 > > and they are using i40e driver, the vectors should be reserved by: > i40e_probe() -> > i40e_init_interrupt_scheme() -> > i40e_init_msix() -> > i40e_reserve_msix_vectors() -> > pci_enable_msix_range() > > # ls /sys/kernel/debug/irq/irqs > 0 10 11 13 142 184 217 259 292 31 33 > 337 339 340 342 344 346 348 350 352 354 356 > 358 360 362 364 366 368 370 372 374 376 378 > 380 382 384 386 388 390 392 394 4 6 7 9 > 1 109 12 14 15 224 26 332 335 > 338 34 341 343 345 347 349 351 353 355 357 > 359 361 363 365 367 369 371 373 375 377 379 > 381 383 385 387 389 391 393 395 5 67 8 Out of these 300 interrupts exactly 8 randomly selected ones are actively used. And the other 292 interrupts are just there because it might need them in the future when the 32 CPU machine gets magically upgraded to 4096 cores at runtime? Can the i40e people @intel please fix this waste of resources and sanitize their interrupt allocation scheme? Please switch it over to managed interrupts so the affinity spreading happens in a sane way and the interrupts are properly managed on CPU hotplug. Thanks, tglx
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 10:40 PM, Luca Coelhowrote: > > This patch is not very important (unless you really like blinking lights > -- maybe I'll change my mind when the holidays approach :P). so it is > fine if you just want to revert it for now. > > In any case, I'll send a patch fixing this problem soon. No need to revert if we can get this fixed quickly enough. I'll leave the fw-31 on my laptop, so that I can continue to use it for now. Thanks, Linus
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 10:40 PM, Luca Coelho wrote: > > This patch is not very important (unless you really like blinking lights > -- maybe I'll change my mind when the holidays approach :P). so it is > fine if you just want to revert it for now. > > In any case, I'll send a patch fixing this problem soon. No need to revert if we can get this fixed quickly enough. I'll leave the fw-31 on my laptop, so that I can continue to use it for now. Thanks, Linus
[PATCH] security/keys: Adding newline after declaration
Fixing checkpatch warning to add newline after declaration Signed-off-by: Pushkar Jambhlekar--- security/keys/big_key.c | 1 + 1 file changed, 1 insertion(+) diff --git a/security/keys/big_key.c b/security/keys/big_key.c index 835c1ab..bc4c6eb 100644 --- a/security/keys/big_key.c +++ b/security/keys/big_key.c @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, size_t datalen, u8 *key) { int ret = -EINVAL; struct scatterlist sgio; + SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher); if (crypto_skcipher_setkey(big_key_skcipher, key, ENC_KEY_SIZE)) { -- 2.7.4
[PATCH] security/keys: Adding newline after declaration
Fixing checkpatch warning to add newline after declaration Signed-off-by: Pushkar Jambhlekar --- security/keys/big_key.c | 1 + 1 file changed, 1 insertion(+) diff --git a/security/keys/big_key.c b/security/keys/big_key.c index 835c1ab..bc4c6eb 100644 --- a/security/keys/big_key.c +++ b/security/keys/big_key.c @@ -93,6 +93,7 @@ static int big_key_crypt(enum big_key_op op, u8 *data, size_t datalen, u8 *key) { int ret = -EINVAL; struct scatterlist sgio; + SKCIPHER_REQUEST_ON_STACK(req, big_key_skcipher); if (crypto_skcipher_setkey(big_key_skcipher, key, ENC_KEY_SIZE)) { -- 2.7.4
Re: [GIT] Networking
On Thu, 2017-09-07 at 05:04 +, Coelho, Luciano wrote: > On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano > >wrote: > > > > > > This seems to be a problem with backwards-compatibility with FW version > > > 27. We are now in version 31[1] and upgrading will probably fix that. > > > > I can confirm that fw version 31 works. > > Great, so I know for sure that this is a backwards-compatibility issue > with the FW API. > > > > > But obviously the driver should not fail miserably like this with > > > version 27, because it claims to support it still. > > > > Not just "claims to support it", but if it's what is shipped with a > > fairly recent distro like an up-to-date version of F26, I would really > > hope that the driver can still work with it. > > I totally agree, we support a bunch of older versions for that exact > reason. We just don't really test all the supported versions very > often. We should probably change that. > > I'll make sure it still works with version 27. Okay, I found the offending patch: commit 7089ae634c50544b29b31faf1a751e8765c8de3b Author: Johannes Berg AuthorDate: Wed Jun 28 16:19:49 2017 +0200 Commit: Luca Coelho CommitDate: Wed Aug 9 09:15:32 2017 +0300 iwlwifi: mvm: use firmware LED command where applicable On devices starting from 8000 series, the host can no longer toggle the LED through the CSR_LED_REG register, but must do it via the firmware instead. Add support for this. Note that this means that the LED cannot be turned on while the firmware is off, so using an arbitrary LED trigger may not work as expected. Fixes: 503ab8c56ca0 ("iwlwifi: Add 8000 HW family support") Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Reverting it solves the problem. We introduced a new command to control the LED lights and assumed it was available in older FW versions as well, which turned out not to be the case. This patch is not very important (unless you really like blinking lights -- maybe I'll change my mind when the holidays approach :P). so it is fine if you just want to revert it for now. In any case, I'll send a patch fixing this problem soon. -- Cheers, Luca.
Re: [GIT] Networking
On Thu, 2017-09-07 at 05:04 +, Coelho, Luciano wrote: > On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano > > wrote: > > > > > > This seems to be a problem with backwards-compatibility with FW version > > > 27. We are now in version 31[1] and upgrading will probably fix that. > > > > I can confirm that fw version 31 works. > > Great, so I know for sure that this is a backwards-compatibility issue > with the FW API. > > > > > But obviously the driver should not fail miserably like this with > > > version 27, because it claims to support it still. > > > > Not just "claims to support it", but if it's what is shipped with a > > fairly recent distro like an up-to-date version of F26, I would really > > hope that the driver can still work with it. > > I totally agree, we support a bunch of older versions for that exact > reason. We just don't really test all the supported versions very > often. We should probably change that. > > I'll make sure it still works with version 27. Okay, I found the offending patch: commit 7089ae634c50544b29b31faf1a751e8765c8de3b Author: Johannes Berg AuthorDate: Wed Jun 28 16:19:49 2017 +0200 Commit: Luca Coelho CommitDate: Wed Aug 9 09:15:32 2017 +0300 iwlwifi: mvm: use firmware LED command where applicable On devices starting from 8000 series, the host can no longer toggle the LED through the CSR_LED_REG register, but must do it via the firmware instead. Add support for this. Note that this means that the LED cannot be turned on while the firmware is off, so using an arbitrary LED trigger may not work as expected. Fixes: 503ab8c56ca0 ("iwlwifi: Add 8000 HW family support") Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Reverting it solves the problem. We introduced a new command to control the LED lights and assumed it was available in older FW versions as well, which turned out not to be the case. This patch is not very important (unless you really like blinking lights -- maybe I'll change my mind when the holidays approach :P). so it is fine if you just want to revert it for now. In any case, I'll send a patch fixing this problem soon. -- Cheers, Luca.
[PATCH] arm64: KVM: VHE: reset PSTATE.UAO when switch to host
In VHE mode, host kernel runs in the EL2 and can enable 'User Access Override' when fs==KERNEL_DS so that it can access kernel memory. However, PSTATE.UAO is set to 0 on an exception taken from EL1 to EL2. Thus when VHE is used and exception taken from a guest UAO will be disabled and host will use the incorrect PSTATE.UAO. So check and reset the PSTATE.UAO when switching to host. Move the reset PSTATE.PAN on entry to EL2 together with PSTATE.UAO reset. Signed-off-by: Dongjiu GengSigned-off-by: Haibin Zhang Tested-by: Dongjiu Geng --- arch/arm64/kvm/hyp/entry.S | 2 -- arch/arm64/kvm/hyp/switch.c | 12 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index 12ee62d..7662ef5 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -96,8 +96,6 @@ ENTRY(__guest_exit) add x1, x1, #VCPU_CONTEXT - ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN) - // Store the guest regs x2 and x3 stp x2, x3, [x1, #CPU_XREG_OFFSET(2)] diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index a733461..715b3941 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -22,6 +22,7 @@ #include #include #include +#include static bool __hyp_text __fpsimd_enabled_nvhe(void) { @@ -399,6 +400,17 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __sysreg_restore_host_state(host_ctxt); + if (has_vhe()) { + /* +* PSTATE was not saved over guest enter/exit, re-enable +* any detecte features that might not have been set +* correctly. +*/ + uao_thread_switch(current); + asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), + ARM64_HAS_PAN, CONFIG_ARM64_PAN)); + } + if (fp_enabled) { __fpsimd_save_state(_ctxt->gp_regs.fp_regs); __fpsimd_restore_state(_ctxt->gp_regs.fp_regs); -- 1.8.3.1
[PATCH] arm64: KVM: VHE: reset PSTATE.UAO when switch to host
In VHE mode, host kernel runs in the EL2 and can enable 'User Access Override' when fs==KERNEL_DS so that it can access kernel memory. However, PSTATE.UAO is set to 0 on an exception taken from EL1 to EL2. Thus when VHE is used and exception taken from a guest UAO will be disabled and host will use the incorrect PSTATE.UAO. So check and reset the PSTATE.UAO when switching to host. Move the reset PSTATE.PAN on entry to EL2 together with PSTATE.UAO reset. Signed-off-by: Dongjiu Geng Signed-off-by: Haibin Zhang Tested-by: Dongjiu Geng --- arch/arm64/kvm/hyp/entry.S | 2 -- arch/arm64/kvm/hyp/switch.c | 12 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index 12ee62d..7662ef5 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -96,8 +96,6 @@ ENTRY(__guest_exit) add x1, x1, #VCPU_CONTEXT - ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN) - // Store the guest regs x2 and x3 stp x2, x3, [x1, #CPU_XREG_OFFSET(2)] diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index a733461..715b3941 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -22,6 +22,7 @@ #include #include #include +#include static bool __hyp_text __fpsimd_enabled_nvhe(void) { @@ -399,6 +400,17 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __sysreg_restore_host_state(host_ctxt); + if (has_vhe()) { + /* +* PSTATE was not saved over guest enter/exit, re-enable +* any detecte features that might not have been set +* correctly. +*/ + uao_thread_switch(current); + asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), + ARM64_HAS_PAN, CONFIG_ARM64_PAN)); + } + if (fp_enabled) { __fpsimd_save_state(_ctxt->gp_regs.fp_regs); __fpsimd_restore_state(_ctxt->gp_regs.fp_regs); -- 1.8.3.1
Re: [PATCH] [RESEND] drm/stm: fix warning about multiplication in condition
Hi Benjamin, This should be pushed to drm-misc by you, right? Thanks, Archit On 09/06/2017 06:43 PM, Arnd Bergmann wrote: gcc-7 complains about multiplying within a condition being suspicious: drivers/gpu/drm/stm/dw_mipi_dsi-stm.c: In function 'dsi_pll_get_clkout_khz': drivers/gpu/drm/stm/dw_mipi_dsi-stm.c:117:10: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] The code here is correct, but can be easily rephrased to make that more obvious. I also swap out the error handling and the normal code path for clarity. Fixes: b0f09a3c69d9 ("drm/stm: Add STM32 DSI controller driver") Acked-by: Philippe CornuTested-by: Philippe Cornu Signed-off-by: Arnd Bergmann --- Originally sent on July 25, but never made it into linux-next. The warning is currently disabled in mainline, but this seems to be a legitimate instance, and we may want to turn it back on in the future. --- drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c index 568c5d0461ea..e5b6310240fe 100644 --- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c +++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c @@ -113,11 +113,13 @@ static enum dsi_color dsi_color_from_mipi(enum mipi_dsi_pixel_format fmt) static int dsi_pll_get_clkout_khz(int clkin_khz, int idf, int ndiv, int odf) { + int divisor = idf * odf; + /* prevent from division by 0 */ - if (idf * odf) - return DIV_ROUND_CLOSEST(clkin_khz * ndiv, idf * odf); + if (!divisor) + return 0; - return 0; + return DIV_ROUND_CLOSEST(clkin_khz * ndiv, divisor); } static int dsi_pll_get_params(int clkin_khz, int clkout_khz, -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH] [RESEND] drm/stm: fix warning about multiplication in condition
Hi Benjamin, This should be pushed to drm-misc by you, right? Thanks, Archit On 09/06/2017 06:43 PM, Arnd Bergmann wrote: gcc-7 complains about multiplying within a condition being suspicious: drivers/gpu/drm/stm/dw_mipi_dsi-stm.c: In function 'dsi_pll_get_clkout_khz': drivers/gpu/drm/stm/dw_mipi_dsi-stm.c:117:10: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] The code here is correct, but can be easily rephrased to make that more obvious. I also swap out the error handling and the normal code path for clarity. Fixes: b0f09a3c69d9 ("drm/stm: Add STM32 DSI controller driver") Acked-by: Philippe Cornu Tested-by: Philippe Cornu Signed-off-by: Arnd Bergmann --- Originally sent on July 25, but never made it into linux-next. The warning is currently disabled in mainline, but this seems to be a legitimate instance, and we may want to turn it back on in the future. --- drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c index 568c5d0461ea..e5b6310240fe 100644 --- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c +++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c @@ -113,11 +113,13 @@ static enum dsi_color dsi_color_from_mipi(enum mipi_dsi_pixel_format fmt) static int dsi_pll_get_clkout_khz(int clkin_khz, int idf, int ndiv, int odf) { + int divisor = idf * odf; + /* prevent from division by 0 */ - if (idf * odf) - return DIV_ROUND_CLOSEST(clkin_khz * ndiv, idf * odf); + if (!divisor) + return 0; - return 0; + return DIV_ROUND_CLOSEST(clkin_khz * ndiv, divisor); } static int dsi_pll_get_params(int clkin_khz, int clkout_khz, -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: linux-next: build failure after merge of the akpm-current tree
Hi all, On Wed, 2 Aug 2017 16:31:45 +1000 Stephen Rothwellwrote: > > On Wed, 2 Aug 2017 15:45:54 +1000 Stephen Rothwell > wrote: > > > > On Tue, 01 Aug 2017 09:08:01 -0400 "Zi Yan" wrote: > > > > > > I found two possible fixes. > > > > > > 1. This uses C++ zero initializer, GCC is OK with it. > > > I tested with GCC 4.9.3 (has the initialization bug) and GCC 6.4.0. > > > > > > --- a/include/linux/swapops.h~a > > > +++ a/include/linux/swapops.h > > > @@ -217,7 +217,7 @@ static inline swp_entry_t pmd_to_swp_ent > > > > > > static inline pmd_t swp_entry_to_pmd(swp_entry_t entry) > > > { > > > - return (pmd_t){ 0 }; > > > + return (pmd_t){}; > > > } > > > > I have done that for today ... please decide which is best (or find > > something better - maybe every platform really needs to have a __pmd() > > definition) and submit a real fix patch to Andrew. > > OK, that failed for my compiler (gcc 5.2.0) like this: > > In file included from mm/vmscan.c:55:0: > include/linux/swapops.h: In function 'swp_entry_to_pmd': > include/linux/swapops.h:226:16: error: empty scalar initializer > return (pmd_t){}; > ^ > include/linux/swapops.h:226:16: note: (near initialization for '(anonymous)') This has reappeared today :-( (for the arm multi_v7_defconfig build at least) > So I used the other idea (on top of Andrew's current tree): The fix patch now looks like this: From: Stephen Rothwell Date: Wed, 2 Aug 2017 15:55:02 +1000 Subject: [PATCH] mm-thp-enable-thp-migration-in-generic-path-fix-fix Signed-off-by: Stephen Rothwell --- include/linux/swapops.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 45b092aa6419..61cffa148a79 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -223,7 +223,9 @@ static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd) static inline pmd_t swp_entry_to_pmd(swp_entry_t entry) { - return (pmd_t){}; + pmd_t e; + memset(, 0, sizeof(pmd_t)); + return e; } static inline int is_pmd_migration_entry(pmd_t pmd) -- Cheers, Stephen Rothwell
Re: linux-next: build failure after merge of the akpm-current tree
Hi all, On Wed, 2 Aug 2017 16:31:45 +1000 Stephen Rothwell wrote: > > On Wed, 2 Aug 2017 15:45:54 +1000 Stephen Rothwell > wrote: > > > > On Tue, 01 Aug 2017 09:08:01 -0400 "Zi Yan" wrote: > > > > > > I found two possible fixes. > > > > > > 1. This uses C++ zero initializer, GCC is OK with it. > > > I tested with GCC 4.9.3 (has the initialization bug) and GCC 6.4.0. > > > > > > --- a/include/linux/swapops.h~a > > > +++ a/include/linux/swapops.h > > > @@ -217,7 +217,7 @@ static inline swp_entry_t pmd_to_swp_ent > > > > > > static inline pmd_t swp_entry_to_pmd(swp_entry_t entry) > > > { > > > - return (pmd_t){ 0 }; > > > + return (pmd_t){}; > > > } > > > > I have done that for today ... please decide which is best (or find > > something better - maybe every platform really needs to have a __pmd() > > definition) and submit a real fix patch to Andrew. > > OK, that failed for my compiler (gcc 5.2.0) like this: > > In file included from mm/vmscan.c:55:0: > include/linux/swapops.h: In function 'swp_entry_to_pmd': > include/linux/swapops.h:226:16: error: empty scalar initializer > return (pmd_t){}; > ^ > include/linux/swapops.h:226:16: note: (near initialization for '(anonymous)') This has reappeared today :-( (for the arm multi_v7_defconfig build at least) > So I used the other idea (on top of Andrew's current tree): The fix patch now looks like this: From: Stephen Rothwell Date: Wed, 2 Aug 2017 15:55:02 +1000 Subject: [PATCH] mm-thp-enable-thp-migration-in-generic-path-fix-fix Signed-off-by: Stephen Rothwell --- include/linux/swapops.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 45b092aa6419..61cffa148a79 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -223,7 +223,9 @@ static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd) static inline pmd_t swp_entry_to_pmd(swp_entry_t entry) { - return (pmd_t){}; + pmd_t e; + memset(, 0, sizeof(pmd_t)); + return e; } static inline int is_pmd_migration_entry(pmd_t pmd) -- Cheers, Stephen Rothwell
Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode
On 09/07/17 at 12:19pm, Dou Liyang wrote: > Hi Baoquan > > I am wordy one ah: > our target is checking if BIOS supports APIC, no matter what > type(separated/integrated) it is. if not, go to PIC mode. > > Let‘s discuss the original logic and the smp_found_config, > then take about your code. > > The existing logic is: > > if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1) > return -1; > > if (!boot_cpu_has(X86_FEATURE_APIC) && > APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2) > pr_err(); > > why smp_found_config has to be checked in (1)? > > Because, In case of discrete (pretty old) apics we may not set > X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic > feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1] > So we assume that if SMP configuration is found from MP table > (smp_found_config = 1) in above case, there maybe a separated > chip in our pc. > > After passing the check of (1), we in (2), check whether local APIC > is detected or not, If we have a BIOS bug. > > [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics") Hmm, sounds reasonable. Just a sentence to describe it could be better. > > At 09/06/2017 06:17 PM, Baoquan He wrote: > > Hi Dou, > > > > On 08/28/17 at 11:20am, Dou Liyang wrote: > > > +static int __init apic_intr_mode_select(void) > > > +{ > > > + /* Check kernel option */ > > > + if (disable_apic) { > > > + pr_info("APIC disabled via kernel command line\n"); > > > + return APIC_PIC; > > > + } > > > + > > > > I am not very familiar with cpu registers, not sure if we can adjust > > below code flow as: > > > > /* If APIC is integrated, check local APIC only */ > > if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) { > > disable_apic = 1; > > pr_info("APIC disabled by BIOS\n"); > > return APIC_PIC; > > } > > > > /* If APIC is on a separate chip, check if smp_found_config is found*/ > > if (!lapic_is_integrated() && !smp_found_config) { > > disable_apic = 1; > > return APIC_PIC; > > } > > Yes, Awesome, we first consider it from APIC register space, then > the BOIS and software configration. let me do more investigation. > > I rewrite it based on you, any comments will welcome. > > /* If APIC is not integrated, check if SMP configuration is >* found from MP table. If not too, no 82489DX. switch to >* PIC mode >* >* Else APIC is integrated, check if the BIOS allows local APIC >* >*/ > if (!lapic_is_integrated()) { > if (!smp_found_config) { > disable_apic = 1; > return APIC_PIC; > } > } else if(!boot_cpu_has(X86_FEATURE_APIC)) { > disable_apic = 1; > pr_info("APIC disabled by BIOS\n"); > return APIC_PIC; > } > } Yeah, it's fine to me. At least the logic looks more understandable. > > BTW, As the macro APIC_INTEGRATED(x) has already wrapped by > CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity > like that: Yes, looks good. There's duplicate judgement of X86_64 in lapic_is_integrated. > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index 7834f73..63b3ae9 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -211,11 +211,7 @@ static inline int lapic_get_version(void) > */ > static inline int lapic_is_integrated(void) > { > -#ifdef CONFIG_X86_64 > - return 1; > -#else > return APIC_INTEGRATED(lapic_get_version()); > -#endif > } > > > Do you think so. ;-) > > > Thanks, > dou. > > > > Now, I haven't think of why smp_found_config has to be > > checked here. > > > > In this way, we don't need the CONFIG_X86_64 checking since it's > > contained in lapic_is_integrated() already. And the checking is obvious > > for understanding. Just not very sure if the checking is adequate. > > > > Just my personal opinion. > > > > > + /* Check BIOS */ > > > +#ifdef CONFIG_X86_64 > > > + /* On 64-bit, the APIC must be integrated, Check local APIC only */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC)) { > > > + disable_apic = 1; > > > + pr_info("APIC disabled by BIOS\n"); > > > + return APIC_PIC; > > > + } > > > +#else > > > + /* > > > + * On 32-bit, check whether there is a separate chip or integrated > > > + * APIC > > > + */ > > > + > > > + /* Has a separate chip ? */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) { > > > + disable_apic = 1; > > > + > > > + return APIC_PIC; > > > + } > > > + > > > + /* Has a local APIC ? */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC) && > > > + APIC_INTEGRATED(boot_cpu_apic_version)) { > > > +
Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode
On 09/07/17 at 12:19pm, Dou Liyang wrote: > Hi Baoquan > > I am wordy one ah: > our target is checking if BIOS supports APIC, no matter what > type(separated/integrated) it is. if not, go to PIC mode. > > Let‘s discuss the original logic and the smp_found_config, > then take about your code. > > The existing logic is: > > if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1) > return -1; > > if (!boot_cpu_has(X86_FEATURE_APIC) && > APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2) > pr_err(); > > why smp_found_config has to be checked in (1)? > > Because, In case of discrete (pretty old) apics we may not set > X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic > feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1] > So we assume that if SMP configuration is found from MP table > (smp_found_config = 1) in above case, there maybe a separated > chip in our pc. > > After passing the check of (1), we in (2), check whether local APIC > is detected or not, If we have a BIOS bug. > > [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics") Hmm, sounds reasonable. Just a sentence to describe it could be better. > > At 09/06/2017 06:17 PM, Baoquan He wrote: > > Hi Dou, > > > > On 08/28/17 at 11:20am, Dou Liyang wrote: > > > +static int __init apic_intr_mode_select(void) > > > +{ > > > + /* Check kernel option */ > > > + if (disable_apic) { > > > + pr_info("APIC disabled via kernel command line\n"); > > > + return APIC_PIC; > > > + } > > > + > > > > I am not very familiar with cpu registers, not sure if we can adjust > > below code flow as: > > > > /* If APIC is integrated, check local APIC only */ > > if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) { > > disable_apic = 1; > > pr_info("APIC disabled by BIOS\n"); > > return APIC_PIC; > > } > > > > /* If APIC is on a separate chip, check if smp_found_config is found*/ > > if (!lapic_is_integrated() && !smp_found_config) { > > disable_apic = 1; > > return APIC_PIC; > > } > > Yes, Awesome, we first consider it from APIC register space, then > the BOIS and software configration. let me do more investigation. > > I rewrite it based on you, any comments will welcome. > > /* If APIC is not integrated, check if SMP configuration is >* found from MP table. If not too, no 82489DX. switch to >* PIC mode >* >* Else APIC is integrated, check if the BIOS allows local APIC >* >*/ > if (!lapic_is_integrated()) { > if (!smp_found_config) { > disable_apic = 1; > return APIC_PIC; > } > } else if(!boot_cpu_has(X86_FEATURE_APIC)) { > disable_apic = 1; > pr_info("APIC disabled by BIOS\n"); > return APIC_PIC; > } > } Yeah, it's fine to me. At least the logic looks more understandable. > > BTW, As the macro APIC_INTEGRATED(x) has already wrapped by > CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity > like that: Yes, looks good. There's duplicate judgement of X86_64 in lapic_is_integrated. > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index 7834f73..63b3ae9 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -211,11 +211,7 @@ static inline int lapic_get_version(void) > */ > static inline int lapic_is_integrated(void) > { > -#ifdef CONFIG_X86_64 > - return 1; > -#else > return APIC_INTEGRATED(lapic_get_version()); > -#endif > } > > > Do you think so. ;-) > > > Thanks, > dou. > > > > Now, I haven't think of why smp_found_config has to be > > checked here. > > > > In this way, we don't need the CONFIG_X86_64 checking since it's > > contained in lapic_is_integrated() already. And the checking is obvious > > for understanding. Just not very sure if the checking is adequate. > > > > Just my personal opinion. > > > > > + /* Check BIOS */ > > > +#ifdef CONFIG_X86_64 > > > + /* On 64-bit, the APIC must be integrated, Check local APIC only */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC)) { > > > + disable_apic = 1; > > > + pr_info("APIC disabled by BIOS\n"); > > > + return APIC_PIC; > > > + } > > > +#else > > > + /* > > > + * On 32-bit, check whether there is a separate chip or integrated > > > + * APIC > > > + */ > > > + > > > + /* Has a separate chip ? */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) { > > > + disable_apic = 1; > > > + > > > + return APIC_PIC; > > > + } > > > + > > > + /* Has a local APIC ? */ > > > + if (!boot_cpu_has(X86_FEATURE_APIC) && > > > + APIC_INTEGRATED(boot_cpu_apic_version)) { > > > +
Re: [PATCH 3/3] drm/exynos/gsc: Add rotation hardware limits of gscaler
On 09/04/2017 03:19 PM, Hoegeun Kwon wrote: On 09/01/2017 04:31 PM, Marek Szyprowski wrote: Hi Hoegeun, On 2017-09-01 03:47, Hoegeun Kwon wrote: The gscaler has hardware rotation limits that need to be imported from dts. Parse them and add them to the property list. The rotation hardware limits are related to the cropped source size. When swap occurs, use rot_max size instead of crop_max size. Also the scaling limits are related to post size, use pos size to check the limits. Signed-off-by: Hoegeun Kwon--- drivers/gpu/drm/exynos/exynos_drm_gsc.c | 63 + include/uapi/drm/exynos_drm.h | 2 ++ 2 files changed, 42 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c index 0506b2b..dd9b057 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c @@ -1401,6 +1401,23 @@ static int gsc_ippdrv_check_property(struct device *dev, bool swap; int i; +config = >config[EXYNOS_DRM_OPS_DST]; + +/* check for degree */ +switch (config->degree) { +case EXYNOS_DRM_DEGREE_90: +case EXYNOS_DRM_DEGREE_270: +swap = true; +break; +case EXYNOS_DRM_DEGREE_0: +case EXYNOS_DRM_DEGREE_180: +swap = false; +break; +default: +DRM_ERROR("invalid degree.\n"); +goto err_property; +} + for_each_ipp_ops(i) { if ((i == EXYNOS_DRM_OPS_SRC) && (property->cmd == IPP_CMD_WB)) @@ -1416,21 +1433,6 @@ static int gsc_ippdrv_check_property(struct device *dev, goto err_property; } -/* check for degree */ -switch (config->degree) { -case EXYNOS_DRM_DEGREE_90: -case EXYNOS_DRM_DEGREE_270: -swap = true; -break; -case EXYNOS_DRM_DEGREE_0: -case EXYNOS_DRM_DEGREE_180: -swap = false; -break; -default: -DRM_ERROR("invalid degree.\n"); -goto err_property; -} - /* check for buffer bound */ if ((pos->x + pos->w > sz->hsize) || (pos->y + pos->h > sz->vsize)) { @@ -1438,21 +1440,27 @@ static int gsc_ippdrv_check_property(struct device *dev, goto err_property; } +/* + * The rotation hardware limits are related to the cropped + * source size. So use rot_max size to check the limits when + * swap happens. And also the scaling limits are related to pos + * size, use pos size to check the limits. + */ /* check for crop */ if ((i == EXYNOS_DRM_OPS_SRC) && (pp->crop)) { if (swap) { if ((pos->h < pp->crop_min.hsize) || -(sz->vsize > pp->crop_max.hsize) || +(pos->h > pp->rot_max.hsize) || (pos->w < pp->crop_min.vsize) || -(sz->hsize > pp->crop_max.vsize)) { +(pos->w > pp->rot_max.vsize)) { DRM_ERROR("out of crop size.\n"); goto err_property; } } else { if ((pos->w < pp->crop_min.hsize) || -(sz->hsize > pp->crop_max.hsize) || +(pos->w > pp->crop_max.hsize) || (pos->h < pp->crop_min.vsize) || -(sz->vsize > pp->crop_max.vsize)) { +(pos->h > pp->crop_max.vsize)) { DRM_ERROR("out of crop size.\n"); goto err_property; } @@ -1463,17 +1471,17 @@ static int gsc_ippdrv_check_property(struct device *dev, if ((i == EXYNOS_DRM_OPS_DST) && (pp->scale)) { if (swap) { if ((pos->h < pp->scale_min.hsize) || -(sz->vsize > pp->scale_max.hsize) || +(pos->h > pp->scale_max.hsize) || (pos->w < pp->scale_min.vsize) || -(sz->hsize > pp->scale_max.vsize)) { +(pos->w > pp->scale_max.vsize)) { DRM_ERROR("out of scale size.\n"); goto err_property; } } else { if ((pos->w < pp->scale_min.hsize) || -(sz->hsize > pp->scale_max.hsize) || +(pos->w > pp->scale_max.hsize) || (pos->h < pp->scale_min.vsize) || -(sz->vsize > pp->scale_max.vsize)) { +(pos->h > pp->scale_max.vsize)) { DRM_ERROR("out of scale size.\n"); goto err_property; } @@ -1676,6 +1684,15 @@ static int gsc_probe(struct platform_device *pdev) dev_warn(dev, "failed to get system register.\n"); ctx->sysreg = NULL;
Re: [PATCH 3/3] drm/exynos/gsc: Add rotation hardware limits of gscaler
On 09/04/2017 03:19 PM, Hoegeun Kwon wrote: On 09/01/2017 04:31 PM, Marek Szyprowski wrote: Hi Hoegeun, On 2017-09-01 03:47, Hoegeun Kwon wrote: The gscaler has hardware rotation limits that need to be imported from dts. Parse them and add them to the property list. The rotation hardware limits are related to the cropped source size. When swap occurs, use rot_max size instead of crop_max size. Also the scaling limits are related to post size, use pos size to check the limits. Signed-off-by: Hoegeun Kwon --- drivers/gpu/drm/exynos/exynos_drm_gsc.c | 63 + include/uapi/drm/exynos_drm.h | 2 ++ 2 files changed, 42 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c index 0506b2b..dd9b057 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c @@ -1401,6 +1401,23 @@ static int gsc_ippdrv_check_property(struct device *dev, bool swap; int i; +config = >config[EXYNOS_DRM_OPS_DST]; + +/* check for degree */ +switch (config->degree) { +case EXYNOS_DRM_DEGREE_90: +case EXYNOS_DRM_DEGREE_270: +swap = true; +break; +case EXYNOS_DRM_DEGREE_0: +case EXYNOS_DRM_DEGREE_180: +swap = false; +break; +default: +DRM_ERROR("invalid degree.\n"); +goto err_property; +} + for_each_ipp_ops(i) { if ((i == EXYNOS_DRM_OPS_SRC) && (property->cmd == IPP_CMD_WB)) @@ -1416,21 +1433,6 @@ static int gsc_ippdrv_check_property(struct device *dev, goto err_property; } -/* check for degree */ -switch (config->degree) { -case EXYNOS_DRM_DEGREE_90: -case EXYNOS_DRM_DEGREE_270: -swap = true; -break; -case EXYNOS_DRM_DEGREE_0: -case EXYNOS_DRM_DEGREE_180: -swap = false; -break; -default: -DRM_ERROR("invalid degree.\n"); -goto err_property; -} - /* check for buffer bound */ if ((pos->x + pos->w > sz->hsize) || (pos->y + pos->h > sz->vsize)) { @@ -1438,21 +1440,27 @@ static int gsc_ippdrv_check_property(struct device *dev, goto err_property; } +/* + * The rotation hardware limits are related to the cropped + * source size. So use rot_max size to check the limits when + * swap happens. And also the scaling limits are related to pos + * size, use pos size to check the limits. + */ /* check for crop */ if ((i == EXYNOS_DRM_OPS_SRC) && (pp->crop)) { if (swap) { if ((pos->h < pp->crop_min.hsize) || -(sz->vsize > pp->crop_max.hsize) || +(pos->h > pp->rot_max.hsize) || (pos->w < pp->crop_min.vsize) || -(sz->hsize > pp->crop_max.vsize)) { +(pos->w > pp->rot_max.vsize)) { DRM_ERROR("out of crop size.\n"); goto err_property; } } else { if ((pos->w < pp->crop_min.hsize) || -(sz->hsize > pp->crop_max.hsize) || +(pos->w > pp->crop_max.hsize) || (pos->h < pp->crop_min.vsize) || -(sz->vsize > pp->crop_max.vsize)) { +(pos->h > pp->crop_max.vsize)) { DRM_ERROR("out of crop size.\n"); goto err_property; } @@ -1463,17 +1471,17 @@ static int gsc_ippdrv_check_property(struct device *dev, if ((i == EXYNOS_DRM_OPS_DST) && (pp->scale)) { if (swap) { if ((pos->h < pp->scale_min.hsize) || -(sz->vsize > pp->scale_max.hsize) || +(pos->h > pp->scale_max.hsize) || (pos->w < pp->scale_min.vsize) || -(sz->hsize > pp->scale_max.vsize)) { +(pos->w > pp->scale_max.vsize)) { DRM_ERROR("out of scale size.\n"); goto err_property; } } else { if ((pos->w < pp->scale_min.hsize) || -(sz->hsize > pp->scale_max.hsize) || +(pos->w > pp->scale_max.hsize) || (pos->h < pp->scale_min.vsize) || -(sz->vsize > pp->scale_max.vsize)) { +(pos->h > pp->scale_max.vsize)) { DRM_ERROR("out of scale size.\n"); goto err_property; } @@ -1676,6 +1684,15 @@ static int gsc_probe(struct platform_device *pdev) dev_warn(dev, "failed to get system register.\n"); ctx->sysreg = NULL; } + +ret =
Re: [PATCH v4 1/3] arm: npcm: add basic support for Nuvoton BMCs
On Wed, Sep 6, 2017 at 2:04 PM, Avi Fishmanwrote: > Sorry for sending again some mailing list rejected it due to HTML > involved (althoug I tried Plain Text), Trying again. > > 2017-09-06 11:07 GMT+03:00 Brendan Higgins : >> Adds basic support for the Nuvoton NPCM750 BMC. >> >> Signed-off-by: Brendan Higgins >> --- >> arch/arm/Kconfig | 2 + >> arch/arm/Makefile| 1 + >> arch/arm/mach-npcm/Kconfig | 50 + >> arch/arm/mach-npcm/Makefile | 3 ++ >> arch/arm/mach-npcm/headsmp.S | 17 + >> arch/arm/mach-npcm/npcm7xx.c | 34 + >> arch/arm/mach-npcm/platsmp.c | 89 >> >> 7 files changed, 196 insertions(+) >> create mode 100644 arch/arm/mach-npcm/Kconfig >> create mode 100644 arch/arm/mach-npcm/Makefile >> create mode 100644 arch/arm/mach-npcm/headsmp.S >> create mode 100644 arch/arm/mach-npcm/npcm7xx.c >> create mode 100644 arch/arm/mach-npcm/platsmp.c >> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> index 61a0cb15067e..05543f1cfbde 100644 >> --- a/arch/arm/Kconfig >> +++ b/arch/arm/Kconfig >> @@ -782,6 +782,8 @@ source "arch/arm/mach-netx/Kconfig" >> >> source "arch/arm/mach-nomadik/Kconfig" >> >> +source "arch/arm/mach-npcm/Kconfig" >> + >> source "arch/arm/mach-nspire/Kconfig" >> >> source "arch/arm/plat-omap/Kconfig" >> diff --git a/arch/arm/Makefile b/arch/arm/Makefile >> index 47d3a1ab08d2..60ca50c7d762 100644 >> --- a/arch/arm/Makefile >> +++ b/arch/arm/Makefile >> @@ -191,6 +191,7 @@ machine-$(CONFIG_ARCH_MEDIATEK) += mediatek >> machine-$(CONFIG_ARCH_MXS) += mxs >> machine-$(CONFIG_ARCH_NETX)+= netx >> machine-$(CONFIG_ARCH_NOMADIK) += nomadik >> +machine-$(CONFIG_ARCH_NPCM)+= npcm >> machine-$(CONFIG_ARCH_NSPIRE) += nspire >> machine-$(CONFIG_ARCH_OXNAS) += oxnas >> machine-$(CONFIG_ARCH_OMAP1) += omap1 >> diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig >> new file mode 100644 >> index ..d47061855439 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/Kconfig >> @@ -0,0 +1,50 @@ >> +menuconfig ARCH_NPCM >> + bool "Nuvoton NPCM Architecture" >> + select ARCH_REQUIRE_GPIOLIB >> + select USE_OF >> + select PINCTRL >> + select PINCTRL_NPCM7XX >> + >> +if ARCH_NPCM >> + >> +comment "NPCMX50 CPU type" >> + >> +config CPU_NPCM750 >> + depends on ARCH_NPCM && ARCH_MULTI_V7 >> + bool "Support for NPCM750 BMC CPU (Poleg)" >> + select CACHE_L2X0 >> + select CPU_V7 >> + select ARM_GIC >> + select HAVE_SMP >> + select SMP >> + select SMP_ON_UP >> + select HAVE_ARM_SCU >> + select HAVE_ARM_TWD if SMP >> + select ARM_ERRATA_458693 >> + select ARM_ERRATA_720789 >> + select ARM_ERRATA_742231 >> + select ARM_ERRATA_754322 >> + select ARM_ERRATA_764369 >> + select ARM_ERRATA_794072 >> + select PL310_ERRATA_588369 >> + select PL310_ERRATA_727915 >> + select USB_EHCI_ROOT_HUB_TT >> + select USB_ARCH_HAS_HCD >> + select USB_ARCH_HAS_EHCI >> + select USB_EHCI_HCD >> + select USB_ARCH_HAS_OHCI >> + select USB_OHCI_HCD >> + select USB >> + select FIQ >> + select CPU_USE_DOMAINS >> + select GENERIC_CLOCKEVENTS >> + select CLKDEV_LOOKUP >> + select COMMON_CLK if OF >> + select NPCM750_TIMER >> + select MFD_SYSCON >> + help >> + Support for NPCM750 BMC CPU (Poleg). >> + >> + Nuvoton NPCM750 BMC based on the Cortex A9. >> + >> +endif >> diff --git a/arch/arm/mach-npcm/Makefile b/arch/arm/mach-npcm/Makefile >> new file mode 100644 >> index ..78416055b854 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/Makefile >> @@ -0,0 +1,3 @@ >> +AFLAGS_headsmp.o += -march=armv7-a >> + >> +obj-$(CONFIG_CPU_NPCM750) += npcm7xx.o platsmp.o headsmp.o >> diff --git a/arch/arm/mach-npcm/headsmp.S b/arch/arm/mach-npcm/headsmp.S >> new file mode 100644 >> index ..9fccbbd49ed4 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/headsmp.S >> @@ -0,0 +1,17 @@ >> +/* >> + * Copyright 2017 Google, Inc. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License version 2 as >> + * published by the Free Software Foundation. >> + */ >> + >> +#include >> +#include >> +#include >> + >> +ENTRY(npcm7xx_secondary_startup) >> + safe_svcmode_maskall r0 > > I saw you answered to Florian Fainelli that the BootRom is not > starting the secondary CPU in SVC mode. Are you sure? In our > engineering npcm7xx revision, Z1, the BootRom indeed started to run it > in IRQ mode but we fixed it in the production version, A1, (quite long > time ago), I hope you didn't get an EB with
Re: [PATCH v4 1/3] arm: npcm: add basic support for Nuvoton BMCs
On Wed, Sep 6, 2017 at 2:04 PM, Avi Fishman wrote: > Sorry for sending again some mailing list rejected it due to HTML > involved (althoug I tried Plain Text), Trying again. > > 2017-09-06 11:07 GMT+03:00 Brendan Higgins : >> Adds basic support for the Nuvoton NPCM750 BMC. >> >> Signed-off-by: Brendan Higgins >> --- >> arch/arm/Kconfig | 2 + >> arch/arm/Makefile| 1 + >> arch/arm/mach-npcm/Kconfig | 50 + >> arch/arm/mach-npcm/Makefile | 3 ++ >> arch/arm/mach-npcm/headsmp.S | 17 + >> arch/arm/mach-npcm/npcm7xx.c | 34 + >> arch/arm/mach-npcm/platsmp.c | 89 >> >> 7 files changed, 196 insertions(+) >> create mode 100644 arch/arm/mach-npcm/Kconfig >> create mode 100644 arch/arm/mach-npcm/Makefile >> create mode 100644 arch/arm/mach-npcm/headsmp.S >> create mode 100644 arch/arm/mach-npcm/npcm7xx.c >> create mode 100644 arch/arm/mach-npcm/platsmp.c >> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> index 61a0cb15067e..05543f1cfbde 100644 >> --- a/arch/arm/Kconfig >> +++ b/arch/arm/Kconfig >> @@ -782,6 +782,8 @@ source "arch/arm/mach-netx/Kconfig" >> >> source "arch/arm/mach-nomadik/Kconfig" >> >> +source "arch/arm/mach-npcm/Kconfig" >> + >> source "arch/arm/mach-nspire/Kconfig" >> >> source "arch/arm/plat-omap/Kconfig" >> diff --git a/arch/arm/Makefile b/arch/arm/Makefile >> index 47d3a1ab08d2..60ca50c7d762 100644 >> --- a/arch/arm/Makefile >> +++ b/arch/arm/Makefile >> @@ -191,6 +191,7 @@ machine-$(CONFIG_ARCH_MEDIATEK) += mediatek >> machine-$(CONFIG_ARCH_MXS) += mxs >> machine-$(CONFIG_ARCH_NETX)+= netx >> machine-$(CONFIG_ARCH_NOMADIK) += nomadik >> +machine-$(CONFIG_ARCH_NPCM)+= npcm >> machine-$(CONFIG_ARCH_NSPIRE) += nspire >> machine-$(CONFIG_ARCH_OXNAS) += oxnas >> machine-$(CONFIG_ARCH_OMAP1) += omap1 >> diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig >> new file mode 100644 >> index ..d47061855439 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/Kconfig >> @@ -0,0 +1,50 @@ >> +menuconfig ARCH_NPCM >> + bool "Nuvoton NPCM Architecture" >> + select ARCH_REQUIRE_GPIOLIB >> + select USE_OF >> + select PINCTRL >> + select PINCTRL_NPCM7XX >> + >> +if ARCH_NPCM >> + >> +comment "NPCMX50 CPU type" >> + >> +config CPU_NPCM750 >> + depends on ARCH_NPCM && ARCH_MULTI_V7 >> + bool "Support for NPCM750 BMC CPU (Poleg)" >> + select CACHE_L2X0 >> + select CPU_V7 >> + select ARM_GIC >> + select HAVE_SMP >> + select SMP >> + select SMP_ON_UP >> + select HAVE_ARM_SCU >> + select HAVE_ARM_TWD if SMP >> + select ARM_ERRATA_458693 >> + select ARM_ERRATA_720789 >> + select ARM_ERRATA_742231 >> + select ARM_ERRATA_754322 >> + select ARM_ERRATA_764369 >> + select ARM_ERRATA_794072 >> + select PL310_ERRATA_588369 >> + select PL310_ERRATA_727915 >> + select USB_EHCI_ROOT_HUB_TT >> + select USB_ARCH_HAS_HCD >> + select USB_ARCH_HAS_EHCI >> + select USB_EHCI_HCD >> + select USB_ARCH_HAS_OHCI >> + select USB_OHCI_HCD >> + select USB >> + select FIQ >> + select CPU_USE_DOMAINS >> + select GENERIC_CLOCKEVENTS >> + select CLKDEV_LOOKUP >> + select COMMON_CLK if OF >> + select NPCM750_TIMER >> + select MFD_SYSCON >> + help >> + Support for NPCM750 BMC CPU (Poleg). >> + >> + Nuvoton NPCM750 BMC based on the Cortex A9. >> + >> +endif >> diff --git a/arch/arm/mach-npcm/Makefile b/arch/arm/mach-npcm/Makefile >> new file mode 100644 >> index ..78416055b854 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/Makefile >> @@ -0,0 +1,3 @@ >> +AFLAGS_headsmp.o += -march=armv7-a >> + >> +obj-$(CONFIG_CPU_NPCM750) += npcm7xx.o platsmp.o headsmp.o >> diff --git a/arch/arm/mach-npcm/headsmp.S b/arch/arm/mach-npcm/headsmp.S >> new file mode 100644 >> index ..9fccbbd49ed4 >> --- /dev/null >> +++ b/arch/arm/mach-npcm/headsmp.S >> @@ -0,0 +1,17 @@ >> +/* >> + * Copyright 2017 Google, Inc. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License version 2 as >> + * published by the Free Software Foundation. >> + */ >> + >> +#include >> +#include >> +#include >> + >> +ENTRY(npcm7xx_secondary_startup) >> + safe_svcmode_maskall r0 > > I saw you answered to Florian Fainelli that the BootRom is not > starting the secondary CPU in SVC mode. Are you sure? In our > engineering npcm7xx revision, Z1, the BootRom indeed started to run it > in IRQ mode but we fixed it in the production version, A1, (quite long > time ago), I hope you didn't get an EB with Z1 you can check that in > BootBlock console print: >> ADC CLK is set to
Re: [PATCH 1/2] pidmap(2)
Hi Alexey, On Thu, Sep 7, 2017 at 4:04 AM, Andy Lutomirskiwrote: > On Wed, Sep 6, 2017 at 2:04 AM, Alexey Dobriyan wrote: >> On 9/6/17, Randy Dunlap wrote: >>> On 09/05/17 15:53, Andrew Morton wrote: [...] >>> >>> also, I expect that the tiny kernel people will want kconfig options for >>> these syscalls. >> >> We'll add it but the question if it is a good idea. Ideally these system >> calls >> should be mandatory and /proc optional. >> >> $ size kernel/pidmap.o fs/fdmap.o >>textdata bss dec hex filename >> 560 0 0 560 230 kernel/pidmap.o >> 617 0 0 617 269 fs/fdmap.o > > After much discussion at LPC/KS last year, I thought the idea was to > try to speed up /proc rather than replacing it outright. The two > specific ideas I recall were: > > 1. Add a syscall like readfileat() that you can use to, in a single > operation, open, read, and close a /proc file (or other file). This > should vastly reduce locking and RCU overhead. > > 2. Add a /proc file that has a nice binary format for task info. (nl_attr?) > > I don't see why pidmap() deserves to be significantly faster than getdents(). > > Also, a pidmap() syscall like this inherently bypasses any security > restrictions implied by the way that /proc is mounted. It can respect > hidepid, but hidepid (as a per-namespace concept) is an enormous turd > that badly needs to be deprecated, and Djalal is working on exactly > that. Yes as noted by Andy, me and Alexey Gladkov are working on modernizing procfs [1] and to reduce/remove ties within pid namespaces which has lot of problems now. We just picked the task again, and this was the result of discussion with Andy some months ago, on how to improve hidepid, but also how to improve procfs in general, so we can add other mechanisms to hide or return NULL on other /proc/_file_not_needed_by_containers_ or /proc/_specific_module_files_ everything that is not virtualized , or mount only some specific view of the whole /proc API this will also be used by containers. This also should make it hard for attackers since we are planning to have a backward compatible options on how to better treat some of these files in regard of some namespaces. The syscall or readfileat() for one operation is a nice addition definitively. But in general it would be better to treat /proc as a filesystem and not add other specific interfaces that may abstract it with pidns, as it is the situation now which make it from userspace perspective: hard to use especially for security context. Alexey, could you please Cc'us on future, thank you very much! [1] https://lkml.org/lkml/2017/4/25/282 -- tixxdz
Re: [PATCH 1/2] pidmap(2)
Hi Alexey, On Thu, Sep 7, 2017 at 4:04 AM, Andy Lutomirski wrote: > On Wed, Sep 6, 2017 at 2:04 AM, Alexey Dobriyan wrote: >> On 9/6/17, Randy Dunlap wrote: >>> On 09/05/17 15:53, Andrew Morton wrote: [...] >>> >>> also, I expect that the tiny kernel people will want kconfig options for >>> these syscalls. >> >> We'll add it but the question if it is a good idea. Ideally these system >> calls >> should be mandatory and /proc optional. >> >> $ size kernel/pidmap.o fs/fdmap.o >>textdata bss dec hex filename >> 560 0 0 560 230 kernel/pidmap.o >> 617 0 0 617 269 fs/fdmap.o > > After much discussion at LPC/KS last year, I thought the idea was to > try to speed up /proc rather than replacing it outright. The two > specific ideas I recall were: > > 1. Add a syscall like readfileat() that you can use to, in a single > operation, open, read, and close a /proc file (or other file). This > should vastly reduce locking and RCU overhead. > > 2. Add a /proc file that has a nice binary format for task info. (nl_attr?) > > I don't see why pidmap() deserves to be significantly faster than getdents(). > > Also, a pidmap() syscall like this inherently bypasses any security > restrictions implied by the way that /proc is mounted. It can respect > hidepid, but hidepid (as a per-namespace concept) is an enormous turd > that badly needs to be deprecated, and Djalal is working on exactly > that. Yes as noted by Andy, me and Alexey Gladkov are working on modernizing procfs [1] and to reduce/remove ties within pid namespaces which has lot of problems now. We just picked the task again, and this was the result of discussion with Andy some months ago, on how to improve hidepid, but also how to improve procfs in general, so we can add other mechanisms to hide or return NULL on other /proc/_file_not_needed_by_containers_ or /proc/_specific_module_files_ everything that is not virtualized , or mount only some specific view of the whole /proc API this will also be used by containers. This also should make it hard for attackers since we are planning to have a backward compatible options on how to better treat some of these files in regard of some namespaces. The syscall or readfileat() for one operation is a nice addition definitively. But in general it would be better to treat /proc as a filesystem and not add other specific interfaces that may abstract it with pidns, as it is the situation now which make it from userspace perspective: hard to use especially for security context. Alexey, could you please Cc'us on future, thank you very much! [1] https://lkml.org/lkml/2017/4/25/282 -- tixxdz
Re: [GIT] Networking
On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano >wrote: > > > > This seems to be a problem with backwards-compatibility with FW version > > 27. We are now in version 31[1] and upgrading will probably fix that. > > I can confirm that fw version 31 works. Great, so I know for sure that this is a backwards-compatibility issue with the FW API. > > But obviously the driver should not fail miserably like this with > > version 27, because it claims to support it still. > > Not just "claims to support it", but if it's what is shipped with a > fairly recent distro like an up-to-date version of F26, I would really > hope that the driver can still work with it. I totally agree, we support a bunch of older versions for that exact reason. We just don't really test all the supported versions very often. We should probably change that. I'll make sure it still works with version 27. -- Cheers, Luca.
Re: [GIT] Networking
On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano > wrote: > > > > This seems to be a problem with backwards-compatibility with FW version > > 27. We are now in version 31[1] and upgrading will probably fix that. > > I can confirm that fw version 31 works. Great, so I know for sure that this is a backwards-compatibility issue with the FW API. > > But obviously the driver should not fail miserably like this with > > version 27, because it claims to support it still. > > Not just "claims to support it", but if it's what is shipped with a > fairly recent distro like an up-to-date version of F26, I would really > hope that the driver can still work with it. I totally agree, we support a bunch of older versions for that exact reason. We just don't really test all the supported versions very often. We should probably change that. I'll make sure it still works with version 27. -- Cheers, Luca.
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Lucianowrote: > > This seems to be a problem with backwards-compatibility with FW version > 27. We are now in version 31[1] and upgrading will probably fix that. I can confirm that fw version 31 works. > But obviously the driver should not fail miserably like this with > version 27, because it claims to support it still. Not just "claims to support it", but if it's what is shipped with a fairly recent distro like an up-to-date version of F26, I would really hope that the driver can still work with it. > I'm looking into this now and will provide a fix asap. Thanks, Linus
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano wrote: > > This seems to be a problem with backwards-compatibility with FW version > 27. We are now in version 31[1] and upgrading will probably fix that. I can confirm that fw version 31 works. > But obviously the driver should not fail miserably like this with > version 27, because it claims to support it still. Not just "claims to support it", but if it's what is shipped with a fairly recent distro like an up-to-date version of F26, I would really hope that the driver can still work with it. > I'm looking into this now and will provide a fix asap. Thanks, Linus
[PATCH] drivers/staging/rtl8188eu: cleanup crc32_init logic
crc32_init is using unnecessary else condition. Cleaningup this function Signed-off-by: Pushkar Jambhlekar--- drivers/staging/rtl8188eu/core/rtw_security.c | 42 +-- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_security.c b/drivers/staging/rtl8188eu/core/rtw_security.c index b283a490..9d17204 100644 --- a/drivers/staging/rtl8188eu/core/rtw_security.c +++ b/drivers/staging/rtl8188eu/core/rtw_security.c @@ -94,29 +94,29 @@ static u8 crc32_reverseBit(u8 data) static void crc32_init(void) { - if (bcrc32initialized == 1) { + if (bcrc32initialized == 1) return; - } else { - int i, j; - u32 c; - u8 *p = (u8 *), *p1; - u8 k; - - c = 0x1234; - - for (i = 0; i < 256; ++i) { - k = crc32_reverseBit((u8)i); - for (c = ((u32)k) << 24, j = 8; j > 0; --j) - c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1); - p1 = (u8 *)_table[i]; - - p1[0] = crc32_reverseBit(p[3]); - p1[1] = crc32_reverseBit(p[2]); - p1[2] = crc32_reverseBit(p[1]); - p1[3] = crc32_reverseBit(p[0]); - } - bcrc32initialized = 1; + + int i, j; + u32 c; + u8 *p = (u8 *), *p1; + u8 k; + + c = 0x1234; + + for (i = 0; i < 256; ++i) { + k = crc32_reverseBit((u8)i); + for (c = ((u32)k) << 24, j = 8; j > 0; --j) + c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1); + p1 = (u8 *)_table[i]; + + p1[0] = crc32_reverseBit(p[3]); + p1[1] = crc32_reverseBit(p[2]); + p1[2] = crc32_reverseBit(p[1]); + p1[3] = crc32_reverseBit(p[0]); } + + bcrc32initialized = 1; } static __le32 getcrc32(u8 *buf, int len) -- 2.7.4
[PATCH] drivers/staging/rtl8188eu: cleanup crc32_init logic
crc32_init is using unnecessary else condition. Cleaningup this function Signed-off-by: Pushkar Jambhlekar --- drivers/staging/rtl8188eu/core/rtw_security.c | 42 +-- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_security.c b/drivers/staging/rtl8188eu/core/rtw_security.c index b283a490..9d17204 100644 --- a/drivers/staging/rtl8188eu/core/rtw_security.c +++ b/drivers/staging/rtl8188eu/core/rtw_security.c @@ -94,29 +94,29 @@ static u8 crc32_reverseBit(u8 data) static void crc32_init(void) { - if (bcrc32initialized == 1) { + if (bcrc32initialized == 1) return; - } else { - int i, j; - u32 c; - u8 *p = (u8 *), *p1; - u8 k; - - c = 0x1234; - - for (i = 0; i < 256; ++i) { - k = crc32_reverseBit((u8)i); - for (c = ((u32)k) << 24, j = 8; j > 0; --j) - c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1); - p1 = (u8 *)_table[i]; - - p1[0] = crc32_reverseBit(p[3]); - p1[1] = crc32_reverseBit(p[2]); - p1[2] = crc32_reverseBit(p[1]); - p1[3] = crc32_reverseBit(p[0]); - } - bcrc32initialized = 1; + + int i, j; + u32 c; + u8 *p = (u8 *), *p1; + u8 k; + + c = 0x1234; + + for (i = 0; i < 256; ++i) { + k = crc32_reverseBit((u8)i); + for (c = ((u32)k) << 24, j = 8; j > 0; --j) + c = c & 0x8000 ? (c << 1) ^ CRC32_POLY : (c << 1); + p1 = (u8 *)_table[i]; + + p1[0] = crc32_reverseBit(p[3]); + p1[1] = crc32_reverseBit(p[2]); + p1[2] = crc32_reverseBit(p[1]); + p1[3] = crc32_reverseBit(p[0]); } + + bcrc32initialized = 1; } static __le32 getcrc32(u8 *buf, int len) -- 2.7.4
Re: [PATCH 06/14] md/bcache: Use %pS printk format for direct addresses
On 2017/9/7 上午4:27, Helge Deller wrote: > Use the %pS instead of the %pF printk format specifier for printing symbols > from direct addresses. This is needed for the ia64, ppc64 and parisc64 > architectures. > > Signed-off-by: Helge Deller> Cc: linux-bca...@vger.kernel.org > Cc: linux-r...@vger.kernel.org > --- > drivers/md/bcache/closure.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c > index 864e673..0b0c9bc 100644 > --- a/drivers/md/bcache/closure.c > +++ b/drivers/md/bcache/closure.c > @@ -175,7 +175,7 @@ static int debug_seq_show(struct seq_file *f, void *data) > list_for_each_entry(cl, _list, all) { > int r = atomic_read(>remaining); > > - seq_printf(f, "%p: %pF -> %pf p %p r %i ", > + seq_printf(f, "%p: %pS -> %pf p %p r %i ", > cl, (void *) cl->ip, cl->fn, cl->parent, > r & CLOSURE_REMAINING_MASK); > > @@ -187,7 +187,7 @@ static int debug_seq_show(struct seq_file *f, void *data) > r & CLOSURE_SLEEPING ? "Sl" : ""); > > if (r & CLOSURE_WAITING) > - seq_printf(f, " W %pF\n", > + seq_printf(f, " W %pS\n", > (void *) cl->waiting_on); > > seq_printf(f, "\n"); > It is unclear to me, that if %pF is used, on ia64/ppc64/parisc64 a function descriptor conversion happens, what negative effect exactly takes place ? Or you just want to unify the out put format, and get rid of extra conversion information ? Thanks. -- Coly Li
Re: [PATCH 06/14] md/bcache: Use %pS printk format for direct addresses
On 2017/9/7 上午4:27, Helge Deller wrote: > Use the %pS instead of the %pF printk format specifier for printing symbols > from direct addresses. This is needed for the ia64, ppc64 and parisc64 > architectures. > > Signed-off-by: Helge Deller > Cc: linux-bca...@vger.kernel.org > Cc: linux-r...@vger.kernel.org > --- > drivers/md/bcache/closure.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c > index 864e673..0b0c9bc 100644 > --- a/drivers/md/bcache/closure.c > +++ b/drivers/md/bcache/closure.c > @@ -175,7 +175,7 @@ static int debug_seq_show(struct seq_file *f, void *data) > list_for_each_entry(cl, _list, all) { > int r = atomic_read(>remaining); > > - seq_printf(f, "%p: %pF -> %pf p %p r %i ", > + seq_printf(f, "%p: %pS -> %pf p %p r %i ", > cl, (void *) cl->ip, cl->fn, cl->parent, > r & CLOSURE_REMAINING_MASK); > > @@ -187,7 +187,7 @@ static int debug_seq_show(struct seq_file *f, void *data) > r & CLOSURE_SLEEPING ? "Sl" : ""); > > if (r & CLOSURE_WAITING) > - seq_printf(f, " W %pF\n", > + seq_printf(f, " W %pS\n", > (void *) cl->waiting_on); > > seq_printf(f, "\n"); > It is unclear to me, that if %pF is used, on ia64/ppc64/parisc64 a function descriptor conversion happens, what negative effect exactly takes place ? Or you just want to unify the out put format, and get rid of extra conversion information ? Thanks. -- Coly Li
Re: [PATCH 3/3] f2fs: support flexible inline xattr size
On 09/06, Chao Yu wrote: > Hi Jaegeuk, > > Do we have time to test and stabilize this new feature before merge window? Let's try this in the next merge window. It's too tight to test now. :) > > Thanks, > > On 2017/9/4 18:58, Chao Yu wrote: > > Now, in product, more and more features based on file encryption were > > introduced, their demand of xattr space is increasing, however, inline > > xattr has fixed-size of 200 bytes, once inline xattr space is full, new > > increased xattr data would occupy additional xattr block which may bring > > us more space usage and performance regression during persisting. > > > > In order to resolve above issue, it's better to expand inline xattr size > > flexibly according to user's requirement. > > > > So this patch introduces new filesystem feature 'flexible inline xattr', > > and new mount option 'inline_xattr_size=%u', once mkfs enables the > > feature, we can use the option to make f2fs supporting flexible inline > > xattr size. > > > > To support this feature, we add extra attribute i_inline_xattr_size in > > inode layout, indicating that how many space inline xattr borrows from > > block address mapping space in inode layout, by this, we can easily > > locate and store flexible-sized inline xattr data in inode. > > > > Inode disk layout: > > +--+ > > | .i_mode | > > | ... | > > | .i_ext | > > +--+ > > | .i_extra_isize | > > | .i_inline_xattr_size |---+ > > | ... | | > > +--+ | > > | .i_addr | | > > | - block address or | | > > | - inline data | | > > +--+<---+ v > > |inline xattr |+---inline xattr range > > +--+<---+ > > | .i_nid | > > +--+ > > | node_footer| > > | (nid, ino, offset) | > > +--+ > > > > Signed-off-by: Chao Yu> > --- > > fs/f2fs/f2fs.h | 43 ++- > > fs/f2fs/inode.c | 12 > > fs/f2fs/namei.c | 6 ++ > > fs/f2fs/node.c | 4 ++-- > > fs/f2fs/super.c | 32 +++- > > fs/f2fs/sysfs.c | 7 +++ > > fs/f2fs/xattr.c | 18 +- > > include/linux/f2fs_fs.h | 5 +++-- > > 8 files changed, 100 insertions(+), 27 deletions(-) > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > > index 1f24ad4ca1bb..168ad51b7fb9 100644 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -93,6 +93,7 @@ extern char *fault_name[FAULT_MAX]; > > #define F2FS_MOUNT_GRPQUOTA0x0010 > > #define F2FS_MOUNT_PRJQUOTA0x0020 > > #define F2FS_MOUNT_QUOTA 0x0040 > > +#define F2FS_MOUNT_INLINE_XATTR_SIZE 0x0080 > > > > #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= > > ~F2FS_MOUNT_##option) > > #define set_opt(sbi, option) ((sbi)->mount_opt.opt |= > > F2FS_MOUNT_##option) > > @@ -118,6 +119,7 @@ struct f2fs_mount_info { > > #define F2FS_FEATURE_EXTRA_ATTR0x0008 > > #define F2FS_FEATURE_PRJQUOTA 0x0010 > > #define F2FS_FEATURE_INODE_CHKSUM 0x0020 > > +#define F2FS_FEATURE_FLEXIBLE_INLINE_XATTR 0x0040 > > > > #define F2FS_HAS_FEATURE(sb, mask) \ > > ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0) > > @@ -379,11 +381,14 @@ struct f2fs_flush_device { > > > > /* for inline stuff */ > > #define DEF_INLINE_RESERVED_SIZE 1 > > +#define DEF_MIN_INLINE_SIZE1 > > static inline int get_extra_isize(struct inode *inode); > > -#define MAX_INLINE_DATA(inode) (sizeof(__le32) * \ > > - (CUR_ADDRS_PER_INODE(inode) - \ > > - DEF_INLINE_RESERVED_SIZE - \ > > - F2FS_INLINE_XATTR_ADDRS)) > > +static inline int get_inline_xattr_addrs(struct inode *inode); > > +#define F2FS_INLINE_XATTR_ADDRS(inode) get_inline_xattr_addrs(inode) > > +#define MAX_INLINE_DATA(inode) (sizeof(__le32) * > > \ > > + (CUR_ADDRS_PER_INODE(inode) - \ > > + F2FS_INLINE_XATTR_ADDRS(inode) -\ > > + DEF_INLINE_RESERVED_SIZE)) > > > > /* for inline dir */ > > #define NR_INLINE_DENTRY(inode)(MAX_INLINE_DATA(inode) * BITS_PER_BYTE > > / \ > > @@ -592,6 +597,7 @@ struct f2fs_inode_info { > > > > int i_extra_isize; /* size of extra space located in > > i_addr */ > > kprojid_t i_projid; /* id for project quota */ > > + int i_inline_xattr_size;/* inline xattr size */ > > }; > > > > static inline void get_extent_info(struct extent_info
Re: [PATCH 3/3] f2fs: support flexible inline xattr size
On 09/06, Chao Yu wrote: > Hi Jaegeuk, > > Do we have time to test and stabilize this new feature before merge window? Let's try this in the next merge window. It's too tight to test now. :) > > Thanks, > > On 2017/9/4 18:58, Chao Yu wrote: > > Now, in product, more and more features based on file encryption were > > introduced, their demand of xattr space is increasing, however, inline > > xattr has fixed-size of 200 bytes, once inline xattr space is full, new > > increased xattr data would occupy additional xattr block which may bring > > us more space usage and performance regression during persisting. > > > > In order to resolve above issue, it's better to expand inline xattr size > > flexibly according to user's requirement. > > > > So this patch introduces new filesystem feature 'flexible inline xattr', > > and new mount option 'inline_xattr_size=%u', once mkfs enables the > > feature, we can use the option to make f2fs supporting flexible inline > > xattr size. > > > > To support this feature, we add extra attribute i_inline_xattr_size in > > inode layout, indicating that how many space inline xattr borrows from > > block address mapping space in inode layout, by this, we can easily > > locate and store flexible-sized inline xattr data in inode. > > > > Inode disk layout: > > +--+ > > | .i_mode | > > | ... | > > | .i_ext | > > +--+ > > | .i_extra_isize | > > | .i_inline_xattr_size |---+ > > | ... | | > > +--+ | > > | .i_addr | | > > | - block address or | | > > | - inline data | | > > +--+<---+ v > > |inline xattr |+---inline xattr range > > +--+<---+ > > | .i_nid | > > +--+ > > | node_footer| > > | (nid, ino, offset) | > > +--+ > > > > Signed-off-by: Chao Yu > > --- > > fs/f2fs/f2fs.h | 43 ++- > > fs/f2fs/inode.c | 12 > > fs/f2fs/namei.c | 6 ++ > > fs/f2fs/node.c | 4 ++-- > > fs/f2fs/super.c | 32 +++- > > fs/f2fs/sysfs.c | 7 +++ > > fs/f2fs/xattr.c | 18 +- > > include/linux/f2fs_fs.h | 5 +++-- > > 8 files changed, 100 insertions(+), 27 deletions(-) > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > > index 1f24ad4ca1bb..168ad51b7fb9 100644 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -93,6 +93,7 @@ extern char *fault_name[FAULT_MAX]; > > #define F2FS_MOUNT_GRPQUOTA0x0010 > > #define F2FS_MOUNT_PRJQUOTA0x0020 > > #define F2FS_MOUNT_QUOTA 0x0040 > > +#define F2FS_MOUNT_INLINE_XATTR_SIZE 0x0080 > > > > #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= > > ~F2FS_MOUNT_##option) > > #define set_opt(sbi, option) ((sbi)->mount_opt.opt |= > > F2FS_MOUNT_##option) > > @@ -118,6 +119,7 @@ struct f2fs_mount_info { > > #define F2FS_FEATURE_EXTRA_ATTR0x0008 > > #define F2FS_FEATURE_PRJQUOTA 0x0010 > > #define F2FS_FEATURE_INODE_CHKSUM 0x0020 > > +#define F2FS_FEATURE_FLEXIBLE_INLINE_XATTR 0x0040 > > > > #define F2FS_HAS_FEATURE(sb, mask) \ > > ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0) > > @@ -379,11 +381,14 @@ struct f2fs_flush_device { > > > > /* for inline stuff */ > > #define DEF_INLINE_RESERVED_SIZE 1 > > +#define DEF_MIN_INLINE_SIZE1 > > static inline int get_extra_isize(struct inode *inode); > > -#define MAX_INLINE_DATA(inode) (sizeof(__le32) * \ > > - (CUR_ADDRS_PER_INODE(inode) - \ > > - DEF_INLINE_RESERVED_SIZE - \ > > - F2FS_INLINE_XATTR_ADDRS)) > > +static inline int get_inline_xattr_addrs(struct inode *inode); > > +#define F2FS_INLINE_XATTR_ADDRS(inode) get_inline_xattr_addrs(inode) > > +#define MAX_INLINE_DATA(inode) (sizeof(__le32) * > > \ > > + (CUR_ADDRS_PER_INODE(inode) - \ > > + F2FS_INLINE_XATTR_ADDRS(inode) -\ > > + DEF_INLINE_RESERVED_SIZE)) > > > > /* for inline dir */ > > #define NR_INLINE_DENTRY(inode)(MAX_INLINE_DATA(inode) * BITS_PER_BYTE > > / \ > > @@ -592,6 +597,7 @@ struct f2fs_inode_info { > > > > int i_extra_isize; /* size of extra space located in > > i_addr */ > > kprojid_t i_projid; /* id for project quota */ > > + int i_inline_xattr_size;/* inline xattr size */ > > }; > > > > static inline void get_extent_info(struct extent_info *ext, > > @@ -1043,6
[PATCH] f2fs: make get_lock_data_page to handle encrypted inode
This patch refactors get_lock_data_page() to handle encryption case directly. In order to do that, it introduces common f2fs_submit_page_read(). Signed-off-by: Jaegeuk Kim--- fs/f2fs/data.c | 109 +++-- 1 file changed, 51 insertions(+), 58 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ee6801fdbdec..95f30fb6 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -456,6 +456,53 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio) return err; } +static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr, +unsigned nr_pages) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct fscrypt_ctx *ctx = NULL; + struct bio *bio; + + if (f2fs_encrypted_file(inode)) { + ctx = fscrypt_get_ctx(inode, GFP_NOFS); + if (IS_ERR(ctx)) + return ERR_CAST(ctx); + + /* wait the page to be moved by cleaning */ + f2fs_wait_on_block_writeback(sbi, blkaddr); + } + + bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); + if (!bio) { + if (ctx) + fscrypt_release_ctx(ctx); + return ERR_PTR(-ENOMEM); + } + f2fs_target_device(sbi, blkaddr, bio); + bio->bi_end_io = f2fs_read_end_io; + bio->bi_private = ctx; + bio_set_op_attrs(bio, REQ_OP_READ, 0); + + return bio; +} + +/* This can handle encryption stuffs */ +static int f2fs_submit_page_read(struct inode *inode, struct page *page, + block_t blkaddr) +{ + struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1); + + if (IS_ERR(bio)) + return PTR_ERR(bio); + + if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { + bio_put(bio); + return -EFAULT; + } + __submit_bio(F2FS_I_SB(inode), bio, DATA); + return 0; +} + static void __set_data_blkaddr(struct dnode_of_data *dn) { struct f2fs_node *rn = F2FS_NODE(dn->node_page); @@ -573,16 +620,6 @@ struct page *get_read_data_page(struct inode *inode, pgoff_t index, struct page *page; struct extent_info ei = {0,0,0}; int err; - struct f2fs_io_info fio = { - .sbi = F2FS_I_SB(inode), - .type = DATA, - .op = REQ_OP_READ, - .op_flags = op_flags, - .encrypted_page = NULL, - }; - - if (f2fs_encrypted_file(inode)) - return read_mapping_page(mapping, index, NULL); page = f2fs_grab_cache_page(mapping, index, for_write); if (!page) @@ -623,9 +660,7 @@ struct page *get_read_data_page(struct inode *inode, pgoff_t index, return page; } - fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr; - fio.page = page; - err = f2fs_submit_page_bio(); + err = f2fs_submit_page_read(inode, page, dn.data_blkaddr); if (err) goto put_err; return page; @@ -1150,35 +1185,6 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, return ret; } -static struct bio *f2fs_grab_bio(struct inode *inode, block_t blkaddr, -unsigned nr_pages) -{ - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct fscrypt_ctx *ctx = NULL; - struct bio *bio; - - if (f2fs_encrypted_file(inode)) { - ctx = fscrypt_get_ctx(inode, GFP_NOFS); - if (IS_ERR(ctx)) - return ERR_CAST(ctx); - - /* wait the page to be moved by cleaning */ - f2fs_wait_on_block_writeback(sbi, blkaddr); - } - - bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); - if (!bio) { - if (ctx) - fscrypt_release_ctx(ctx); - return ERR_PTR(-ENOMEM); - } - f2fs_target_device(sbi, blkaddr, bio); - bio->bi_end_io = f2fs_read_end_io; - bio->bi_private = ctx; - - return bio; -} - /* * This function was originally taken from fs/mpage.c, and customized for f2fs. * Major change was from block_size == page_size in f2fs by default. @@ -1275,12 +1281,11 @@ static int f2fs_mpage_readpages(struct address_space *mapping, bio = NULL; } if (bio == NULL) { - bio = f2fs_grab_bio(inode, block_nr, nr_pages); + bio = f2fs_grab_read_bio(inode, block_nr, nr_pages); if (IS_ERR(bio)) { bio = NULL; goto set_error_page; } - bio_set_op_attrs(bio, REQ_OP_READ, 0); } if
[PATCH] f2fs: make get_lock_data_page to handle encrypted inode
This patch refactors get_lock_data_page() to handle encryption case directly. In order to do that, it introduces common f2fs_submit_page_read(). Signed-off-by: Jaegeuk Kim --- fs/f2fs/data.c | 109 +++-- 1 file changed, 51 insertions(+), 58 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ee6801fdbdec..95f30fb6 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -456,6 +456,53 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio) return err; } +static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr, +unsigned nr_pages) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct fscrypt_ctx *ctx = NULL; + struct bio *bio; + + if (f2fs_encrypted_file(inode)) { + ctx = fscrypt_get_ctx(inode, GFP_NOFS); + if (IS_ERR(ctx)) + return ERR_CAST(ctx); + + /* wait the page to be moved by cleaning */ + f2fs_wait_on_block_writeback(sbi, blkaddr); + } + + bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); + if (!bio) { + if (ctx) + fscrypt_release_ctx(ctx); + return ERR_PTR(-ENOMEM); + } + f2fs_target_device(sbi, blkaddr, bio); + bio->bi_end_io = f2fs_read_end_io; + bio->bi_private = ctx; + bio_set_op_attrs(bio, REQ_OP_READ, 0); + + return bio; +} + +/* This can handle encryption stuffs */ +static int f2fs_submit_page_read(struct inode *inode, struct page *page, + block_t blkaddr) +{ + struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1); + + if (IS_ERR(bio)) + return PTR_ERR(bio); + + if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { + bio_put(bio); + return -EFAULT; + } + __submit_bio(F2FS_I_SB(inode), bio, DATA); + return 0; +} + static void __set_data_blkaddr(struct dnode_of_data *dn) { struct f2fs_node *rn = F2FS_NODE(dn->node_page); @@ -573,16 +620,6 @@ struct page *get_read_data_page(struct inode *inode, pgoff_t index, struct page *page; struct extent_info ei = {0,0,0}; int err; - struct f2fs_io_info fio = { - .sbi = F2FS_I_SB(inode), - .type = DATA, - .op = REQ_OP_READ, - .op_flags = op_flags, - .encrypted_page = NULL, - }; - - if (f2fs_encrypted_file(inode)) - return read_mapping_page(mapping, index, NULL); page = f2fs_grab_cache_page(mapping, index, for_write); if (!page) @@ -623,9 +660,7 @@ struct page *get_read_data_page(struct inode *inode, pgoff_t index, return page; } - fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr; - fio.page = page; - err = f2fs_submit_page_bio(); + err = f2fs_submit_page_read(inode, page, dn.data_blkaddr); if (err) goto put_err; return page; @@ -1150,35 +1185,6 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, return ret; } -static struct bio *f2fs_grab_bio(struct inode *inode, block_t blkaddr, -unsigned nr_pages) -{ - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct fscrypt_ctx *ctx = NULL; - struct bio *bio; - - if (f2fs_encrypted_file(inode)) { - ctx = fscrypt_get_ctx(inode, GFP_NOFS); - if (IS_ERR(ctx)) - return ERR_CAST(ctx); - - /* wait the page to be moved by cleaning */ - f2fs_wait_on_block_writeback(sbi, blkaddr); - } - - bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); - if (!bio) { - if (ctx) - fscrypt_release_ctx(ctx); - return ERR_PTR(-ENOMEM); - } - f2fs_target_device(sbi, blkaddr, bio); - bio->bi_end_io = f2fs_read_end_io; - bio->bi_private = ctx; - - return bio; -} - /* * This function was originally taken from fs/mpage.c, and customized for f2fs. * Major change was from block_size == page_size in f2fs by default. @@ -1275,12 +1281,11 @@ static int f2fs_mpage_readpages(struct address_space *mapping, bio = NULL; } if (bio == NULL) { - bio = f2fs_grab_bio(inode, block_nr, nr_pages); + bio = f2fs_grab_read_bio(inode, block_nr, nr_pages); if (IS_ERR(bio)) { bio = NULL; goto set_error_page; } - bio_set_op_attrs(bio, REQ_OP_READ, 0); } if (bio_add_page(bio, page,
Re: iov_iter_pipe warning.
On Wed, Sep 06, 2017 at 11:48:35PM -0400, Dave Jones wrote: > > That doesn't seem like an XFS problem - it indicates the pipe we are > > filling in generic_file_splice_read() is not being emptied by > > whatever we are splicing the file data to > > The puzzling part is this runs for a day on ext4 or btrfs, whereas I can > make xfs fall over pretty quickly. As Darrick pointed out though, this > could be due to xfs being the only user of iomap_dio_rw. > > I'm juggling a few other things right now, so probably not going to > have much time to dig further on this until after plumbers + 1 wk. I'll look into that tomorrow...
Re: iov_iter_pipe warning.
On Wed, Sep 06, 2017 at 11:48:35PM -0400, Dave Jones wrote: > > That doesn't seem like an XFS problem - it indicates the pipe we are > > filling in generic_file_splice_read() is not being emptied by > > whatever we are splicing the file data to > > The puzzling part is this runs for a day on ext4 or btrfs, whereas I can > make xfs fall over pretty quickly. As Darrick pointed out though, this > could be due to xfs being the only user of iomap_dio_rw. > > I'm juggling a few other things right now, so probably not going to > have much time to dig further on this until after plumbers + 1 wk. I'll look into that tomorrow...
Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode
Hi Baoquan I am wordy one ah: our target is checking if BIOS supports APIC, no matter what type(separated/integrated) it is. if not, go to PIC mode. Let‘s discuss the original logic and the smp_found_config, then take about your code. The existing logic is: if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1) return -1; if (!boot_cpu_has(X86_FEATURE_APIC) && APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2) pr_err(); why smp_found_config has to be checked in (1)? Because, In case of discrete (pretty old) apics we may not set X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1] So we assume that if SMP configuration is found from MP table (smp_found_config = 1) in above case, there maybe a separated chip in our pc. After passing the check of (1), we in (2), check whether local APIC is detected or not, If we have a BIOS bug. [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics") At 09/06/2017 06:17 PM, Baoquan He wrote: Hi Dou, On 08/28/17 at 11:20am, Dou Liyang wrote: +static int __init apic_intr_mode_select(void) +{ + /* Check kernel option */ + if (disable_apic) { + pr_info("APIC disabled via kernel command line\n"); + return APIC_PIC; + } + I am not very familiar with cpu registers, not sure if we can adjust below code flow as: /* If APIC is integrated, check local APIC only */ if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) { disable_apic = 1; pr_info("APIC disabled by BIOS\n"); return APIC_PIC; } /* If APIC is on a separate chip, check if smp_found_config is found*/ if (!lapic_is_integrated() && !smp_found_config) { disable_apic = 1; return APIC_PIC; } Yes, Awesome, we first consider it from APIC register space, then the BOIS and software configration. let me do more investigation. I rewrite it based on you, any comments will welcome. /* If APIC is not integrated, check if SMP configuration is * found from MP table. If not too, no 82489DX. switch to * PIC mode * * Else APIC is integrated, check if the BIOS allows local APIC * */ if (!lapic_is_integrated()) { if (!smp_found_config) { disable_apic = 1; return APIC_PIC; } } else if(!boot_cpu_has(X86_FEATURE_APIC)) { disable_apic = 1; pr_info("APIC disabled by BIOS\n"); return APIC_PIC; } } BTW, As the macro APIC_INTEGRATED(x) has already wrapped by CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity like that: diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 7834f73..63b3ae9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -211,11 +211,7 @@ static inline int lapic_get_version(void) */ static inline int lapic_is_integrated(void) { -#ifdef CONFIG_X86_64 - return 1; -#else return APIC_INTEGRATED(lapic_get_version()); -#endif } Do you think so. ;-) Thanks, dou. Now, I haven't think of why smp_found_config has to be checked here. In this way, we don't need the CONFIG_X86_64 checking since it's contained in lapic_is_integrated() already. And the checking is obvious for understanding. Just not very sure if the checking is adequate. Just my personal opinion. + /* Check BIOS */ +#ifdef CONFIG_X86_64 + /* On 64-bit, the APIC must be integrated, Check local APIC only */ + if (!boot_cpu_has(X86_FEATURE_APIC)) { + disable_apic = 1; + pr_info("APIC disabled by BIOS\n"); + return APIC_PIC; + } +#else + /* +* On 32-bit, check whether there is a separate chip or integrated +* APIC +*/ + + /* Has a separate chip ? */ + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) { + disable_apic = 1; + + return APIC_PIC; + } + + /* Has a local APIC ? */ + if (!boot_cpu_has(X86_FEATURE_APIC) && + APIC_INTEGRATED(boot_cpu_apic_version)) { + disable_apic = 1; + pr_err(FW_BUG "Local APIC %d not detected, force emulation\n", + boot_cpu_physical_apicid); + + return APIC_PIC; + } +#endif + + /* Check MP table or ACPI MADT configuration */ + if (!smp_found_config) { + disable_ioapic_support(); + + if (!acpi_lapic) + pr_info("APIC: ACPI MADT or MP tables are not detected\n"); + +
Re: [PATCH v8 01/13] x86/apic: Construct a selector for the interrupt delivery mode
Hi Baoquan I am wordy one ah: our target is checking if BIOS supports APIC, no matter what type(separated/integrated) it is. if not, go to PIC mode. Let‘s discuss the original logic and the smp_found_config, then take about your code. The existing logic is: if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) ...(1) return -1; if (!boot_cpu_has(X86_FEATURE_APIC) && APIC_INTEGRATED(boot_cpu_apic_version)) { ...(2) pr_err(); why smp_found_config has to be checked in (1)? Because, In case of discrete (pretty old) apics we may not set X86_FEATURE_APIC bit in cpuid, with 82489DX we can't rely on apic feature bit retrieved via cpuid(boot_cpu_has(X86_FEATURE_APIC)).[1] So we assume that if SMP configuration is found from MP table (smp_found_config = 1) in above case, there maybe a separated chip in our pc. After passing the check of (1), we in (2), check whether local APIC is detected or not, If we have a BIOS bug. [1] Commit 8312136fa8b0("x86, apic: Fix missed handling of discrete apics") At 09/06/2017 06:17 PM, Baoquan He wrote: Hi Dou, On 08/28/17 at 11:20am, Dou Liyang wrote: +static int __init apic_intr_mode_select(void) +{ + /* Check kernel option */ + if (disable_apic) { + pr_info("APIC disabled via kernel command line\n"); + return APIC_PIC; + } + I am not very familiar with cpu registers, not sure if we can adjust below code flow as: /* If APIC is integrated, check local APIC only */ if (lapic_is_integrated() && !boot_cpu_has(X86_FEATURE_APIC)) { disable_apic = 1; pr_info("APIC disabled by BIOS\n"); return APIC_PIC; } /* If APIC is on a separate chip, check if smp_found_config is found*/ if (!lapic_is_integrated() && !smp_found_config) { disable_apic = 1; return APIC_PIC; } Yes, Awesome, we first consider it from APIC register space, then the BOIS and software configration. let me do more investigation. I rewrite it based on you, any comments will welcome. /* If APIC is not integrated, check if SMP configuration is * found from MP table. If not too, no 82489DX. switch to * PIC mode * * Else APIC is integrated, check if the BIOS allows local APIC * */ if (!lapic_is_integrated()) { if (!smp_found_config) { disable_apic = 1; return APIC_PIC; } } else if(!boot_cpu_has(X86_FEATURE_APIC)) { disable_apic = 1; pr_info("APIC disabled by BIOS\n"); return APIC_PIC; } } BTW, As the macro APIC_INTEGRATED(x) has already wrapped by CONFIG_X86_32, I will cleanup the lapic_is_integrated() for readablity like that: diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 7834f73..63b3ae9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -211,11 +211,7 @@ static inline int lapic_get_version(void) */ static inline int lapic_is_integrated(void) { -#ifdef CONFIG_X86_64 - return 1; -#else return APIC_INTEGRATED(lapic_get_version()); -#endif } Do you think so. ;-) Thanks, dou. Now, I haven't think of why smp_found_config has to be checked here. In this way, we don't need the CONFIG_X86_64 checking since it's contained in lapic_is_integrated() already. And the checking is obvious for understanding. Just not very sure if the checking is adequate. Just my personal opinion. + /* Check BIOS */ +#ifdef CONFIG_X86_64 + /* On 64-bit, the APIC must be integrated, Check local APIC only */ + if (!boot_cpu_has(X86_FEATURE_APIC)) { + disable_apic = 1; + pr_info("APIC disabled by BIOS\n"); + return APIC_PIC; + } +#else + /* +* On 32-bit, check whether there is a separate chip or integrated +* APIC +*/ + + /* Has a separate chip ? */ + if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) { + disable_apic = 1; + + return APIC_PIC; + } + + /* Has a local APIC ? */ + if (!boot_cpu_has(X86_FEATURE_APIC) && + APIC_INTEGRATED(boot_cpu_apic_version)) { + disable_apic = 1; + pr_err(FW_BUG "Local APIC %d not detected, force emulation\n", + boot_cpu_physical_apicid); + + return APIC_PIC; + } +#endif + + /* Check MP table or ACPI MADT configuration */ + if (!smp_found_config) { + disable_ioapic_support(); + + if (!acpi_lapic) + pr_info("APIC: ACPI MADT or MP tables are not detected\n"); + +
Re: [PATCH v2 1/3] arm64/ras: support sea error recovery
Hi Borislav, On 2017/9/6 18:12, Borislav Petkov wrote: > On Tue, Sep 05, 2017 at 07:06:04PM +0800, Xie XiuQi wrote: >> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors >> are consumed. In some cases, if the error address is in a clean page or a >> read-only page, there is a chance to recover. Such as error occurs in a >> instruction page, we can reread this page from disk instead of killing >> process. >> >> Because memory_failure() may sleep, we can not call it directly in SEA >> exception >> context. So we saved faulting physical address associated with a process in >> the >> ghes handler and set __TIF_SEA_NOTIFY. When we return from SEA exception >> context >> and get into do_notify_resume() before the process running, we could check it >> and call memory_failure() to do recovery. It's safe, because we are in >> process >> context. >> >> Signed-off-by: Xie XiuQi>> Signed-off-by: Wang Xiongfeng >> --- >> arch/arm64/Kconfig | 11 +++ >> arch/arm64/include/asm/ras.h | 36 + >> arch/arm64/include/asm/thread_info.h | 4 +- >> arch/arm64/kernel/Makefile | 1 + >> arch/arm64/kernel/ras.c | 143 >> +++ >> arch/arm64/kernel/signal.c | 8 ++ >> arch/arm64/mm/fault.c| 27 +-- >> drivers/acpi/apei/ghes.c | 4 +- >> 8 files changed, 223 insertions(+), 11 deletions(-) >> create mode 100644 arch/arm64/include/asm/ras.h >> create mode 100644 arch/arm64/kernel/ras.c > > Please integrate scripts/checkpatch.pl into your patch creation workflow > and run all patches through it before submitting: Sorry for my mistake. I'll fix it, thanks. > > ERROR: code indent should use tabs where possible > #200: FILE: arch/arm64/kernel/ras.c:42: > +atomic_tinuse;$ > > WARNING: please, no spaces at the start of a line > #200: FILE: arch/arm64/kernel/ras.c:42: > +atomic_tinuse;$ > > ERROR: code indent should use tabs where possible > #201: FILE: arch/arm64/kernel/ras.c:43: > +struct task_struct *t;$ > > WARNING: please, no spaces at the start of a line > #201: FILE: arch/arm64/kernel/ras.c:43: > +struct task_struct *t;$ > > ERROR: code indent should use tabs where possible > #202: FILE: arch/arm64/kernel/ras.c:44: > +__u64 paddr;$ > > WARNING: please, no spaces at the start of a line > #202: FILE: arch/arm64/kernel/ras.c:44: > +__u64 paddr;$ > > ERROR: code indent should use tabs where possible > #207: FILE: arch/arm64/kernel/ras.c:49: > +struct sea_info *si;$ > > WARNING: please, no spaces at the start of a line > #207: FILE: arch/arm64/kernel/ras.c:49: > +struct sea_info *si;$ > > ERROR: code indent should use tabs where possible > #209: FILE: arch/arm64/kernel/ras.c:51: > +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$ > > WARNING: please, no spaces at the start of a line > #209: FILE: arch/arm64/kernel/ras.c:51: > +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$ > > ERROR: code indent should use tabs where possible > #210: FILE: arch/arm64/kernel/ras.c:52: > +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$ > > WARNING: please, no spaces at the start of a line > #210: FILE: arch/arm64/kernel/ras.c:52: > +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$ > > ERROR: code indent should use tabs where possible > #211: FILE: arch/arm64/kernel/ras.c:53: > +si->t = current;$ > > WARNING: please, no spaces at the start of a line > #211: FILE: arch/arm64/kernel/ras.c:53: > +si->t = current;$ > > ERROR: code indent should use tabs where possible > #212: FILE: arch/arm64/kernel/ras.c:54: > +si->paddr = addr;$ > > WARNING: please, no spaces at the start of a line > #212: FILE: arch/arm64/kernel/ras.c:54: > +si->paddr = addr;$ > > ERROR: code indent should use tabs where possible > #213: FILE: arch/arm64/kernel/ras.c:55: > +return true;$ > > WARNING: please, no spaces at the start of a line > #213: FILE: arch/arm64/kernel/ras.c:55: > +return true;$ > > ERROR: code indent should use tabs where possible > #214: FILE: arch/arm64/kernel/ras.c:56: > +}$ > > WARNING: please, no spaces at the start of a line > #214: FILE: arch/arm64/kernel/ras.c:56: > +}$ > > ERROR: code indent should use tabs where possible > #215: FILE: arch/arm64/kernel/ras.c:57: > +}$ > > WARNING: please, no spaces at the start of a line > #215: FILE: arch/arm64/kernel/ras.c:57: > +}$ > > ERROR: code indent should use tabs where possible > #223: FILE: arch/arm64/kernel/ras.c:65: > +struct sea_info *si;$ > > WARNING: please, no spaces at the start of
Re: [PATCH v2 1/3] arm64/ras: support sea error recovery
Hi Borislav, On 2017/9/6 18:12, Borislav Petkov wrote: > On Tue, Sep 05, 2017 at 07:06:04PM +0800, Xie XiuQi wrote: >> With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors >> are consumed. In some cases, if the error address is in a clean page or a >> read-only page, there is a chance to recover. Such as error occurs in a >> instruction page, we can reread this page from disk instead of killing >> process. >> >> Because memory_failure() may sleep, we can not call it directly in SEA >> exception >> context. So we saved faulting physical address associated with a process in >> the >> ghes handler and set __TIF_SEA_NOTIFY. When we return from SEA exception >> context >> and get into do_notify_resume() before the process running, we could check it >> and call memory_failure() to do recovery. It's safe, because we are in >> process >> context. >> >> Signed-off-by: Xie XiuQi >> Signed-off-by: Wang Xiongfeng >> --- >> arch/arm64/Kconfig | 11 +++ >> arch/arm64/include/asm/ras.h | 36 + >> arch/arm64/include/asm/thread_info.h | 4 +- >> arch/arm64/kernel/Makefile | 1 + >> arch/arm64/kernel/ras.c | 143 >> +++ >> arch/arm64/kernel/signal.c | 8 ++ >> arch/arm64/mm/fault.c| 27 +-- >> drivers/acpi/apei/ghes.c | 4 +- >> 8 files changed, 223 insertions(+), 11 deletions(-) >> create mode 100644 arch/arm64/include/asm/ras.h >> create mode 100644 arch/arm64/kernel/ras.c > > Please integrate scripts/checkpatch.pl into your patch creation workflow > and run all patches through it before submitting: Sorry for my mistake. I'll fix it, thanks. > > ERROR: code indent should use tabs where possible > #200: FILE: arch/arm64/kernel/ras.c:42: > +atomic_tinuse;$ > > WARNING: please, no spaces at the start of a line > #200: FILE: arch/arm64/kernel/ras.c:42: > +atomic_tinuse;$ > > ERROR: code indent should use tabs where possible > #201: FILE: arch/arm64/kernel/ras.c:43: > +struct task_struct *t;$ > > WARNING: please, no spaces at the start of a line > #201: FILE: arch/arm64/kernel/ras.c:43: > +struct task_struct *t;$ > > ERROR: code indent should use tabs where possible > #202: FILE: arch/arm64/kernel/ras.c:44: > +__u64 paddr;$ > > WARNING: please, no spaces at the start of a line > #202: FILE: arch/arm64/kernel/ras.c:44: > +__u64 paddr;$ > > ERROR: code indent should use tabs where possible > #207: FILE: arch/arm64/kernel/ras.c:49: > +struct sea_info *si;$ > > WARNING: please, no spaces at the start of a line > #207: FILE: arch/arm64/kernel/ras.c:49: > +struct sea_info *si;$ > > ERROR: code indent should use tabs where possible > #209: FILE: arch/arm64/kernel/ras.c:51: > +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$ > > WARNING: please, no spaces at the start of a line > #209: FILE: arch/arm64/kernel/ras.c:51: > +for (si = sea_info; si < _info[SEA_INFO_MAX]; si++) {$ > > ERROR: code indent should use tabs where possible > #210: FILE: arch/arm64/kernel/ras.c:52: > +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$ > > WARNING: please, no spaces at the start of a line > #210: FILE: arch/arm64/kernel/ras.c:52: > +if (atomic_cmpxchg(>inuse, 0, 1) == 0) {$ > > ERROR: code indent should use tabs where possible > #211: FILE: arch/arm64/kernel/ras.c:53: > +si->t = current;$ > > WARNING: please, no spaces at the start of a line > #211: FILE: arch/arm64/kernel/ras.c:53: > +si->t = current;$ > > ERROR: code indent should use tabs where possible > #212: FILE: arch/arm64/kernel/ras.c:54: > +si->paddr = addr;$ > > WARNING: please, no spaces at the start of a line > #212: FILE: arch/arm64/kernel/ras.c:54: > +si->paddr = addr;$ > > ERROR: code indent should use tabs where possible > #213: FILE: arch/arm64/kernel/ras.c:55: > +return true;$ > > WARNING: please, no spaces at the start of a line > #213: FILE: arch/arm64/kernel/ras.c:55: > +return true;$ > > ERROR: code indent should use tabs where possible > #214: FILE: arch/arm64/kernel/ras.c:56: > +}$ > > WARNING: please, no spaces at the start of a line > #214: FILE: arch/arm64/kernel/ras.c:56: > +}$ > > ERROR: code indent should use tabs where possible > #215: FILE: arch/arm64/kernel/ras.c:57: > +}$ > > WARNING: please, no spaces at the start of a line > #215: FILE: arch/arm64/kernel/ras.c:57: > +}$ > > ERROR: code indent should use tabs where possible > #223: FILE: arch/arm64/kernel/ras.c:65: > +struct sea_info *si;$ > > WARNING: please, no spaces at the start of a line > #223: FILE: arch/arm64/kernel/ras.c:65:
Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg
On 2017/9/7 3:50, Luis R. Rodriguez wrote: On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote: Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write, .data = _sched_time_avg, .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = sched_time_avg_handler, *sigh*, what's wrong with the below? Too easy? :), seems I walked zigzag several cycles to get the right point. It sounds like negative values are not possible too, right? If so then you can use proc_douintvec_minmax(). V4 has been sent days ago, please see v4. Thank, Ethan Luis
Re: [PATCH v2] sched: check user input value of sysctl_sched_time_avg
On 2017/9/7 3:50, Luis R. Rodriguez wrote: On Mon, Sep 04, 2017 at 03:54:23PM +0800, Ethan Zhao wrote: Peter, On 2017/9/4 15:49, Peter Zijlstra wrote: On Sat, Sep 02, 2017 at 02:57:32PM +0800, Ethan Zhao wrote: diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6648fbb..609bed2 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -367,7 +367,7 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write, .data = _sched_time_avg, .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = sched_time_avg_handler, *sigh*, what's wrong with the below? Too easy? :), seems I walked zigzag several cycles to get the right point. It sounds like negative values are not possible too, right? If so then you can use proc_douintvec_minmax(). V4 has been sent days ago, please see v4. Thank, Ethan Luis
Re: Donation
Greetings To You, My Name is Mavis wanczyk , the winner of the Power ball jackpot of $ $758.7 million in the AUGUST 24, 2017, My jackpot was a gift from God to me hence my Entire family/foundation has AGREED to do this. My foundation is donating $500,000.00USD to you. please contac maviswanczyk...@gmail.com for full details and please accept this token as a gift from me and my family. Read more: http://money.cnn.com/2017/08/23/news/powerball-700-million-jackpot/index.html Best Regards, Mavis Wanczyk
Re: Donation
Greetings To You, My Name is Mavis wanczyk , the winner of the Power ball jackpot of $ $758.7 million in the AUGUST 24, 2017, My jackpot was a gift from God to me hence my Entire family/foundation has AGREED to do this. My foundation is donating $500,000.00USD to you. please contac maviswanczyk...@gmail.com for full details and please accept this token as a gift from me and my family. Read more: http://money.cnn.com/2017/08/23/news/powerball-700-million-jackpot/index.html Best Regards, Mavis Wanczyk
Re: [PATCH 0/2] Fix resume failure due to PCID
> On Sep 6, 2017, at 8:25 PM, Linus Torvalds> wrote: > >> On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski wrote: >> Patch 1 is the fix. Patch 2 is a comment that would have kept me from >> chasing down a false lead. > > Yes, this seems to fix things for me. Thanks. > > Of course, right now that laptop has no working wifi with tip-of-tree > due to some issues with the networking tree, but that's an independent > thing and I could suspend and resume with this. So applied and pushed > out, > >Linus Great! FWIW, there's still a possible glitch where doing EFI calls could corrupt ASID 0. I figured that fixing that could wait for tomorrow.
Re: [PATCH 0/2] Fix resume failure due to PCID
> On Sep 6, 2017, at 8:25 PM, Linus Torvalds > wrote: > >> On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski wrote: >> Patch 1 is the fix. Patch 2 is a comment that would have kept me from >> chasing down a false lead. > > Yes, this seems to fix things for me. Thanks. > > Of course, right now that laptop has no working wifi with tip-of-tree > due to some issues with the networking tree, but that's an independent > thing and I could suspend and resume with this. So applied and pushed > out, > >Linus Great! FWIW, there's still a possible glitch where doing EFI calls could corrupt ASID 0. I figured that fixing that could wait for tomorrow.
Re: [PATCH] dt-binding: phy: don't confuse with Ethernet phy properties
From: Baruch SiachDate: Sun, 3 Sep 2017 17:32:16 +0300 > The generic PHY 'phys' property sometime appears in the same node with > the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in > phy-bindings.txt to reduce confusion. > > Signed-off-by: Baruch Siach Applied.
Re: [PATCH] dt-binding: phy: don't confuse with Ethernet phy properties
From: Baruch Siach Date: Sun, 3 Sep 2017 17:32:16 +0300 > The generic PHY 'phys' property sometime appears in the same node with > the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in > phy-bindings.txt to reduce confusion. > > Signed-off-by: Baruch Siach Applied.
Re: [GIT] Networking
On Wed, 2017-09-06 at 16:27 -0700, Linus Torvalds wrote: > This pull request completely breaks Intel wireless for me. > > This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a). > > That remains a very standard Intel machine with absolutely zero odd > things going on. > > The firmware is iwlwifi-8000C-28.ucode from > iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports > > iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm > > the thing starts acting badly with this: > > iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04 > iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004 > iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84 > iwlwifi :3a:00.0: Microcode SW error detected. Restarting 0x200. > iwlwifi :3a:00.0: Start IWL Error Log Dump: > iwlwifi :3a:00.0: Status: 0x0100, count: 6 > iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0 > iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND > iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0 > ... > iwlwifi :3a:00.0: 0x | isr status reg > ieee80211 phy0: Hardware restart was requested > iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD This seems to be a problem with backwards-compatibility with FW version 27. We are now in version 31[1] and upgrading will probably fix that. But obviously the driver should not fail miserably like this with version 27, because it claims to support it still. I'm looking into this now and will provide a fix asap. [1] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/iwlwifi-8000C-31.ucode -- Cheers, Luca.
Re: [GIT] Networking
On Wed, 2017-09-06 at 16:27 -0700, Linus Torvalds wrote: > This pull request completely breaks Intel wireless for me. > > This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a). > > That remains a very standard Intel machine with absolutely zero odd > things going on. > > The firmware is iwlwifi-8000C-28.ucode from > iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports > > iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm > > the thing starts acting badly with this: > > iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04 > iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004 > iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84 > iwlwifi :3a:00.0: Microcode SW error detected. Restarting 0x200. > iwlwifi :3a:00.0: Start IWL Error Log Dump: > iwlwifi :3a:00.0: Status: 0x0100, count: 6 > iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0 > iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND > iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0 > ... > iwlwifi :3a:00.0: 0x | isr status reg > ieee80211 phy0: Hardware restart was requested > iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD This seems to be a problem with backwards-compatibility with FW version 27. We are now in version 31[1] and upgrading will probably fix that. But obviously the driver should not fail miserably like this with version 27, because it claims to support it still. I'm looking into this now and will provide a fix asap. [1] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/iwlwifi-8000C-31.ucode -- Cheers, Luca.
Re: iov_iter_pipe warning.
On Thu, Sep 07, 2017 at 09:46:17AM +1000, Dave Chinner wrote: > On Wed, Sep 06, 2017 at 04:03:37PM -0400, Dave Jones wrote: > > On Mon, Aug 28, 2017 at 09:25:42PM -0700, Darrick J. Wong wrote: > > > On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote: > > > > I'm still trying to narrow down an exact reproducer, but it seems > > having > > > > trinity do a combination of sendfile & writev, with pipes and regular > > > > files as fd's is the best repro. > > > > > > > > Is this a real problem, or am I chasing ghosts ? That it doesn't > > happen > > > > on ext4 or btrfs is making me wonder... > > > > > > I haven't heard of any problems w/ directio xfs lately, but OTOH > > > I think it's the only filesystem that uses iomap_dio_rw, which would > > > explain why ext4/btrfs don't have this problem. > > > > Another warning, from likely the same root cause. > > > > WARNING: CPU: 3 PID: 572 at lib/iov_iter.c:962 iov_iter_pipe+0xe2/0xf0 > > WARN_ON(pipe->nrbufs == pipe->buffers); > > * @nrbufs: the number of non-empty pipe buffers in this pipe > * @buffers: total number of buffers (should be a power of 2) > > So that's warning that the pipe buffer is already full before we > try to read from the filesystem? > > That doesn't seem like an XFS problem - it indicates the pipe we are > filling in generic_file_splice_read() is not being emptied by > whatever we are splicing the file data to The puzzling part is this runs for a day on ext4 or btrfs, whereas I can make xfs fall over pretty quickly. As Darrick pointed out though, this could be due to xfs being the only user of iomap_dio_rw. I'm juggling a few other things right now, so probably not going to have much time to dig further on this until after plumbers + 1 wk. Dave
Re: iov_iter_pipe warning.
On Thu, Sep 07, 2017 at 09:46:17AM +1000, Dave Chinner wrote: > On Wed, Sep 06, 2017 at 04:03:37PM -0400, Dave Jones wrote: > > On Mon, Aug 28, 2017 at 09:25:42PM -0700, Darrick J. Wong wrote: > > > On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote: > > > > I'm still trying to narrow down an exact reproducer, but it seems > > having > > > > trinity do a combination of sendfile & writev, with pipes and regular > > > > files as fd's is the best repro. > > > > > > > > Is this a real problem, or am I chasing ghosts ? That it doesn't > > happen > > > > on ext4 or btrfs is making me wonder... > > > > > > I haven't heard of any problems w/ directio xfs lately, but OTOH > > > I think it's the only filesystem that uses iomap_dio_rw, which would > > > explain why ext4/btrfs don't have this problem. > > > > Another warning, from likely the same root cause. > > > > WARNING: CPU: 3 PID: 572 at lib/iov_iter.c:962 iov_iter_pipe+0xe2/0xf0 > > WARN_ON(pipe->nrbufs == pipe->buffers); > > * @nrbufs: the number of non-empty pipe buffers in this pipe > * @buffers: total number of buffers (should be a power of 2) > > So that's warning that the pipe buffer is already full before we > try to read from the filesystem? > > That doesn't seem like an XFS problem - it indicates the pipe we are > filling in generic_file_splice_read() is not being emptied by > whatever we are splicing the file data to The puzzling part is this runs for a day on ext4 or btrfs, whereas I can make xfs fall over pretty quickly. As Darrick pointed out though, this could be due to xfs being the only user of iomap_dio_rw. I'm juggling a few other things right now, so probably not going to have much time to dig further on this until after plumbers + 1 wk. Dave
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum ADI controller documentation
Hi Mark, On 6 September 2017 at 22:59, Mark Brownwrote: > On Wed, Sep 06, 2017 at 02:10:43PM +0800, Baolin Wang wrote: > >> +- hwlocks: Reference to a phandle of a hwlock provider node. >> +- hwlock-names: Reference to hwlock name strings defined in the same order >> + as the hwlocks. > > What are these hwlocks protecting, and what names are expected? I made one explanation in above sentence, I assume it is not clear. Since we have multi-subsystems will use ADI to access analog chip, when one system is reading/writing data by ADI, which should be under one hardware spinlock protection to prevent other systems from reading/writing data by ADI at the same time, or two parallel routine of setting ADI registers will get incorrect results. The hwspinlock name should be "adi", and I will make it clear in next version. > >> +Optional properties: >> +- sprd,hw-channels: Specify the hardware channel number and mapped address >> + for hardware channel accessing. > > What do these mean and how are the numbers and how will the binding be > interpreted? I also gave one explanation in above sentence, is it not clear? I try again. ADI controller has 50 channels including 2 software read/write channels and 48 hardware channels to access analog chip. For 2 software read/write channels, which means we should set ADI registers to access analog chip. But For hardware channels, we can just mapped one analog chip address to one hardware channel, then user can access analog chip by hardware channel without setting ADI registers. For this "sprd,hw-channels" property, the first value specifies the channel id, and the second value specifies the address which is mapped into analog chip space. -- Baolin.wang Best Regards
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum ADI controller documentation
Hi Mark, On 6 September 2017 at 22:59, Mark Brown wrote: > On Wed, Sep 06, 2017 at 02:10:43PM +0800, Baolin Wang wrote: > >> +- hwlocks: Reference to a phandle of a hwlock provider node. >> +- hwlock-names: Reference to hwlock name strings defined in the same order >> + as the hwlocks. > > What are these hwlocks protecting, and what names are expected? I made one explanation in above sentence, I assume it is not clear. Since we have multi-subsystems will use ADI to access analog chip, when one system is reading/writing data by ADI, which should be under one hardware spinlock protection to prevent other systems from reading/writing data by ADI at the same time, or two parallel routine of setting ADI registers will get incorrect results. The hwspinlock name should be "adi", and I will make it clear in next version. > >> +Optional properties: >> +- sprd,hw-channels: Specify the hardware channel number and mapped address >> + for hardware channel accessing. > > What do these mean and how are the numbers and how will the binding be > interpreted? I also gave one explanation in above sentence, is it not clear? I try again. ADI controller has 50 channels including 2 software read/write channels and 48 hardware channels to access analog chip. For 2 software read/write channels, which means we should set ADI registers to access analog chip. But For hardware channels, we can just mapped one analog chip address to one hardware channel, then user can access analog chip by hardware channel without setting ADI registers. For this "sprd,hw-channels" property, the first value specifies the channel id, and the second value specifies the address which is mapped into analog chip space. -- Baolin.wang Best Regards
Re: [PATCH 0/2] Fix resume failure due to PCID
On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirskiwrote: > Patch 1 is the fix. Patch 2 is a comment that would have kept me from > chasing down a false lead. Yes, this seems to fix things for me. Thanks. Of course, right now that laptop has no working wifi with tip-of-tree due to some issues with the networking tree, but that's an independent thing and I could suspend and resume with this. So applied and pushed out, Linus
Re: [PATCH 0/2] Fix resume failure due to PCID
On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski wrote: > Patch 1 is the fix. Patch 2 is a comment that would have kept me from > chasing down a false lead. Yes, this seems to fix things for me. Thanks. Of course, right now that laptop has no working wifi with tip-of-tree due to some issues with the networking tree, but that's an independent thing and I could suspend and resume with this. So applied and pushed out, Linus
[PATCH v5 3/3] perf config: Allow creating empty config set for config file autogeneration
When there isn't a config file (e.g. ~/.perfconfig) or it has nothing, the config set wasn't created. If the config set not exists, a config file can't be autogenerated. So allow creating a empty config set in the above case, then we can support the config file autogeneration. Before: $ rm -f ~/.perfconfig $ perf config --user report.children=false $ cat ~/.perfconfig cat: /root/.perfconfig: No such file or directory But I think it should work even if there isn't a config file. After: $ rm -f ~/.perfconfig $ perf config --user report.children=false $ cat ~/.perfconfig # this file is auto-generated. [report] children = false NOTE: As a result, if perf_config_set__init() is failed, it seems that the config set isn't freed. But it isn't a problem. Because the config set will be freed by perf_config_set__delete() at the end of cmd_config(). Cc: Jiri OlsaCc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/util/config.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c index bc75596..d2b6983 100644 --- a/tools/perf/util/config.c +++ b/tools/perf/util/config.c @@ -700,10 +700,7 @@ struct perf_config_set *perf_config_set__new(void) if (set) { INIT_LIST_HEAD(>sections); - if (perf_config_set__init(set) < 0) { - perf_config_set__delete(set); - set = NULL; - } + perf_config_set__init(set); } return set; -- 2.7.4
[PATCH v5 1/3] perf config: Check not only section->from_system_config but also item's
Currently only section->from_system_config is being checked multiple times. items->from_system_config should be also checked, so fix it. Cc: Jiri OlsaCc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/builtin-config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c index 3ddcc6e..a1d82e3 100644 --- a/tools/perf/builtin-config.c +++ b/tools/perf/builtin-config.c @@ -59,7 +59,7 @@ static int set_config(struct perf_config_set *set, const char *file_name, fprintf(fp, "[%s]\n", section->name); perf_config_items__for_each_entry(>items, item) { - if (!use_system_config && section->from_system_config) + if (!use_system_config && item->from_system_config) continue; if (item->value) fprintf(fp, "\t%s = %s\n", -- 2.7.4
[PATCH v5 3/3] perf config: Allow creating empty config set for config file autogeneration
When there isn't a config file (e.g. ~/.perfconfig) or it has nothing, the config set wasn't created. If the config set not exists, a config file can't be autogenerated. So allow creating a empty config set in the above case, then we can support the config file autogeneration. Before: $ rm -f ~/.perfconfig $ perf config --user report.children=false $ cat ~/.perfconfig cat: /root/.perfconfig: No such file or directory But I think it should work even if there isn't a config file. After: $ rm -f ~/.perfconfig $ perf config --user report.children=false $ cat ~/.perfconfig # this file is auto-generated. [report] children = false NOTE: As a result, if perf_config_set__init() is failed, it seems that the config set isn't freed. But it isn't a problem. Because the config set will be freed by perf_config_set__delete() at the end of cmd_config(). Cc: Jiri Olsa Cc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/util/config.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c index bc75596..d2b6983 100644 --- a/tools/perf/util/config.c +++ b/tools/perf/util/config.c @@ -700,10 +700,7 @@ struct perf_config_set *perf_config_set__new(void) if (set) { INIT_LIST_HEAD(>sections); - if (perf_config_set__init(set) < 0) { - perf_config_set__delete(set); - set = NULL; - } + perf_config_set__init(set); } return set; -- 2.7.4
[PATCH v5 1/3] perf config: Check not only section->from_system_config but also item's
Currently only section->from_system_config is being checked multiple times. items->from_system_config should be also checked, so fix it. Cc: Jiri Olsa Cc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/builtin-config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c index 3ddcc6e..a1d82e3 100644 --- a/tools/perf/builtin-config.c +++ b/tools/perf/builtin-config.c @@ -59,7 +59,7 @@ static int set_config(struct perf_config_set *set, const char *file_name, fprintf(fp, "[%s]\n", section->name); perf_config_items__for_each_entry(>items, item) { - if (!use_system_config && section->from_system_config) + if (!use_system_config && item->from_system_config) continue; if (item->value) fprintf(fp, "\t%s = %s\n", -- 2.7.4
[PATCH v5 0/3] perf config: Simple bugfixes & refactoring
Hi all, This is simple patchset for perf-config to fix small bugs and refactor code. I'd appreciate some feedback on this patchset. The code is also available at 'config/refactoring-v5' branch on git://github.com/taeung/linux-perf.git Thanks, Taeung v5: - rebase on current acme/perf/core - adjust commit log messages to be more proper v4: - rebase on current acme/perf/core - simplify commit log messages - remove needless two patches v3: - fix a bug of no checked 'ret' in the loop in cmd_config() (Arnaldo) - modify commit log messages to be more clear (Aranaldo) - return -1 if show_spec_config() cannot show the config - initialize 'ret' with -1 instead of 0 for more compact code in cmd_config() - Add a error message when perf_config_set__new() failed in cmd_config() v2: - there is no need to consider empty config file (Arnaldo) Taeung Song (3): perf config: Check not only section->from_system_config but also item's perf config: Once write a config file in the end, not a repeat perf config: Allow creating empty config set for config file autogeneration tools/perf/builtin-config.c | 24 +--- tools/perf/util/config.c| 5 + 2 files changed, 18 insertions(+), 11 deletions(-) -- 2.7.4
[PATCH v5 2/3] perf config: Once write a config file in the end, not a repeat
Currently set_config() can be repeatedly called for each input config on the below case: $ perf config kmem.default=slab report.children=false ... But it's a waste, so only once write a config file gathering all given config key=value pairs Cc: Jiri OlsaCc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/builtin-config.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c index a1d82e3..b89417d 100644 --- a/tools/perf/builtin-config.c +++ b/tools/perf/builtin-config.c @@ -34,8 +34,7 @@ static struct option config_options[] = { OPT_END() }; -static int set_config(struct perf_config_set *set, const char *file_name, - const char *var, const char *value) +static int set_config(struct perf_config_set *set, const char *file_name) { struct perf_config_section *section = NULL; struct perf_config_item *item = NULL; @@ -49,7 +48,6 @@ static int set_config(struct perf_config_set *set, const char *file_name, if (!fp) return -1; - perf_config_set__collect(set, file_name, var, value); fprintf(fp, "%s\n", first_line); /* overwrite configvariables */ @@ -161,6 +159,7 @@ int cmd_config(int argc, const char **argv) struct perf_config_set *set; char *user_config = mkpath("%s/.perfconfig", getenv("HOME")); const char *config_filename; + bool changed = false; argc = parse_options(argc, argv, config_options, config_usage, PARSE_OPT_STOP_AT_NON_OPTION); @@ -231,15 +230,26 @@ int cmd_config(int argc, const char **argv) goto out_err; } } else { - if (set_config(set, config_filename, var, value) < 0) { - pr_err("Failed to set '%s=%s' on %s\n", - var, value, config_filename); + if (perf_config_set__collect(set, config_filename, +var, value) < 0) { + pr_err("Failed to add '%s=%s'\n", + var, value); free(arg); goto out_err; } + changed = true; } free(arg); } + + if (!changed) + break; + + if (set_config(set, config_filename) < 0) { + pr_err("Failed to set the configs on %s\n", + config_filename); + goto out_err; + } } ret = 0; -- 2.7.4
[PATCH v5 2/3] perf config: Once write a config file in the end, not a repeat
Currently set_config() can be repeatedly called for each input config on the below case: $ perf config kmem.default=slab report.children=false ... But it's a waste, so only once write a config file gathering all given config key=value pairs Cc: Jiri Olsa Cc: Namhyung Kim Signed-off-by: Taeung Song --- tools/perf/builtin-config.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c index a1d82e3..b89417d 100644 --- a/tools/perf/builtin-config.c +++ b/tools/perf/builtin-config.c @@ -34,8 +34,7 @@ static struct option config_options[] = { OPT_END() }; -static int set_config(struct perf_config_set *set, const char *file_name, - const char *var, const char *value) +static int set_config(struct perf_config_set *set, const char *file_name) { struct perf_config_section *section = NULL; struct perf_config_item *item = NULL; @@ -49,7 +48,6 @@ static int set_config(struct perf_config_set *set, const char *file_name, if (!fp) return -1; - perf_config_set__collect(set, file_name, var, value); fprintf(fp, "%s\n", first_line); /* overwrite configvariables */ @@ -161,6 +159,7 @@ int cmd_config(int argc, const char **argv) struct perf_config_set *set; char *user_config = mkpath("%s/.perfconfig", getenv("HOME")); const char *config_filename; + bool changed = false; argc = parse_options(argc, argv, config_options, config_usage, PARSE_OPT_STOP_AT_NON_OPTION); @@ -231,15 +230,26 @@ int cmd_config(int argc, const char **argv) goto out_err; } } else { - if (set_config(set, config_filename, var, value) < 0) { - pr_err("Failed to set '%s=%s' on %s\n", - var, value, config_filename); + if (perf_config_set__collect(set, config_filename, +var, value) < 0) { + pr_err("Failed to add '%s=%s'\n", + var, value); free(arg); goto out_err; } + changed = true; } free(arg); } + + if (!changed) + break; + + if (set_config(set, config_filename) < 0) { + pr_err("Failed to set the configs on %s\n", + config_filename); + goto out_err; + } } ret = 0; -- 2.7.4
[PATCH v5 0/3] perf config: Simple bugfixes & refactoring
Hi all, This is simple patchset for perf-config to fix small bugs and refactor code. I'd appreciate some feedback on this patchset. The code is also available at 'config/refactoring-v5' branch on git://github.com/taeung/linux-perf.git Thanks, Taeung v5: - rebase on current acme/perf/core - adjust commit log messages to be more proper v4: - rebase on current acme/perf/core - simplify commit log messages - remove needless two patches v3: - fix a bug of no checked 'ret' in the loop in cmd_config() (Arnaldo) - modify commit log messages to be more clear (Aranaldo) - return -1 if show_spec_config() cannot show the config - initialize 'ret' with -1 instead of 0 for more compact code in cmd_config() - Add a error message when perf_config_set__new() failed in cmd_config() v2: - there is no need to consider empty config file (Arnaldo) Taeung Song (3): perf config: Check not only section->from_system_config but also item's perf config: Once write a config file in the end, not a repeat perf config: Allow creating empty config set for config file autogeneration tools/perf/builtin-config.c | 24 +--- tools/perf/util/config.c| 5 + 2 files changed, 18 insertions(+), 11 deletions(-) -- 2.7.4
Re: [PATCH 2/2] spi: Add ADI driver for Spreadtrum platform
Hi Mark, On 6 September 2017 at 23:04, Mark Brownwrote: > On Wed, Sep 06, 2017 at 02:10:44PM +0800, Baolin Wang wrote: > > This looks like a nice, clean driver. A few fairly small issues though: > >> +config SPI_SPRD_ADI >> + tristate "Spreadtrum ADI controller" >> + depends on ARCH_SPRD >> + help >> + ADI driver based on SPI for Spreadtrum SoCs. >> + > > I can't see any hard dependencies on the architecture - can you add an > || COMPILE_TEST for more coverage? Yes. > >> + ret = devm_spi_register_controller(>dev, ctlr); >> + if (ret) { >> + dev_err(>dev, "failed to register SPI controller\n"); >> + goto free_hwlock; >> + } > >> + spi_controller_put(ctlr); > > You used devm_ to register the controller, no need to free it > explicitly. Indeed. > >> +static int __init sprd_adi_init(void) >> +{ >> + return platform_driver_register(_adi_driver); >> +} >> +subsys_initcall(sprd_adi_init); > > Why is this subsys_initcall() and not module_platform_driver()? Since ADI is one very fundamental driver for our SoC, many drivers such as regulator need depend on ADI, and regulator need to regulate core voltage as earlier as possible. Very appreciated for your comments. -- Baolin.wang Best Regards
Re: [PATCH 2/2] spi: Add ADI driver for Spreadtrum platform
Hi Mark, On 6 September 2017 at 23:04, Mark Brown wrote: > On Wed, Sep 06, 2017 at 02:10:44PM +0800, Baolin Wang wrote: > > This looks like a nice, clean driver. A few fairly small issues though: > >> +config SPI_SPRD_ADI >> + tristate "Spreadtrum ADI controller" >> + depends on ARCH_SPRD >> + help >> + ADI driver based on SPI for Spreadtrum SoCs. >> + > > I can't see any hard dependencies on the architecture - can you add an > || COMPILE_TEST for more coverage? Yes. > >> + ret = devm_spi_register_controller(>dev, ctlr); >> + if (ret) { >> + dev_err(>dev, "failed to register SPI controller\n"); >> + goto free_hwlock; >> + } > >> + spi_controller_put(ctlr); > > You used devm_ to register the controller, no need to free it > explicitly. Indeed. > >> +static int __init sprd_adi_init(void) >> +{ >> + return platform_driver_register(_adi_driver); >> +} >> +subsys_initcall(sprd_adi_init); > > Why is this subsys_initcall() and not module_platform_driver()? Since ADI is one very fundamental driver for our SoC, many drivers such as regulator need depend on ADI, and regulator need to regulate core voltage as earlier as possible. Very appreciated for your comments. -- Baolin.wang Best Regards
[PATCH] ipv4: Namespaceify tcp_max_orphans knob
Different namespace application might require different maximal number of TCP sockets independently of the host. Signed-off-by: Haishuang Yan--- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 5 +++-- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp.c | 3 --- net/ipv4/tcp_input.c | 1 - net/ipv4/tcp_ipv4.c| 1 + 6 files changed, 12 insertions(+), 13 deletions(-) diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 20d061c..305e031 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -127,6 +127,7 @@ struct netns_ipv4 { int sysctl_tcp_timestamps; struct inet_timewait_death_row tcp_death_row; int sysctl_max_syn_backlog; + int sysctl_tcp_max_orphans; #ifdef CONFIG_NET_L3_MASTER_DEV int sysctl_udp_l3mdev_accept; diff --git a/include/net/tcp.h b/include/net/tcp.h index b510f28..ac2d998 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -320,10 +320,11 @@ static inline bool tcp_too_many_orphans(struct sock *sk, int shift) { struct percpu_counter *ocp = sk->sk_prot->orphan_count; int orphans = percpu_counter_read_positive(ocp); + int tcp_max_orphans = sock_net(sk)->ipv4.sysctl_tcp_max_orphans; - if (orphans << shift > sysctl_tcp_max_orphans) { + if (orphans << shift > tcp_max_orphans) { orphans = percpu_counter_sum_positive(ocp); - if (orphans << shift > sysctl_tcp_max_orphans) + if (orphans << shift > tcp_max_orphans) return true; } return false; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 0d3c038..4f26c8d3 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -394,13 +394,6 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl, .proc_handler = proc_dointvec }, { - .procname = "tcp_max_orphans", - .data = _tcp_max_orphans, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec - }, - { .procname = "tcp_fastopen", .data = _tcp_fastopen, .maxlen = sizeof(int), @@ -1085,6 +1078,13 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl, .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = "tcp_max_orphans", + .data = _net.ipv4.sysctl_tcp_max_orphans, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec + }, #ifdef CONFIG_IP_ROUTE_MULTIPATH { .procname = "fib_multipath_use_neigh", diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 5091402..39187ac 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3522,9 +3522,6 @@ void __init tcp_init(void) } - cnt = tcp_hashinfo.ehash_mask + 1; - sysctl_tcp_max_orphans = cnt / 2; - tcp_init_mem(); /* Set per-socket limits to no more than 1/128 the pressure threshold */ limit = nr_free_buffer_pages() << (PAGE_SHIFT - 7); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c5d7656..0230509 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -88,7 +88,6 @@ int sysctl_tcp_stdurg __read_mostly; int sysctl_tcp_rfc1337 __read_mostly; -int sysctl_tcp_max_orphans __read_mostly = NR_FILE; int sysctl_tcp_frto __read_mostly = 2; int sysctl_tcp_min_rtt_wlen __read_mostly = 300; int sysctl_tcp_moderate_rcvbuf __read_mostly = 1; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index a63486a..4b17a91 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2468,6 +2468,7 @@ static int __net_init tcp_sk_init(struct net *net) net->ipv4.tcp_death_row.hashinfo = _hashinfo; net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256); + net->ipv4.sysctl_tcp_max_orphans = cnt / 2; net->ipv4.sysctl_tcp_sack = 1; net->ipv4.sysctl_tcp_window_scaling = 1; net->ipv4.sysctl_tcp_timestamps = 1; -- 1.8.3.1
[PATCH] ipv4: Namespaceify tcp_max_orphans knob
Different namespace application might require different maximal number of TCP sockets independently of the host. Signed-off-by: Haishuang Yan --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 5 +++-- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp.c | 3 --- net/ipv4/tcp_input.c | 1 - net/ipv4/tcp_ipv4.c| 1 + 6 files changed, 12 insertions(+), 13 deletions(-) diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 20d061c..305e031 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -127,6 +127,7 @@ struct netns_ipv4 { int sysctl_tcp_timestamps; struct inet_timewait_death_row tcp_death_row; int sysctl_max_syn_backlog; + int sysctl_tcp_max_orphans; #ifdef CONFIG_NET_L3_MASTER_DEV int sysctl_udp_l3mdev_accept; diff --git a/include/net/tcp.h b/include/net/tcp.h index b510f28..ac2d998 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -320,10 +320,11 @@ static inline bool tcp_too_many_orphans(struct sock *sk, int shift) { struct percpu_counter *ocp = sk->sk_prot->orphan_count; int orphans = percpu_counter_read_positive(ocp); + int tcp_max_orphans = sock_net(sk)->ipv4.sysctl_tcp_max_orphans; - if (orphans << shift > sysctl_tcp_max_orphans) { + if (orphans << shift > tcp_max_orphans) { orphans = percpu_counter_sum_positive(ocp); - if (orphans << shift > sysctl_tcp_max_orphans) + if (orphans << shift > tcp_max_orphans) return true; } return false; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 0d3c038..4f26c8d3 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -394,13 +394,6 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl, .proc_handler = proc_dointvec }, { - .procname = "tcp_max_orphans", - .data = _tcp_max_orphans, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec - }, - { .procname = "tcp_fastopen", .data = _tcp_fastopen, .maxlen = sizeof(int), @@ -1085,6 +1078,13 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl, .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = "tcp_max_orphans", + .data = _net.ipv4.sysctl_tcp_max_orphans, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec + }, #ifdef CONFIG_IP_ROUTE_MULTIPATH { .procname = "fib_multipath_use_neigh", diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 5091402..39187ac 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3522,9 +3522,6 @@ void __init tcp_init(void) } - cnt = tcp_hashinfo.ehash_mask + 1; - sysctl_tcp_max_orphans = cnt / 2; - tcp_init_mem(); /* Set per-socket limits to no more than 1/128 the pressure threshold */ limit = nr_free_buffer_pages() << (PAGE_SHIFT - 7); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c5d7656..0230509 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -88,7 +88,6 @@ int sysctl_tcp_stdurg __read_mostly; int sysctl_tcp_rfc1337 __read_mostly; -int sysctl_tcp_max_orphans __read_mostly = NR_FILE; int sysctl_tcp_frto __read_mostly = 2; int sysctl_tcp_min_rtt_wlen __read_mostly = 300; int sysctl_tcp_moderate_rcvbuf __read_mostly = 1; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index a63486a..4b17a91 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2468,6 +2468,7 @@ static int __net_init tcp_sk_init(struct net *net) net->ipv4.tcp_death_row.hashinfo = _hashinfo; net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256); + net->ipv4.sysctl_tcp_max_orphans = cnt / 2; net->ipv4.sysctl_tcp_sack = 1; net->ipv4.sysctl_tcp_window_scaling = 1; net->ipv4.sysctl_tcp_timestamps = 1; -- 1.8.3.1
Re: [PATCH v3 2/5] dt-bindings: input: Add document bindings for mtk-pmic-keys
On Tue, 2017-09-05 at 11:05 -0500, Rob Herring wrote: > On Fri, Sep 1, 2017 at 9:16 PM, Chen Zhongwrote: > > On Thu, 2017-08-31 at 14:52 -0500, Rob Herring wrote: > >> On Fri, Aug 25, 2017 at 02:32:30PM +0800, Chen Zhong wrote: > >> > This patch adds the device tree binding documentation for the MediaTek > >> > pmic keys found on PMIC MT6397/MT6323. > >> > > >> > Signed-off-by: Chen Zhong > >> > --- > >> > .../devicetree/bindings/input/mtk-pmic-keys.txt| 38 > >> > > >> > 1 file changed, 38 insertions(+) > >> > create mode 100644 > >> > Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > > >> > diff --git a/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > new file mode 100644 > >> > index 000..100ec44 > >> > --- /dev/null > >> > +++ b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > @@ -0,0 +1,38 @@ > >> > +MediaTek MT6397/MT6323 PMIC Keys Device Driver > >> > + > >> > +There are two key functions provided by MT6397/MT6323 PMIC, pwrkey > >> > +and homekey. The key functions are defined as the subnode of the > >> > function > >> > +node provided by MT6397/MT6323 PMIC that is being defined as one kind > >> > +of Muti-Function Device (MFD) > >> > + > >> > +For MT6397/MT6323 MFD bindings see: > >> > +Documentation/devicetree/bindings/mfd/mt6397.txt > >> > + > >> > +Required properties: > >> > +- compatible: "mediatek,mt6397-keys" or "mediatek,mt6323-keys" > >> > +- linux,keycodes: Specifies the numeric keycode values to > >> > + be used for reporting keys presses. The array can > >> > + contain up to 2 entries. > >> > + > >> > +Optional Properties: > >> > +- wakeup-source: each key can be used as a wakeup source. > >> > >> wakeup-source is defined as a boolean. > > > > Hi Rob, > > > > Could I modify it as this? > > > > mediatek,wakeup-keys = <1>, <0>; > > wakeup-source; > > What do the values and index signify? The power key is index 0 and the > value 1 means enable wakeup? Or each value is the raw key (i.e. > indexes in linux,keycode) that wakeup is enabled for? > > I don't think this should be in DT really. It's really up to the user > (i.e. userspace) to decide what keys cause wakeup (or maybe that's > only suspend). If you default to the power key causes wakeup, do you > really need to support different options? > > If we do put this in DT, then it should be a common binding to specify > keys that cause wake-up. > > Rob Hi Rob, Yes, we want to describe that power key is index 0 and is a wakeup source, homekey is index 1 and not a wakeup source. Since power key and homekey are two real HW keys, customer can decide which key to be the wakeup source or both due to their hw design, so we put this in DT and can be different for different boards. Thank you. Chen
Re: [PATCH v3 2/5] dt-bindings: input: Add document bindings for mtk-pmic-keys
On Tue, 2017-09-05 at 11:05 -0500, Rob Herring wrote: > On Fri, Sep 1, 2017 at 9:16 PM, Chen Zhong wrote: > > On Thu, 2017-08-31 at 14:52 -0500, Rob Herring wrote: > >> On Fri, Aug 25, 2017 at 02:32:30PM +0800, Chen Zhong wrote: > >> > This patch adds the device tree binding documentation for the MediaTek > >> > pmic keys found on PMIC MT6397/MT6323. > >> > > >> > Signed-off-by: Chen Zhong > >> > --- > >> > .../devicetree/bindings/input/mtk-pmic-keys.txt| 38 > >> > > >> > 1 file changed, 38 insertions(+) > >> > create mode 100644 > >> > Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > > >> > diff --git a/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > new file mode 100644 > >> > index 000..100ec44 > >> > --- /dev/null > >> > +++ b/Documentation/devicetree/bindings/input/mtk-pmic-keys.txt > >> > @@ -0,0 +1,38 @@ > >> > +MediaTek MT6397/MT6323 PMIC Keys Device Driver > >> > + > >> > +There are two key functions provided by MT6397/MT6323 PMIC, pwrkey > >> > +and homekey. The key functions are defined as the subnode of the > >> > function > >> > +node provided by MT6397/MT6323 PMIC that is being defined as one kind > >> > +of Muti-Function Device (MFD) > >> > + > >> > +For MT6397/MT6323 MFD bindings see: > >> > +Documentation/devicetree/bindings/mfd/mt6397.txt > >> > + > >> > +Required properties: > >> > +- compatible: "mediatek,mt6397-keys" or "mediatek,mt6323-keys" > >> > +- linux,keycodes: Specifies the numeric keycode values to > >> > + be used for reporting keys presses. The array can > >> > + contain up to 2 entries. > >> > + > >> > +Optional Properties: > >> > +- wakeup-source: each key can be used as a wakeup source. > >> > >> wakeup-source is defined as a boolean. > > > > Hi Rob, > > > > Could I modify it as this? > > > > mediatek,wakeup-keys = <1>, <0>; > > wakeup-source; > > What do the values and index signify? The power key is index 0 and the > value 1 means enable wakeup? Or each value is the raw key (i.e. > indexes in linux,keycode) that wakeup is enabled for? > > I don't think this should be in DT really. It's really up to the user > (i.e. userspace) to decide what keys cause wakeup (or maybe that's > only suspend). If you default to the power key causes wakeup, do you > really need to support different options? > > If we do put this in DT, then it should be a common binding to specify > keys that cause wake-up. > > Rob Hi Rob, Yes, we want to describe that power key is index 0 and is a wakeup source, homekey is index 1 and not a wakeup source. Since power key and homekey are two real HW keys, customer can decide which key to be the wakeup source or both due to their hw design, so we put this in DT and can be different for different boards. Thank you. Chen
Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver
On 7 September 2017 at 10:48, Baolin Wangwrote: > Hi Vinod, > > On 7 September 2017 at 00:22, Vinod Koul wrote: >> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote: >> >>> > > +/* DMA global registers definition */ >>> > > +#define DMA_GLB_PAUSE0x0 >>> > > +#define DMA_GLB_FRAG_WAIT0x4 >>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8 >>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc >>> > > +#define DMA_GLB_INT_RAW_STS 0x10 >>> > > +#define DMA_GLB_INT_MSK_STS 0x14 >>> > > +#define DMA_GLB_REQ_STS 0x18 >>> > > +#define DMA_GLB_CHN_EN_STS 0x1c >>> > > +#define DMA_GLB_DEBUG_STS0x20 >>> > > +#define DMA_GLB_ARB_SEL_STS 0x24 >>> > > +#define DMA_GLB_CHN_START_CHN_CFG1 0x28 >>> > > +#define DMA_GLB_CHN_START_CHN_CFG2 0x2c >>> > > +#define DMA_CHN_LLIST_OFFSET 0x10 >>> > > +#define DMA_GLB_REQ_CID_OFFSET 0x2000 >>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1)) >>> > >>> > namespaced properly please, here and rest.. >>> >>> Could you elaborate which name need to named properly? >>> I guess DMA_GLB_REQ_CID is not very clear, can I change >>> to DMA_GLB_REQ_UID(uid) which is used to define the UID >>> registers? >> >> I meant something like: >> >> SPRD_DMA_ >> >> DMA_GLB is very generic term and might cause collisions in future so lets >> make these future proof.. > > Make sense. > >> >>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev) >>> > > +{ >>> > > + int ret; >>> > > + >>> > > + ret = clk_prepare_enable(sdev->clk); >>> > > + if (ret) >>> > > + return ret; >>> > > + >>> > > + if (!IS_ERR(sdev->ashb_clk)) >>> > >>> > that looks odd, can you explain this? >>> >>> Since the ashb_clk is optional and only for AGCP DMA controller, >>> so here we add one condition to check if the ashb_clk need enable. >> >> it be worth documenting this.. > > OK. > >> >>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan) >>> > > +{ >>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >>> > > + unsigned long flags; >>> > > + >>> > > + spin_lock_irqsave(>vc.lock, flags); >>> > > + sprd_dma_stop(schan); >>> > > + spin_unlock_irqrestore(>vc.lock, flags); >>> > > + >>> > > + vchan_free_chan_resources(>vc); >>> > > + pm_runtime_put_sync(chan->device->dev); >>> > >>> > why put_sync() >>> >>> Since we will get pm counter to resume DMA when allocating resources, then >>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend >>> DMA if the pm counter is 0. >> >> runtime_pm part is fine, you could have used pm_runtime_put(), why the >> sync() variant here... > > After checking again, I agree pm_runtime_put() is enough here and I > will fix it in next version. > >>> >>> > >>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan, >>> > > + dma_cookie_t cookie, >>> > > + struct dma_tx_state *txstate) >>> > > +{ >>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >>> > > + unsigned long flags; >>> > > + enum dma_status ret; >>> > > + >>> > > + ret = dma_cookie_status(chan, cookie, txstate); >>> > > + >>> > > + spin_lock_irqsave(>vc.lock, flags); >>> > > + txstate->residue = sprd_dma_get_dst_addr(schan); >>> > >>> > I dont think this is correct, the residue needs to be read only for >>> > current >>> > cookie and the query might be for one which is not even submitted >>> >>> We have one scenario for our audio driver, the audio driver need to get >>> the destination address to check if their transfer is done in no irq mode. >> >> Yes but for the descriptor requested but not any. Audio maybe working as >> period count maybe lesser... Sorry for missing one comment. I know what is your meaning, now I understand the residue needs to be read only for current cookie and I will fix this in next version. >> >>> > > +static int sprd_dma_probe(struct platform_device *pdev) >>> > > +{ >>> > > + struct device_node *np = pdev->dev.of_node; >>> > > + struct sprd_dma_dev *sdev; >>> > > + struct sprd_dma_chn *dma_chn; >>> > > + struct resource *res; >>> > > + u32 chn_count; >>> > > + int ret, i; >>> > > + >>> > > + ret = of_property_read_u32(np, "#dma-channels", _count); >>> > > + if (ret) { >>> > > + dev_err(>dev, "get dma channels count failed\n"); >>> > > + return ret; >>> > > + } >>> > > + >>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) + >>> > > + (sizeof(struct sprd_dma_chn) * chn_count)), >>> > > + GFP_KERNEL); >>> > > + if (!sdev) >>> > > + return -ENOMEM; >>> > > + >>> > > + sdev->clk = devm_clk_get(>dev, "enable"); >>> > > + if (IS_ERR(sdev->clk)) { >>> > > + dev_err(>dev, "get enable clock failed\n"); >>> > > + return PTR_ERR(sdev->clk); >>> > > + } >>> > > + >>>
Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver
On 7 September 2017 at 10:48, Baolin Wang wrote: > Hi Vinod, > > On 7 September 2017 at 00:22, Vinod Koul wrote: >> On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote: >> >>> > > +/* DMA global registers definition */ >>> > > +#define DMA_GLB_PAUSE0x0 >>> > > +#define DMA_GLB_FRAG_WAIT0x4 >>> > > +#define DMA_GLB_REQ_PEND0_EN 0x8 >>> > > +#define DMA_GLB_REQ_PEND1_EN 0xc >>> > > +#define DMA_GLB_INT_RAW_STS 0x10 >>> > > +#define DMA_GLB_INT_MSK_STS 0x14 >>> > > +#define DMA_GLB_REQ_STS 0x18 >>> > > +#define DMA_GLB_CHN_EN_STS 0x1c >>> > > +#define DMA_GLB_DEBUG_STS0x20 >>> > > +#define DMA_GLB_ARB_SEL_STS 0x24 >>> > > +#define DMA_GLB_CHN_START_CHN_CFG1 0x28 >>> > > +#define DMA_GLB_CHN_START_CHN_CFG2 0x2c >>> > > +#define DMA_CHN_LLIST_OFFSET 0x10 >>> > > +#define DMA_GLB_REQ_CID_OFFSET 0x2000 >>> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1)) >>> > >>> > namespaced properly please, here and rest.. >>> >>> Could you elaborate which name need to named properly? >>> I guess DMA_GLB_REQ_CID is not very clear, can I change >>> to DMA_GLB_REQ_UID(uid) which is used to define the UID >>> registers? >> >> I meant something like: >> >> SPRD_DMA_ >> >> DMA_GLB is very generic term and might cause collisions in future so lets >> make these future proof.. > > Make sense. > >> >>> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev) >>> > > +{ >>> > > + int ret; >>> > > + >>> > > + ret = clk_prepare_enable(sdev->clk); >>> > > + if (ret) >>> > > + return ret; >>> > > + >>> > > + if (!IS_ERR(sdev->ashb_clk)) >>> > >>> > that looks odd, can you explain this? >>> >>> Since the ashb_clk is optional and only for AGCP DMA controller, >>> so here we add one condition to check if the ashb_clk need enable. >> >> it be worth documenting this.. > > OK. > >> >>> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan) >>> > > +{ >>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >>> > > + unsigned long flags; >>> > > + >>> > > + spin_lock_irqsave(>vc.lock, flags); >>> > > + sprd_dma_stop(schan); >>> > > + spin_unlock_irqrestore(>vc.lock, flags); >>> > > + >>> > > + vchan_free_chan_resources(>vc); >>> > > + pm_runtime_put_sync(chan->device->dev); >>> > >>> > why put_sync() >>> >>> Since we will get pm counter to resume DMA when allocating resources, then >>> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend >>> DMA if the pm counter is 0. >> >> runtime_pm part is fine, you could have used pm_runtime_put(), why the >> sync() variant here... > > After checking again, I agree pm_runtime_put() is enough here and I > will fix it in next version. > >>> >>> > >>> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan, >>> > > + dma_cookie_t cookie, >>> > > + struct dma_tx_state *txstate) >>> > > +{ >>> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >>> > > + unsigned long flags; >>> > > + enum dma_status ret; >>> > > + >>> > > + ret = dma_cookie_status(chan, cookie, txstate); >>> > > + >>> > > + spin_lock_irqsave(>vc.lock, flags); >>> > > + txstate->residue = sprd_dma_get_dst_addr(schan); >>> > >>> > I dont think this is correct, the residue needs to be read only for >>> > current >>> > cookie and the query might be for one which is not even submitted >>> >>> We have one scenario for our audio driver, the audio driver need to get >>> the destination address to check if their transfer is done in no irq mode. >> >> Yes but for the descriptor requested but not any. Audio maybe working as >> period count maybe lesser... Sorry for missing one comment. I know what is your meaning, now I understand the residue needs to be read only for current cookie and I will fix this in next version. >> >>> > > +static int sprd_dma_probe(struct platform_device *pdev) >>> > > +{ >>> > > + struct device_node *np = pdev->dev.of_node; >>> > > + struct sprd_dma_dev *sdev; >>> > > + struct sprd_dma_chn *dma_chn; >>> > > + struct resource *res; >>> > > + u32 chn_count; >>> > > + int ret, i; >>> > > + >>> > > + ret = of_property_read_u32(np, "#dma-channels", _count); >>> > > + if (ret) { >>> > > + dev_err(>dev, "get dma channels count failed\n"); >>> > > + return ret; >>> > > + } >>> > > + >>> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) + >>> > > + (sizeof(struct sprd_dma_chn) * chn_count)), >>> > > + GFP_KERNEL); >>> > > + if (!sdev) >>> > > + return -ENOMEM; >>> > > + >>> > > + sdev->clk = devm_clk_get(>dev, "enable"); >>> > > + if (IS_ERR(sdev->clk)) { >>> > > + dev_err(>dev, "get enable clock failed\n"); >>> > > + return PTR_ERR(sdev->clk); >>> > > + } >>> > > + >>> > > + /* ashb clock is optional for AGCP DMA
[PATCH v1] [media] uvcvideo: mark buffer error where overflow
Some cameras post inaccurate frame where next frame data overlap it. this results in screen flicker, and it need to be prevented. So this patch marks the buffer error to discard the frame where buffer overflow. Signed-off-by: Baoyou Xie--- drivers/media/usb/uvc/uvc_video.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c index fb86d6a..81a3530 100644 --- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -1077,6 +1077,7 @@ static void uvc_video_decode_data(struct uvc_streaming *stream, /* Complete the current frame if the buffer size was exceeded. */ if (len > maxlen) { uvc_trace(UVC_TRACE_FRAME, "Frame complete (overflow).\n"); + buf->error = 1; buf->state = UVC_BUF_STATE_READY; } } -- 2.7.4
[PATCH v1] [media] uvcvideo: mark buffer error where overflow
Some cameras post inaccurate frame where next frame data overlap it. this results in screen flicker, and it need to be prevented. So this patch marks the buffer error to discard the frame where buffer overflow. Signed-off-by: Baoyou Xie --- drivers/media/usb/uvc/uvc_video.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c index fb86d6a..81a3530 100644 --- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -1077,6 +1077,7 @@ static void uvc_video_decode_data(struct uvc_streaming *stream, /* Complete the current frame if the buffer size was exceeded. */ if (len > maxlen) { uvc_trace(UVC_TRACE_FRAME, "Frame complete (overflow).\n"); + buf->error = 1; buf->state = UVC_BUF_STATE_READY; } } -- 2.7.4
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Wed, Sep 06, 2017 at 10:46:17AM -0700, Dan Williams wrote: > On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwigwrote: > > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote: > >> I agree, the driver could be rewritten, but it might take some time, so > >> meanwhile I'm looking at also other possible optimization. > > > > Which driver are we talking about anyway? Let's start looking at it > > and fix the issue there. > > As far as I understand, it's already fixed there: > > commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1 > Author: Carolyn Wyborny > Date: Tue Jun 20 15:16:53 2017 -0700 > > i40e: Fix for trace found with S4 state > > This patch fixes a problem found in systems when entering > S4 state. This patch fixes the problem by ensuring that > the misc vector's IRQ is disabled as well. Without this > patch a stack trace can be seen upon entering S4 state. > > However this seems like something that should be handled generically > in the irq-core especially since commit c5cb83bb337c > "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in > that direction. It's otherwise non-obvious when a driver needs to > release and re-acquire interrupts or be reworked to use managed > interrupts. Yes, thanks for the explaination! I did not notice this patch has been merged already. I'm using the normal CPU hotplug to reproduce the issue: #!/bin/bash n=1 while [ $n -le 31 ] do echo 0 > /sys/devices/system/cpu/cpu${n}/online n=$(( n+1 )) done Thanks, Yu
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Wed, Sep 06, 2017 at 10:46:17AM -0700, Dan Williams wrote: > On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwig wrote: > > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote: > >> I agree, the driver could be rewritten, but it might take some time, so > >> meanwhile I'm looking at also other possible optimization. > > > > Which driver are we talking about anyway? Let's start looking at it > > and fix the issue there. > > As far as I understand, it's already fixed there: > > commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1 > Author: Carolyn Wyborny > Date: Tue Jun 20 15:16:53 2017 -0700 > > i40e: Fix for trace found with S4 state > > This patch fixes a problem found in systems when entering > S4 state. This patch fixes the problem by ensuring that > the misc vector's IRQ is disabled as well. Without this > patch a stack trace can be seen upon entering S4 state. > > However this seems like something that should be handled generically > in the irq-core especially since commit c5cb83bb337c > "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in > that direction. It's otherwise non-obvious when a driver needs to > release and re-acquire interrupts or be reworked to use managed > interrupts. Yes, thanks for the explaination! I did not notice this patch has been merged already. I'm using the normal CPU hotplug to reproduce the issue: #!/bin/bash n=1 while [ $n -le 31 ] do echo 0 > /sys/devices/system/cpu/cpu${n}/online n=$(( n+1 )) done Thanks, Yu
[PATCH 0/2] Fix resume failure due to PCID
Patch 1 is the fix. Patch 2 is a comment that would have kept me from chasing down a false lead. I've tested patch 2 using CPU hotplug and suspend/resume. I haven't tested hibernation or kexec because I don't know how. (If I do systemctl hibernate on my laptop, it happily writes out a hiberation image somewhere and then it equally happily ignores it on the next boot. I don't know how to test kexec.) I haven't tested the 32-bit version. I'll try to get to that tomorrow. Andy Lutomirski (2): x86/mm: Reinitialize TLB state on hotplug and resume x86/mm: Document how CR4.PCIDE restore works arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/kernel/cpu/common.c| 15 ++ arch/x86/mm/tlb.c | 44 + arch/x86/power/cpu.c| 1 + 4 files changed, 62 insertions(+) -- 2.13.5
[PATCH 0/2] Fix resume failure due to PCID
Patch 1 is the fix. Patch 2 is a comment that would have kept me from chasing down a false lead. I've tested patch 2 using CPU hotplug and suspend/resume. I haven't tested hibernation or kexec because I don't know how. (If I do systemctl hibernate on my laptop, it happily writes out a hiberation image somewhere and then it equally happily ignores it on the next boot. I don't know how to test kexec.) I haven't tested the 32-bit version. I'll try to get to that tomorrow. Andy Lutomirski (2): x86/mm: Reinitialize TLB state on hotplug and resume x86/mm: Document how CR4.PCIDE restore works arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/kernel/cpu/common.c| 15 ++ arch/x86/mm/tlb.c | 44 + arch/x86/power/cpu.c| 1 + 4 files changed, 62 insertions(+) -- 2.13.5
[PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works
While debugging a problem, I thought that using cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be helpful. It turns out to be counterproductive. Add a comment documenting how this works. Signed-off-by: Andy Lutomirski--- arch/x86/kernel/cpu/common.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 40cb4d0a5982..4c31a6585333 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -333,6 +333,19 @@ static void setup_pcid(struct cpuinfo_x86 *c) { if (cpu_has(c, X86_FEATURE_PCID)) { if (cpu_has(c, X86_FEATURE_PGE)) { + /* +* We'd like to use cr4_set_bits_and_update_boot(), +* but we can't. CR4.PCIDE is special and can only +* be set in long mode, and the early CPU init code +* doesn't know this and would try to restore CR4.PCIDE +* prior to entering long mode. +* +* Instead, we rely on the fact that hotplug, resume, +* etc all fully restore CR4 before they write anything +* that could have nonzero PCID bits to CR3. CR4.PCIDE +* has no effect on the page tables themselves, so we +* don't need it top be restored early. +*/ cr4_set_bits(X86_CR4_PCIDE); } else { /* -- 2.13.5
[PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume
When Linux brings a CPU down and back up, it switches to init_mm and then loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect of masking off the ASID bits in CR3. This can result in some confusion in the TLB handling code. If we bring a CPU down and back up with any ASID other than 0, we end up with the wrong ASID active on the CPU after resume. This could cause our internal state to become corrupt, although major corruption is unlikely because init_mm doesn't have any user pages. More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion in the next context switch. The result of *that* is a failure to resume from suspend with probability 1 - 1/6^(cpus-1). Fix it by reinitializing cpu_tlbstate on resume and CPU bringup. Reported-by: Linus TorvaldsReported-by: Jiri Kosina Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/kernel/cpu/common.c| 2 ++ arch/x86/mm/tlb.c | 44 + arch/x86/power/cpu.c| 1 + 4 files changed, 49 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index d23e61dc0640..4893abf7f74f 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -198,6 +198,8 @@ static inline void cr4_set_bits_and_update_boot(unsigned long mask) cr4_set_bits(mask); } +extern void initialize_tlbstate_and_flush(void); + static inline void __native_flush_tlb(void) { /* diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index efba8e3da3e2..40cb4d0a5982 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1583,6 +1583,7 @@ void cpu_init(void) mmgrab(_mm); me->active_mm = _mm; BUG_ON(me->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(_mm, me); load_sp0(t, >thread); @@ -1637,6 +1638,7 @@ void cpu_init(void) mmgrab(_mm); curr->active_mm = _mm; BUG_ON(curr->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(_mm, curr); load_sp0(t, thread); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce104b962a17..dbbcfd59726a 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -214,6 +214,50 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } /* + * Call this when reinitializing a CPU. It fixes the following potential + * problems: + * + * - The ASID changed from what cpu_tlbstate thinks it is (most likely + * because the CPU was taken down and came back up with CR3's PCID + * bits clear. CPU hotplug can do this. + * + * - The TLB contains junk in slots corresponding to inactive ASIDs. + * + * - The CPU went so far out to lunch that it may have missed a TLB + * flush. + */ +void initialize_tlbstate_and_flush(void) +{ + int i; + struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm); + u64 tlb_gen = atomic64_read(_mm.context.tlb_gen); + unsigned long cr3 = __read_cr3(); + + /* Assert that CR3 already references the right mm. */ + WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd)); + + /* +* Assert that CR4.PCIDE is set if needed. (CR4.PCIDE initialization +* doesn't work like other CR4 bits because it can only be set from +* long mode.) +*/ + WARN_ON(boot_cpu_has(X86_CR4_PCIDE) && + !(cr4_read_shadow() & X86_CR4_PCIDE)); + + /* Force ASID 0 and force a TLB flush. */ + write_cr3(cr3 & ~CR3_PCID_MASK); + + /* Reinitialize tlbstate. */ + this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); + this_cpu_write(cpu_tlbstate.next_asid, 1); + this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); + this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen); + + for (i = 1; i < TLB_NR_DYN_ASIDS; i++) + this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0); +} + +/* * flush_tlb_func_common()'s memory ordering requirement is that any * TLB fills that happen after we flush the TLB are ordered after we * read active_mm's tlb_gen. We don't need any explicit barriers diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index 78459a6d455a..4d68d59f457d 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -181,6 +181,7 @@ static void fix_processor_context(void) #endif load_TR_desc(); /* This does ltr */ load_mm_ldt(current->active_mm);/* This does lldt */ + initialize_tlbstate_and_flush(); fpu__resume_cpu(); -- 2.13.5
[PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works
While debugging a problem, I thought that using cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be helpful. It turns out to be counterproductive. Add a comment documenting how this works. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/cpu/common.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 40cb4d0a5982..4c31a6585333 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -333,6 +333,19 @@ static void setup_pcid(struct cpuinfo_x86 *c) { if (cpu_has(c, X86_FEATURE_PCID)) { if (cpu_has(c, X86_FEATURE_PGE)) { + /* +* We'd like to use cr4_set_bits_and_update_boot(), +* but we can't. CR4.PCIDE is special and can only +* be set in long mode, and the early CPU init code +* doesn't know this and would try to restore CR4.PCIDE +* prior to entering long mode. +* +* Instead, we rely on the fact that hotplug, resume, +* etc all fully restore CR4 before they write anything +* that could have nonzero PCID bits to CR3. CR4.PCIDE +* has no effect on the page tables themselves, so we +* don't need it top be restored early. +*/ cr4_set_bits(X86_CR4_PCIDE); } else { /* -- 2.13.5
[PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume
When Linux brings a CPU down and back up, it switches to init_mm and then loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect of masking off the ASID bits in CR3. This can result in some confusion in the TLB handling code. If we bring a CPU down and back up with any ASID other than 0, we end up with the wrong ASID active on the CPU after resume. This could cause our internal state to become corrupt, although major corruption is unlikely because init_mm doesn't have any user pages. More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion in the next context switch. The result of *that* is a failure to resume from suspend with probability 1 - 1/6^(cpus-1). Fix it by reinitializing cpu_tlbstate on resume and CPU bringup. Reported-by: Linus Torvalds Reported-by: Jiri Kosina Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/kernel/cpu/common.c| 2 ++ arch/x86/mm/tlb.c | 44 + arch/x86/power/cpu.c| 1 + 4 files changed, 49 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index d23e61dc0640..4893abf7f74f 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -198,6 +198,8 @@ static inline void cr4_set_bits_and_update_boot(unsigned long mask) cr4_set_bits(mask); } +extern void initialize_tlbstate_and_flush(void); + static inline void __native_flush_tlb(void) { /* diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index efba8e3da3e2..40cb4d0a5982 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1583,6 +1583,7 @@ void cpu_init(void) mmgrab(_mm); me->active_mm = _mm; BUG_ON(me->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(_mm, me); load_sp0(t, >thread); @@ -1637,6 +1638,7 @@ void cpu_init(void) mmgrab(_mm); curr->active_mm = _mm; BUG_ON(curr->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(_mm, curr); load_sp0(t, thread); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce104b962a17..dbbcfd59726a 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -214,6 +214,50 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } /* + * Call this when reinitializing a CPU. It fixes the following potential + * problems: + * + * - The ASID changed from what cpu_tlbstate thinks it is (most likely + * because the CPU was taken down and came back up with CR3's PCID + * bits clear. CPU hotplug can do this. + * + * - The TLB contains junk in slots corresponding to inactive ASIDs. + * + * - The CPU went so far out to lunch that it may have missed a TLB + * flush. + */ +void initialize_tlbstate_and_flush(void) +{ + int i; + struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm); + u64 tlb_gen = atomic64_read(_mm.context.tlb_gen); + unsigned long cr3 = __read_cr3(); + + /* Assert that CR3 already references the right mm. */ + WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd)); + + /* +* Assert that CR4.PCIDE is set if needed. (CR4.PCIDE initialization +* doesn't work like other CR4 bits because it can only be set from +* long mode.) +*/ + WARN_ON(boot_cpu_has(X86_CR4_PCIDE) && + !(cr4_read_shadow() & X86_CR4_PCIDE)); + + /* Force ASID 0 and force a TLB flush. */ + write_cr3(cr3 & ~CR3_PCID_MASK); + + /* Reinitialize tlbstate. */ + this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); + this_cpu_write(cpu_tlbstate.next_asid, 1); + this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); + this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen); + + for (i = 1; i < TLB_NR_DYN_ASIDS; i++) + this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0); +} + +/* * flush_tlb_func_common()'s memory ordering requirement is that any * TLB fills that happen after we flush the TLB are ordered after we * read active_mm's tlb_gen. We don't need any explicit barriers diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index 78459a6d455a..4d68d59f457d 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -181,6 +181,7 @@ static void fix_processor_context(void) #endif load_TR_desc(); /* This does ltr */ load_mm_ldt(current->active_mm);/* This does lldt */ + initialize_tlbstate_and_flush(); fpu__resume_cpu(); -- 2.13.5
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote: > On Wed, 6 Sep 2017, Yu Chen wrote: > > On Wed, Sep 06, 2017 at 12:57:41AM +0200, Thomas Gleixner wrote: > > > I have a hard time to figure out how the 133 vectors on CPU31 are now > > > magically fitting in the empty space on CPU0, which is 204 - 133 = 71. In > > > my limited understanding of math 133 is greater than 71, but your patch > > > might make that magically be wrong. > > > > > The problem is reproduced when the network cable is not plugged in, > > because this driver looks like this: > > > > step 1. Reserved enough irq vectors and corresponding IRQs. > > step 2. If the network is activated, invoke request_irq() to > > register the handler. > > step 3. Invoke set_affinity() to spread the IRQs onto different > > CPUs, thus to spread the vectors too. > > > > Here's my understanding for why spreading vectors might help for this > > special case: > > As step 2 will not get invoked, the IRQs of this driver > > has not been enabled, thus in migrate_one_irq() this IRQ > > will not be considered because there is a check of > > irqd_is_started(d), thus there should only be 8 vectors > > allocated by this driver on CPU0, and 8 vectors left on > > CPU31, and the 8 vectors on CPU31 will not be migrated > > to CPU0 neither, so there is room for other 'valid' vectors > > to be migrated to CPU0. > > Can you please spare me repeating your theories, as long as you don't have > hard facts to back them up? The network cable is changing the symptoms, > but the underlying root cause is definitely something different. > > > # cat /sys/kernel/debug/irq/domains/* > > name: VECTOR > > size: 0 > > mapped: 388 > > flags: 0x0041 > > So we have 388 vectors mapped in total. And those are just device vectors > because system vectors are not accounted there. > > > name: IO-APIC-0 > > size: 24 > > mapped: 16 > > That's the legacy space > > > name: IO-APIC-1 > > size: 8 > > mapped: 2 > > > name: IO-APIC-2 > > size: 8 > > mapped: 0 > > > name: IO-APIC-3 > > size: 8 > > mapped: 0 > > > name: IO-APIC-4 > > size: 8 > > mapped: 5 > > And a few GSIs: Total GSIs = 16 + 2 + 5 = 23 > > > name: PCI-MSI-2 > > size: 0 > > mapped: 365 > > Plus 365 PCI-MSI vectors allocated. > > > flags: 0x0051 > > parent: VECTOR > > name: VECTOR > > size: 0 > > mapped: 388 > > Which nicely sums up to 388 > > > # ls /sys/kernel/debug/irq/irqs > > ls /sys/kernel/debug/irq/irqs > > 0 10 11 13 142 184 217 259 292 31 33 337 339 > > 340 342 344 346 348 350 352 354 356 358 360 362 > > 364 366 368 370 372 374 376 378 380 382 384 386 > > 388 390 392 394 4 6 7 9 1 109 12 14 15 2 > > 24 26 332 335 338 34 341 343 345 347 349 > > 351 353 355 357 359 361 363 365 367 369 371 373 > > 375 377 379 381 383 385 387 389 391 393 395 5 > > 67 8 > > That are all interrupts which are active. That's a total of 89. Can you > explain where the delta of 299 vectors comes from? > > 299 allocated, vector mapped, but unused interrupts? > > That's where your problem is, not in the vector spreading. You have a > massive leak. > > > BTW, do we have sysfs to display how much vectors used on each CPUs? > > Not yet. > > Can you please apply the debug patch below, boot the machine and right > after login provide the output of > > # cat /sys/kernel/debug/tracing/trace > Ok, I've tested on top of 4.13, here're the results: # cat /sys/kernel/debug/tracing/trace # tracer: nop # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | kworker/0:0-3 [000] 7.373461: msi_domain_alloc_irqs: dev: :00:1c.0 nvec 1 virq 24 kworker/0:0-3 [000] 7.373666: msi_domain_alloc_irqs: dev: :b2:00.0 nvec 1 virq 26 kworker/0:0-3 [000] 7.578980: msi_domain_alloc_irqs: dev: :00:11.5 nvec 1 virq 31 kworker/0:0-3 [000] 7.676938: msi_domain_alloc_irqs: dev: :00:17.0 nvec 1 virq 32 kworker/0:0-3 [000] 8.017664: msi_domain_alloc_irqs: dev: :00:14.0 nvec 1 virq 33 kworker/0:2-303 [000] 9.135467: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 34 kworker/0:2-303 [000] 9.135476: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 35 kworker/0:2-303 [000] 9.135484: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 36 kworker/0:2-303 [000] 9.135492: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 37 kworker/0:2-303 [000] 9.135499: msi_domain_alloc_irqs:
Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote: > On Wed, 6 Sep 2017, Yu Chen wrote: > > On Wed, Sep 06, 2017 at 12:57:41AM +0200, Thomas Gleixner wrote: > > > I have a hard time to figure out how the 133 vectors on CPU31 are now > > > magically fitting in the empty space on CPU0, which is 204 - 133 = 71. In > > > my limited understanding of math 133 is greater than 71, but your patch > > > might make that magically be wrong. > > > > > The problem is reproduced when the network cable is not plugged in, > > because this driver looks like this: > > > > step 1. Reserved enough irq vectors and corresponding IRQs. > > step 2. If the network is activated, invoke request_irq() to > > register the handler. > > step 3. Invoke set_affinity() to spread the IRQs onto different > > CPUs, thus to spread the vectors too. > > > > Here's my understanding for why spreading vectors might help for this > > special case: > > As step 2 will not get invoked, the IRQs of this driver > > has not been enabled, thus in migrate_one_irq() this IRQ > > will not be considered because there is a check of > > irqd_is_started(d), thus there should only be 8 vectors > > allocated by this driver on CPU0, and 8 vectors left on > > CPU31, and the 8 vectors on CPU31 will not be migrated > > to CPU0 neither, so there is room for other 'valid' vectors > > to be migrated to CPU0. > > Can you please spare me repeating your theories, as long as you don't have > hard facts to back them up? The network cable is changing the symptoms, > but the underlying root cause is definitely something different. > > > # cat /sys/kernel/debug/irq/domains/* > > name: VECTOR > > size: 0 > > mapped: 388 > > flags: 0x0041 > > So we have 388 vectors mapped in total. And those are just device vectors > because system vectors are not accounted there. > > > name: IO-APIC-0 > > size: 24 > > mapped: 16 > > That's the legacy space > > > name: IO-APIC-1 > > size: 8 > > mapped: 2 > > > name: IO-APIC-2 > > size: 8 > > mapped: 0 > > > name: IO-APIC-3 > > size: 8 > > mapped: 0 > > > name: IO-APIC-4 > > size: 8 > > mapped: 5 > > And a few GSIs: Total GSIs = 16 + 2 + 5 = 23 > > > name: PCI-MSI-2 > > size: 0 > > mapped: 365 > > Plus 365 PCI-MSI vectors allocated. > > > flags: 0x0051 > > parent: VECTOR > > name: VECTOR > > size: 0 > > mapped: 388 > > Which nicely sums up to 388 > > > # ls /sys/kernel/debug/irq/irqs > > ls /sys/kernel/debug/irq/irqs > > 0 10 11 13 142 184 217 259 292 31 33 337 339 > > 340 342 344 346 348 350 352 354 356 358 360 362 > > 364 366 368 370 372 374 376 378 380 382 384 386 > > 388 390 392 394 4 6 7 9 1 109 12 14 15 2 > > 24 26 332 335 338 34 341 343 345 347 349 > > 351 353 355 357 359 361 363 365 367 369 371 373 > > 375 377 379 381 383 385 387 389 391 393 395 5 > > 67 8 > > That are all interrupts which are active. That's a total of 89. Can you > explain where the delta of 299 vectors comes from? > > 299 allocated, vector mapped, but unused interrupts? > > That's where your problem is, not in the vector spreading. You have a > massive leak. > > > BTW, do we have sysfs to display how much vectors used on each CPUs? > > Not yet. > > Can you please apply the debug patch below, boot the machine and right > after login provide the output of > > # cat /sys/kernel/debug/tracing/trace > Ok, I've tested on top of 4.13, here're the results: # cat /sys/kernel/debug/tracing/trace # tracer: nop # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | kworker/0:0-3 [000] 7.373461: msi_domain_alloc_irqs: dev: :00:1c.0 nvec 1 virq 24 kworker/0:0-3 [000] 7.373666: msi_domain_alloc_irqs: dev: :b2:00.0 nvec 1 virq 26 kworker/0:0-3 [000] 7.578980: msi_domain_alloc_irqs: dev: :00:11.5 nvec 1 virq 31 kworker/0:0-3 [000] 7.676938: msi_domain_alloc_irqs: dev: :00:17.0 nvec 1 virq 32 kworker/0:0-3 [000] 8.017664: msi_domain_alloc_irqs: dev: :00:14.0 nvec 1 virq 33 kworker/0:2-303 [000] 9.135467: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 34 kworker/0:2-303 [000] 9.135476: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 35 kworker/0:2-303 [000] 9.135484: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 36 kworker/0:2-303 [000] 9.135492: msi_domain_alloc_irqs: dev: :bb:00.0 nvec 1 virq 37 kworker/0:2-303 [000] 9.135499: msi_domain_alloc_irqs:
Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver
Hi Vinod, On 7 September 2017 at 00:22, Vinod Koulwrote: > On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote: > >> > > +/* DMA global registers definition */ >> > > +#define DMA_GLB_PAUSE0x0 >> > > +#define DMA_GLB_FRAG_WAIT0x4 >> > > +#define DMA_GLB_REQ_PEND0_EN 0x8 >> > > +#define DMA_GLB_REQ_PEND1_EN 0xc >> > > +#define DMA_GLB_INT_RAW_STS 0x10 >> > > +#define DMA_GLB_INT_MSK_STS 0x14 >> > > +#define DMA_GLB_REQ_STS 0x18 >> > > +#define DMA_GLB_CHN_EN_STS 0x1c >> > > +#define DMA_GLB_DEBUG_STS0x20 >> > > +#define DMA_GLB_ARB_SEL_STS 0x24 >> > > +#define DMA_GLB_CHN_START_CHN_CFG1 0x28 >> > > +#define DMA_GLB_CHN_START_CHN_CFG2 0x2c >> > > +#define DMA_CHN_LLIST_OFFSET 0x10 >> > > +#define DMA_GLB_REQ_CID_OFFSET 0x2000 >> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1)) >> > >> > namespaced properly please, here and rest.. >> >> Could you elaborate which name need to named properly? >> I guess DMA_GLB_REQ_CID is not very clear, can I change >> to DMA_GLB_REQ_UID(uid) which is used to define the UID >> registers? > > I meant something like: > > SPRD_DMA_ > > DMA_GLB is very generic term and might cause collisions in future so lets > make these future proof.. Make sense. > >> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev) >> > > +{ >> > > + int ret; >> > > + >> > > + ret = clk_prepare_enable(sdev->clk); >> > > + if (ret) >> > > + return ret; >> > > + >> > > + if (!IS_ERR(sdev->ashb_clk)) >> > >> > that looks odd, can you explain this? >> >> Since the ashb_clk is optional and only for AGCP DMA controller, >> so here we add one condition to check if the ashb_clk need enable. > > it be worth documenting this.. OK. > >> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan) >> > > +{ >> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >> > > + unsigned long flags; >> > > + >> > > + spin_lock_irqsave(>vc.lock, flags); >> > > + sprd_dma_stop(schan); >> > > + spin_unlock_irqrestore(>vc.lock, flags); >> > > + >> > > + vchan_free_chan_resources(>vc); >> > > + pm_runtime_put_sync(chan->device->dev); >> > >> > why put_sync() >> >> Since we will get pm counter to resume DMA when allocating resources, then >> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend >> DMA if the pm counter is 0. > > runtime_pm part is fine, you could have used pm_runtime_put(), why the > sync() variant here... After checking again, I agree pm_runtime_put() is enough here and I will fix it in next version. >> >> > >> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan, >> > > + dma_cookie_t cookie, >> > > + struct dma_tx_state *txstate) >> > > +{ >> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >> > > + unsigned long flags; >> > > + enum dma_status ret; >> > > + >> > > + ret = dma_cookie_status(chan, cookie, txstate); >> > > + >> > > + spin_lock_irqsave(>vc.lock, flags); >> > > + txstate->residue = sprd_dma_get_dst_addr(schan); >> > >> > I dont think this is correct, the residue needs to be read only for current >> > cookie and the query might be for one which is not even submitted >> >> We have one scenario for our audio driver, the audio driver need to get >> the destination address to check if their transfer is done in no irq mode. > > Yes but for the descriptor requested but not any. Audio maybe working as > period count maybe lesser... > >> > > +static int sprd_dma_probe(struct platform_device *pdev) >> > > +{ >> > > + struct device_node *np = pdev->dev.of_node; >> > > + struct sprd_dma_dev *sdev; >> > > + struct sprd_dma_chn *dma_chn; >> > > + struct resource *res; >> > > + u32 chn_count; >> > > + int ret, i; >> > > + >> > > + ret = of_property_read_u32(np, "#dma-channels", _count); >> > > + if (ret) { >> > > + dev_err(>dev, "get dma channels count failed\n"); >> > > + return ret; >> > > + } >> > > + >> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) + >> > > + (sizeof(struct sprd_dma_chn) * chn_count)), >> > > + GFP_KERNEL); >> > > + if (!sdev) >> > > + return -ENOMEM; >> > > + >> > > + sdev->clk = devm_clk_get(>dev, "enable"); >> > > + if (IS_ERR(sdev->clk)) { >> > > + dev_err(>dev, "get enable clock failed\n"); >> > > + return PTR_ERR(sdev->clk); >> > > + } >> > > + >> > > + /* ashb clock is optional for AGCP DMA */ >> > > + sdev->ashb_clk = devm_clk_get(>dev, "ashb_eb"); >> > > + if (IS_ERR(sdev->ashb_clk)) >> > > + dev_warn(>dev, "no optional ashb eb clock\n"); >> > > + >> > > + sdev->irq = platform_get_irq(pdev, 0); >> > > + if (sdev->irq > 0) { >> > > + ret = devm_request_irq(>dev, sdev->irq, dma_irq_handle, >> > > +
Re: [PATCH v2 2/2] dma: sprd: Add Spreadtrum DMA driver
Hi Vinod, On 7 September 2017 at 00:22, Vinod Koul wrote: > On Tue, Sep 05, 2017 at 07:48:43PM +0800, Baolin Wang wrote: > >> > > +/* DMA global registers definition */ >> > > +#define DMA_GLB_PAUSE0x0 >> > > +#define DMA_GLB_FRAG_WAIT0x4 >> > > +#define DMA_GLB_REQ_PEND0_EN 0x8 >> > > +#define DMA_GLB_REQ_PEND1_EN 0xc >> > > +#define DMA_GLB_INT_RAW_STS 0x10 >> > > +#define DMA_GLB_INT_MSK_STS 0x14 >> > > +#define DMA_GLB_REQ_STS 0x18 >> > > +#define DMA_GLB_CHN_EN_STS 0x1c >> > > +#define DMA_GLB_DEBUG_STS0x20 >> > > +#define DMA_GLB_ARB_SEL_STS 0x24 >> > > +#define DMA_GLB_CHN_START_CHN_CFG1 0x28 >> > > +#define DMA_GLB_CHN_START_CHN_CFG2 0x2c >> > > +#define DMA_CHN_LLIST_OFFSET 0x10 >> > > +#define DMA_GLB_REQ_CID_OFFSET 0x2000 >> > > +#define DMA_GLB_REQ_CID(uid) (0x4 * ((uid) - 1)) >> > >> > namespaced properly please, here and rest.. >> >> Could you elaborate which name need to named properly? >> I guess DMA_GLB_REQ_CID is not very clear, can I change >> to DMA_GLB_REQ_UID(uid) which is used to define the UID >> registers? > > I meant something like: > > SPRD_DMA_ > > DMA_GLB is very generic term and might cause collisions in future so lets > make these future proof.. Make sense. > >> > > +static int sprd_dma_enable(struct sprd_dma_dev *sdev) >> > > +{ >> > > + int ret; >> > > + >> > > + ret = clk_prepare_enable(sdev->clk); >> > > + if (ret) >> > > + return ret; >> > > + >> > > + if (!IS_ERR(sdev->ashb_clk)) >> > >> > that looks odd, can you explain this? >> >> Since the ashb_clk is optional and only for AGCP DMA controller, >> so here we add one condition to check if the ashb_clk need enable. > > it be worth documenting this.. OK. > >> > > +static void sprd_dma_free_chan_resources(struct dma_chan *chan) >> > > +{ >> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >> > > + unsigned long flags; >> > > + >> > > + spin_lock_irqsave(>vc.lock, flags); >> > > + sprd_dma_stop(schan); >> > > + spin_unlock_irqrestore(>vc.lock, flags); >> > > + >> > > + vchan_free_chan_resources(>vc); >> > > + pm_runtime_put_sync(chan->device->dev); >> > >> > why put_sync() >> >> Since we will get pm counter to resume DMA when allocating resources, then >> we should put pm counter in sprd_dma_free_chan_resources() to try to suspend >> DMA if the pm counter is 0. > > runtime_pm part is fine, you could have used pm_runtime_put(), why the > sync() variant here... After checking again, I agree pm_runtime_put() is enough here and I will fix it in next version. >> >> > >> > > +static enum dma_status sprd_dma_tx_status(struct dma_chan *chan, >> > > + dma_cookie_t cookie, >> > > + struct dma_tx_state *txstate) >> > > +{ >> > > + struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); >> > > + unsigned long flags; >> > > + enum dma_status ret; >> > > + >> > > + ret = dma_cookie_status(chan, cookie, txstate); >> > > + >> > > + spin_lock_irqsave(>vc.lock, flags); >> > > + txstate->residue = sprd_dma_get_dst_addr(schan); >> > >> > I dont think this is correct, the residue needs to be read only for current >> > cookie and the query might be for one which is not even submitted >> >> We have one scenario for our audio driver, the audio driver need to get >> the destination address to check if their transfer is done in no irq mode. > > Yes but for the descriptor requested but not any. Audio maybe working as > period count maybe lesser... > >> > > +static int sprd_dma_probe(struct platform_device *pdev) >> > > +{ >> > > + struct device_node *np = pdev->dev.of_node; >> > > + struct sprd_dma_dev *sdev; >> > > + struct sprd_dma_chn *dma_chn; >> > > + struct resource *res; >> > > + u32 chn_count; >> > > + int ret, i; >> > > + >> > > + ret = of_property_read_u32(np, "#dma-channels", _count); >> > > + if (ret) { >> > > + dev_err(>dev, "get dma channels count failed\n"); >> > > + return ret; >> > > + } >> > > + >> > > + sdev = devm_kzalloc(>dev, (sizeof(*sdev) + >> > > + (sizeof(struct sprd_dma_chn) * chn_count)), >> > > + GFP_KERNEL); >> > > + if (!sdev) >> > > + return -ENOMEM; >> > > + >> > > + sdev->clk = devm_clk_get(>dev, "enable"); >> > > + if (IS_ERR(sdev->clk)) { >> > > + dev_err(>dev, "get enable clock failed\n"); >> > > + return PTR_ERR(sdev->clk); >> > > + } >> > > + >> > > + /* ashb clock is optional for AGCP DMA */ >> > > + sdev->ashb_clk = devm_clk_get(>dev, "ashb_eb"); >> > > + if (IS_ERR(sdev->ashb_clk)) >> > > + dev_warn(>dev, "no optional ashb eb clock\n"); >> > > + >> > > + sdev->irq = platform_get_irq(pdev, 0); >> > > + if (sdev->irq > 0) { >> > > + ret = devm_request_irq(>dev, sdev->irq, dma_irq_handle, >> > > +0,
[PATCH] crc32-pclmul: remove useless relative addressing
In 32-bit mode, the x86 architecture can hold full 32-bit pointers. Therefore, the code that copies the current address to the %ecx register and uses %ecx-relative addressing is useless, we could just use absolute addressing. The processors have a stack of return addresses for branch prediction. If we use a call instruction and pop the return address, it desynchronizes the return stack and causes branch prediction misses. This patch also moves the data to the .rodata section. Signed-off-by: Mikulas Patocka--- arch/x86/crypto/crc32-pclmul_asm.S | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) Index: linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S === --- linux-2.6.orig/arch/x86/crypto/crc32-pclmul_asm.S +++ linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S @@ -41,6 +41,7 @@ #include +.section .rodata .align 16 /* * [x4*128+32 mod P(x) << 32)]' << 1 = 0x154442bd4 @@ -111,19 +112,13 @@ ENTRY(crc32_pclmul_le_16) /* buffer and pxorCONSTANT, %xmm1 sub $0x40, LEN add $0x40, BUF -#ifndef __x86_64__ - /* This is for position independent code(-fPIC) support for 32bit */ - calldelta -delta: - pop %ecx -#endif cmp $0x40, LEN jb less_64 #ifdef __x86_64__ movdqa .Lconstant_R2R1(%rip), CONSTANT #else - movdqa .Lconstant_R2R1 - delta(%ecx), CONSTANT + movdqa .Lconstant_R2R1, CONSTANT #endif loop_64:/* 64 bytes Full cache line folding */ @@ -172,7 +167,7 @@ less_64:/* Folding cache line into 128b #ifdef __x86_64__ movdqa .Lconstant_R4R3(%rip), CONSTANT #else - movdqa .Lconstant_R4R3 - delta(%ecx), CONSTANT + movdqa .Lconstant_R4R3, CONSTANT #endif prefetchnta (BUF) @@ -220,8 +215,8 @@ fold_64: movdqa .Lconstant_R5(%rip), CONSTANT movdqa .Lconstant_mask32(%rip), %xmm3 #else - movdqa .Lconstant_R5 - delta(%ecx), CONSTANT - movdqa .Lconstant_mask32 - delta(%ecx), %xmm3 + movdqa .Lconstant_R5, CONSTANT + movdqa .Lconstant_mask32, %xmm3 #endif psrldq $0x04, %xmm2 pand%xmm3, %xmm1 @@ -232,7 +227,7 @@ fold_64: #ifdef __x86_64__ movdqa .Lconstant_RUpoly(%rip), CONSTANT #else - movdqa .Lconstant_RUpoly - delta(%ecx), CONSTANT + movdqa .Lconstant_RUpoly, CONSTANT #endif movdqa %xmm1, %xmm2 pand%xmm3, %xmm1
[PATCH] crc32-pclmul: remove useless relative addressing
In 32-bit mode, the x86 architecture can hold full 32-bit pointers. Therefore, the code that copies the current address to the %ecx register and uses %ecx-relative addressing is useless, we could just use absolute addressing. The processors have a stack of return addresses for branch prediction. If we use a call instruction and pop the return address, it desynchronizes the return stack and causes branch prediction misses. This patch also moves the data to the .rodata section. Signed-off-by: Mikulas Patocka --- arch/x86/crypto/crc32-pclmul_asm.S | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) Index: linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S === --- linux-2.6.orig/arch/x86/crypto/crc32-pclmul_asm.S +++ linux-2.6/arch/x86/crypto/crc32-pclmul_asm.S @@ -41,6 +41,7 @@ #include +.section .rodata .align 16 /* * [x4*128+32 mod P(x) << 32)]' << 1 = 0x154442bd4 @@ -111,19 +112,13 @@ ENTRY(crc32_pclmul_le_16) /* buffer and pxorCONSTANT, %xmm1 sub $0x40, LEN add $0x40, BUF -#ifndef __x86_64__ - /* This is for position independent code(-fPIC) support for 32bit */ - calldelta -delta: - pop %ecx -#endif cmp $0x40, LEN jb less_64 #ifdef __x86_64__ movdqa .Lconstant_R2R1(%rip), CONSTANT #else - movdqa .Lconstant_R2R1 - delta(%ecx), CONSTANT + movdqa .Lconstant_R2R1, CONSTANT #endif loop_64:/* 64 bytes Full cache line folding */ @@ -172,7 +167,7 @@ less_64:/* Folding cache line into 128b #ifdef __x86_64__ movdqa .Lconstant_R4R3(%rip), CONSTANT #else - movdqa .Lconstant_R4R3 - delta(%ecx), CONSTANT + movdqa .Lconstant_R4R3, CONSTANT #endif prefetchnta (BUF) @@ -220,8 +215,8 @@ fold_64: movdqa .Lconstant_R5(%rip), CONSTANT movdqa .Lconstant_mask32(%rip), %xmm3 #else - movdqa .Lconstant_R5 - delta(%ecx), CONSTANT - movdqa .Lconstant_mask32 - delta(%ecx), %xmm3 + movdqa .Lconstant_R5, CONSTANT + movdqa .Lconstant_mask32, %xmm3 #endif psrldq $0x04, %xmm2 pand%xmm3, %xmm1 @@ -232,7 +227,7 @@ fold_64: #ifdef __x86_64__ movdqa .Lconstant_RUpoly(%rip), CONSTANT #else - movdqa .Lconstant_RUpoly - delta(%ecx), CONSTANT + movdqa .Lconstant_RUpoly, CONSTANT #endif movdqa %xmm1, %xmm2 pand%xmm3, %xmm1
Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup
On 09/07/17 at 10:27am, Dou Liyang wrote: > > > At 09/06/2017 04:03 PM, Baoquan He wrote: > > On 09/06/17 at 01:41pm, Dou Liyang wrote: > > > Hi Baoquan, > > > > > > At 09/06/2017 01:26 PM, Baoquan He wrote: > > > [...] > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > > index 4f63afc..9f8479c 100644 > > > --- a/arch/x86/kernel/smpboot.c > > > +++ b/arch/x86/kernel/smpboot.c > > > @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void) > > > } > > > > > > /* > > > - * Prepare for SMP bootup. The MP table or ACPI has been read > > > - * earlier. Just do some sanity checking here and enable APIC mode. > > > + * Prepare for SMP bootup. > > > + * > > > + * @max_cpus: configured maximum number of CPUs > > > + * It don't be used, but other arch also have this hook, so keep it. > > How about: > > @max_cpus: configured maximum number of CPUs > It is a legacy parameter for common interface support Looks good to me. > > > > > Yeah, makes sense. Above words can be reconsidered. > > > > >
Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup
On 09/07/17 at 10:27am, Dou Liyang wrote: > > > At 09/06/2017 04:03 PM, Baoquan He wrote: > > On 09/06/17 at 01:41pm, Dou Liyang wrote: > > > Hi Baoquan, > > > > > > At 09/06/2017 01:26 PM, Baoquan He wrote: > > > [...] > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > > index 4f63afc..9f8479c 100644 > > > --- a/arch/x86/kernel/smpboot.c > > > +++ b/arch/x86/kernel/smpboot.c > > > @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void) > > > } > > > > > > /* > > > - * Prepare for SMP bootup. The MP table or ACPI has been read > > > - * earlier. Just do some sanity checking here and enable APIC mode. > > > + * Prepare for SMP bootup. > > > + * > > > + * @max_cpus: configured maximum number of CPUs > > > + * It don't be used, but other arch also have this hook, so keep it. > > How about: > > @max_cpus: configured maximum number of CPUs > It is a legacy parameter for common interface support Looks good to me. > > > > > Yeah, makes sense. Above words can be reconsidered. > > > > >
linux-next: build warning after merge of the tpmdd tree
Hi Jarkko, After merging the tpmdd tree, today's linux-next build (x86_64 allmodconfig) produced this warning: drivers/char/tpm/tpm_tis_core.c: In function 'probe_itpm': drivers/char/tpm/tpm_tis_core.c:469:31: warning: passing argument 2 of 'tpm_tis_send_data' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] rc = tpm_tis_send_data(chip, cmd_getticks, len); ^ drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char *}' but argument is of type 'const u8 * {aka const unsigned char *}' static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) ^ drivers/char/tpm/tpm_tis_core.c:477:31: warning: passing argument 2 of 'tpm_tis_send_data' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] rc = tpm_tis_send_data(chip, cmd_getticks, len); ^ drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char *}' but argument is of type 'const u8 * {aka const unsigned char *}' static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) ^ Introduced by commit 5e35bd8e06b9 ("tpm_tis: make array cmd_getticks static const to shink object code size") -- Cheers, Stephen Rothwell
linux-next: build warning after merge of the tpmdd tree
Hi Jarkko, After merging the tpmdd tree, today's linux-next build (x86_64 allmodconfig) produced this warning: drivers/char/tpm/tpm_tis_core.c: In function 'probe_itpm': drivers/char/tpm/tpm_tis_core.c:469:31: warning: passing argument 2 of 'tpm_tis_send_data' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] rc = tpm_tis_send_data(chip, cmd_getticks, len); ^ drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char *}' but argument is of type 'const u8 * {aka const unsigned char *}' static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) ^ drivers/char/tpm/tpm_tis_core.c:477:31: warning: passing argument 2 of 'tpm_tis_send_data' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] rc = tpm_tis_send_data(chip, cmd_getticks, len); ^ drivers/char/tpm/tpm_tis_core.c:255:12: note: expected 'u8 * {aka unsigned char *}' but argument is of type 'const u8 * {aka const unsigned char *}' static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) ^ Introduced by commit 5e35bd8e06b9 ("tpm_tis: make array cmd_getticks static const to shink object code size") -- Cheers, Stephen Rothwell
Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup
At 09/06/2017 04:03 PM, Baoquan He wrote: On 09/06/17 at 01:41pm, Dou Liyang wrote: Hi Baoquan, At 09/06/2017 01:26 PM, Baoquan He wrote: [...] diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 4f63afc..9f8479c 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void) } /* - * Prepare for SMP bootup. The MP table or ACPI has been read - * earlier. Just do some sanity checking here and enable APIC mode. + * Prepare for SMP bootup. + * + * @max_cpus: configured maximum number of CPUs + * It don't be used, but other arch also have this hook, so keep it. How about: @max_cpus: configured maximum number of CPUs It is a legacy parameter for common interface support Yeah, makes sense. Above words can be reconsidered.
Re: [PATCH v8 06/13] x86/apic: Mark the apic_intr_mode extern for sanity check cleanup
At 09/06/2017 04:03 PM, Baoquan He wrote: On 09/06/17 at 01:41pm, Dou Liyang wrote: Hi Baoquan, At 09/06/2017 01:26 PM, Baoquan He wrote: [...] diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 4f63afc..9f8479c 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1260,8 +1260,10 @@ static void __init smp_get_logical_apicid(void) } /* - * Prepare for SMP bootup. The MP table or ACPI has been read - * earlier. Just do some sanity checking here and enable APIC mode. + * Prepare for SMP bootup. + * + * @max_cpus: configured maximum number of CPUs + * It don't be used, but other arch also have this hook, so keep it. How about: @max_cpus: configured maximum number of CPUs It is a legacy parameter for common interface support Yeah, makes sense. Above words can be reconsidered.
Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors
On 09/06/2017 04:32 PM, Andrew Jeffery wrote: On Wed, 2017-09-06 at 15:51 -0700, Guenter Roeck wrote: On Wed, Sep 06, 2017 at 10:23:37AM +1000, Andrew Jeffery wrote: On Tue, 2017-09-05 at 10:00 -0700, Guenter Roeck wrote: On Tue, Sep 05, 2017 at 05:01:32PM +1000, Andrew Jeffery wrote: Some functions exposed by pmbus core conflated errors that occurred when setting the page to access with errors that occurred when accessing registers in a page. In some cases, this caused legitimate errors to be hidden under the guise of the register not being supported. I'll have to look into this. If I recall correctly, the reason for not returning errors from the "clear faults" command was that some early chips did not support it, or did not support it on all pages. Those chips would now always fail. Yes, that is a concern. However, shouldn't the lack of support for CLEAR_FAULTS be a described property of the chip or page? Regardless, the issue here is the PAGE command is sometimes failing (more below). This means we won't have issued a CLEAR_FAULTS against the page even if the page supports CLEAR_FAULTS. That to me is something that should be propagated back up the call chain, as faults that can be cleared will not have been. We could also consider - retry I guess that leads to the usual retries problem: How many? Oneshot? More? My expectation is that oneshot would be enough, but we'd still need to consider what to do if that wasn't successful. Does the retry policy need to be in the kernel? I guess if it's not we would need all operations in the path to the failure to be idempotent. - Add a marker indicating that faults (specifically command errors) are unreliable after clearing faults failed What kind of marker were you thinking? Something in dmesg? If it's anything else we'd probably want something to respond to the condition, which nothing would do currently. If we can't reliably clear faults, all bets are off. Yeah, it's a messy problem. However even if we resolve our issues in hardware (i.e. it's discovered that it is a design failure or something), I don't think we should drop a resolution to the problem highlighted by this patch. On a higher level, I am not sure if it is a good idea to return an error from a function intended to clear faults (and nothing else). That seems counterproductive. Is this a problem you have actually observed ? Unfortunately yes. I was trying to reproduce some issues that we are seeing in userspace but wasn't having much luck (getting -EIO on reading a hwmon attribute). I ended up instrumenting our I2C bus driver, and found that the problem was more prevalent than what the read failures in userspace were indicating. The paths that were triggering these unreported conditions all traced through the check register and clear fault functions, which on analysis were swallowing the error. If so, what is the chip ? Well, the chip that I'm *communicating* with is the Maxim MAX31785 Intelligent Fan Controller. As to whether that's what is *causing* the PAGE command to fail is still up in the air. I would not be surprised if we have other issues in the hardware design. I haven't sent a follow-up to the series introducing the MAX31785 driver because I've been chasing down communication issues. I felt it was important to understand whether the device has quirks that need to be worked around in the driver, or if our hardware design has bigger problems that should be handled in other ways. I've been in touch with Maxim who have asserted that some of the problems we're seeing cannot be caused by their chip. Guess I need to dig up my eval board and see if I can reproduce the problem. Seems you are saying that the problem is always seen when issuing a sequence of "clear faults" commands on multiple pages ? Yeah. We're also seeing bad behaviour under other command sequences as well, which lead to this hack of a work-around patch[1]. I'd be very interested in the results of testing against the eval board. I don't have access to one and it seems Maxim have discontinued them. [1] https://patchwork.kernel.org/patch/9876083/ The MAX31785 is microcode based, so I would not be entirely surprised if it sometimes fails to reply if a sequence of commands is executed. Have you seen this behaviour in other microcode-based chips? ltc2978 (commit e04d1ce9bbb) and most of the Zilker Labs chips (zl6100.c). You could try the approach I used in the zl6100 driver: Add a minimum time between accesses to the chip. I would probably no longer use udelay(), though, but usleeep_range(), if I were to implement the code today. If possible, you might try reducing the i2c clock. I have not seen that with Maxim chips, but some other PMBus chips don't operate reliably at 400 KHz. It's already running at 100kHz, which is the maximum clock rate the device is specified for. I've even underclocked the bus to 75kHz and still observed
Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors
On 09/06/2017 04:32 PM, Andrew Jeffery wrote: On Wed, 2017-09-06 at 15:51 -0700, Guenter Roeck wrote: On Wed, Sep 06, 2017 at 10:23:37AM +1000, Andrew Jeffery wrote: On Tue, 2017-09-05 at 10:00 -0700, Guenter Roeck wrote: On Tue, Sep 05, 2017 at 05:01:32PM +1000, Andrew Jeffery wrote: Some functions exposed by pmbus core conflated errors that occurred when setting the page to access with errors that occurred when accessing registers in a page. In some cases, this caused legitimate errors to be hidden under the guise of the register not being supported. I'll have to look into this. If I recall correctly, the reason for not returning errors from the "clear faults" command was that some early chips did not support it, or did not support it on all pages. Those chips would now always fail. Yes, that is a concern. However, shouldn't the lack of support for CLEAR_FAULTS be a described property of the chip or page? Regardless, the issue here is the PAGE command is sometimes failing (more below). This means we won't have issued a CLEAR_FAULTS against the page even if the page supports CLEAR_FAULTS. That to me is something that should be propagated back up the call chain, as faults that can be cleared will not have been. We could also consider - retry I guess that leads to the usual retries problem: How many? Oneshot? More? My expectation is that oneshot would be enough, but we'd still need to consider what to do if that wasn't successful. Does the retry policy need to be in the kernel? I guess if it's not we would need all operations in the path to the failure to be idempotent. - Add a marker indicating that faults (specifically command errors) are unreliable after clearing faults failed What kind of marker were you thinking? Something in dmesg? If it's anything else we'd probably want something to respond to the condition, which nothing would do currently. If we can't reliably clear faults, all bets are off. Yeah, it's a messy problem. However even if we resolve our issues in hardware (i.e. it's discovered that it is a design failure or something), I don't think we should drop a resolution to the problem highlighted by this patch. On a higher level, I am not sure if it is a good idea to return an error from a function intended to clear faults (and nothing else). That seems counterproductive. Is this a problem you have actually observed ? Unfortunately yes. I was trying to reproduce some issues that we are seeing in userspace but wasn't having much luck (getting -EIO on reading a hwmon attribute). I ended up instrumenting our I2C bus driver, and found that the problem was more prevalent than what the read failures in userspace were indicating. The paths that were triggering these unreported conditions all traced through the check register and clear fault functions, which on analysis were swallowing the error. If so, what is the chip ? Well, the chip that I'm *communicating* with is the Maxim MAX31785 Intelligent Fan Controller. As to whether that's what is *causing* the PAGE command to fail is still up in the air. I would not be surprised if we have other issues in the hardware design. I haven't sent a follow-up to the series introducing the MAX31785 driver because I've been chasing down communication issues. I felt it was important to understand whether the device has quirks that need to be worked around in the driver, or if our hardware design has bigger problems that should be handled in other ways. I've been in touch with Maxim who have asserted that some of the problems we're seeing cannot be caused by their chip. Guess I need to dig up my eval board and see if I can reproduce the problem. Seems you are saying that the problem is always seen when issuing a sequence of "clear faults" commands on multiple pages ? Yeah. We're also seeing bad behaviour under other command sequences as well, which lead to this hack of a work-around patch[1]. I'd be very interested in the results of testing against the eval board. I don't have access to one and it seems Maxim have discontinued them. [1] https://patchwork.kernel.org/patch/9876083/ The MAX31785 is microcode based, so I would not be entirely surprised if it sometimes fails to reply if a sequence of commands is executed. Have you seen this behaviour in other microcode-based chips? ltc2978 (commit e04d1ce9bbb) and most of the Zilker Labs chips (zl6100.c). You could try the approach I used in the zl6100 driver: Add a minimum time between accesses to the chip. I would probably no longer use udelay(), though, but usleeep_range(), if I were to implement the code today. If possible, you might try reducing the i2c clock. I have not seen that with Maxim chips, but some other PMBus chips don't operate reliably at 400 KHz. It's already running at 100kHz, which is the maximum clock rate the device is specified for. I've even underclocked the bus to 75kHz and still observed
Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3
On 2017/9/6 10:12, Jerome Glisse wrote: > On Wed, Sep 06, 2017 at 09:25:36AM +0800, Bob Liu wrote: >> On 2017/9/6 2:54, Ross Zwisler wrote: >>> On Mon, Sep 04, 2017 at 10:38:27PM -0400, Jerome Glisse wrote: On Tue, Sep 05, 2017 at 09:13:24AM +0800, Bob Liu wrote: > On 2017/9/4 23:51, Jerome Glisse wrote: >> On Mon, Sep 04, 2017 at 11:09:14AM +0800, Bob Liu wrote: >>> On 2017/8/17 8:05, Jérôme Glisse wrote: Unlike unaddressable memory, coherent device memory has a real resource associated with it on the system (as CPU can address it). Add a new helper to hotplug such memory within the HMM framework. >>> >>> Got an new question, coherent device( e.g CCIX) memory are likely >>> reported to OS >>> through ACPI and recognized as NUMA memory node. >>> Then how can their memory be captured and managed by HMM framework? >>> >> >> Only platform that has such memory today is powerpc and it is not >> reported >> as regular memory by the firmware hence why they need this helper. >> >> I don't think anyone has defined anything yet for x86 and acpi. As this >> is > > Not yet, but now the ACPI spec has Heterogeneous Memory Attribute > Table (HMAT) table defined in ACPI 6.2. > The HMAT can cover CPU-addressable memory types(though not non-cache > coherent on-device memory). > > Ross from Intel already done some work on this, see: > https://lwn.net/Articles/724562/ > > arm64 supports APCI also, there is likely more this kind of device when > CCIX > is out (should be very soon if on schedule). HMAT is not for the same thing, AFAIK HMAT is for deep "hierarchy" memory ie when you have several kind of memory each with different characteristics: - HBM very fast (latency) and high bandwidth, non persistent, somewhat small (ie few giga bytes) - Persistent memory, slower (both latency and bandwidth) big (tera bytes) - DDR (good old memory) well characteristics are between HBM and persistent So AFAICT this has nothing to do with what HMM is for, ie device memory. Note that device memory can have a hierarchy of memory themself (HBM, GDDR and in maybe even persistent memory). >> memory on PCIE like interface then i don't expect it to be reported as >> NUMA >> memory node but as io range like any regular PCIE resources. Device >> driver >> through capabilities flags would then figure out if the link between the >> device and CPU is CCIX capable if so it can use this helper to hotplug it >> as device memory. >> > > From my point of view, Cache coherent device memory will popular soon and > reported through ACPI/UEFI. Extending NUMA policy still sounds more > reasonable > to me. Cache coherent device will be reported through standard mecanisms defined by the bus standard they are using. To my knowledge all the standard are either on top of PCIE or are similar to PCIE. It is true that on many platform PCIE resource is manage/initialize by the bios (UEFI) but it is platform specific. In some case we reprogram what the bios pick. So like i was saying i don't expect the BIOS/UEFI to report device memory as regular memory. It will be reported as a regular PCIE resources and then the device driver will be able to determine through some flags if the link between the CPU(s) and the device is cache coherent or not. At that point the device driver can use register it with HMM helper. The whole NUMA discussion happen several time in the past i suggest looking on mm list archive for them. But it was rule out for several reasons. Top of my head: - people hate CPU less node and device memory is inherently CPU less >>> >>> With the introduction of the HMAT in ACPI 6.2 one of the things that was >>> added >>> was the ability to have an ACPI proximity domain that isn't associated with >>> a >>> CPU. This can be seen in the changes in the text of the "Proximity Domain" >>> field in table 5-73 which describes the "Memory Affinity Structure". One of >>> the major features of the HMAT was the separation of "Initiator" proximity >>> domains (CPUs, devices that initiate memory transfers), and "target" >>> proximity >>> domains (memory regions, be they attached to a CPU or some other device). >>> >>> ACPI proximity domains map directly to Linux NUMA nodes, so I think we're >>> already in a place where we have to support CPU-less NUMA nodes. >>> - device driver want total control over memory and thus to be isolated from mm mecanism and doing all those special cases was not welcome >>> >>> I agree that the kernel doesn't have enough information to be able to >>> accurately handle
Re: [HMM-v25 19/19] mm/hmm: add new helper to hotplug CDM memory region v3
On 2017/9/6 10:12, Jerome Glisse wrote: > On Wed, Sep 06, 2017 at 09:25:36AM +0800, Bob Liu wrote: >> On 2017/9/6 2:54, Ross Zwisler wrote: >>> On Mon, Sep 04, 2017 at 10:38:27PM -0400, Jerome Glisse wrote: On Tue, Sep 05, 2017 at 09:13:24AM +0800, Bob Liu wrote: > On 2017/9/4 23:51, Jerome Glisse wrote: >> On Mon, Sep 04, 2017 at 11:09:14AM +0800, Bob Liu wrote: >>> On 2017/8/17 8:05, Jérôme Glisse wrote: Unlike unaddressable memory, coherent device memory has a real resource associated with it on the system (as CPU can address it). Add a new helper to hotplug such memory within the HMM framework. >>> >>> Got an new question, coherent device( e.g CCIX) memory are likely >>> reported to OS >>> through ACPI and recognized as NUMA memory node. >>> Then how can their memory be captured and managed by HMM framework? >>> >> >> Only platform that has such memory today is powerpc and it is not >> reported >> as regular memory by the firmware hence why they need this helper. >> >> I don't think anyone has defined anything yet for x86 and acpi. As this >> is > > Not yet, but now the ACPI spec has Heterogeneous Memory Attribute > Table (HMAT) table defined in ACPI 6.2. > The HMAT can cover CPU-addressable memory types(though not non-cache > coherent on-device memory). > > Ross from Intel already done some work on this, see: > https://lwn.net/Articles/724562/ > > arm64 supports APCI also, there is likely more this kind of device when > CCIX > is out (should be very soon if on schedule). HMAT is not for the same thing, AFAIK HMAT is for deep "hierarchy" memory ie when you have several kind of memory each with different characteristics: - HBM very fast (latency) and high bandwidth, non persistent, somewhat small (ie few giga bytes) - Persistent memory, slower (both latency and bandwidth) big (tera bytes) - DDR (good old memory) well characteristics are between HBM and persistent So AFAICT this has nothing to do with what HMM is for, ie device memory. Note that device memory can have a hierarchy of memory themself (HBM, GDDR and in maybe even persistent memory). >> memory on PCIE like interface then i don't expect it to be reported as >> NUMA >> memory node but as io range like any regular PCIE resources. Device >> driver >> through capabilities flags would then figure out if the link between the >> device and CPU is CCIX capable if so it can use this helper to hotplug it >> as device memory. >> > > From my point of view, Cache coherent device memory will popular soon and > reported through ACPI/UEFI. Extending NUMA policy still sounds more > reasonable > to me. Cache coherent device will be reported through standard mecanisms defined by the bus standard they are using. To my knowledge all the standard are either on top of PCIE or are similar to PCIE. It is true that on many platform PCIE resource is manage/initialize by the bios (UEFI) but it is platform specific. In some case we reprogram what the bios pick. So like i was saying i don't expect the BIOS/UEFI to report device memory as regular memory. It will be reported as a regular PCIE resources and then the device driver will be able to determine through some flags if the link between the CPU(s) and the device is cache coherent or not. At that point the device driver can use register it with HMM helper. The whole NUMA discussion happen several time in the past i suggest looking on mm list archive for them. But it was rule out for several reasons. Top of my head: - people hate CPU less node and device memory is inherently CPU less >>> >>> With the introduction of the HMAT in ACPI 6.2 one of the things that was >>> added >>> was the ability to have an ACPI proximity domain that isn't associated with >>> a >>> CPU. This can be seen in the changes in the text of the "Proximity Domain" >>> field in table 5-73 which describes the "Memory Affinity Structure". One of >>> the major features of the HMAT was the separation of "Initiator" proximity >>> domains (CPUs, devices that initiate memory transfers), and "target" >>> proximity >>> domains (memory regions, be they attached to a CPU or some other device). >>> >>> ACPI proximity domains map directly to Linux NUMA nodes, so I think we're >>> already in a place where we have to support CPU-less NUMA nodes. >>> - device driver want total control over memory and thus to be isolated from mm mecanism and doing all those special cases was not welcome >>> >>> I agree that the kernel doesn't have enough information to be able to >>> accurately handle