date:20201225

Re: [PATCH] arch/x86: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)

2020-12-25 Thread Nathan Chancellor

On Fri, Dec 25, 2020 at 11:35:28PM -0800, h...@zytor.com wrote:
> On December 25, 2020 11:29:30 PM PST, John Millikin  
> wrote:
> >When compiling with Clang, the `$(CLANG_FLAGS)' variable contains
> >additional flags needed to cross-compile C and Assembly sources:

I am not sure how or if others agree but it took me a second to realize
the purpose of this change was cross compiling even though I read the
commit message so I think it should be called out a bit more. I would
argue that it is not very common to see x86 cross compiled (I know
Stephen Rothwell does) :) x86 is one of the most tested architectures
for building with clang and we have see no recent failures in the boot
or realmode code.

> >* `-no-integrated-as' tells clang to assemble with GNU Assembler
> >  instead of its built-in LLVM assembler. This flag is set by default
> >  unless `LLVM_IAS=1' is set, because the LLVM assembler can't yet
> >  parse certain GNU extensions.
> >
> >* `--target' sets the target architecture when cross-compiling. This
> >  flag must be set for both compilation and assembly (`KBUILD_AFLAGS')
> >  to support architecture-specific assembler directives.
> >
> >Signed-off-by: John Millikin 
> >---
> > arch/x86/Makefile | 5 +
> > 1 file changed, 5 insertions(+)
> >
> >diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> >index 7116da3980be..725c65532482 100644
> >--- a/arch/x86/Makefile
> >+++ b/arch/x86/Makefile
> >@@ -33,6 +33,11 @@ REALMODE_CFLAGS += -ffreestanding
> > REALMODE_CFLAGS += -fno-stack-protector
> > REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS),
> >-Wno-address-of-packed-member)
> > REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS),
> >$(cc_stack_align4))
> >+
> >+ifdef CONFIG_CC_IS_CLANG
> >+REALMODE_CFLAGS += $(CLANG_FLAGS)
> >+endif
> >+
> > export REALMODE_CFLAGS
> > 
> > # BITS is used as extension for files which are available in a 32 bit
> 
> Why is CLANG_FLAGS non-null when unused? It would be better to centralize 
> that.

It isn't.

$ rg "CLANG_FLAGS :=" Makefile
507:CLANG_FLAGS :=

$ rg "CLANG_FLAGS\t" Makefile
564:CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
566:CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))
570:CLANG_FLAGS += --gcc-toolchain=$(GCC_TOOLCHAIN)
573:CLANG_FLAGS += -no-integrated-as
575:CLANG_FLAGS += -Werror=unknown-warning-option

The ifdef can be dropped and unconditonally add CLANG_FLAGS to
REALMODE_CFLAGS, as is done in a few arch directories:

$ rg "\(CLANG_FLAGS\)" arch | cat
arch/s390/purgatory/Makefile:KBUILD_CFLAGS += $(CLANG_FLAGS)
arch/s390/Makefile:KBUILD_AFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 
-D__ASSEMBLY__
arch/s390/Makefile:KBUILD_CFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 -O2
arch/powerpc/boot/Makefile:BOOTCFLAGS += $(CLANG_FLAGS)
arch/powerpc/boot/Makefile:BOOTAFLAGS += $(CLANG_FLAGS)

Cheers,
Nathan

Re: [PATCH] arch/x86: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)

2020-12-25 Thread John Millikin

On 12/26/20 16:35, h...@zytor.com wrote:
> Why is CLANG_FLAGS non-null when unused? It would be better to centralize 
> that.
CLANG_FLAGS normally propagates through inclusion in the default KBUILD_CFLAGS 
and KBUILD_AFLAGS, set in `/Makefile':

    # Makefile
    KBUILD_CFLAGS += $(CLANG_FLAGS)
    KBUILD_AFLAGS += $(CLANG_FLAGS)
    export CLANG_FLAGS

This default can be overridden by explicit assignment, as is done in some of 
the arch/x86 makefiles:

    # arch/x86/realmode/rm/Makefile
    KBUILD_CFLAGS    := $(REALMODE_CFLAGS) -D_SETUP -D_WAKEUP \
               -I$(srctree)/arch/x86/boot
    KBUILD_AFLAGS    := $(KBUILD_CFLAGS) -D__ASSEMBLY__
    KBUILD_CFLAGS    += -fno-asynchronous-unwind-tables

Since REALMODE_CFLAGS is being built up from a plain assignment, the Clang 
flags get lost. As a result Clang fails to compile the real-mode code when 
cross-compiling for an x86 target.

    arch/x86/realmode/rm/header.S:36:1: error: unknown directive
    .type real_mode_header STT_OBJECT ; .size real_mode_header, 
.-real_mode_header
    ^
    arch/x86/realmode/rm/header.S:36:37: error: unknown directive
    .type real_mode_header STT_OBJECT ; .size real_mode_header, 
.-real_mode_header
            ^
    arch/x86/realmode/rm/header.S:41:62: error: unknown directive
    .globl end_signature ; ; end_signature: ; .long 0x65a22c82 ; .type 
end_signature STT_OBJECT ; .size end_signature, .-end_signature
         ^
    arch/x86/realmode/rm/header.S:41:95: error: unknown directive
    .globl end_signature ; ; end_signature: ; .long 0x65a22c82 ; .type 
end_signature STT_OBJECT ; .size end_signature, .-end_signature
    
    ^

This patch allows the Clang-specific flags to propagate through the 
REALMODE_CFLAGS variable set in `arch/x86/Makefile' and consumed by certain 
arch/x86 targets, which fixes cross-compilation of x86 kernels.

Re: [PATCH] ALSA: hda: Resume codec for system suspend if LED is controlled by codec

2020-12-25 Thread Takashi Iwai

On Fri, 25 Dec 2020 17:47:23 +0100,
Kai-Heng Feng wrote:
> 
> Laptop with codec controlled LEDs takes a very long time to suspend
> after commit 215a22ed31a1 ("ALSA: hda: Refactor codec PM to use
> direct-complete optimization"):
> [   90.065964] PM: suspend entry (s2idle)
> [   90.067337] Filesystems sync: 0.001 seconds
> [   90.185758] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   90.188713] OOM killer disabled.
> [   90.188714] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
> done.
> [   90.190024] printk: Suspending console(s) (use no_console_suspend to debug)
> [   90.904912] intel_pch_thermal :00:12.0: CPU-PCH is cool [49C], 
> continue to suspend
> [  321.262505] snd_hda_codec_realtek ehdaudio0D0: Unable to sync register 
> 0x2b8000. -5
> [  328.426919] snd_hda_codec_realtek ehdaudio0D0: Unable to sync register 
> 0x2b8000. -5
> [  329.490933] ACPI: EC: interrupt blocked
> 
> That commit keeps codec suspended during the system suspend. This
> doesn't play well with codec controlled mute and micmute LEDs, because
> LED core will use codec registers to turn off those LEDs.
> 
> Currently, all users of create_mute_led_cdev() use codec to control
> LED, so unconditionally runtime resume those codecs before system
> suspend to solve the problem.
> 
> Fixes: 215a22ed31a1 ("ALSA: hda: Refactor codec PM to use direct-complete 
> optimization")
> Signed-off-by: Kai-Heng Feng 

A puzzling point is that it's applied only to the cases with the led
cdev.  Or does it basically apply for the ASoC controller?
That is, if you run with legacy HDA (snd-intel-dspcfg.dsp_driver=1),
does the issue appear as well on those machines?


thanks,

Takashi

> ---
>  include/sound/hda_codec.h   | 1 +
>  sound/pci/hda/hda_codec.c   | 7 +++
>  sound/pci/hda/hda_generic.c | 1 +
>  3 files changed, 9 insertions(+)
> 
> diff --git a/include/sound/hda_codec.h b/include/sound/hda_codec.h
> index 2e8d51937acd..b01d76abe008 100644
> --- a/include/sound/hda_codec.h
> +++ b/include/sound/hda_codec.h
> @@ -255,6 +255,7 @@ struct hda_codec {
>   unsigned int relaxed_resume:1;  /* don't resume forcibly for jack */
>   unsigned int forced_resume:1; /* forced resume for jack */
>   unsigned int mst_no_extra_pcms:1; /* no backup PCMs for DP-MST */
> + unsigned int resume_for_sleep:1;  /* runtime resume for system sleep */
>  
>  #ifdef CONFIG_PM
>   unsigned long power_on_acct;
> diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
> index 687216e74526..b890d9b4339e 100644
> --- a/sound/pci/hda/hda_codec.c
> +++ b/sound/pci/hda/hda_codec.c
> @@ -2983,6 +2983,13 @@ static int hda_codec_runtime_resume(struct device *dev)
>  #ifdef CONFIG_PM_SLEEP
>  static int hda_codec_pm_prepare(struct device *dev)
>  {
> + struct hda_codec *codec = dev_to_hda_codec(dev);
> +
> + if (codec->resume_for_sleep) {
> + pm_runtime_resume(dev);
> + return 0;
> + }
> +
>   return pm_runtime_suspended(dev);
>  }
>  
> diff --git a/sound/pci/hda/hda_generic.c b/sound/pci/hda/hda_generic.c
> index 8060cc86dfea..6d259d5bb5bb 100644
> --- a/sound/pci/hda/hda_generic.c
> +++ b/sound/pci/hda/hda_generic.c
> @@ -3913,6 +3913,7 @@ static int create_mute_led_cdev(struct hda_codec *codec,
>   cdev->brightness_set_blocking = callback;
>   cdev->brightness = ledtrig_audio_get(micmute ? LED_AUDIO_MICMUTE : 
> LED_AUDIO_MUTE);
>   cdev->flags = LED_CORE_SUSPENDRESUME;
> + codec->resume_for_sleep = 1;
>  
>   return devm_led_classdev_register(&codec->core.dev, cdev);
>  }
> -- 
> 2.29.2
>

[PATCH] RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp

2020-12-25 Thread Dinghao Liu

If usnic_ib_qp_grp_create() fails at the first call, dev_list
will not be freed on error, which leads to memleak.

Signed-off-by: Dinghao Liu 
---
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c 
b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 38a37770c016..3705c6b8b223 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -214,6 +214,7 @@ find_free_vf_and_create_qp_grp(struct usnic_ib_dev 
*us_ibdev,
 
}
usnic_uiom_free_dev_list(dev_list);
+   dev_list = NULL;
}
 
/* Try to find resources on an unused vf */
@@ -239,6 +240,8 @@ find_free_vf_and_create_qp_grp(struct usnic_ib_dev 
*us_ibdev,
 qp_grp_check:
if (IS_ERR_OR_NULL(qp_grp)) {
usnic_err("Failed to allocate qp_grp\n");
+   if (usnic_ib_share_vf)
+   usnic_uiom_free_dev_list(dev_list);
return ERR_PTR(qp_grp ? PTR_ERR(qp_grp) : -ENOMEM);
}
return qp_grp;
-- 
2.17.1

Re: [PATCH] arch/x86: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)

2020-12-25 Thread hpa

On December 25, 2020 11:29:30 PM PST, John Millikin  wrote:
>When compiling with Clang, the `$(CLANG_FLAGS)' variable contains
>additional flags needed to cross-compile C and Assembly sources:
>
>* `-no-integrated-as' tells clang to assemble with GNU Assembler
>  instead of its built-in LLVM assembler. This flag is set by default
>  unless `LLVM_IAS=1' is set, because the LLVM assembler can't yet
>  parse certain GNU extensions.
>
>* `--target' sets the target architecture when cross-compiling. This
>  flag must be set for both compilation and assembly (`KBUILD_AFLAGS')
>  to support architecture-specific assembler directives.
>
>Signed-off-by: John Millikin 
>---
> arch/x86/Makefile | 5 +
> 1 file changed, 5 insertions(+)
>
>diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>index 7116da3980be..725c65532482 100644
>--- a/arch/x86/Makefile
>+++ b/arch/x86/Makefile
>@@ -33,6 +33,11 @@ REALMODE_CFLAGS += -ffreestanding
> REALMODE_CFLAGS += -fno-stack-protector
> REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS),
>-Wno-address-of-packed-member)
> REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS),
>$(cc_stack_align4))
>+
>+ifdef CONFIG_CC_IS_CLANG
>+REALMODE_CFLAGS += $(CLANG_FLAGS)
>+endif
>+
> export REALMODE_CFLAGS
> 
> # BITS is used as extension for files which are available in a 32 bit

Why is CLANG_FLAGS non-null when unused? It would be better to centralize that.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[PATCH] arch/x86: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)

2020-12-25 Thread John Millikin

When compiling with Clang, the `$(CLANG_FLAGS)' variable contains
additional flags needed to cross-compile C and Assembly sources:

* `-no-integrated-as' tells clang to assemble with GNU Assembler
  instead of its built-in LLVM assembler. This flag is set by default
  unless `LLVM_IAS=1' is set, because the LLVM assembler can't yet
  parse certain GNU extensions.

* `--target' sets the target architecture when cross-compiling. This
  flag must be set for both compilation and assembly (`KBUILD_AFLAGS')
  to support architecture-specific assembler directives.

Signed-off-by: John Millikin 
---
 arch/x86/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 7116da3980be..725c65532482 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -33,6 +33,11 @@ REALMODE_CFLAGS += -ffreestanding
 REALMODE_CFLAGS += -fno-stack-protector
 REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS), 
-Wno-address-of-packed-member)
 REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS), 
$(cc_stack_align4))
+
+ifdef CONFIG_CC_IS_CLANG
+REALMODE_CFLAGS += $(CLANG_FLAGS)
+endif
+
 export REALMODE_CFLAGS
 
 # BITS is used as extension for files which are available in a 32 bit
-- 
2.29.2

[PATCH] habanalabs: Fix memleak in hl_device_reset

2020-12-25 Thread Dinghao Liu

When kzalloc() fails, we should execute hl_mmu_fini()
to release the MMU module. It's the same when
hl_ctx_init() fails.

Signed-off-by: Dinghao Liu 
---
 drivers/misc/habanalabs/common/device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/misc/habanalabs/common/device.c 
b/drivers/misc/habanalabs/common/device.c
index 5871162a8442..e5028890ead1 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -1092,6 +1092,7 @@ int hl_device_reset(struct hl_device *hdev, bool 
hard_reset,
GFP_KERNEL);
if (!hdev->kernel_ctx) {
rc = -ENOMEM;
+   hl_mmu_fini(hdev);
goto out_err;
}
 
@@ -1103,6 +1104,7 @@ int hl_device_reset(struct hl_device *hdev, bool 
hard_reset,
"failed to init kernel ctx in hard reset\n");
kfree(hdev->kernel_ctx);
hdev->kernel_ctx = NULL;
+   hl_mmu_fini(hdev);
goto out_err;
}
}
-- 
2.17.1

Re: [PATCH] mfd: ab8500-debugfs: Remove extraneous curly brace

2020-12-25 Thread Joe Perches

On Fri, 2020-12-25 at 18:35 -0700, Nathan Chancellor wrote:
> Clang errors:
> 
> ../drivers/mfd/ab8500-debugfs.c:1526:2: error: non-void function does
> not return a value [-Werror,-Wreturn-type]
> }
> ^
> ../drivers/mfd/ab8500-debugfs.c:1528:2: error: expected identifier or '('
> return 0;
> ^
> ../drivers/mfd/ab8500-debugfs.c:1529:1: error: extraneous closing brace ('}')
> }
> ^
> 3 errors generated.
> 
> The cleanup in ab8500_interrupts_show left a curly brace around, remove
> it to fix the error.
> 
> Fixes: 886c8121659d ("mfd: ab8500-debugfs: Remove the racy fiddling with 
> irq_desc")
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/mfd/ab8500-debugfs.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/mfd/ab8500-debugfs.c b/drivers/mfd/ab8500-debugfs.c
[]
> @@ -1521,7 +1521,6 @@ static int ab8500_interrupts_show(struct seq_file *s, 
> void *p)
>  line + irq_first,
>  num_interrupts[line],
>  num_wake_interrupts[line]);
> - }
>   seq_putc(s, '\n');

It looks as if this seq_putc should be removed as well
as the seq_printf above already ends in a newline.

>   }
>  
> 
> 
> base-commit: 61d791365b72a89062fbbea69aa61479476da946

[PATCH 5/5] gpio: xilinx: Add extra check if sum of widths exceed 64

2020-12-25 Thread Syed Nayyar Waris

Add extra check to see if sum of widths does not exceed 64. If it
exceeds then return -EINVAL alongwith appropriate error message.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index d565fbf128b7..c9d740ac711b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -319,6 +319,12 @@ static int xgpio_probe(struct platform_device *pdev)
 
chip->gc.base = -1;
chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
+
+   if (chip->gc.ngpio > 64) {
+   dev_err(&pdev->dev, "invalid configuration: number of GPIO is 
greater than 64");
+   return -EINVAL;
+   }
+
chip->gc.parent = &pdev->dev;
chip->gc.direction_input = xgpio_dir_in;
chip->gc.direction_output = xgpio_dir_out;
-- 
2.29.0

[PATCH 2/5] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-12-25 Thread Syed Nayyar Waris

The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Cc: Andy Shevchenko 
Cc: William Breathitt Gray 
Signed-off-by: Syed Nayyar Waris 
---
 lib/test_bitmap.c | 144 ++
 1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4425a1dd4ef1..c5b5fb98c9dd 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 #include "../tools/testing/selftests/kselftest_module.h"
 
@@ -155,6 +156,37 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +204,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -530,6 +563,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -541,6 +596,94 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x00,
+   0xff,
+};
+
+static const unsigned long clump_exp3[] __initconst = {
+   0x, /* starting with 0s*/
+   0x, /* All 0s */
+   0x,
+   0x,
+   0x3f0f, /* Non zero set */
+   0x2aa80003,
+   0x0aaa,
+   0x3fc0,
+};
+
+static const unsigned long clump_exp4[] __initconst = {
+   0x00,
+   0x2b,
+};
+
+struct clump_test_data_params {
+   DECLARE_BITMAP(data, 256);
+   unsigned long count;
+   unsigned long offset;
+   unsigned long limit;
+   unsigned long clump_size;
+   unsigned long const *exp;
+};
+
+static struct clump_test_data_params clump_test

[PATCH 4/5] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-12-25 Thread Syed Nayyar Waris

This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: William Breathitt Gray 
Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 66 +++---
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..d565fbf128b7 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -138,37 +139,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
@@ -292,6 +293,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -313,8 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
if (of_property_read_u32(np, "xlnx,gpio2-width",
 &chip->gpio_width[1]))
chip->gpio_width[1] = 32;
-
-   spin_lock_init(&chip->gpio_lock[1]);
}
 
chip->gc.base = -1;
-- 
2.29.0

[PATCH 3/5] gpio: thunderx: Utilize for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris

This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: William Breathitt Gray 
Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-thunderx.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..716b75ba7df6 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 
 #define GPIO_RX_DAT0x0
@@ -275,12 +276,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0

[PATCH 1/5] clump_bits: Introduce the for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris

This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Cc: Linus Walleij 
Cc: Arnd Bergmann 
Cc: William Breathitt Gray 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/clump_bits.h | 101 ++
 1 file changed, 101 insertions(+)
 create mode 100644 drivers/gpio/clump_bits.h

diff --git a/drivers/gpio/clump_bits.h b/drivers/gpio/clump_bits.h
new file mode 100644
index ..72ef772b83c8
--- /dev/null
+++ b/drivers/gpio/clump_bits.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __CLUMP_BITS_H
+#define __CLUMP_BITS_H
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long 
value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+
+   value &= GENMASK(v

[PATCH 0/5] Introduce the for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris

Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

(Note: Patchset resent with the new macro and relevant
functions shifted to a new header clump_bits.h [Linus Torvalds])

Michal,
What do you think of [PATCH 5/5]? Is the conditional check needed? And
also does returning -EINVAL look good?

This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Syed Nayyar Waris (5):
  clump_bits: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value
  gpio: xilinx: Add extra check if sum of widths exceed 64

 drivers/gpio/clump_bits.h| 101 
 drivers/gpio/gpio-thunderx.c |  12 ++-
 drivers/gpio/gpio-xilinx.c   |  72 ++
 lib/test_bitmap.c| 144 +++
 4 files changed, 292 insertions(+), 37 deletions(-)
 create mode 100644 drivers/gpio/clump_bits.h


base-commit: bbe2ba04c5a92a49db8a42c850a5a2f6481e47eb
-- 
2.29.0

Re: [PATCH v4 2/4] spi: ls7a: Add YAML schemas

2020-12-25 Thread zhangqing


Hi，Sergei

Thank you for your reply and suggestion .

I will use extra indentation levels and send v5 in the soon.

Thanks,

-Qing


On 12/25/2020 08:19 PM, Sergei Shtylyov wrote:

On 12/25/20 1:35 PM, Qing Zhang wrote:


Switch the DT binding to a YAML schema to enable the DT validation.

Signed-off-by: Qing Zhang 
---

v4: fix warnings/errors about running 'make dt_binding_check'

---
  .../devicetree/bindings/spi/loongson,spi-ls7a.yaml | 46 ++
  1 file changed, 46 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml

diff --git a/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml 
b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml
new file mode 100644
index 000..8cc9bc5
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml
@@ -0,0 +1,46 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/loongson,spi-ls7a.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Loongson LS7A PCH SPI Controller
+
+maintainers:
+  - Qing Zhang 
+
+description: |
+  This controller can be found in Loongson-3 systems with LS7A PCH.
+
+properties:
+  compatible:
+const: loongson,ls7a-spi
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - num-chipselects
+
+additionalProperties: false
+
+examples:
+  - |
+pci {
+#address-cells = <3>;
+#size-cells = <2>;
+
+spi@16,0 {
+compatible = "pci0014,7a0b.0",
+"pci0014,7a0b",
+"pciclass088000",
+"pciclass0800";
+
+reg = <0xb000 0x0 0x0 0x0 0x0>;
+num-chipselects = <0>;

 The above lines after { need extra indentation level.


+};
+};
+
+...

MBR, Sergei

Re: [PATCH v4 1/4] spi: LS7A: Add Loongson LS7A SPI controller driver support

2020-12-25 Thread zhangqing


Hi，Huacai

Thank you for your reply and suggestion .

I will remove blank lines and  send v5 in the soon.

Thanks,

-Qing


On 12/26/2020 12:28 PM, Huacai Chen wrote:

Hi, Qing

On Fri, Dec 25, 2020 at 6:40 PM Qing Zhang  wrote:

The SPI controller has the following characteristics:

- Full-duplex synchronous serial data transmission
- Support up to 4 variable length byte transmission
- Main mode support
- Mode failure generates an error flag and issues an interrupt request
- Double buffer receiver
- Serial clock with programmable polarity and phase
- SPI can be controlled in wait mode
- Support boot from SPI

Use mtd_debug tool to earse/write/read /dev/mtd0 on development.

eg:

[root@linux mtd-utils-1.0.0]# mtd_debug erase /dev/mtd0 0x2 0x4
Erased 262144 bytes from address 0x0002 in flash
[root@linux mtd-utils-1.0.0]# mtd_debug write /dev/mtd0 0x2 13 1.img
Copied 13 bytes from 1.img to address 0x0002 in flash
[root@linux mtd-utils-1.0.0]# mtd_debug read /dev/mtd0 0x2 13 2.img
Copied 13 bytes from address 0x0002 in flash to 2.img
[root@linux mtd-utils-1.0.0]# cmp -l 1.img 2.img

Signed-off-by: Qing Zhang 
---

v2:
- keep Kconfig and Makefile sorted
- make the entire comment a C++ one so things look more intentional
- Fix unclear indentation
- make conditional statements to improve legibility
- Don't use static inline
- the core handle message queue
- Add a new binding document
- Fix probe part mixed pdev and PCI

v3:
- expose set_cs to the core and let it handle things
- replace transfer_one_message to transfer_one
- replace spi_alloc_master to devm_spi_alloc_master
- split out into prepare/unprepare_message
- releases pci regions before unregister master

v4:
- names used in the manual
- rename ls7a_spi_do_transfer to ls7a_spi_setup_transfer
- change read the spcr and sper outside of this function
- mode configuration moved to prepare instead
- remove redundancy code about unprepare/prepare_message
- used 0x4 instead of 0x1,WFEMPTY instead of RFEMPTY

---
  drivers/spi/Kconfig|   7 ++
  drivers/spi/Makefile   |   1 +
  drivers/spi/spi-ls7a.c | 283 +
  3 files changed, 291 insertions(+)
  create mode 100644 drivers/spi/spi-ls7a.c

diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
index aadaea0..af7c0d4 100644
--- a/drivers/spi/Kconfig
+++ b/drivers/spi/Kconfig
@@ -413,6 +413,13 @@ config SPI_LP8841_RTC
   Say N here unless you plan to run the kernel on an ICP DAS
   LP-8x4x industrial computer.

+config SPI_LS7A
+   tristate "Loongson LS7A SPI Controller Support"
+   depends on CPU_LOONGSON64 || COMPILE_TEST
+   help
+ This drivers supports the Loongson LS7A SPI controller in master
+ SPI mode.
+
  config SPI_MPC52xx
 tristate "Freescale MPC52xx SPI (non-PSC) controller support"
 depends on PPC_MPC52xx
diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
index 6fea582..d015cf2 100644
--- a/drivers/spi/Makefile
+++ b/drivers/spi/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_SPI_LANTIQ_SSC)  += spi-lantiq-ssc.o
  obj-$(CONFIG_SPI_JCORE)+= spi-jcore.o
  obj-$(CONFIG_SPI_LM70_LLP) += spi-lm70llp.o
  obj-$(CONFIG_SPI_LP8841_RTC)   += spi-lp8841-rtc.o
+obj-$(CONFIG_SPI_LS7A) += spi-ls7a.o
  obj-$(CONFIG_SPI_MESON_SPICC)  += spi-meson-spicc.o
  obj-$(CONFIG_SPI_MESON_SPIFC)  += spi-meson-spifc.o
  obj-$(CONFIG_SPI_MPC512x_PSC)  += spi-mpc512x-psc.o
diff --git a/drivers/spi/spi-ls7a.c b/drivers/spi/spi-ls7a.c
new file mode 100644
index 000..d2be370
--- /dev/null
+++ b/drivers/spi/spi-ls7a.c
@@ -0,0 +1,283 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//
+// Loongson LS7A SPI Controller driver
+//
+// Copyright (C) 2020 Loongson Technology Corporation Limited.
+//
+
+#include 
+#include 
+#include 
+
+/* define spi register */
+#defineSPCR0x00
+#defineSPSR0x01
+#defineFIFO0x02
+#defineSPER0x03
+#defineSFC_PARAM   0x04
+#defineSFC_SOFTCS  0x05
+#defineSFC_TIMING  0x06
+
+struct ls7a_spi {
+   struct spi_master *master;
+   void __iomem *base;
+   unsigned int hz;
+   unsigned int mode;
+};
+
+static void ls7a_spi_write_reg(struct ls7a_spi *spi,
+  unsigned char reg,
+  unsigned char data)
+{
+   writeb(data, spi->base + reg);
+}
+
+static char ls7a_spi_read_reg(struct ls7a_spi *spi, unsigned char reg)
+{
+   return readb(spi->base + reg);
+}
+
+static int ls7a_spi_prepare_message(struct spi_master *master,
+   struct spi_message *msg)
+{
+   struct ls7a_spi *ls7a_spi;
+   struct spi_device *spi = msg->spi;
+   unsigned char val;
+
+   ls7a_spi = spi_master_get_devdata(master);
+
+   if (ls7a_spi->mode != spi->mode) {
+   val

Re: [PATCH v4 2/4] spi: ls7a: Add YAML schemas

2020-12-25 Thread zhangqing


Hi，Jiaxun

Thank you for your reply.

I will also delete the num chipelects attribute in the device tree.

And I will send v5 in the soon.

Thanks,

-Qing

On 12/26/2020 11:31 AM, Jiaxun Yang wrote:

在 2020/12/25 下午6:35, Qing Zhang 写道:

Switch the DT binding to a YAML schema to enable the DT validation.

Signed-off-by: Qing Zhang 
---

v4: fix warnings/errors about running 'make dt_binding_check'

---
  .../devicetree/bindings/spi/loongson,spi-ls7a.yaml | 46 
++

  1 file changed, 46 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml


diff --git 
a/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml 
b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml

new file mode 100644
index 000..8cc9bc5
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml
@@ -0,0 +1,46 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/loongson,spi-ls7a.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Loongson LS7A PCH SPI Controller
+
+maintainers:
+  - Qing Zhang 
+
+description: |
+  This controller can be found in Loongson-3 systems with LS7A PCH.
+
+properties:
+  compatible:
+const: loongson,ls7a-spi
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - num-chipselects


num-chipselects never parsed in code?

Thanks.

- Jiaxun




+
+additionalProperties: false
+
+examples:
+  - |
+pci {
+#address-cells = <3>;
+#size-cells = <2>;
+
+spi@16,0 {
+compatible = "pci0014,7a0b.0",
+"pci0014,7a0b",
+"pciclass088000",
+"pciclass0800";
+
+reg = <0xb000 0x0 0x0 0x0 0x0>;
+num-chipselects = <0>;
+};
+};
+
+...

[PATCH] [v2] scsi: scsi_debug: Fix memleak in scsi_debug_init

2020-12-25 Thread Dinghao Liu

When sdeb_zbc_model does not match BLK_ZONED_NONE,
BLK_ZONED_HA or BLK_ZONED_HM, we should free sdebug_q_arr
to prevent memleak. Also there is no need to execute
sdebug_erase_store() on failure of sdeb_zbc_model_str().

Signed-off-by: Dinghao Liu 
---

Changelog:

v2: - Add missed assignment statement for ret.
---
 drivers/scsi/scsi_debug.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 24c0f7ec0351..4a08c450b756 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -6740,7 +6740,7 @@ static int __init scsi_debug_init(void)
k = sdeb_zbc_model_str(sdeb_zbc_model_s);
if (k < 0) {
ret = k;
-   goto free_vm;
+   goto free_q_arr;
}
sdeb_zbc_model = k;
switch (sdeb_zbc_model) {
@@ -6753,7 +6753,8 @@ static int __init scsi_debug_init(void)
break;
default:
pr_err("Invalid ZBC model\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto free_q_arr;
}
}
if (sdeb_zbc_model != BLK_ZONED_NONE) {
-- 
2.17.1

[PATCH] scsi: scsi_debug: Fix memleak in scsi_debug_init

2020-12-25 Thread Dinghao Liu

When sdeb_zbc_model does not match BLK_ZONED_NONE,
BLK_ZONED_HA or BLK_ZONED_HM, we should free sdebug_q_arr
to prevent memleak. Also there is no need to execute
sdebug_erase_store() on failure of sdeb_zbc_model_str().

Signed-off-by: Dinghao Liu 
---
 drivers/scsi/scsi_debug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 24c0f7ec0351..88e785da03ba 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -6740,7 +6740,7 @@ static int __init scsi_debug_init(void)
k = sdeb_zbc_model_str(sdeb_zbc_model_s);
if (k < 0) {
ret = k;
-   goto free_vm;
+   goto free_q_arr;
}
sdeb_zbc_model = k;
switch (sdeb_zbc_model) {
@@ -6753,7 +6753,7 @@ static int __init scsi_debug_init(void)
break;
default:
pr_err("Invalid ZBC model\n");
-   return -EINVAL;
+   goto free_q_arr;
}
}
if (sdeb_zbc_model != BLK_ZONED_NONE) {
-- 
2.17.1

[PATCH] iommu: check for the deferred attach when attaching a device

2020-12-25 Thread Lianbo Jiang

Currently, because domain attach allows to be deferred from iommu
driver to device driver, and when iommu initializes, the devices
on the bus will be scanned and the default groups will be allocated.

Due to the above changes, some devices could be added to the same
group as below:

[3.859417] pci :01:00.0: Adding to iommu group 16
[3.864572] pci :01:00.1: Adding to iommu group 16
[3.869738] pci :02:00.0: Adding to iommu group 17
[3.874892] pci :02:00.1: Adding to iommu group 17

But when attaching these devices, it doesn't allow that a group has
more than one device, otherwise it will return an error. This conflicts
with the deferred attaching. Unfortunately, it has two devices in the
same group for my side, for example:

[9.627014] iommu_group_device_count(): device name[0]::01:00.0
[9.633545] iommu_group_device_count(): device name[1]::01:00.1
...
[   10.255609] iommu_group_device_count(): device name[0]::02:00.0
[   10.262144] iommu_group_device_count(): device name[1]::02:00.1

Finally, which caused the failure of tg3 driver when tg3 driver calls
the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().

[9.660310] tg3 :01:00.0: DMA engine test failed, aborting
[9.754085] tg3: probe of :01:00.0 failed with error -12
[9.997512] tg3 :01:00.1: DMA engine test failed, aborting
[   10.043053] tg3: probe of :01:00.1 failed with error -12
[   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
[   10.334070] tg3: probe of :02:00.0 failed with error -12
[   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
[   10.622629] tg3: probe of :02:00.1 failed with error -12

In addition, the similar situations also occur in other drivers such
as the bnxt_en driver. That can be reproduced easily in kdump kernel
when SME is active.

Add a check for the deferred attach in the iommu_attach_device() and
allow to attach the deferred device regardless of how many devices
are in a group.

Signed-off-by: Lianbo Jiang 
---
 drivers/iommu/iommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..dccab7b133fb 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1967,8 +1967,11 @@ int iommu_attach_device(struct iommu_domain *domain, 
struct device *dev)
 */
mutex_lock(&group->mutex);
ret = -EINVAL;
-   if (iommu_group_device_count(group) != 1)
+   if (!iommu_is_attach_deferred(domain, dev) &&
+   iommu_group_device_count(group) != 1) {
+   dev_err_ratelimited(dev, "Group has more than one device\n");
goto out_unlock;
+   }
 
ret = __iommu_attach_group(domain, group);
 
-- 
2.17.1

[PATCH] mm: page-flags.h: Typo fix (It -> If)

2020-12-25 Thread guoren

From: Guo Ren 

The "If" was wrongly spelled as "It".

Signed-off-by: Guo Ren 
Cc: Andrew Morton 
Cc: Oscar Salvador 
Cc: Alexander Duyck 
Cc: David Hildenbrand 
Cc: Steven Price 

---
 include/linux/page-flags.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 4f6ba93..e7b4242 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -824,7 +824,7 @@ static inline void ClearPageSlabPfmemalloc(struct page 
*page)
 
 /*
  * Flags checked when a page is freed.  Pages being freed should not have
- * these flags set.  It they are, there is a problem.
+ * these flags set.  If they are, there is a problem.
  */
 #define PAGE_FLAGS_CHECK_AT_FREE   \
(1UL << PG_lru  | 1UL << PG_locked  |   \
@@ -835,7 +835,7 @@ static inline void ClearPageSlabPfmemalloc(struct page 
*page)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have these flags set.  It they are set,
+ * Pages being prepped should not have these flags set.  If they are set,
  * there has been a kernel bug or struct page corruption.
  *
  * __PG_HWPOISON is exceptional because it needs to be kept beyond page's
-- 
2.7.4

Re: [PATCH v4 4/4] MIPS: Loongson: Enable Loongson LS7A SPI in loongson3_defconfig

2020-12-25 Thread Huacai Chen

Reviewed-by: Huacai Chen 

On Fri, Dec 25, 2020 at 6:41 PM Qing Zhang  wrote:
>
> This is now supported, enable for Loongson systems.
>
> Signed-off-by: Qing Zhang 
> ---
>
> v2:
>  - Modify CONFIG_SPI_LOONGSON to CONFIG_SPI_LS7A
>
> v3:
>  - No changes
>
> v4:
>  - No changes
>
> ---
>  arch/mips/configs/loongson3_defconfig | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/mips/configs/loongson3_defconfig 
> b/arch/mips/configs/loongson3_defconfig
> index 38a817e..28784cb 100644
> --- a/arch/mips/configs/loongson3_defconfig
> +++ b/arch/mips/configs/loongson3_defconfig
> @@ -271,6 +271,9 @@ CONFIG_HW_RANDOM=y
>  CONFIG_RAW_DRIVER=m
>  CONFIG_I2C_CHARDEV=y
>  CONFIG_I2C_PIIX4=y
> +CONFIG_SPI=y
> +CONFIG_SPI_MASTER=y
> +CONFIG_SPI_LS7A=y
>  CONFIG_GPIO_LOONGSON=y
>  CONFIG_SENSORS_LM75=m
>  CONFIG_SENSORS_LM93=m
> --
> 2.1.0
>

Re: [PATCH v4 1/4] spi: LS7A: Add Loongson LS7A SPI controller driver support

2020-12-25 Thread Huacai Chen

Hi, Qing

On Fri, Dec 25, 2020 at 6:40 PM Qing Zhang  wrote:
>
> The SPI controller has the following characteristics:
>
> - Full-duplex synchronous serial data transmission
> - Support up to 4 variable length byte transmission
> - Main mode support
> - Mode failure generates an error flag and issues an interrupt request
> - Double buffer receiver
> - Serial clock with programmable polarity and phase
> - SPI can be controlled in wait mode
> - Support boot from SPI
>
> Use mtd_debug tool to earse/write/read /dev/mtd0 on development.
>
> eg:
>
> [root@linux mtd-utils-1.0.0]# mtd_debug erase /dev/mtd0 0x2 0x4
> Erased 262144 bytes from address 0x0002 in flash
> [root@linux mtd-utils-1.0.0]# mtd_debug write /dev/mtd0 0x2 13 1.img
> Copied 13 bytes from 1.img to address 0x0002 in flash
> [root@linux mtd-utils-1.0.0]# mtd_debug read /dev/mtd0 0x2 13 2.img
> Copied 13 bytes from address 0x0002 in flash to 2.img
> [root@linux mtd-utils-1.0.0]# cmp -l 1.img 2.img
>
> Signed-off-by: Qing Zhang 
> ---
>
> v2:
> - keep Kconfig and Makefile sorted
> - make the entire comment a C++ one so things look more intentional
> - Fix unclear indentation
> - make conditional statements to improve legibility
> - Don't use static inline
> - the core handle message queue
> - Add a new binding document
> - Fix probe part mixed pdev and PCI
>
> v3:
> - expose set_cs to the core and let it handle things
> - replace transfer_one_message to transfer_one
> - replace spi_alloc_master to devm_spi_alloc_master
> - split out into prepare/unprepare_message
> - releases pci regions before unregister master
>
> v4:
> - names used in the manual
> - rename ls7a_spi_do_transfer to ls7a_spi_setup_transfer
> - change read the spcr and sper outside of this function
> - mode configuration moved to prepare instead
> - remove redundancy code about unprepare/prepare_message
> - used 0x4 instead of 0x1,WFEMPTY instead of RFEMPTY
>
> ---
>  drivers/spi/Kconfig|   7 ++
>  drivers/spi/Makefile   |   1 +
>  drivers/spi/spi-ls7a.c | 283 
> +
>  3 files changed, 291 insertions(+)
>  create mode 100644 drivers/spi/spi-ls7a.c
>
> diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
> index aadaea0..af7c0d4 100644
> --- a/drivers/spi/Kconfig
> +++ b/drivers/spi/Kconfig
> @@ -413,6 +413,13 @@ config SPI_LP8841_RTC
>   Say N here unless you plan to run the kernel on an ICP DAS
>   LP-8x4x industrial computer.
>
> +config SPI_LS7A
> +   tristate "Loongson LS7A SPI Controller Support"
> +   depends on CPU_LOONGSON64 || COMPILE_TEST
> +   help
> + This drivers supports the Loongson LS7A SPI controller in master
> + SPI mode.
> +
>  config SPI_MPC52xx
> tristate "Freescale MPC52xx SPI (non-PSC) controller support"
> depends on PPC_MPC52xx
> diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
> index 6fea582..d015cf2 100644
> --- a/drivers/spi/Makefile
> +++ b/drivers/spi/Makefile
> @@ -61,6 +61,7 @@ obj-$(CONFIG_SPI_LANTIQ_SSC)  += spi-lantiq-ssc.o
>  obj-$(CONFIG_SPI_JCORE)+= spi-jcore.o
>  obj-$(CONFIG_SPI_LM70_LLP) += spi-lm70llp.o
>  obj-$(CONFIG_SPI_LP8841_RTC)   += spi-lp8841-rtc.o
> +obj-$(CONFIG_SPI_LS7A) += spi-ls7a.o
>  obj-$(CONFIG_SPI_MESON_SPICC)  += spi-meson-spicc.o
>  obj-$(CONFIG_SPI_MESON_SPIFC)  += spi-meson-spifc.o
>  obj-$(CONFIG_SPI_MPC512x_PSC)  += spi-mpc512x-psc.o
> diff --git a/drivers/spi/spi-ls7a.c b/drivers/spi/spi-ls7a.c
> new file mode 100644
> index 000..d2be370
> --- /dev/null
> +++ b/drivers/spi/spi-ls7a.c
> @@ -0,0 +1,283 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +//
> +// Loongson LS7A SPI Controller driver
> +//
> +// Copyright (C) 2020 Loongson Technology Corporation Limited.
> +//
> +
> +#include 
> +#include 
> +#include 
> +
> +/* define spi register */
> +#defineSPCR0x00
> +#defineSPSR0x01
> +#defineFIFO0x02
> +#defineSPER0x03
> +#defineSFC_PARAM   0x04
> +#defineSFC_SOFTCS  0x05
> +#defineSFC_TIMING  0x06
> +
> +struct ls7a_spi {
> +   struct spi_master *master;
> +   void __iomem *base;
> +   unsigned int hz;
> +   unsigned int mode;
> +};
> +
> +static void ls7a_spi_write_reg(struct ls7a_spi *spi,
> +  unsigned char reg,
> +  unsigned char data)
> +{
> +   writeb(data, spi->base + reg);
> +}
> +
> +static char ls7a_spi_read_reg(struct ls7a_spi *spi, unsigned char reg)
> +{
> +   return readb(spi->base + reg);
> +}
> +
> +static int ls7a_spi_prepare_message(struct spi_master *master,
> +   struct spi_message *msg)
> +{
> +   struct ls7a_spi *ls7a_spi;
> +   struct spi_device *spi = msg->spi;
> +   unsigned char val;
> +
> +   ls7a_spi = spi_ma

Re: [PATCH v4] HID: i2c-hid: add polling mode based on connected GPIO chip's pin status

2020-12-25 Thread Coiby Xu

Hi Greg and Barnabás,

On Wed, Dec 09, 2020 at 04:44:35PM +0100, Greg KH wrote:

On Wed, Dec 09, 2020 at 03:38:11PM +, Barnabás Pőcze wrote:

2020. december 9., szerda 8:00 keltezéssel, Greg KH írta:

> On Tue, Dec 08, 2020 at 09:59:20PM +, Barnabás Pőcze wrote:
>
> > 2020.  november 25., szerda 16:07 keltezéssel, Greg KH írta:
> >
> > > [...]
> > >
> > > > +static u8 polling_mode;
> > > > +module_param(polling_mode, byte, 0444);
> > > > +MODULE_PARM_DESC(polling_mode, "How to poll (default=0) - 0 disabled; 1 
based on GPIO pin's status");
> > >
> > > Module parameters are for the 1990's, they are global and horrible to
> > > try to work with. You should provide something on a per-device basis,
> > > as what happens if your system requires different things here for
> > > different devices? You set this for all devices :(
> > > [...]

Thank you for pointing out that.

> >
> > Hi
> > do you think something like what the usbcore has would be better?
> > A module parameter like 
"quirks=::[,::]*"?
>
> Not really, that's just for debugging, and asking users to test
> something, not for a final solution to anything.

This patch is not only for debugging. The primary reason is as a
fallback solution to save the touchpad. The mentioned touchpads will
be fixed by Linux 5.11 which means the enthusiastic Linux users have to
wait for ~10 months to get their touchpads fixed.

My understanding is that this polling mode option is by no means intended
as a final solution, it's purely for debugging/fallback:

"Polling mode could be a fallback solution for enthusiastic Linux users
when they have a new laptop. It also acts like a debugging feature. If
polling mode works for a broken touchpad, we can almost be certain
the root cause is related to the interrupt or power setting."

What would you suggest instead of the module parameter?

a debugfs file?  That means it's only for debugging and you have to be
root to change the value.

Thank you for the helpful discussion and the suggestions!

If we can answer the following two questions, it could help decide
what's the next move.

1. How many machines have two or more I2C HID devices?

For the laptops this patch try to fix, they only have one I2C HID
device, i.e., the touchpad. If it's the case with most machines
running Linux, then we don't need to support per-HID-I2C-device
setting and module parameter is the most user-friendly since the user
doesn't even need to know the / pair.

2. How many I2C HID devices like touchpads could be saved by this
patch in the future?

If polling could potentially save lots of I2C hid devices, we are more
motivated to make it easier to use.

--
Best regards,
Coiby

Re: [GIT PULL] PCI fixes for v5.11

2020-12-25 Thread pr-tracker-bot

The pull request you sent on Fri, 25 Dec 2020 22:11:03 -0600:

> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
> tags/pci-v5.11-fixes-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/40f78232f97344afbbeb5b0008615f17c4b93466

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[GIT PULL] PCI fixes for v5.11

2020-12-25 Thread Bjorn Helgaas

The following changes since commit 255b2d524884e4ec60333131aa0ca0ef19826dc2:

  Merge branch 'remotes/lorenzo/pci/misc' (2020-12-15 15:11:14 -0600)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
tags/pci-v5.11-fixes-1

for you to fetch changes up to 99e629f14b471d852d28ecf554093c4730ed0927:

  PCI: dwc: Fix inverted condition of DMA mask setup warning (2020-12-25 
21:58:42 -0600)


PCI fixes:

  - Fix a tegra enumeration regression (Rob Herring)

  - Fix a designware-host check that warned on *success*, not failure
(Alexander Lobakin)


Alexander Lobakin (1):
  PCI: dwc: Fix inverted condition of DMA mask setup warning

Rob Herring (1):
  PCI: tegra: Fix host link initialization

 drivers/pci/controller/dwc/pcie-designware-host.c |  8 +---
 drivers/pci/controller/dwc/pcie-tegra194.c| 55 ---
 2 files changed, 31 insertions(+), 32 deletions(-)

Re: [PATCH v4 2/4] spi: ls7a: Add YAML schemas

2020-12-25 Thread Jiaxun Yang


在 2020/12/25 下午6:35, Qing Zhang 写道:

Switch the DT binding to a YAML schema to enable the DT validation.

Signed-off-by: Qing Zhang 
---

v4: fix warnings/errors about running 'make dt_binding_check'

---
  .../devicetree/bindings/spi/loongson,spi-ls7a.yaml | 46 ++
  1 file changed, 46 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml

diff --git a/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml 
b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml
new file mode 100644
index 000..8cc9bc5
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/loongson,spi-ls7a.yaml
@@ -0,0 +1,46 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/loongson,spi-ls7a.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Loongson LS7A PCH SPI Controller
+
+maintainers:
+  - Qing Zhang 
+
+description: |
+  This controller can be found in Loongson-3 systems with LS7A PCH.
+
+properties:
+  compatible:
+const: loongson,ls7a-spi
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - num-chipselects


num-chipselects never parsed in code?

Thanks.

- Jiaxun




+
+additionalProperties: false
+
+examples:
+  - |
+pci {
+#address-cells = <3>;
+#size-cells = <2>;
+
+spi@16,0 {
+compatible = "pci0014,7a0b.0",
+"pci0014,7a0b",
+"pciclass088000",
+"pciclass0800";
+
+reg = <0xb000 0x0 0x0 0x0 0x0>;
+num-chipselects = <0>;
+};
+};
+
+...

[PATCH 2/2] arm64: mm: fix kdump broken with ZONE_DMA reintroduced

2020-12-25 Thread Chen Zhou

If the memory reserved for crash dump kernel falled in ZONE_DMA32,
the devices in crash dump kernel need to use ZONE_DMA will alloc fail.

Fix this by reserving low memory in ZONE_DMA if CONFIG_ZONE_DMA is
enabled, otherwise, reserving in ZONE_DMA32.

Fixes: bff3b04460a8 ("arm64: mm: reserve CMA and crashkernel in ZONE_DMA32")
Signed-off-by: Chen Zhou 
---
 arch/arm64/mm/init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 7b9809e39927..5074e945f1a6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -85,7 +85,8 @@ static void __init reserve_crashkernel(void)
 
if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
-   crash_base = memblock_find_in_range(0, arm64_dma32_phys_limit,
+   crash_base = memblock_find_in_range(0,
+   arm64_dma_phys_limit ? : arm64_dma32_phys_limit,
crash_size, SZ_2M);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
-- 
2.20.1

[PATCH 1/2] arm64: mm: update the comments about ZONE_DMA

2020-12-25 Thread Chen Zhou

Since patchset "arm64: Default to 32-bit wide ZONE_DMA", ZONE_DMA's
size is fine-tuned. In the absence of addressing limited masters,
ZONE_DMA will span the whole 32-bit address space, otherwise, in
the case of the Raspberry Pi 4, it'll only span the 30-bit address
space. Update the comments.

Signed-off-by: Chen Zhou 
---
 arch/arm64/mm/init.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 75addb36354a..7b9809e39927 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -53,10 +53,11 @@ s64 memstart_addr __ro_after_init = -1;
 EXPORT_SYMBOL(memstart_addr);
 
 /*
- * We create both ZONE_DMA and ZONE_DMA32. ZONE_DMA covers the first 1G of
- * memory as some devices, namely the Raspberry Pi 4, have peripherals with
- * this limited view of the memory. ZONE_DMA32 will cover the rest of the 32
- * bit addressable memory area.
+ * We create both ZONE_DMA and ZONE_DMA32. ZONE_DMA's size is fine-tuned.
+ * In the absence of addressing limited masters, ZONE_DMA will span the
+ * whole 32-bit address space, otherwise, in the case of the Raspberry Pi 4,
+ * it'll only span the 30-bit address space. ZONE_DMA32 will cover the rest
+ * of the 32 bit addressable memory area.
  */
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static phys_addr_t arm64_dma32_phys_limit __ro_after_init;
-- 
2.20.1

[PATCH 0/2] arm64: mm: fix kdump broken with ZONE_DMA reintroduced

2020-12-25 Thread Chen Zhou

If the memory reserved for crash dump kernel falled in ZONE_DMA32,
the devices in crash dump kernel need to use ZONE_DMA will alloc fail.
   
Fix this by reserving low memory in ZONE_DMA if CONFIG_ZONE_DMA is
enabled, otherwise, reserving in ZONE_DMA32.

Patch 1 updates the comments about the ZONE_DMA.
Patch 2 fix kdump broken.

Chen Zhou (2):
  arm64: mm: update the comments about ZONE_DMA
  arm64: mm: fix kdump broken with ZONE_DMA reintroduced

 arch/arm64/mm/init.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

-- 
2.20.1

ERROR: modpost: ".do_uaccess_flush" undefined!

2020-12-25 Thread kernel test robot

Hi Nicholas,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   5814bc2d4cc241c1a603fac2b5bf1bd4daa108fc
commit: 9a32a7e78bd0cd9a9b6332cbdc345ee5ffd0c5de powerpc/64s: flush L1D after 
user accesses
date:   5 weeks ago
config: powerpc64-randconfig-s032-20201223 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-184-g1b896707-dirty
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9a32a7e78bd0cd9a9b6332cbdc345ee5ffd0c5de
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 9a32a7e78bd0cd9a9b6332cbdc345ee5ffd0c5de
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

ERROR: modpost: "._dev_err" [sound/hda/snd-intel-dspcfg.ko] undefined!
ERROR: modpost: ".snd_hdac_bus_init" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_bus_init_cmd_io" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".kmem_cache_alloc_trace" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_device_init" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snprintf" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".sscanf" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".put_device" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".snd_hdac_stream_stop" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_bus_stop_chip" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".kfree" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".trace_hardirqs_on" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".eeh_check_failure" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_stream_assign" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_device_unregister" 
[sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".mutex_lock" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".driver_register" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_device_register" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".arch_local_irq_restore" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: "._raw_spin_lock_irq" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_stream_init" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".__list_add_valid" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_stream_release" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_bus_stop_cmd_io" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".udelay" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".driver_unregister" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".snd_hdac_device_exit" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: "._dev_err" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".snd_hdac_bus_exit" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".__list_del_entry_valid" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".__stack_chk_fail" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".mutex_unlock" [sound/hda/ext/snd-hda-ext-core.ko] undefined!
ERROR: modpost: ".ftrace_likely_update" [sound/hda/ext/snd-hda-ext-core.ko] 
undefined!
ERROR: modpost: ".event_triggers_call" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".regmap_write" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".__regmap_init" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".__warn_printk" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".kmem_cache_alloc_trace" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".device_add" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".regmap_update_bits_base" [sound/hda/snd-hda-core.ko] 
undefined!
ERROR: modpost: ".snprintf" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".__mutex_init" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".__pm_runtime_set_status" [sound/hda/snd-hda-core.ko] 
undefined!
ERROR: modpost: "._copy_to_user" [sound/hda/snd-hda-core.ko] undefined!
ERROR: modpost: ".regcache_sync" [sound/hda/snd-hda-core.ko] undefi

Re: [PATCH 1/1] ARM: LPAE: use phys_addr_t instead of unsigned long in outercache hooks

2020-12-25 Thread Leizhen (ThunderTown)




On 2020/12/25 19:44, Zhen Lei wrote:
> The outercache of some Hisilicon SOCs support physical addresses wider
> than 32-bits. The unsigned long datatype is not sufficient for mapping
> physical addresses >= 4GB. The commit ad6b9c9d78b9 ("ARM: 6671/1: LPAE:
> use phys_addr_t instead of unsigned long in outercache functions") has
> already modified the outercache functions. But the parameters of the
> outercache hooks are not changed. This patch use phys_addr_t instead of
> unsigned long in outercache hooks: inv_range, clean_range, flush_range.
> 
> To ensure the outercache that does not support LPAE works properly, do
> cast phys_addr_t to unsigned long by adding a middle-tier function.
> For example:
> -static void l2c220_inv_range(unsigned long start, unsigned long end)
> +static void __l2c220_inv_range(unsigned long start, unsigned long end)
>  {
>   ...
>  }
> +static void l2c220_inv_range(phys_addr_t start, phys_addr_t end)
> +{
> +  __l2c220_inv_range(start, end);
> +}
> 
> Note that the outercache functions have been doing this cast before this
> patch. So now, the cast is just moved to the middle-tier function.
> 
> No functional change.

This patch will impact the outercache drivers that have not been merged into
the kernel. They should also update the datatype of the outercache hooks.

Another compatible solution is to add three new outercache hooks, as follows:

diff --git a/arch/arm/include/asm/outercache.h 
b/arch/arm/include/asm/outercache.h
index 3364637755e86aa..83344d0428fa5b6 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -17,6 +17,9 @@ struct outer_cache_fns {
 void (*inv_range)(unsigned long, unsigned long);
 void (*clean_range)(unsigned long, unsigned long);
 void (*flush_range)(unsigned long, unsigned long);
+  void (*lpae_inv_range)(phys_addr_t, phys_addr_t);
+  void (*lpae_clean_range)(phys_addr_t, phys_addr_t);
+  void (*lpae_flush_range)(phys_addr_t, phys_addr_t);
 void (*flush_all)(void);
 void (*disable)(void);
 #ifdef CONFIG_OUTER_CACHE_SYNC
@@ -41,6 +44,8 @@ static inline void outer_inv_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.inv_range)
 outer_cache.inv_range(start, end);
+  else if (outer_cache.lpae_inv_range)
+  outer_cache.lpae_inv_range(start, end);
 }

 /**
@@ -52,6 +57,8 @@ static inline void outer_clean_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.clean_range)
 outer_cache.clean_range(start, end);
+  else if (outer_cache.lpae_clean_range)
+  outer_cache.lpae_clean_range(start, end);
 }

 /**
@@ -63,6 +70,8 @@ static inline void outer_flush_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.flush_range)
 outer_cache.flush_range(start, end);
+  else if (outer_cache.lpae_flush_range)
+  outer_cache.lpae_flush_range(start, end);
 }

 /**



> 
> Signed-off-by: Zhen Lei 
> ---
>  arch/arm/include/asm/outercache.h |  6 +--
>  arch/arm/mm/cache-feroceon-l2.c   | 21 --
>  arch/arm/mm/cache-l2x0.c  | 83 
> ---
>  arch/arm/mm/cache-tauros2.c   | 21 --
>  arch/arm/mm/cache-uniphier.c  |  6 +--
>  arch/arm/mm/cache-xsc3l2.c| 21 --
>  6 files changed, 129 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/arm/include/asm/outercache.h 
> b/arch/arm/include/asm/outercache.h
> index 3364637755e86aa..4cee1ea0c15449a 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -14,9 +14,9 @@
>  struct l2x0_regs;
>  
>  struct outer_cache_fns {
> - void (*inv_range)(unsigned long, unsigned long);
> - void (*clean_range)(unsigned long, unsigned long);
> - void (*flush_range)(unsigned long, unsigned long);
> + void (*inv_range)(phys_addr_t, phys_addr_t);
> + void (*clean_range)(phys_addr_t, phys_addr_t);
> + void (*flush_range)(phys_addr_t, phys_addr_t);
>   void (*flush_all)(void);
>   void (*disable)(void);
>  #ifdef CONFIG_OUTER_CACHE_SYNC
> diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c
> index 5c1b7a7b9af6300..ab1d8051bf832c9 100644
> --- a/arch/arm/mm/cache-feroceon-l2.c
> +++ b/arch/arm/mm/cache-feroceon-l2.c
> @@ -168,7 +168,7 @@ static unsigned long calc_range_end(unsigned long start, 
> unsigned long end)
>   return range_end;
>  }
>  
> -static void feroceon_l2_inv_range(unsigned long start, unsigned long end)
> +static void __feroceon_l2_inv_range(unsigned long start, unsigned long end)
>  {
>   /*
>* Clean and invalidate partial first cache line.
> @@ -198,7 +198,12 @@ static void feroceon_l2_inv_range(unsigned long start, 
> unsigned long end)
>   dsb();
>  }
>  
> -static void feroceon_l2_clean_range(unsigned long start, unsigned long end)
> +static void feroceon_l2_inv_range(phys_addr_t start, phys_addr_t end)
> +{
> + __feroceon_l2_in

[PATCH -tip V3 1/8] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

The scheduler won't break affinity for us any more, and we should
"emulate" the same behavior when the scheduler breaks affinity for
us.  The behavior is "changing the cpumask to cpu_possible_mask".

And there might be some other CPUs online later while the worker is
still running with the pending work items.  The worker should be allowed
to use the later online CPUs as before and process the work items ASAP.
If we use cpu_active_mask here, we can't achieve this goal but
using cpu_possible_mask can.

Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c71da2a59e12..f2b8f3d458d1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4910,7 +4910,7 @@ static void unbind_workers(int cpu)
raw_spin_unlock_irq(&pool->lock);
 
for_each_pool_worker(worker, pool)
-   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_active_mask) < 0);
+   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_possible_mask) < 0);
 
mutex_unlock(&wq_pool_attach_mutex);
 
-- 
2.19.1.6.gb485710b

[PATCH -tip V3 7/8] workqueue: reorganize workqueue_offline_cpu() unbind_workers()

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

Just move around the code, no functionality changed.
Only wq_pool_attach_mutex protected region becomes a little larger.

It prepares for later patch protecting wq_online_cpumask
in wq_pool_attach_mutex.

Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 90 +++---
 1 file changed, 45 insertions(+), 45 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 94545e6feda5..dd32398edf55 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4896,61 +4896,57 @@ void wq_worker_comm(char *buf, size_t size, struct 
task_struct *task)
  * cpu comes back online.
  */
 
-static void unbind_workers(int cpu)
+static void unbind_workers(struct worker_pool *pool)
 {
-   struct worker_pool *pool;
struct worker *worker;
 
-   for_each_cpu_worker_pool(pool, cpu) {
-   mutex_lock(&wq_pool_attach_mutex);
-   raw_spin_lock_irq(&pool->lock);
+   lockdep_assert_held(&wq_pool_attach_mutex);
 
-   /*
-* We've blocked all attach/detach operations. Make all workers
-* unbound and set DISASSOCIATED.  Before this, all workers
-* except for the ones which are still executing works from
-* before the last CPU down must be on the cpu.  After
-* this, they may become diasporas.
-*/
-   for_each_pool_worker(worker, pool)
-   worker->flags |= WORKER_UNBOUND;
+   raw_spin_lock_irq(&pool->lock);
 
-   pool->flags |= POOL_DISASSOCIATED;
+   /*
+* We've blocked all attach/detach operations. Make all workers
+* unbound and set DISASSOCIATED.  Before this, all workers
+* except for the ones which are still executing works from
+* before the last CPU down must be on the cpu.  After
+* this, they may become diasporas.
+*/
+   for_each_pool_worker(worker, pool)
+   worker->flags |= WORKER_UNBOUND;
 
-   raw_spin_unlock_irq(&pool->lock);
+   pool->flags |= POOL_DISASSOCIATED;
 
-   for_each_pool_worker(worker, pool)
-   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_possible_mask) < 0);
+   raw_spin_unlock_irq(&pool->lock);
 
-   mutex_unlock(&wq_pool_attach_mutex);
+   for_each_pool_worker(worker, pool)
+   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_possible_mask) < 0);
 
-   /*
-* Call schedule() so that we cross rq->lock and thus can
-* guarantee sched callbacks see the %WORKER_UNBOUND flag.
-* This is necessary as scheduler callbacks may be invoked
-* from other cpus.
-*/
-   schedule();
+   /*
+* Call schedule() so that we cross rq->lock and thus can
+* guarantee sched callbacks see the %WORKER_UNBOUND flag.
+* This is necessary as scheduler callbacks may be invoked
+* from other cpus.
+*/
+   schedule();
 
-   /*
-* Sched callbacks are disabled now.  Zap nr_running.
-* After this, nr_running stays zero and need_more_worker()
-* and keep_working() are always true as long as the
-* worklist is not empty.  This pool now behaves as an
-* unbound (in terms of concurrency management) pool which
-* are served by workers tied to the pool.
-*/
-   atomic_set(&pool->nr_running, 0);
+   /*
+* Sched callbacks are disabled now.  Zap nr_running.
+* After this, nr_running stays zero and need_more_worker()
+* and keep_working() are always true as long as the
+* worklist is not empty.  This pool now behaves as an
+* unbound (in terms of concurrency management) pool which
+* are served by workers tied to the pool.
+*/
+   atomic_set(&pool->nr_running, 0);
 
-   /*
-* With concurrency management just turned off, a busy
-* worker blocking could lead to lengthy stalls.  Kick off
-* unbound chain execution of currently pending work items.
-*/
-   raw_spin_lock_irq(&pool->lock);
-   wake_up_worker(pool);
-   raw_spin_unlock_irq(&pool->lock);
-   }
+   /*
+* With concurrency management just turned off, a busy
+* worker blocking could lead to lengthy stalls.  Kick off
+* unbound chain execution of currently pending work items.
+*/
+   raw_spin_lock_irq(&pool->lock);
+   wake_up_worker(pool);
+   raw_spin_unlock_irq(&pool->lock);
 }
 
 /**
@@ -5122,7 +5118,11 @@ int workqueue_offline_cpu(unsigned int cpu)
if (WARN_ON(cpu != smp_processor_id()))
return -1;
 
-   unbind_w

[PATCH -tip V3 6/8] workqueue: reorganize workqueue_online_cpu()

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

Just move around the code, no functionality changed.

It prepares for later patch protecting wq_online_cpumask
in wq_pool_attach_mutex.

Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 79cc87df0cda..94545e6feda5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5088,12 +5088,17 @@ int workqueue_online_cpu(unsigned int cpu)
mutex_lock(&wq_pool_mutex);
cpumask_set_cpu(cpu, wq_online_cpumask);
 
+   for_each_cpu_worker_pool(pool, cpu) {
+   mutex_lock(&wq_pool_attach_mutex);
+   rebind_workers(pool);
+   mutex_unlock(&wq_pool_attach_mutex);
+   }
+
+   /* update CPU affinity of workers of unbound pools */
for_each_pool(pool, pi) {
mutex_lock(&wq_pool_attach_mutex);
 
-   if (pool->cpu == cpu)
-   rebind_workers(pool);
-   else if (pool->cpu < 0)
+   if (pool->cpu < 0)
update_unbound_workers_cpumask(pool, true, cpu);
 
mutex_unlock(&wq_pool_attach_mutex);
-- 
2.19.1.6.gb485710b

[PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

There is possible that a per-node pool/woker's affinity is a single
CPU.  It can happen when the workqueue user changes the cpumask of the
workqueue or when wq_unbound_cpumask is changed by system adim via
/sys/devices/virtual/workqueue/cpumask.  And pool->attrs->cpumask
is workqueue's cpumask & wq_unbound_cpumask & possible_cpumask_of_the_node,
which can be a single CPU and makes the pool's workers to be "per cpu
kthread".

And it can also happen when the cpu is the first online and has been
the only online cpu in pool->attrs->cpumask.  In this case, the worker
task cpumask is single cpu no matter what pool->attrs->cpumask since
commit d945b5e9f0e3 ("workqueue: Fix setting affinity of unbound worker
threads").

And the scheduler won't break affinity on the "per cpu kthread" workers
when the CPU is going down, so we have to do it by our own.

We do it by reusing existing restore_unbound_workers_cpumask() and rename
it to update_unbound_workers_cpumask().  When the number of the online
CPU of the pool goes from 1 to 0, we break the affinity initiatively.

Note here, we even break the affinity for non-per-cpu-kthread workers,
because first, the code path is slow path which is not worth too much to
optimize, second, we don't need to rely on the code/conditions when the
scheduler forces breaking affinity for us.

The way to break affinity is to set the workers' affinity to
cpu_possible_mask, so that we preserve the same behavisor when
the scheduler breaks affinity for us.

Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 48 ++
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0a95ae14d46f..79cc87df0cda 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5019,16 +5019,18 @@ static void rebind_workers(struct worker_pool *pool)
 }
 
 /**
- * restore_unbound_workers_cpumask - restore cpumask of unbound workers
+ * update_unbound_workers_cpumask - update cpumask of unbound workers
  * @pool: unbound pool of interest
- * @cpu: the CPU which is coming up
+ * @online: whether @cpu is coming up or going down
+ * @cpu: the CPU which is coming up or going down
  *
  * An unbound pool may end up with a cpumask which doesn't have any online
- * CPUs.  When a worker of such pool get scheduled, the scheduler resets
- * its cpus_allowed.  If @cpu is in @pool's cpumask which didn't have any
- * online CPU before, cpus_allowed of all its workers should be restored.
+ * CPUs.  We have to reset workers' cpus_allowed of such pool.  And we
+ * restore the workers' cpus_allowed when the pool's cpumask has online
+ * CPU.
  */
-static void restore_unbound_workers_cpumask(struct worker_pool *pool, int cpu)
+static void update_unbound_workers_cpumask(struct worker_pool *pool,
+  bool online, int cpu)
 {
static cpumask_t cpumask;
struct worker *worker;
@@ -5042,6 +5044,23 @@ static void restore_unbound_workers_cpumask(struct 
worker_pool *pool, int cpu)
 
cpumask_and(&cpumask, pool->attrs->cpumask, wq_online_cpumask);
 
+   if (!online) {
+   if (cpumask_weight(&cpumask) > 0)
+   return;
+   /*
+* All unbound workers can be possibly "per cpu kthread"
+* if this is the only online CPU in pool->attrs->cpumask
+* from the last time it has been brought up until now.
+* And the scheduler won't break affinity on the "per cpu
+* kthread" workers when the CPU is going down, so we have
+* to do it by our own.
+*/
+   for_each_pool_worker(worker, pool)
+   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_possible_mask) < 0);
+
+   return;
+   }
+
/* as we're called from CPU_ONLINE, the following shouldn't fail */
for_each_pool_worker(worker, pool)
WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, &cpumask) < 0);
@@ -5075,7 +5094,7 @@ int workqueue_online_cpu(unsigned int cpu)
if (pool->cpu == cpu)
rebind_workers(pool);
else if (pool->cpu < 0)
-   restore_unbound_workers_cpumask(pool, cpu);
+   update_unbound_workers_cpumask(pool, true, cpu);
 
mutex_unlock(&wq_pool_attach_mutex);
}
@@ -5090,7 +5109,9 @@ int workqueue_online_cpu(unsigned int cpu)
 
 int workqueue_offline_cpu(unsigned int cpu)
 {
+   struct worker_pool *pool;
struct workqueue_struct *wq;
+   int pi;
 
/* unbinding per-cpu workers should happen on the local CPU */
if (WARN_ON(cpu != smp_processor_id()))
@@ -5098,9 +5119,20 @@ int workqueue_offline_cpu(unsigned int cpu)
 
unbind_workers(cpu);
 
-

[PATCH -tip V3 8/8] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

When worker_attach_to_pool() is called, we should not put the workers
to pool->attrs->cpumask when there is not CPU online in it.

We have to use wq_online_cpumask in worker_attach_to_pool() to check
if pool->attrs->cpumask is valid rather than cpu_online_mask or
cpu_active_mask due to gaps between stages in cpu hot[un]plug.

So for that late-spawned per-CPU kworker case: the outgoing CPU should have
already been cleared from wq_online_cpumask, so it gets its affinity reset
to the possible mask and the subsequent wakeup will ensure it's put on an
active CPU.

To use wq_online_cpumask in worker_attach_to_pool(), we need to protect
wq_online_cpumask in wq_pool_attach_mutex and we modify workqueue_online_cpu()
and workqueue_offline_cpu() to enlarge wq_pool_attach_mutex protected
region. We also put updating wq_online_cpumask and [re|un]bind_workers()
in the same wq_pool_attach_mutex protected region to make the update
for percpu workqueue atomically.

Cc: Qian Cai 
Cc: Peter Zijlstra 
Cc: Vincent Donnefort 
Link: 
https://lore.kernel.org/lkml/20201210163830.21514-3-valentin.schnei...@arm.com/
Reviewed-by: Valentin Schneider 
Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index dd32398edf55..25d50050257c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -310,7 +310,7 @@ static bool workqueue_freezing; /* PL: have wqs 
started freezing? */
 /* PL: allowable cpus for unbound wqs and work items */
 static cpumask_var_t wq_unbound_cpumask;
 
-/* PL: online cpus (cpu_online_mask with the going-down cpu cleared) */
+/* PL&A: online cpus (cpu_online_mask with the going-down cpu cleared) */
 static cpumask_var_t wq_online_cpumask;
 
 /* CPU where unbound work was last round robin scheduled from this CPU */
@@ -1848,11 +1848,11 @@ static void worker_attach_to_pool(struct worker *worker,
 {
mutex_lock(&wq_pool_attach_mutex);
 
-   /*
-* set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
-* online CPUs.  It'll be re-applied when any of the CPUs come up.
-*/
-   set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+   /* Is there any cpu in pool->attrs->cpumask online? */
+   if (cpumask_intersects(pool->attrs->cpumask, wq_online_cpumask))
+   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
pool->attrs->cpumask) < 0);
+   else
+   WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
cpu_possible_mask) < 0);
 
/*
 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
@@ -5082,13 +5082,12 @@ int workqueue_online_cpu(unsigned int cpu)
int pi;
 
mutex_lock(&wq_pool_mutex);
-   cpumask_set_cpu(cpu, wq_online_cpumask);
 
-   for_each_cpu_worker_pool(pool, cpu) {
-   mutex_lock(&wq_pool_attach_mutex);
+   mutex_lock(&wq_pool_attach_mutex);
+   cpumask_set_cpu(cpu, wq_online_cpumask);
+   for_each_cpu_worker_pool(pool, cpu)
rebind_workers(pool);
-   mutex_unlock(&wq_pool_attach_mutex);
-   }
+   mutex_unlock(&wq_pool_attach_mutex);
 
/* update CPU affinity of workers of unbound pools */
for_each_pool(pool, pi) {
@@ -5118,14 +5117,13 @@ int workqueue_offline_cpu(unsigned int cpu)
if (WARN_ON(cpu != smp_processor_id()))
return -1;
 
-   for_each_cpu_worker_pool(pool, cpu) {
-   mutex_lock(&wq_pool_attach_mutex);
-   unbind_workers(pool);
-   mutex_unlock(&wq_pool_attach_mutex);
-   }
-
mutex_lock(&wq_pool_mutex);
+
+   mutex_lock(&wq_pool_attach_mutex);
cpumask_clear_cpu(cpu, wq_online_cpumask);
+   for_each_cpu_worker_pool(pool, cpu)
+   unbind_workers(pool);
+   mutex_unlock(&wq_pool_attach_mutex);
 
/* update CPU affinity of workers of unbound pools */
for_each_pool(pool, pi) {
-- 
2.19.1.6.gb485710b

[PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

wq_online_cpumask is the cached result of cpu_online_mask with the
going-down cpu cleared.  It is needed for later patches for setting
correct cpumask for workers and break affinity initiatively.

The first usage of wq_online_cpumask is also in this patch.
wq_calc_node_cpumask() and wq_update_unbound_numa() can be simplified
a little.

Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ccbceacaea1b..6f75f7ebeb17 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -310,6 +310,9 @@ static bool workqueue_freezing; /* PL: have wqs 
started freezing? */
 /* PL: allowable cpus for unbound wqs and work items */
 static cpumask_var_t wq_unbound_cpumask;
 
+/* PL: online cpus (cpu_online_mask with the going-down cpu cleared) */
+static cpumask_var_t wq_online_cpumask;
+
 /* CPU where unbound work was last round robin scheduled from this CPU */
 static DEFINE_PER_CPU(int, wq_rr_cpu_last);
 
@@ -3825,12 +3828,10 @@ static struct pool_workqueue *alloc_unbound_pwq(struct 
workqueue_struct *wq,
  * wq_calc_node_cpumask - calculate a wq_attrs' cpumask for the specified node
  * @attrs: the wq_attrs of the default pwq of the target workqueue
  * @node: the target NUMA node
- * @cpu_going_down: if >= 0, the CPU to consider as offline
  * @cpumask: outarg, the resulting cpumask
  *
- * Calculate the cpumask a workqueue with @attrs should use on @node.  If
- * @cpu_going_down is >= 0, that cpu is considered offline during
- * calculation.  The result is stored in @cpumask.
+ * Calculate the cpumask a workqueue with @attrs should use on @node.
+ * The result is stored in @cpumask.
  *
  * If NUMA affinity is not enabled, @attrs->cpumask is always used.  If
  * enabled and @node has online CPUs requested by @attrs, the returned
@@ -3844,15 +3845,14 @@ static struct pool_workqueue *alloc_unbound_pwq(struct 
workqueue_struct *wq,
  * %false if equal.
  */
 static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
-int cpu_going_down, cpumask_t *cpumask)
+cpumask_t *cpumask)
 {
if (!wq_numa_enabled || attrs->no_numa)
goto use_dfl;
 
/* does @node have any online CPUs @attrs wants? */
cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
-   if (cpu_going_down >= 0)
-   cpumask_clear_cpu(cpu_going_down, cpumask);
+   cpumask_and(cpumask, cpumask, wq_online_cpumask);
 
if (cpumask_empty(cpumask))
goto use_dfl;
@@ -3961,7 +3961,7 @@ apply_wqattrs_prepare(struct workqueue_struct *wq,
goto out_free;
 
for_each_node(node) {
-   if (wq_calc_node_cpumask(new_attrs, node, -1, 
tmp_attrs->cpumask)) {
+   if (wq_calc_node_cpumask(new_attrs, node, tmp_attrs->cpumask)) {
ctx->pwq_tbl[node] = alloc_unbound_pwq(wq, tmp_attrs);
if (!ctx->pwq_tbl[node])
goto out_free;
@@ -4086,7 +4086,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
  * wq_update_unbound_numa - update NUMA affinity of a wq for CPU hot[un]plug
  * @wq: the target workqueue
  * @cpu: the CPU coming up or going down
- * @online: whether @cpu is coming up or going down
  *
  * This function is to be called from %CPU_DOWN_PREPARE, %CPU_ONLINE and
  * %CPU_DOWN_FAILED.  @cpu is being hot[un]plugged, update NUMA affinity of
@@ -4104,11 +4103,9 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
  * affinity, it's the user's responsibility to flush the work item from
  * CPU_DOWN_PREPARE.
  */
-static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu,
-  bool online)
+static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu)
 {
int node = cpu_to_node(cpu);
-   int cpu_off = online ? -1 : cpu;
struct pool_workqueue *old_pwq = NULL, *pwq;
struct workqueue_attrs *target_attrs;
cpumask_t *cpumask;
@@ -4136,7 +4133,7 @@ static void wq_update_unbound_numa(struct 
workqueue_struct *wq, int cpu,
 * and create a new one if they don't match.  If the target cpumask
 * equals the default pwq's, the default pwq should be used.
 */
-   if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off, 
cpumask)) {
+   if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpumask)) {
if (cpumask_equal(cpumask, pwq->pool->attrs->cpumask))
return;
} else {
@@ -5069,6 +5066,7 @@ int workqueue_online_cpu(unsigned int cpu)
int pi;
 
mutex_lock(&wq_pool_mutex);
+   cpumask_set_cpu(cpu, wq_online_cpumask);
 
for_each_pool(pool, pi) {
mutex_lock(&wq_pool_attach_

[PATCH -tip V3 2/8] workqueue: Manually break affinity on pool detachment

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

The pool->attrs->cpumask might be a single CPU and it may go
down after detachment, and the scheduler won't force to break
affinity for us since it is a per-cpu-ktrhead.  So we have to
do it on our own and unbind this worker which can't be unbound
by workqueue_offline_cpu() since it doesn't belong to any pool
after detachment.  Do it unconditionally for there is no harm
to break affinity for non-per-cpu-ktrhead and we don't need to
rely on the scheduler's policy on when to break affinity.

Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f2b8f3d458d1..ccbceacaea1b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1885,6 +1885,19 @@ static void worker_detach_from_pool(struct worker 
*worker)
 
if (list_empty(&pool->workers))
detach_completion = pool->detach_completion;
+
+   /*
+* The pool->attrs->cpumask might be a single CPU and it may go
+* down after detachment, and the scheduler won't force to break
+* affinity for us since it is a per-cpu-ktrhead.  So we have to
+* do it on our own and unbind this worker which can't be unbound
+* by workqueue_offline_cpu() since it doesn't belong to any pool
+* after detachment.  Do it unconditionally for there is no harm
+* to break affinity for non-per-cpu-ktrhead and we don't need to
+* rely on the scheduler's policy on when to break affinity.
+*/
+   set_cpus_allowed_ptr(worker->task, cpu_possible_mask);
+
mutex_unlock(&wq_pool_attach_mutex);
 
/* clear leftover flags without pool->lock after it is detached */
-- 
2.19.1.6.gb485710b

[PATCH -tip V3 4/8] workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask()

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

restore_unbound_workers_cpumask() is called when CPU_ONLINE, where
wq_online_cpumask equals to cpu_online_mask. So no fucntionality
changed.

Acked-by: Tejun Heo 
Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6f75f7ebeb17..0a95ae14d46f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5033,13 +5033,14 @@ static void restore_unbound_workers_cpumask(struct 
worker_pool *pool, int cpu)
static cpumask_t cpumask;
struct worker *worker;
 
+   lockdep_assert_held(&wq_pool_mutex);
lockdep_assert_held(&wq_pool_attach_mutex);
 
/* is @cpu allowed for @pool? */
if (!cpumask_test_cpu(cpu, pool->attrs->cpumask))
return;
 
-   cpumask_and(&cpumask, pool->attrs->cpumask, cpu_online_mask);
+   cpumask_and(&cpumask, pool->attrs->cpumask, wq_online_cpumask);
 
/* as we're called from CPU_ONLINE, the following shouldn't fail */
for_each_pool_worker(worker, pool)
-- 
2.19.1.6.gb485710b

[PATCH -tip V3 0/8] workqueue: break affinity initiatively

2020-12-25 Thread Lai Jiangshan

From: Lai Jiangshan 

06249738a41a ("workqueue: Manually break affinity on hotplug")
said that scheduler will not force break affinity for us.

But workqueue highly depends on the old behavior. Many parts of the codes
relies on it, 06249738a41a ("workqueue: Manually break affinity on hotplug")
is not enough to change it, and the commit has flaws in itself too.

It doesn't handle for worker detachment.
It doesn't handle for worker attachement, especially worker creation
  which is handled by Valentin Schneider's patch [1].
It doesn't handle for unbound workers which might be possible
per-cpu-kthread.

We need to thoroughly update the way workqueue handles affinity
in cpu hot[un]plug, what is this patchset intends to do and
replace the Valentin Schneider's patch [1].  The equivalent patch
is patch 10.

The patchset is based on tip/master rather than workqueue tree,
because the patchset is a complement for 06249738a41a ("workqueue:
Manually break affinity on hotplug") which is only in tip/master by now.

And TJ acked to route the series through tip.

Changed from V2:
Drop V2's patch4, which causes warning about setting cpumask
online&!active to kthread reported by several people:
Dexuan Cui 
kernel test robot 

Drop V2's patch 1, which can also cause warning about setting
cpumask online&!active to kthread.  restore_unbound_workers_cpumask()
is changed when we are bring cpu online.  And it cause V2's patch7
(V3's patch5) to be changed accordingly.

Marked patch8 Reviewed-by: Valentin Schneider 


Changed from V1:
Add TJ's acked-by for the whole patchset

Add more words to the comments and the changelog, mainly derived
from discussion with Peter.

Update the comments as TJ suggested.

Update a line of code as Valentin suggested.

Add Valentin's ack for patch 10 because "Seems alright to me." and
add Valentin's comments to the changelog which is integral.

[1]: 
https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.ca...@redhat.com
[V1 patchset]: 
https://lore.kernel.org/lkml/20201214155457.3430-1-jiangshan...@gmail.com/
[V2 patchset]: 
https://lore.kernel.org/lkml/20201218170919.2950-1-jiangshan...@gmail.com/

Lai Jiangshan (8):
  workqueue: use cpu_possible_mask instead of cpu_active_mask to break
affinity
  workqueue: Manually break affinity on pool detachment
  workqueue: introduce wq_online_cpumask
  workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask()
  workqueue: Manually break affinity on hotplug for unbound pool
  workqueue: reorganize workqueue_online_cpu()
  workqueue: reorganize workqueue_offline_cpu() unbind_workers()
  workqueue: Fix affinity of kworkers when attaching into pool

 kernel/workqueue.c | 207 -
 1 file changed, 129 insertions(+), 78 deletions(-)

-- 
2.19.1.6.gb485710b

[PATCH] mfd: ab8500-debugfs: Remove extraneous curly brace

2020-12-25 Thread Nathan Chancellor

Clang errors:

../drivers/mfd/ab8500-debugfs.c:1526:2: error: non-void function does
not return a value [-Werror,-Wreturn-type]
}
^
../drivers/mfd/ab8500-debugfs.c:1528:2: error: expected identifier or '('
return 0;
^
../drivers/mfd/ab8500-debugfs.c:1529:1: error: extraneous closing brace ('}')
}
^
3 errors generated.

The cleanup in ab8500_interrupts_show left a curly brace around, remove
it to fix the error.

Fixes: 886c8121659d ("mfd: ab8500-debugfs: Remove the racy fiddling with 
irq_desc")
Signed-off-by: Nathan Chancellor 
---
 drivers/mfd/ab8500-debugfs.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mfd/ab8500-debugfs.c b/drivers/mfd/ab8500-debugfs.c
index a32039366a93..1cf84562351e 100644
--- a/drivers/mfd/ab8500-debugfs.c
+++ b/drivers/mfd/ab8500-debugfs.c
@@ -1521,7 +1521,6 @@ static int ab8500_interrupts_show(struct seq_file *s, 
void *p)
   line + irq_first,
   num_interrupts[line],
   num_wake_interrupts[line]);
-   }
seq_putc(s, '\n');
}
 

base-commit: 61d791365b72a89062fbbea69aa61479476da946
-- 
2.30.0.rc1

Re: [PATCH v3 4/5] RISC-V: Enable Microchip PolarFire ICICLE SoC

2020-12-25 Thread Aurelien Jarno

On 2020-12-04 00:58, Atish Patra wrote:
> Enable Microchip PolarFire ICICLE soc config in defconfig.
> It allows the default upstream kernel to boot on PolarFire ICICLE board.
> 
> Signed-off-by: Atish Patra 
> Reviewed-by: Anup Patel 
> Reviewed-by: Bin Meng 
> ---
>  arch/riscv/configs/defconfig | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
> index d222d353d86d..2660fa05451e 100644
> --- a/arch/riscv/configs/defconfig
> +++ b/arch/riscv/configs/defconfig
> @@ -16,6 +16,7 @@ CONFIG_EXPERT=y
>  CONFIG_BPF_SYSCALL=y
>  CONFIG_SOC_SIFIVE=y
>  CONFIG_SOC_VIRT=y
> +CONFIG_SOC_MICROCHIP_POLARFIRE=y
>  CONFIG_SMP=y
>  CONFIG_JUMP_LABEL=y
>  CONFIG_MODULES=y
> @@ -79,6 +80,9 @@ CONFIG_USB_OHCI_HCD=y
>  CONFIG_USB_OHCI_HCD_PLATFORM=y
>  CONFIG_USB_STORAGE=y
>  CONFIG_USB_UAS=y
> +CONFIG_SDHCI=y

I guess this should be CONFIG_MMC_SDHCI=y

> +CONFIG_MMC_SDHCI_PLTFM=y
> +CONFIG_MMC_SDHCI_CADENCE=y
>  CONFIG_MMC=y
>  CONFIG_MMC_SPI=y
>  CONFIG_RTC_CLASS=y

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [PATCH 1/7] clk: qcom: clk-alpha-pll: Add support for Stromer PLLs

2020-12-25 Thread Konrad Dybcio

Hi, are you going to resubmit this patch? Looks like MDM9607 uses Stromer PLL 
for its CPU clocks and could benefit from it.


Konrad

Re: [PATCH v1 1/2] dt-bindings: drm/bridge: anx7625: add DPI flag and swing setting

2020-12-25 Thread Rob Herring

On Fri, 25 Dec 2020 19:01:09 +0800, Xin Ji wrote:
> Add DPI flag for distinguish MIPI input signal type, DSI or DPI. Add
> swing setting for adjusting DP tx PHY swing
> 
> Signed-off-by: Xin Ji 
> ---
>  .../bindings/display/bridge/analogix,anx7625.yaml | 19 
> +++
>  1 file changed, 19 insertions(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.example.dt.yaml:
 encoder@58: anx,swing-setting: 'anyOf' conditional failed, one must be fixed:
[[0, 20], [1, 84], [2, 100], [3, 116], [4, 41], [5, 123], [6, 119], [7, 
91], [8, 127], [12, 32], [13, 96], [16, 96], [18, 64], [19, 96], [20, 20], [21, 
84], [22, 100], [23, 116], [24, 41], [25, 123], [26, 119], [27, 91], [28, 127], 
[32, 32], [33, 96], [36, 96], [38, 64], [39, 96]] is too long
[0, 20] is too long
[1, 84] is too long
[2, 100] is too long
[3, 116] is too long
[4, 41] is too long
[5, 123] is too long
[6, 119] is too long
[7, 91] is too long
[8, 127] is too long
[12, 32] is too long
[13, 96] is too long
[16, 96] is too long
[18, 64] is too long
[19, 96] is too long
[20, 20] is too long
[21, 84] is too long
[22, 100] is too long
[23, 116] is too long
[24, 41] is too long
[25, 123] is too long
[26, 119] is too long
[27, 91] is too long
[28, 127] is too long
[32, 32] is too long
[33, 96] is too long
[36, 96] is too long
[38, 64] is too long
[39, 96] is too long
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.example.dt.yaml:
 encoder@58: 'anx,mipi-dpi-in', 'anx,swing-setting' do not match any of the 
regexes: '^#.*', 
'^(at25|bm|devbus|dmacap|dsa|exynos|fsi[ab]|gpio-fan|gpio-key|gpio|gpmc|hdmi|i2c-gpio),.*',
 '^(keypad|m25p|max8952|max8997|max8998|mpmc),.*', 
'^(pinctrl-single|#pinctrl-single|PowerPC),.*', 
'^(pl022|pxa-mmc|rcar_sound|rotary-encoder|s5m8767|sdhci),.*', 
'^(simple-audio-card|st-plgpio|st-spics|ts),.*', '^70mai,.*', '^GEFanuc,.*', 
'^ORCL,.*', '^SUNW,.*', '^[a-zA-Z0-9#_][a-zA-Z0-9+\\-._@]{0,63}$', 
'^[a-zA-Z0-9+\\-._]*@[0-9a-zA-Z,]*$', '^abb,.*', '^abilis,.*', '^abracon,.*', 
'^abt,.*', '^acer,.*', '^acme,.*', '^actions,.*', '^active-semi,.*', '^ad,.*', 
'^adafruit,.*', '^adapteva,.*', '^adaptrum,.*', '^adh,.*', '^adi,.*', 
'^advantech,.*', '^aeroflexgaisler,.*', '^al,.*', '^allegro,.*', '^allo,.*', 
'^allwinner,.*', '^alphascale,.*', '^alps,.*', '^alt,.*', '^altr,.*', 
'^amarula,.*', '^amazon,.*', '^amcc,.*', '^amd,.*', '^amediatech,.*', 
'^amlogic,.*', '^ampere,.*', '^ampire,.*', '^ams,.*', '^amstaos,.*', 
'^analogix,.*', '^andestech,.*', '^anvo,.*', '^apm,.*', '^aptina,.*', 
'^arasan,.*', '^archermind,.*', '^arctic,.*', '^arcx,.*', '^aries,.*', 
'^arm,.*', '^armadeus,.*', '^arrow,.*', '^artesyn,.*', '^asahi-kasei,.*', 
'^asc,.*', '^aspeed,.*', '^asus,.*', '^atlas,.*', '^atmel,.*', '^auo,.*', 
'^auvidea,.*', '^avago,.*', '^avia,.*', '^avic,.*', '^avnet,.*', '^awinic,.*', 
'^axentia,.*', '^axis,.*', '^azoteq,.*', '^azw,.*', '^baikal,.*', 
'^bananapi,.*', '^beacon,.*', '^beagle,.*', '^bhf,.*', '^bitmain,.*', 
'^boe,.*', '^bosch,.*', '^boundary,.*', '^brcm,.*', '^broadmobi,.*', 
'^bticino,.*', '^buffalo,.*', '^bur,.*', '^calaosystems,.*', '^calxeda,.*', 
'^caninos,.*', '^capella,.*', '^cascoda,.*', '^catalyst,.*', '^cavium,.*', 
'^cdns,.*', '^cdtech,.*', '^cellwise,.*', '^ceva,.*', '^checkpoint,.*', 
'^chefree,.*', '^chipidea,.*', '^chipone,.*', '^chipspark,.*', '^chrontel,.*', 
'^chrp,.*', '^chunghwa,.*', '^chuwi,.*', '^ciaa,.*', '^cirrus,.*', 
'^cloudengines,.*', '^cnm,.*', '^cnxt,.*', '^colorfly,.*', '^compulab,.*', 
'^coreriver,.*', '^corpro,.*', '^cortina,.*', '^cosmic,.*', '^crane,.*', 
'^creative,.*', '^crystalfontz,.*', '^csky,.*', '^csq,.*', '^cubietech,.*', 
'^cypress,.*', '^cznic,.*', '^dallas,.*', '^dataimage,.*', '^davicom,.*', 
'^dell,.*', '^delta,.*', '^denx,.*', '^devantech,.*', '^dfi,.*', '^dh,.*', 
'^difrnce,.*', '^digi,.*', '^digilent,.*', '^dioo,.*', '^dlc,.*', '^dlg,.*', 
'^dlink,.*', '^dmo,.*', '^domintech,.*', '^dongwoon,.*', '^dptechnics,.*', 
'^dragino,.*', '^dserve,.*', '^dynaimage,.*', '^ea,.*', '^ebs-systart,.*', 
'^ebv,.*', '^eckelmann,.*', '^edt,.*', '^eeti,.*', '^einfochips,.*', 
'^elan,.*', '^element14,.*', '^elgin,.*', '^elida,.*', '^elimo,.*', 
'^embest,.*', '^emlid,.*', '^emmicro,.*', '^empire-electronix,.*', 
'^emtrion,.*', '^endless,.*', '^ene,.*', '^energymicro,.*', '^engicam,.*', 
'^epcos,.*', '^epfl,.*', '^epson,.*', '^esp,.*', '^est,.*', '^ettus,.*', 
'^eukrea,.*', '^everest,.*', '^everspin,.*', '^ev

[PATCH v4 0/3] drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs

2020-12-25 Thread Gabriel Somlo

This series expands on commit 22447a99c97e ("drivers/soc/litex: add LiteX
SoC Controller driver"), adding support for handling both 8- and 32-bit
LiteX CSR (MMIO) subregisters, on both 32- and 64-bit CPUs.

Notes v4:
- improved "eloquence" of some 3/3 commit blurb paragraphs
- fixed instance of "disgusting" comment style :)
- litex_[get|set]_reg() now using size_t for 'reg_size' argument
- slightly tighter shift calculation in litex_set_reg()

Notes v3:
- split into smaller, more self-explanatory patches
- more detailed commit blurb for "main payload" (3/3) patch
- eliminate compiler warning on 32-bit architectures

Notes v2:
- fix typo (s/u32/u64/) in litex_read64().

Notes v1:
- LITEX_SUBREG_SIZE now provided by Kconfig.
- it's not LITEX_REG_SIZE, but rather LITEX_SUBREG_ALIGN!
- move litex_[get|set]_reg() to include/linux/litex.h and mark
  them as "static inline";
- redo litex_[read|write][8|16|32|64]() using litex_[get|set]_reg()
  (compiler should produce code as efficient as hardcoded shifts,
   but also automatically matching LITEX_SUBREG_SIZE).

Gabriel Somlo (3):
  drivers/soc/litex: move generic accessors to litex.h
  drivers/soc/litex: separate MMIO from subregister offset calculation
  drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs

 drivers/soc/litex/Kconfig  |  12 +++
 drivers/soc/litex/litex_soc_ctrl.c |  76 +--
 include/linux/litex.h  | 151 +++--
 3 files changed, 115 insertions(+), 124 deletions(-)

-- 
2.26.2

[PATCH v4 2/3] drivers/soc/litex: separate MMIO from subregister offset calculation

2020-12-25 Thread Gabriel Somlo

Separate MMIO (read/write) access into _[read|write]_litex_subregister()
static inline functions, leaving existing "READ|WRITE" macros to handle
calculation of the subregister offset only.

NOTE: this is a non-functional change.

Signed-off-by: Gabriel Somlo 
---
 include/linux/litex.h | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/litex.h b/include/linux/litex.h
index 67c1a18a7425..918bab45243c 100644
--- a/include/linux/litex.h
+++ b/include/linux/litex.h
@@ -24,11 +24,23 @@
 #define LITEX_SUBREG_SIZE  0x1
 #define LITEX_SUBREG_SIZE_BIT   (LITEX_SUBREG_SIZE * 8)
 
+static inline void _write_litex_subregister(u32 val, void __iomem *addr)
+{
+   writel((u32 __force)cpu_to_le32(val), addr);
+}
+
+static inline u32 _read_litex_subregister(void __iomem *addr)
+{
+   return le32_to_cpu((__le32 __force)readl(addr));
+}
+
 #define WRITE_LITEX_SUBREGISTER(val, base_offset, subreg_id) \
-   writel((u32 __force)cpu_to_le32(val), base_offset + (LITEX_REG_SIZE * 
subreg_id))
+   _write_litex_subregister(val, (base_offset) + \
+   LITEX_REG_SIZE * (subreg_id))
 
 #define READ_LITEX_SUBREGISTER(base_offset, subreg_id) \
-   le32_to_cpu((__le32 __force)readl(base_offset + (LITEX_REG_SIZE * 
subreg_id)))
+   _read_litex_subregister((base_offset) + \
+   LITEX_REG_SIZE * (subreg_id))
 
 /*
  * LiteX SoC Generator, depending on the configuration, can split a single
-- 
2.26.2

[PATCH v4 1/3] drivers/soc/litex: move generic accessors to litex.h

2020-12-25 Thread Gabriel Somlo

Move generic LiteX CSR (MMIO) register accessors to litex.h and
declare them as "static inline", in preparation for supporting
32-bit CSR subregisters and 64-bit CPUs.

NOTE: this is a non-functional change.

Signed-off-by: Gabriel Somlo 
---
 drivers/soc/litex/litex_soc_ctrl.c | 73 -
 include/linux/litex.h  | 74 --
 2 files changed, 69 insertions(+), 78 deletions(-)

diff --git a/drivers/soc/litex/litex_soc_ctrl.c 
b/drivers/soc/litex/litex_soc_ctrl.c
index 1217cafdfd4d..65977526d68e 100644
--- a/drivers/soc/litex/litex_soc_ctrl.c
+++ b/drivers/soc/litex/litex_soc_ctrl.c
@@ -16,79 +16,6 @@
 #include 
 #include 
 
-/*
- * LiteX SoC Generator, depending on the configuration, can split a single
- * logical CSR (Control&Status Register) into a series of consecutive physical
- * registers.
- *
- * For example, in the configuration with 8-bit CSR Bus, 32-bit aligned (the
- * default one for 32-bit CPUs) a 32-bit logical CSR will be generated as four
- * 32-bit physical registers, each one containing one byte of meaningful data.
- *
- * For details see: https://github.com/enjoy-digital/litex/wiki/CSR-Bus
- *
- * The purpose of `litex_set_reg`/`litex_get_reg` is to implement the logic
- * of writing to/reading from the LiteX CSR in a single place that can be
- * then reused by all LiteX drivers.
- */
-
-/**
- * litex_set_reg() - Writes the value to the LiteX CSR (Control&Status 
Register)
- * @reg: Address of the CSR
- * @reg_size: The width of the CSR expressed in the number of bytes
- * @val: Value to be written to the CSR
- *
- * In the currently supported LiteX configuration (8-bit CSR Bus, 32-bit 
aligned),
- * a 32-bit LiteX CSR is generated as 4 consecutive 32-bit physical registers,
- * each one containing one byte of meaningful data.
- *
- * This function splits a single possibly multi-byte write into a series of
- * single-byte writes with a proper offset.
- */
-void litex_set_reg(void __iomem *reg, unsigned long reg_size,
-   unsigned long val)
-{
-   unsigned long shifted_data, shift, i;
-
-   for (i = 0; i < reg_size; ++i) {
-   shift = ((reg_size - i - 1) * LITEX_SUBREG_SIZE_BIT);
-   shifted_data = val >> shift;
-
-   WRITE_LITEX_SUBREGISTER(shifted_data, reg, i);
-   }
-}
-EXPORT_SYMBOL_GPL(litex_set_reg);
-
-/**
- * litex_get_reg() - Reads the value of the LiteX CSR (Control&Status Register)
- * @reg: Address of the CSR
- * @reg_size: The width of the CSR expressed in the number of bytes
- *
- * Return: Value read from the CSR
- *
- * In the currently supported LiteX configuration (8-bit CSR Bus, 32-bit 
aligned),
- * a 32-bit LiteX CSR is generated as 4 consecutive 32-bit physical registers,
- * each one containing one byte of meaningful data.
- *
- * This function generates a series of single-byte reads with a proper offset
- * and joins their results into a single multi-byte value.
- */
-unsigned long litex_get_reg(void __iomem *reg, unsigned long reg_size)
-{
-   unsigned long shifted_data, shift, i;
-   unsigned long result = 0;
-
-   for (i = 0; i < reg_size; ++i) {
-   shifted_data = READ_LITEX_SUBREGISTER(reg, i);
-
-   shift = ((reg_size - i - 1) * LITEX_SUBREG_SIZE_BIT);
-   result |= (shifted_data << shift);
-   }
-
-   return result;
-}
-EXPORT_SYMBOL_GPL(litex_get_reg);
-
 #define SCRATCH_REG_OFF 0x04
 #define SCRATCH_REG_VALUE   0x12345678
 #define SCRATCH_TEST_VALUE  0xdeadbeef
diff --git a/include/linux/litex.h b/include/linux/litex.h
index 40f5be503593..67c1a18a7425 100644
--- a/include/linux/litex.h
+++ b/include/linux/litex.h
@@ -3,9 +3,6 @@
  * Common LiteX header providing
  * helper functions for accessing CSRs.
  *
- * Implementation of the functions is provided by
- * the LiteX SoC Controller driver.
- *
  * Copyright (C) 2019-2020 Antmicro 
  */
 
@@ -33,9 +30,76 @@
 #define READ_LITEX_SUBREGISTER(base_offset, subreg_id) \
le32_to_cpu((__le32 __force)readl(base_offset + (LITEX_REG_SIZE * 
subreg_id)))
 
-void litex_set_reg(void __iomem *reg, unsigned long reg_sz, unsigned long val);
+/*
+ * LiteX SoC Generator, depending on the configuration, can split a single
+ * logical CSR (Control&Status Register) into a series of consecutive physical
+ * registers.
+ *
+ * For example, in the configuration with 8-bit CSR Bus, 32-bit aligned (the
+ * default one for 32-bit CPUs) a 32-bit logical CSR will be generated as four
+ * 32-bit physical registers, each one containing one byte of meaningful data.
+ *
+ * For details see: https://github.com/enjoy-digital/litex/wiki/CSR-Bus
+ *
+ * The purpose of `litex_set_reg`/`litex_get_reg` is to implement the logic
+ * of writing to/reading from the LiteX CSR in a single place that can be
+ * then reused by all LiteX drivers.
+ */
+
+/**
+ * litex_set_reg() - Writes the value to the LiteX CSR (Control&Status 
Register)
+ * @reg: Add

[PATCH v4 3/3] drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs

2020-12-25 Thread Gabriel Somlo

Upstream LiteX now defaults to using 32-bit CSR subregisters
(see https://github.com/enjoy-digital/litex/commit/a2b71fde).

This patch expands on commit 22447a99c97e ("drivers/soc/litex: add
LiteX SoC Controller driver"), adding support for handling both 8-
and 32-bit LiteX CSR (MMIO) subregisters, as determined by the
LITEX_SUBREG_SIZE Kconfig option.

NOTE that while LITEX_SUBREG_SIZE could theoretically be a device
tree property, defining it as a compile-time constant allows for
much better optimization of the resulting code. This is further
supported by the low expected usefulness of deploying the same
kernel across LiteX SoCs built with different CSR-Bus data widths.

The constant LITEX_REG_SIZE is renamed to the more descriptive
LITEX_SUBREG_ALIGN (LiteX CSR subregisters are located at 32-bit
aligned MMIO addresses).

Finally, the litex_[read|write][8|16|32|64]() accessors are
redefined in terms of litex_[get|set]_reg(), which, after compiler
optimization, will result in code as efficient as hardcoded shifts,
but with the added benefit of automatically matching the appropriate
LITEX_SUBREG_SIZE.

NOTE that litex_[get|set]_reg() nominally operate on 64-bit data,
but that will also be optimized by the compiler in situations where
narrower data is used from a call site.

Signed-off-by: Gabriel Somlo 
---
 drivers/soc/litex/Kconfig  |  12 +++
 drivers/soc/litex/litex_soc_ctrl.c |   3 +-
 include/linux/litex.h  | 141 -
 3 files changed, 72 insertions(+), 84 deletions(-)

diff --git a/drivers/soc/litex/Kconfig b/drivers/soc/litex/Kconfig
index 7c6b009b6f6c..973f8d2fe1a7 100644
--- a/drivers/soc/litex/Kconfig
+++ b/drivers/soc/litex/Kconfig
@@ -16,4 +16,16 @@ config LITEX_SOC_CONTROLLER
  All drivers that use functions from litex.h must depend on
  LITEX.
 
+config LITEX_SUBREG_SIZE
+   int "Size of a LiteX CSR subregister, in bytes"
+   depends on LITEX
+   range 1 4
+   default 4
+   help
+   LiteX MMIO registers (referred to as Configuration and Status
+   registers, or CSRs) are spread across adjacent 8- or 32-bit
+   subregisters, located at 32-bit aligned MMIO addresses. Use
+   this to select the appropriate size (1 or 4 bytes) matching
+   your particular LiteX build.
+
 endmenu
diff --git a/drivers/soc/litex/litex_soc_ctrl.c 
b/drivers/soc/litex/litex_soc_ctrl.c
index 65977526d68e..da17ba56b795 100644
--- a/drivers/soc/litex/litex_soc_ctrl.c
+++ b/drivers/soc/litex/litex_soc_ctrl.c
@@ -58,7 +58,8 @@ static int litex_check_csr_access(void __iomem *reg_addr)
/* restore original value of the SCRATCH register */
litex_write32(reg_addr + SCRATCH_REG_OFF, SCRATCH_REG_VALUE);
 
-   pr_info("LiteX SoC Controller driver initialized");
+   pr_info("LiteX SoC Controller driver initialized: subreg:%d, align:%d",
+   LITEX_SUBREG_SIZE, LITEX_SUBREG_ALIGN);
 
return 0;
 }
diff --git a/include/linux/litex.h b/include/linux/litex.h
index 918bab45243c..ec969b131f23 100644
--- a/include/linux/litex.h
+++ b/include/linux/litex.h
@@ -10,20 +10,19 @@
 #define _LINUX_LITEX_H
 
 #include 
-#include 
-#include 
 
-/*
- * The parameters below are true for LiteX SoCs configured for 8-bit CSR Bus,
- * 32-bit aligned.
- *
- * Supporting other configurations will require extending the logic in this
- * header and in the LiteX SoC controller driver.
- */
-#define LITEX_REG_SIZE   0x4
-#define LITEX_SUBREG_SIZE  0x1
+/* LiteX SoCs support 8- or 32-bit CSR Bus data width (i.e., subreg. size) */
+#if defined(CONFIG_LITEX_SUBREG_SIZE) && \
+   (CONFIG_LITEX_SUBREG_SIZE == 1 || CONFIG_LITEX_SUBREG_SIZE == 4)
+#define LITEX_SUBREG_SIZE  CONFIG_LITEX_SUBREG_SIZE
+#else
+#error LiteX subregister size (LITEX_SUBREG_SIZE) must be 4 or 1!
+#endif
 #define LITEX_SUBREG_SIZE_BIT   (LITEX_SUBREG_SIZE * 8)
 
+/* LiteX subregisters of any width are always aligned on a 4-byte boundary */
+#define LITEX_SUBREG_ALIGN   0x4
+
 static inline void _write_litex_subregister(u32 val, void __iomem *addr)
 {
writel((u32 __force)cpu_to_le32(val), addr);
@@ -34,25 +33,32 @@ static inline u32 _read_litex_subregister(void __iomem 
*addr)
return le32_to_cpu((__le32 __force)readl(addr));
 }
 
-#define WRITE_LITEX_SUBREGISTER(val, base_offset, subreg_id) \
-   _write_litex_subregister(val, (base_offset) + \
-   LITEX_REG_SIZE * (subreg_id))
-
-#define READ_LITEX_SUBREGISTER(base_offset, subreg_id) \
-   _read_litex_subregister((base_offset) + \
-   LITEX_REG_SIZE * (subreg_id))
-
 /*
  * LiteX SoC Generator, depending on the configuration, can split a single
  * logical CSR (Control&Status Register) into a series of consecutive physical
  * registers.
  *
- * For example, in the configuration with 8-bit CSR Bus, 32-bit aligned (the
- * default one for 32-bit CPUs) a 32-bit logical CSR will be generated as

Re: [PATCH 1/3] dt-bindings: power: Introduce 'assigned-performance-states' property

2020-12-25 Thread Rob Herring

On Thu, 24 Dec 2020 16:42:08 +0530, Roja Rani Yarubandi wrote:
> While most devices within power-domains which support performance states,
> scale the performance state dynamically, some devices might want to
> set a static/default performance state while the device is active.
> These devices typically would also run off a fixed clock and not support
> dynamically scaling the device's performance, also known as DVFS
> techniques.
> 
> Add a property 'assigned-performance-states' which client devices can
> use to set this default performance state on their power-domains.
> 
> Signed-off-by: Roja Rani Yarubandi 
> ---
>  .../bindings/power/power-domain.yaml  | 49 +++
>  1 file changed, 49 insertions(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:
./Documentation/devicetree/bindings/power/power-domain.yaml:72:8: [warning] 
wrong indentation: expected 6 but found 7 (indentation)

dtschema/dtc warnings/errors:

See https://patchwork.ozlabs.org/patch/1420485

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

Re: [PATCH 1/2] dt-bindings: Convert Arm Mali Valhall GPU to DT schema

2020-12-25 Thread Rob Herring

On Thu, 24 Dec 2020 20:31:18 +0800, Nick Fan wrote:
> Convert the Arm Valhall GPU binding to DT schema format.
> 
> Define a compatible string for the Mali Valhall GPU
> for Mediatek's SoC platform.
> 
> Signed-off-by: Nick Fan 
> ---
>  .../bindings/gpu/arm,mali-valhall.yaml| 252 ++
>  1 file changed, 252 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/gpu/arm,mali-valhall.yaml
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
./Documentation/devicetree/bindings/gpu/arm,mali-valhall.yaml: $id: relative 
path/filename doesn't match actual path or filename
expected: http://devicetree.org/schemas/gpu/arm,mali-valhall.yaml#

See https://patchwork.ozlabs.org/patch/1420519

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

[PATCH v7 5/5] counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8

2020-12-25 Thread William Breathitt Gray

The LSI/CSI LS7266R1 chip provides programmable output via the FLG pins.
When interrupts are enabled on the ACCES 104-QUAD-8, they occur whenever
FLG1 is active. Four functions are available for the FLG1 signal: Carry,
Compare, Carry-Borrow, and Index.

Carry:
Interrupt generated on active low Carry signal. Carry
signal toggles every time the respective channel's
counter overflows.

Compare:
Interrupt generated on active low Compare signal.
Compare signal toggles every time respective channel's
preset register is equal to the respective channel's
counter.

Carry-Borrow:
Interrupt generated on active low Carry signal and
active low Borrow signal. Carry signal toggles every
time the respective channel's counter overflows. Borrow
signal toggles every time the respective channel's
counter underflows.

Index:
Interrupt generated on active high Index signal.

The irq_trigger Count extension is introduced to allow the selection of
the desired IRQ trigger function per channel. Interrupts push Counter
events to event channel X, where 'X' is the respective channel whose
FLG1 activated.

This patch adds IRQ support for the ACCES 104-QUAD-8. The interrupt line
numbers for the devices may be configured via the irq array module
parameter.

Reviewed-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
 .../ABI/testing/sysfs-bus-counter-104-quad-8  |  25 ++
 drivers/counter/104-quad-8.c  | 318 ++
 drivers/counter/Kconfig   |   6 +-
 3 files changed, 276 insertions(+), 73 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8 
b/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
index eac32180c40d..0ecba24d43aa 100644
--- a/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
+++ b/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
@@ -1,3 +1,28 @@
+What:  /sys/bus/counter/devices/counterX/countY/irq_trigger
+KernelVersion: 5.12
+Contact:   linux-...@vger.kernel.org
+Description:
+   IRQ trigger function for channel Y. Four trigger functions are
+   available: carry, compare, carry-borrow, and index.
+
+   carry:
+   Interrupt generated on active low Carry signal. Carry
+   signal toggles every time channel Y counter overflows.
+
+   compare:
+   Interrupt generated on active low Compare signal.
+   Compare signal toggles every time channel Y preset
+   register is equal to channel Y counter.
+
+   carry-borrow:
+   Interrupt generated on active low Carry signal and
+   active low Borrow signal. Carry signal toggles every
+   time channel Y counter overflows. Borrow signal toggles
+   every time channel Y counter underflows.
+
+   index:
+   Interrupt generated on active high Index signal.
+
 What:  /sys/bus/counter/devices/counterX/signalY/cable_fault
 KernelVersion: 5.7
 Contact:   linux-...@vger.kernel.org
diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
index f4fb36b751c4..7537575568d0 100644
--- a/drivers/counter/104-quad-8.c
+++ b/drivers/counter/104-quad-8.c
@@ -13,23 +13,30 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #define QUAD8_EXTENT 32
 
 static unsigned int base[max_num_isa_dev(QUAD8_EXTENT)];
 static unsigned int num_quad8;
-module_param_array(base, uint, &num_quad8, 0);
+module_param_hw_array(base, uint, ioport, &num_quad8, 0);
 MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
 
+static unsigned int irq[max_num_isa_dev(QUAD8_EXTENT)];
+module_param_hw_array(irq, uint, irq, NULL, 0);
+MODULE_PARM_DESC(irq, "ACCES 104-QUAD-8 interrupt line numbers");
+
 #define QUAD8_NUM_COUNTERS 8
 
 /**
  * struct quad8_iio - IIO device private data structure
+ * @lock:  synchronization lock to prevent I/O race conditions
  * @counter:   instance of the counter_device
  * @fck_prescaler: array of filter clock prescaler configurations
  * @preset:array of preset values
@@ -38,13 +45,14 @@ MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
  * @quadrature_scale:  array of quadrature mode scale configurations
  * @ab_enable: array of A and B inputs enable configurations
  * @preset_enable: array of set_to_preset_on_index attribute configurations
+ * @irq_trigger:   array of interrupt trigger function configurations
  * @synchronous_mode:  array of index function synchronous mode configurations
  * @index_polarity:array of index functi

[PATCH v7 4/5] docs: counter: Document character device interface

2020-12-25 Thread William Breathitt Gray

This patch adds high-level documentation about the Counter subsystem
character device interface.

Signed-off-by: William Breathitt Gray 
---
 Documentation/ABI/testing/sysfs-bus-counter   |   9 +
 Documentation/driver-api/generic-counter.rst  | 236 +++---
 .../userspace-api/ioctl/ioctl-number.rst  |   1 +
 3 files changed, 205 insertions(+), 41 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-counter 
b/Documentation/ABI/testing/sysfs-bus-counter
index 1820ce2f9183..8f6ea0a50b75 100644
--- a/Documentation/ABI/testing/sysfs-bus-counter
+++ b/Documentation/ABI/testing/sysfs-bus-counter
@@ -99,6 +99,15 @@ Description:
Read-only attribute that indicates whether excessive noise is
present at the channel Y counter inputs.
 
+What:  /sys/bus/counter/devices/counterX/countY/extensionZ_name
+What:  /sys/bus/counter/devices/counterX/extensionZ_name
+What:  /sys/bus/counter/devices/counterX/signalY/extensionZ_name
+KernelVersion: 5.12
+Contact:   linux-...@vger.kernel.org
+Description:
+   Read-only attribute that indicates the component name of
+   Extension Z.
+
 What:  /sys/bus/counter/devices/counterX/countY/function
 KernelVersion: 5.2
 Contact:   linux-...@vger.kernel.org
diff --git a/Documentation/driver-api/generic-counter.rst 
b/Documentation/driver-api/generic-counter.rst
index b842ddbbd8a0..4775dcaff557 100644
--- a/Documentation/driver-api/generic-counter.rst
+++ b/Documentation/driver-api/generic-counter.rst
@@ -223,19 +223,6 @@ whether an input line is differential or single-ended) and 
instead focus
 on the core idea of what the data and process represent (e.g. position
 as interpreted from quadrature encoding data).
 
-Userspace Interface
-===
-
-Several sysfs attributes are generated by the Generic Counter interface,
-and reside under the /sys/bus/counter/devices/counterX directory, where
-counterX refers to the respective counter device. Please see
-Documentation/ABI/testing/sysfs-bus-counter for detailed
-information on each Generic Counter interface sysfs attribute.
-
-Through these sysfs attributes, programs and scripts may interact with
-the Generic Counter paradigm Counts, Signals, and Synapses of respective
-counter devices.
-
 Driver API
 ==
 
@@ -387,16 +374,16 @@ userspace interface components::
 / driver callbacks /
 ---
 |
-+---+
-|
-V
-++
-| Counter sysfs  |
-++
-| Translates to the  |
-| standard Counter   |
-| sysfs output   |
-++
++---+---+
+|   |
+V   V
+++  +-+
+| Counter sysfs  |  | Counter chrdev  |
+++  +-+
+| Translates to the  |  | Translates to the   |
+| standard Counter   |  | standard Counter|
+| sysfs output   |  | character device|
+++  +-+
 
 Thereafter, data can be transferred directly between the Counter device
 driver and Counter userspace interface::
@@ -427,23 +414,30 @@ driver and Counter userspace interface::
 / u64 /
 --
 |
-+---+
-|
-V
-++
-| Counter sysfs  |
-++
-| Translates to the  |
-| standard Counter   |
-| sysfs output   |
-||
-| Type: const char * |
-| Value: "42"|
-++
-|
- ---
-/ const char * /
----
++---+---+
+|   |
+V   V
+++  +-+
+| Counter sysfs  |  | Counter chrdev  |
+++  +-+
+| Translates to the  |  | Translates to the   |
+| standard Counter   |  | standard Counter|
+| sysfs output   |  | character device|
+||  |-|
+| Type: const char * |  | Type: u64   |
+| Value: "42"|  | Value: 42   |
+++  +--

[PATCH v7 3/5] counter: Add character device interface

2020-12-25 Thread William Breathitt Gray

This patch introduces a character device interface for the Counter
subsystem. Device data is exposed through standard character device read
operations. Device data is gathered when a Counter event is pushed by
the respective Counter device driver. Configuration is handled via ioctl
operations on the respective Counter character device node.

Cc: David Lechner 
Cc: Gwendal Grignou 
Cc: Dan Carpenter 
Signed-off-by: William Breathitt Gray 
---
 MAINTAINERS  |   1 +
 drivers/counter/Makefile |   2 +-
 drivers/counter/counter-chrdev.c | 490 +++
 drivers/counter/counter-chrdev.h |  16 +
 drivers/counter/counter-core.c   |  37 ++-
 drivers/counter/counter-sysfs.c  |  51 +++-
 include/linux/counter.h  |  87 +++---
 include/uapi/linux/counter.h | 123 
 8 files changed, 751 insertions(+), 56 deletions(-)
 create mode 100644 drivers/counter/counter-chrdev.c
 create mode 100644 drivers/counter/counter-chrdev.h
 create mode 100644 include/uapi/linux/counter.h

diff --git a/MAINTAINERS b/MAINTAINERS
index b64fa49d5796..3a240f70bdd4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4494,6 +4494,7 @@ F:Documentation/ABI/testing/sysfs-bus-counter*
 F: Documentation/driver-api/generic-counter.rst
 F: drivers/counter/
 F: include/linux/counter.h
+F: include/uapi/linux/counter.h
 
 CPMAC ETHERNET DRIVER
 M: Florian Fainelli 
diff --git a/drivers/counter/Makefile b/drivers/counter/Makefile
index cbe1d06af6a9..c4870eb5b1dd 100644
--- a/drivers/counter/Makefile
+++ b/drivers/counter/Makefile
@@ -4,7 +4,7 @@
 #
 
 obj-$(CONFIG_COUNTER) += counter.o
-counter-y := counter-core.o counter-sysfs.o
+counter-y := counter-core.o counter-sysfs.o counter-chrdev.o
 
 obj-$(CONFIG_104_QUAD_8)   += 104-quad-8.o
 obj-$(CONFIG_STM32_TIMER_CNT)  += stm32-timer-cnt.o
diff --git a/drivers/counter/counter-chrdev.c b/drivers/counter/counter-chrdev.c
new file mode 100644
index ..61b11989546a
--- /dev/null
+++ b/drivers/counter/counter-chrdev.c
@@ -0,0 +1,490 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic Counter character device interface
+ * Copyright (C) 2020 William Breathitt Gray
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "counter-chrdev.h"
+
+struct counter_comp_node {
+   struct list_head l;
+   struct counter_component component;
+   struct counter_comp comp;
+   void *parent;
+};
+
+static ssize_t counter_chrdev_read(struct file *filp, char __user *buf,
+  size_t len, loff_t *f_ps)
+{
+   struct counter_device *const counter = filp->private_data;
+   int err;
+   unsigned int copied;
+
+   if (len < sizeof(struct counter_event))
+   return -EINVAL;
+
+   do {
+   if (kfifo_is_empty(&counter->events)) {
+   if (filp->f_flags & O_NONBLOCK)
+   return -EAGAIN;
+
+   err = wait_event_interruptible(counter->events_wait,
+   !kfifo_is_empty(&counter->events));
+   if (err < 0)
+   return err;
+   }
+
+   if (mutex_lock_interruptible(&counter->events_lock))
+   return -ERESTARTSYS;
+   err = kfifo_to_user(&counter->events, buf, len, &copied);
+   mutex_unlock(&counter->events_lock);
+   if (err < 0)
+   return err;
+   } while (!copied);
+
+   return copied;
+}
+
+static __poll_t counter_chrdev_poll(struct file *filp,
+   struct poll_table_struct *pollt)
+{
+   struct counter_device *const counter = filp->private_data;
+   __poll_t events = 0;
+
+   poll_wait(filp, &counter->events_wait, pollt);
+
+   if (!kfifo_is_empty(&counter->events))
+   events = EPOLLIN | EPOLLRDNORM;
+
+   return events;
+}
+
+static void counter_events_list_free(struct list_head *const events_list)
+{
+   struct counter_event_node *p, *n;
+   struct counter_comp_node *q, *o;
+
+   list_for_each_entry_safe(p, n, events_list, l) {
+   /* Free associated component nodes */
+   list_for_each_entry_safe(q, o, &p->comp_list, l) {
+   list_del(&q->l);
+   kfree(q);
+   }
+
+   /* Free event node */
+   list_del(&p->l);
+   kfree(p);
+   }
+}
+
+static int counter_set_event_node(struct counter_device *const counter,
+ struct counter_watch *const watch,
+ const struct counter_comp_node *const cfg)
+{
+   struct counter_event_node *event_node;
+   struct counter_comp_node *

[PATCH v7 0/5] Introduce the Counter character device interface

2020-12-25 Thread William Breathitt Gray

Changes in v7:
 - Implemented u32 enums; enum types can now be used directly for
   callbacks and values
 - Fixed refcount underflow bug
 - Refactored all err check to check for err < 0; this should help
   prevent future oversights on valid positive return valids
 - Use mutex instead of raw_spin_lock in counter_chrdev_read();
   kifo_to_user() can now safely sleep
 - Renamed COUNTER_COMPONENT_DUMMY to COUNTER_COMPONENT_NONE to make
   purpose more obvious
 - Introduced a watch_validate() callback
 - Consolidated the functionality to clear the watches to the
   counter_clear_watches() function
 - Reimplemented counter_push_event() as a void function; on error,
   errno is returned via struct counter_event so that it can be handled
   by userspace (because interrupt handlers can't do much for this)
 - Renamed the events_config() callback to events_configure();
   "events_config" could be confused as a read callback when this is
   actually intended to configure the device for the requested events
 - Reimplemented 104-QUAD-8 driver to use events_configure() and
   watch_validate() callbacks; irq_trigger_enable sysfs attribute
   removed because events_configure() now serves this purpose

The changes for this revision were much simpler compared to the previous
revisions. I don't have any further questions for this patchset, so it's
really just a search for possible bugs or regressions now. Please test
and verify this patchset on your systems, and ACK appropriately.

To summarize the main points: there are no changes to the existing
Counter sysfs userspace interface; a Counter character device interface
is introduced that allows Counter events and associated data to be
read() by userspace; the events_configure() and watch_validate() driver
callbacks are introduced to support Counter events; and IRQ support is
added to the 104-QUAD-8 driver, serving as an example of how to support
the new Counter events functionality.

William Breathitt Gray (5):
  counter: Internalize sysfs interface code
  docs: counter: Update to reflect sysfs internalization
  counter: Add character device interface
  docs: counter: Document character device interface
  counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8

 Documentation/ABI/testing/sysfs-bus-counter   |   18 +-
 .../ABI/testing/sysfs-bus-counter-104-quad-8  |   25 +
 Documentation/driver-api/generic-counter.rst  |  416 -
 .../userspace-api/ioctl/ioctl-number.rst  |1 +
 MAINTAINERS   |2 +-
 drivers/counter/104-quad-8.c  |  799 +
 drivers/counter/Kconfig   |6 +-
 drivers/counter/Makefile  |1 +
 drivers/counter/counter-chrdev.c  |  490 ++
 drivers/counter/counter-chrdev.h  |   16 +
 drivers/counter/counter-core.c|  182 ++
 drivers/counter/counter-sysfs.c   |  868 ++
 drivers/counter/counter-sysfs.h   |   13 +
 drivers/counter/counter.c | 1496 -
 drivers/counter/ftm-quaddec.c |   61 +-
 drivers/counter/microchip-tcb-capture.c   |  103 +-
 drivers/counter/stm32-lptimer-cnt.c   |  181 +-
 drivers/counter/stm32-timer-cnt.c |  149 +-
 drivers/counter/ti-eqep.c |  224 +--
 include/linux/counter.h   |  716 
 include/linux/counter_enum.h  |   45 -
 include/uapi/linux/counter.h  |  123 ++
 22 files changed, 3259 insertions(+), 2676 deletions(-)
 create mode 100644 drivers/counter/counter-chrdev.c
 create mode 100644 drivers/counter/counter-chrdev.h
 create mode 100644 drivers/counter/counter-core.c
 create mode 100644 drivers/counter/counter-sysfs.c
 create mode 100644 drivers/counter/counter-sysfs.h
 delete mode 100644 drivers/counter/counter.c
 delete mode 100644 include/linux/counter_enum.h
 create mode 100644 include/uapi/linux/counter.h

-- 
2.29.2

[PATCH v7 2/5] docs: counter: Update to reflect sysfs internalization

2020-12-25 Thread William Breathitt Gray

The Counter subsystem architecture and driver implementations have
changed in order to handle Counter sysfs interactions in a more
consistent way. This patch updates the Generic Counter interface
documentation to reflect the changes.

Signed-off-by: William Breathitt Gray 
---
 Documentation/ABI/testing/sysfs-bus-counter  |   9 +-
 Documentation/driver-api/generic-counter.rst | 242 ++-
 2 files changed, 184 insertions(+), 67 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-counter 
b/Documentation/ABI/testing/sysfs-bus-counter
index 566bd99fe0a5..1820ce2f9183 100644
--- a/Documentation/ABI/testing/sysfs-bus-counter
+++ b/Documentation/ABI/testing/sysfs-bus-counter
@@ -219,7 +219,14 @@ What:  
/sys/bus/counter/devices/counterX/signalY/signal
 KernelVersion: 5.2
 Contact:   linux-...@vger.kernel.org
 Description:
-   Signal data of Signal Y represented as a string.
+   Signal level state of Signal Y. The following signal level
+   states are available:
+
+   low:
+   Low level state.
+
+   high:
+   High level state.
 
 What:  /sys/bus/counter/devices/counterX/signalY/name
 KernelVersion: 5.2
diff --git a/Documentation/driver-api/generic-counter.rst 
b/Documentation/driver-api/generic-counter.rst
index b02c52cd69d6..b842ddbbd8a0 100644
--- a/Documentation/driver-api/generic-counter.rst
+++ b/Documentation/driver-api/generic-counter.rst
@@ -250,8 +250,8 @@ for defining a counter device.
 .. kernel-doc:: drivers/counter/counter.c
:export:
 
-Implementation
-==
+Driver Implementation
+=
 
 To support a counter device, a driver must first allocate the available
 Counter Signals via counter_signal structures. These Signals should
@@ -267,25 +267,59 @@ respective counter_count structure. These counter_count 
structures are
 set to the counts array member of an allocated counter_device structure
 before the Counter is registered to the system.
 
-Driver callbacks should be provided to the counter_device structure via
-a constant counter_ops structure in order to communicate with the
-device: to read and write various Signals and Counts, and to set and get
-the "action mode" and "function mode" for various Synapses and Counts
-respectively.
+Driver callbacks must be provided to the counter_device structure in
+order to communicate with the device: to read and write various Signals
+and Counts, and to set and get the "action mode" and "function mode" for
+various Synapses and Counts respectively.
 
 A defined counter_device structure may be registered to the system by
 passing it to the counter_register function, and unregistered by passing
 it to the counter_unregister function. Similarly, the
-devm_counter_register and devm_counter_unregister functions may be used
-if device memory-managed registration is desired.
-
-Extension sysfs attributes can be created for auxiliary functionality
-and data by passing in defined counter_device_ext, counter_count_ext,
-and counter_signal_ext structures. In these cases, the
-counter_device_ext structure is used for global/miscellaneous exposure
-and configuration of the respective Counter device, while the
-counter_count_ext and counter_signal_ext structures allow for auxiliary
-exposure and configuration of a specific Count or Signal respectively.
+devm_counter_register function may be used if device memory-managed
+registration is desired.
+
+The struct counter_comp structure is used to define counter extensions
+for Signals, Synapses, and Counts.
+
+The "type" member specifies the type of high-level data (e.g. BOOL,
+COUNT_DIRECTION, etc.) handled by this extension. The "`*_read`" and
+"`*_write`" members can then be set by the counter device driver with
+callbacks to handle that data using native C data types (i.e. u8, u64,
+etc.).
+
+Convenience macros such as `COUNTER_COMP_COUNT_U64` are provided for use
+by driver authors. In particular, driver authors are expected to use
+the provided macros for standard Counter subsystem attributes in order
+to maintain a consistent interface for userspace. For example, a counter
+device driver may define several standard attributes like so::
+
+struct counter_comp count_ext[] = {
+COUNTER_COMP_DIRECTION(count_direction_read),
+COUNTER_COMP_ENABLE(count_enable_read, count_enable_write),
+COUNTER_COMP_CEILING(count_ceiling_read, count_ceiling_write),
+};
+
+This makes it simple to see, add, and modify the attributes that are
+supported by this driver ("direction", "enable", and "ceiling") and to
+maintain this code without getting lost in a web of struct braces.
+
+Callbacks must match the function type expected for the respective
+component or extension. These function types are defined in the struct
+counter_comp structure as the "`*_read`" and "`*_write`" union members.
+
+The correspo

Re: [PATCH] nfp: remove h from printk format specifier

2020-12-25 Thread Joe Perches

On Fri, 2020-12-25 at 14:13 -0800, Tom Rix wrote:
> On 12/25/20 9:06 AM, Joe Perches wrote:
> > On Fri, 2020-12-25 at 06:56 -0800, Tom Rix wrote:
> > > On 12/24/20 2:39 PM, Joe Perches wrote:
> > []
> > > > Kernel code doesn't use a signed char or short with %hx or %hu very 
> > > > often
> > > > but in case you didn't already know, any signed char/short emitted with
> > > > anything like %hx or %hu needs to be left alone as sign extension 
> > > > occurs so:
> > > Yes, this would also effect checkpatch.
> > Of course but checkpatch is stupid and doesn't know types
> > so it just assumes that the type argument is not signed.
> > 
> > In general, that's a reasonable but imperfect assumption.
> > 
> > coccinelle could probably do this properly as it's a much
> > better parser.  clang-tidy should be able to as well.
> > 
> Ok.
> 
> But types not matching the format string is a larger problem.
> 
> Has there been an effort to clean these up ?

Not really no.  __printf already does a reasonable job for that.

The biggest issue for format type mismatches is the %p extensions.

__printf can only verify that the argument is a pointer, not
necessarily the 'right' type of pointed to object.

There are overflow possibilities like '"%*ph", len, pointer'
where pointer may not have len bytes available and, for instance,
mismatched uses of %pI4 and %pI6 where %pI4 expects a pointer to
4 bytes and %pI6 expects a pointer to 16 bytes.

Anyway it's not that easy a problem to analyze.

Re: [PATCH 03/15] perf: Add build id data in mmap2 event

2020-12-25 Thread Jiri Olsa

On Tue, Dec 15, 2020 at 11:01:51PM +0100, Daniel Borkmann wrote:
> Hey Arnaldo,
> 
> On 12/15/20 4:52 PM, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Dec 14, 2020 at 11:54:45AM +0100, Jiri Olsa escreveu:
> > > Adding support to carry build id data in mmap2 event.
> > > 
> > > The build id data replaces maj/min/ino/ino_generation
> > > fields, which are also used to identify map's binary,
> > > so it's ok to replace them with build id data:
> > > 
> > >union {
> > >struct {
> > >u32   maj;
> > >u32   min;
> > >u64   ino;
> > >u64   ino_generation;
> > >};
> > >struct {
> > >u8build_id_size;
> > >u8__reserved_1;
> > >u16   __reserved_2;
> > >u8build_id[20];
> > >};
> > >};
> > 
> > Alexei/Daniel, this one depends on BPFs build id routines to be exported
> > for use by the perf kernel subsys, PeterZ already acked this, so can you
> > guys consider getting the first three patches in this series via the bpf
> > tree?
> > 
> > The BPF bits were acked by Song.
> 
> All the net-next and therefore also bpf-next bits for 5.11 were just merged
> by Linus into his tree. If you need the first 3 from [0] to land for this 
> merge
> window, it's probably easiest if you take them in and send them via perf tree
> directly in case you didn't send out a pull-req yet.. (alternatively I'll ping
> David/Jakub if they plan to make a 2nd net-next pull-req end of this week).
> 
> Thanks,
> Daniel
> 
>  [0] https://lore.kernel.org/lkml/20201214105457.543111-1-jo...@kernel.org/
> 

hi,
I don't see them (first 3 from [0]) in any tree so far ;-) please let
me know if there's anything I can do from my side to get this merged

thanks,
jirka


[0] https://lore.kernel.org/lkml/20201214105457.543111-1-jo...@kernel.org/

[PATCH] perf tools: Detect when pipe is passed as perf data

2020-12-25 Thread Jiri Olsa

Currently we allow pipe input/output only through '-' string
being passed to '-o' or '-i' options, like:

  # mkfifo perf.pipe
  # perf record --no-buffering -e 'sched:sched_switch' -o - > perf.pipe &
  [1] 354406
  # cat perf.pipe | ./perf --no-pager script -i - | head -3
perf 354406 [000] 168190.164921: sched:sched_switch: perf:354406..
 migration/012 [000] 168190.164928: sched:sched_switch: migration/0:..
perf 354406 [001] 168190.164981: sched:sched_switch: perf:354406..
  ...

This patch detects if given path is pipe and set the perf data
object accordingly, so it's possible now to do above with:

  # mkfifo perf.pipe
  # perf record --no-buffering -e 'sched:sched_switch' -o perf.pipe &
  [1] 360188
  # perf --no-pager script -i ./perf.pipe | head -3
perf 354442 [000] 168275.464895: sched:sched_switch: perf:354442..
 migration/012 [000] 168275.464902: sched:sched_switch: migration/0:..
perf 354442 [001] 168275.464953: sched:sched_switch: perf:354442..

It's of course possible to combine any of above ways.

Signed-off-by: Jiri Olsa 
---
 tools/perf/util/data.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index f29af4fc3d09..767b6c903cf5 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -159,7 +159,7 @@ int perf_data__update_dir(struct perf_data *data)
return 0;
 }
 
-static bool check_pipe(struct perf_data *data)
+static int check_pipe(struct perf_data *data)
 {
struct stat st;
bool is_pipe = false;
@@ -172,6 +172,15 @@ static bool check_pipe(struct perf_data *data)
} else {
if (!strcmp(data->path, "-"))
is_pipe = true;
+   else if (!stat(data->path, &st) && S_ISFIFO(st.st_mode)) {
+   int flags = perf_data__is_read(data) ?
+   O_RDONLY : O_WRONLY|O_CREAT|O_TRUNC;
+
+   fd = open(data->path, flags);
+   if (fd < 0)
+   return -EINVAL;
+   is_pipe = true;
+   }
}
 
if (is_pipe) {
@@ -190,7 +199,8 @@ static bool check_pipe(struct perf_data *data)
}
}
 
-   return data->is_pipe = is_pipe;
+   data->is_pipe = is_pipe;
+   return 0;
 }
 
 static int check_backup(struct perf_data *data)
@@ -344,8 +354,11 @@ static int open_dir(struct perf_data *data)
 
 int perf_data__open(struct perf_data *data)
 {
-   if (check_pipe(data))
-   return 0;
+   int err;
+
+   err = check_pipe(data);
+   if (err || data->is_pipe)
+   return err;
 
/* currently it allows stdio for pipe only */
data->use_stdio = false;
@@ -410,8 +423,10 @@ int perf_data__switch(struct perf_data *data,
 {
int ret;
 
-   if (check_pipe(data))
-   return -EINVAL;
+   ret = check_pipe(data);
+   if (ret || data->is_pipe)
+   return ret;
+
if (perf_data__is_read(data))
return -EINVAL;
 
-- 
2.26.2

Re: [PATCH] nfp: remove h from printk format specifier

2020-12-25 Thread Tom Rix



On 12/25/20 9:06 AM, Joe Perches wrote:
> On Fri, 2020-12-25 at 06:56 -0800, Tom Rix wrote:
>> On 12/24/20 2:39 PM, Joe Perches wrote:
> []
>>> Kernel code doesn't use a signed char or short with %hx or %hu very often
>>> but in case you didn't already know, any signed char/short emitted with
>>> anything like %hx or %hu needs to be left alone as sign extension occurs so:
>> Yes, this would also effect checkpatch.
> Of course but checkpatch is stupid and doesn't know types
> so it just assumes that the type argument is not signed.
>
> In general, that's a reasonable but imperfect assumption.
>
> coccinelle could probably do this properly as it's a much
> better parser.  clang-tidy should be able to as well.
>
Ok.

But types not matching the format string is a larger problem.

Has there been an effort to clean these up ?

Tom

Re: [PATCH v3 3/3] drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs

2020-12-25 Thread Stafford Horne

On Fri, Dec 25, 2020 at 09:21:20AM -0500, Gabriel Somlo wrote:
> Upstream LiteX now defaults to using 32-bit CSR subregisters
> (see https://github.com/enjoy-digital/litex/commit/a2b71fde).
> 
> This patch expands on commit 22447a99c97e ("drivers/soc/litex: add
> LiteX SoC Controller driver"), adding support for handling both 8-
> and 32-bit LiteX CSR (MMIO) subregisters, as determined by the
> LITEX_SUBREG_SIZE Kconfig option.
> 
> NOTE that while LITEX_SUBREG_SIZE could theoretically be a device
> tree property, defining it as a compile-time constant allows for
> much better optimization of the resulting code. This is further
> supported by the low expected usefulness of deploying the same
> kernel across LiteX SoCs built with different CSR-Bus data widths.
> 
> The constant LITEX_SUBREG_SIZE is renamed to the more descriptive
> LITEX_SUBREG_ALIGN (LiteX CSR subregisters are located at 32-bit
> aligned MMIO addresses).
> 
> Finally, the litex_[read|write][8|16|32|64]() accessors are
> redefined in terms of litex_[get|set]_reg(), which, after compiler
> optimization, will result in code as efficient as hardcoded shifts,
> but with the added benefit of automatically matching the appropriate
> LITEX_SUBREG_SIZE.
> 
> NOTE that litex_[get|set]_reg() nominally operate 64-bit data, but
> that too will be optimized away by the compiler in situations where
> narrower data is used.

Hello, thank you for taking the time to split and expand on the commit
message it makes it a lot easier to review.

> Signed-off-by: Gabriel Somlo 
> ---
>  drivers/soc/litex/Kconfig  |  12 +++
>  drivers/soc/litex/litex_soc_ctrl.c |   3 +-
>  include/linux/litex.h  | 139 -
>  3 files changed, 70 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/soc/litex/Kconfig b/drivers/soc/litex/Kconfig
> index 7c6b009b6f6c..973f8d2fe1a7 100644
> --- a/drivers/soc/litex/Kconfig
> +++ b/drivers/soc/litex/Kconfig
> @@ -16,4 +16,16 @@ config LITEX_SOC_CONTROLLER
> All drivers that use functions from litex.h must depend on
> LITEX.
>  
> +config LITEX_SUBREG_SIZE
> + int "Size of a LiteX CSR subregister, in bytes"
> + depends on LITEX
> + range 1 4
> + default 4
> + help
> + LiteX MMIO registers (referred to as Configuration and Status
> + registers, or CSRs) are spread across adjacent 8- or 32-bit
> + subregisters, located at 32-bit aligned MMIO addresses. Use
> + this to select the appropriate size (1 or 4 bytes) matching
> + your particular LiteX build.
> +
>  endmenu
> diff --git a/drivers/soc/litex/litex_soc_ctrl.c 
> b/drivers/soc/litex/litex_soc_ctrl.c
> index 65977526d68e..da17ba56b795 100644
> --- a/drivers/soc/litex/litex_soc_ctrl.c
> +++ b/drivers/soc/litex/litex_soc_ctrl.c
> @@ -58,7 +58,8 @@ static int litex_check_csr_access(void __iomem *reg_addr)
>   /* restore original value of the SCRATCH register */
>   litex_write32(reg_addr + SCRATCH_REG_OFF, SCRATCH_REG_VALUE);
>  
> - pr_info("LiteX SoC Controller driver initialized");
> + pr_info("LiteX SoC Controller driver initialized: subreg:%d, align:%d",
> + LITEX_SUBREG_SIZE, LITEX_SUBREG_ALIGN);
>  
>   return 0;
>  }
> diff --git a/include/linux/litex.h b/include/linux/litex.h
> index 918bab45243c..53fb03a2f257 100644
> --- a/include/linux/litex.h
> +++ b/include/linux/litex.h
> @@ -10,20 +10,19 @@
>  #define _LINUX_LITEX_H
>  
>  #include 
> -#include 
> -#include 
>  
> -/*
> - * The parameters below are true for LiteX SoCs configured for 8-bit CSR Bus,
> - * 32-bit aligned.
> - *
> - * Supporting other configurations will require extending the logic in this
> - * header and in the LiteX SoC controller driver.
> - */
> -#define LITEX_REG_SIZE 0x4
> -#define LITEX_SUBREG_SIZE0x1
> +/* LiteX SoCs support 8- or 32-bit CSR Bus data width (i.e., subreg. size) */
> +#if defined(CONFIG_LITEX_SUBREG_SIZE) && \
> + (CONFIG_LITEX_SUBREG_SIZE == 1 || CONFIG_LITEX_SUBREG_SIZE == 4)
> +#define LITEX_SUBREG_SIZE  CONFIG_LITEX_SUBREG_SIZE
> +#else
> +#error LiteX subregister size (LITEX_SUBREG_SIZE) must be 4 or 1!
> +#endif
>  #define LITEX_SUBREG_SIZE_BIT (LITEX_SUBREG_SIZE * 8)
>  
> +/* LiteX subregisters of any width are always aligned on a 4-byte boundary */
> +#define LITEX_SUBREG_ALIGN 0x4
> +
>  static inline void _write_litex_subregister(u32 val, void __iomem *addr)
>  {
>   writel((u32 __force)cpu_to_le32(val), addr);
> @@ -34,25 +33,31 @@ static inline u32 _read_litex_subregister(void __iomem 
> *addr)
>   return le32_to_cpu((__le32 __force)readl(addr));
>  }
>  
> -#define WRITE_LITEX_SUBREGISTER(val, base_offset, subreg_id) \
> - _write_litex_subregister(val, (base_offset) + \
> - LITEX_REG_SIZE * (subreg_id))
> -
> -#define READ_LITEX_SUBREGISTER(base_offset, subreg_id) \
> - _read_litex_subregister((base_offset) + \
> - LITEX_

Re: [PATCH] nvmet-fc: associations list replaced with hlist rcu,

2020-12-25 Thread kernel test robot

Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.10 next-20201223]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/leonid-ravich-dell-com/nvmet-fc-associations-list-replaced-with-hlist-rcu/20201224-191206
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
58cf05f597b03a8212d9ecf2c79ee046d3ee8ad9
config: s390-randconfig-r031-20201224 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
cee1e7d14f4628d6174b33640d502bff3b54ae45)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# 
https://github.com/0day-ci/linux/commit/db9dbc6efce5ef6533984b7fbe395b365d71ce7f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
leonid-ravich-dell-com/nvmet-fc-associations-list-replaced-with-hlist-rcu/20201224-191206
git checkout db9dbc6efce5ef6533984b7fbe395b365d71ce7f
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   include/uapi/linux/swab.h:19:12: note: expanded from macro 
'___constant_swab32'
   (((__u32)(x) & (__u32)0x00ffUL) << 24) |\
 ^
   In file included from drivers/nvme/target/fc.c:8:
   In file included from include/linux/blk-mq.h:5:
   In file included from include/linux/blkdev.h:26:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a 
null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
   ~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro 
'__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
 ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
   ___constant_swab32(x) : \
  ^
   include/uapi/linux/swab.h:20:12: note: expanded from macro 
'___constant_swab32'
   (((__u32)(x) & (__u32)0xff00UL) <<  8) |\
 ^
   In file included from drivers/nvme/target/fc.c:8:
   In file included from include/linux/blk-mq.h:5:
   In file included from include/linux/blkdev.h:26:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a 
null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
   ~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro 
'__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
 ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
   ___constant_swab32(x) : \
  ^
   include/uapi/linux/swab.h:21:12: note: expanded from macro 
'___constant_swab32'
   (((__u32)(x) & (__u32)0x00ffUL) >>  8) |\
 ^
   In file included from drivers/nvme/target/fc.c:8:
   In file included from include/linux/blk-mq.h:5:
   In file included from include/linux/blkdev.h:26:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a 
null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
   ~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro 
'__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
 ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
   ___constant_swab32(x) :

drivers/mmc/host/s3cmci.c:1684:21: sparse: sparse: incorrect type in argument 1 (different address spaces)

2020-12-25 Thread kernel test robot

Hi Krzysztof,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   5814bc2d4cc241c1a603fac2b5bf1bd4daa108fc
commit: 1b0e4a2141c7bf6a122f1e04cbc1690b835707cf mmc: s3cmci: enable compile 
testing
date:   6 weeks ago
config: openrisc-randconfig-s031-20201223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-184-g1b896707-dirty
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b0e4a2141c7bf6a122f1e04cbc1690b835707cf
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 1b0e4a2141c7bf6a122f1e04cbc1690b835707cf
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"sparse warnings: (new ones prefixed by >>)"
>> drivers/mmc/host/s3cmci.c:1684:21: sparse: sparse: incorrect type in 
>> argument 1 (different address spaces) @@ expected void *addr @@ got 
>> void [noderef] __iomem *base @@
   drivers/mmc/host/s3cmci.c:1684:21: sparse: expected void *addr
   drivers/mmc/host/s3cmci.c:1684:21: sparse: got void [noderef] __iomem 
*base
   drivers/mmc/host/s3cmci.c:1726:21: sparse: sparse: incorrect type in 
argument 1 (different address spaces) @@ expected void *addr @@ got 
void [noderef] __iomem *base @@
   drivers/mmc/host/s3cmci.c:1726:21: sparse: expected void *addr
   drivers/mmc/host/s3cmci.c:1726:21: sparse: got void [noderef] __iomem 
*base

vim +1684 drivers/mmc/host/s3cmci.c

e584e07502131fb Sergio Prado   2017-03-31  1507  
e584e07502131fb Sergio Prado   2017-03-31  1508  static int 
s3cmci_probe(struct platform_device *pdev)
e584e07502131fb Sergio Prado   2017-03-31  1509  {
e584e07502131fb Sergio Prado   2017-03-31  1510 struct s3cmci_host 
*host;
e584e07502131fb Sergio Prado   2017-03-31  1511 struct mmc_host *mmc;
e584e07502131fb Sergio Prado   2017-03-31  1512 int ret;
e584e07502131fb Sergio Prado   2017-03-31  1513  
e584e07502131fb Sergio Prado   2017-03-31  1514 mmc = 
mmc_alloc_host(sizeof(struct s3cmci_host), &pdev->dev);
e584e07502131fb Sergio Prado   2017-03-31  1515 if (!mmc) {
e584e07502131fb Sergio Prado   2017-03-31  1516 ret = -ENOMEM;
e584e07502131fb Sergio Prado   2017-03-31  1517 goto probe_out;
916a30775fc843e Ben Dooks  2009-10-01  1518 }
916a30775fc843e Ben Dooks  2009-10-01  1519  
be518018c6b9224 Thomas Kleffel 2008-06-30  1520 host = mmc_priv(mmc);
be518018c6b9224 Thomas Kleffel 2008-06-30  1521 host->mmc   = mmc;
be518018c6b9224 Thomas Kleffel 2008-06-30  1522 host->pdev  = pdev;
e584e07502131fb Sergio Prado   2017-03-31  1523  
e584e07502131fb Sergio Prado   2017-03-31  1524 if (pdev->dev.of_node)
e584e07502131fb Sergio Prado   2017-03-31  1525 ret = 
s3cmci_probe_dt(host);
e584e07502131fb Sergio Prado   2017-03-31  1526 else
e584e07502131fb Sergio Prado   2017-03-31  1527 ret = 
s3cmci_probe_pdata(host);
e584e07502131fb Sergio Prado   2017-03-31  1528  
e584e07502131fb Sergio Prado   2017-03-31  1529 if (ret)
e584e07502131fb Sergio Prado   2017-03-31  1530 goto 
probe_free_host;
be518018c6b9224 Thomas Kleffel 2008-06-30  1531  
edb5a98e43682d6 Ben Dooks  2008-06-30  1532 host->pdata = 
pdev->dev.platform_data;
edb5a98e43682d6 Ben Dooks  2008-06-30  1533  
be518018c6b9224 Thomas Kleffel 2008-06-30  1534 
spin_lock_init(&host->complete_lock);
be518018c6b9224 Thomas Kleffel 2008-06-30  1535 
tasklet_init(&host->pio_tasklet, pio_tasklet, (unsigned long) host);
be518018c6b9224 Thomas Kleffel 2008-06-30  1536  
e584e07502131fb Sergio Prado   2017-03-31  1537 if (host->is2440) {
be518018c6b9224 Thomas Kleffel 2008-06-30  1538 host->sdiimsk   
= S3C2440_SDIIMSK;
be518018c6b9224 Thomas Kleffel 2008-06-30  1539 host->sdidata   
= S3C2440_SDIDATA;
be518018c6b9224 Thomas Kleffel 2008-06-30  1540 host->clk_div   
= 1;
be518018c6b9224 Thomas Kleffel 2008-06-30  1541 } else {
be518018c6b9224 Thomas Kleffel 2008-06-30  1542 host->sdiimsk   
= S3C2410_SDIIMSK;
be518018c6b9224 Thomas Kleffel 2008-06-30  1543 host->sdidata   
= S3C2410_SDIDATA;
be518018c6b9224 Thomas Kleffel 2008-06-30  1544 host->clk_div   
= 2;
be518018c6b

[RFC PATCH v2 01/19] dyndbg: fix use before null check

2020-12-25 Thread Jim Cromie

In commit a2d375eda771 ("dyndbg: refine export, rename to
dynamic_debug_exec_queries()"), a string is copied before checking it
isn't NULL.  Fix this, report a usage/interface error, and return the
proper error code.

Fixes: a2d375eda771 ("dyndbg: refine export, rename to 
dynamic_debug_exec_queries()")
Cc: sta...@vger.kernel.org
--
-v2 drop comment tweak, improve commit message

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index bd7b3aaa93c3..c70d6347afa2 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -561,9 +561,14 @@ static int ddebug_exec_queries(char *query, const char 
*modname)
 int dynamic_debug_exec_queries(const char *query, const char *modname)
 {
int rc;
-   char *qry = kstrndup(query, PAGE_SIZE, GFP_KERNEL);
+   char *qry; /* writable copy of query */
 
-   if (!query)
+   if (!query) {
+   pr_err("non-null query/command string expected\n");
+   return -EINVAL;
+   }
+   qry = kstrndup(query, PAGE_SIZE, GFP_KERNEL);
+   if (!qry)
return -ENOMEM;
 
rc = ddebug_exec_queries(qry, modname);
-- 
2.29.2

[RFC PATCH v2 17/19] dyndbg: rearrange struct ddebug_callsites

2020-12-25 Thread Jim Cromie

move static-key field to top of struct.  It is the biggest field, and
most alignment constrained (I believe), so this improves ambient
pahole conditions.

It doesn't actually improve the packing, it only simplifies and
shrinks the pahole reporting.

probably drop during rebase, cleanup.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 0fcbe96736f3..e7b5e7664e51 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -25,7 +25,14 @@ struct _ddebug_callsite {
 };
 
 struct _ddebug {
+#ifdef CONFIG_JUMP_LABEL
+   union {
+   struct static_key_true dd_key_true;
+   struct static_key_false dd_key_false;
+   } key;
+#endif
struct _ddebug_callsite *site;
+
/* format is always needed, lineno shares word with flags */
const char *format;
const unsigned lineno:18;
@@ -56,12 +63,6 @@ struct _ddebug {
 #define _DPRINTK_FLAGS_DEFAULT 0
 #endif
unsigned int flags:8;
-#ifdef CONFIG_JUMP_LABEL
-   union {
-   struct static_key_true dd_key_true;
-   struct static_key_false dd_key_false;
-   } key;
-#endif
 } __aligned(8);
 
 
-- 
2.29.2

[RFC PATCH v2 18/19] dyndbg: add module_index to struct _ddebug

2020-12-25 Thread Jim Cromie

We seek to drop the pointer added when we split struct _ddebug.  This
would restore our footprint to parity worst case, with all the upsides
of callsite overhead management.

we will rely upon existing structure (and modify it):

__dyndbg[], __dyndbg_callsites[] are parallel arrays, packed in order
by the linker [*].  Together the pairs compose each prdebug's state.
Each _ddebug[].site points at the corresponding _ddebug_callsite[],
visually like a ladder.

At init we (already) iterate over __dyndbg[] & __dyndbg_callsites[],
and parse them into sub-vectors reffered/owned by ddebug_table's that
we create and collect in a list.  We also prove via BUG_ON(site_iter
!= iter->site) that the ladder is straight.

IFF we can recreate site_iter, we can drop the .site pointer.  But
site_iter is really just &__dyndbg_callsites[module_index], so we add
module_index here and now.

This doesn't finish the job.

With module_index, we can find the vector root.  Now we need a
backlink up to the sponsor/owner ddebug_table record.
This has a few steps:

1 claim space in the linkage, for a module header.
  DECLARE_DYNAMIC_DEBUG_TABLE
  a special variety of DECLARE_DYNAMIC_DEBUG_METADATA,
  sounds like it fits.
  ideally its transparent, registers with module somehow.
  linker needs to put this 1st in module's sub-vector

2 define that header - maybe something like:

  union ddebug_table_header {
struct ddebug_table *owner;
struct _ddebug item;
  }
  OR
  struct ddebug_table_vector {
 struct ddebug_table *owner;
 struct _ddebug vector[];
  }

So with module_index, we can go from our _ddebug element in our
sub-vector to the root, and container_of() to get the backlink up to
the ddebug_table, then down to the sites[N]

The backlink is sort-of a poor-mans list, so why not just use one ?
- we only have 2 vectors in the list
- they never change size
- we only go up one leg, and down other
- list rotation would just confuse.

Given the amount of space in a struct _ddebug (40 bytes), we might be
able to pack part or all of ddebug_table into it, share it in a union,
and avoid kmalloc'g them at all.

if DECLARE_DYNAMIC_DEBUG_TABLE can be forced into the linkage at
the front of each modules sub-section, we can probably initialize
module_index statically, and not have to do it during _init().

add DEFINE_DYNAMIC_DEBUG_TABLE(module) as a special case of:
DEFINE_DYNAMIC_DEBUG_METADATA(module, "00-1st-in-subsection")

It's just a "callsite" we expect not to use, except as a 0th element
we can compute an offset from, to initialize each ddebug.module_index.

The next step is to get one into a module by brute force, see if it
compiles, and places this header record where we need it.
If that doesnt work, we need to fix it.

Later, this macro will be adapted to initialize a pair of alternate
structures; analogous to structs _ddebug & _ddebug_callsite, equal or
smaller in size, and implemented as structs unionized with them.  This
pair of alt-structs is big enough to contain all of ddebug_table's
fields.  For reference, the respective sizes:

A: struct _ddebug_callsite {
/* size: 24, cachelines: 1, members: 3 */
/* last cacheline: 24 bytes */

B: struct _ddebug {
/* XXX 6 bits hole, try to pack */
/* size: 40, cachelines: 1, members: 6 */
/* sum members: 36 */
/* sum bitfield members: 26 bits, bit holes: 1, sum bit holes: 6 bits */
/* last cacheline: 40 bytes */

C: struct ddebug_table {
struct list_head   link; /* 016 */
const char  *  mod_name; /*16 8 */
unsigned int   num_ddebugs;  /*24 4 */

/* XXX 4 bytes hole, try to pack */

struct _ddebug *   ddebugs;  /*32 8 */
struct _ddebug_callsite *  sites;/*40 8 */

/* size: 48, cachelines: 1, members: 5 */
/* sum members: 44, holes: 1, sum holes: 4 */
/* last cacheline: 48 bytes */

Clearly theres enough room in A + B to hold the contents of C.  We
will keep the pointer in A" -> B", it will get us to all the contents
there, most importantly the sites vector.

It will be interesting to see just how much can be done by linker and
static initialization.  Id be tickled if the linked list init could be
done statically, but it hardly matters; __ro_after_init (however its
spelled) would probably suffice for a solid security guarantee.

I presume its ok to have a list which is partly of items in RO memory,
but is extended at runtime, in our case when modules are (un)loaded.
If not, we can keep 2 lists, for builtin and dynamic-loaded modules
respectively.

[1] DECLARE_DYNAMIC_DEBUG_METADATA may be why 2 linker sections are
in-order; it links head->body as it "allocates" them.  If we drop the
pointer, we lose the constraint on the relative ordering.  I hope not.

Signed-off-by: Jim Cromie 
---
 include/linu

[RFC PATCH v2 16/19] dyndbg: ddebug_site_get/put api commentary

2020-12-25 Thread Jim Cromie

Paths forward: (not mutually exclusive)

A: !site -> fill from backing store

1st try at this is/was using zram.  At init, it copied each callsite
into a zs-allocation, and all site-> refs afterward went thru
_get/_put to zs-map on demand, and zs-unmap the site info.  This
worked until I tried to keep callsites mapped while they're enabled.

Another approach is to compress the new linker section, using some
algo thats good at indexed decompression.  I tried to test this a bit,
using objcopy, unsuccessfully:

   objcopy --dump-section __dyndbg_callsites=dd_callsites vmlinux.o

>From vmlinux.o it was mostly empty, vmlinux didnt have the section.

B: callsite as a set of property vectors, indexed by 'N'

We know dp is in a vector, either in the builtin __dyndbg_callsite
linker section, or the same from a modprobed one.  The builtin section
has all builtin module sub-sections catenated dogether.

At init, we iterate over the section, and "parse it" by creating a
ddebug_table for each module with prdebugs.  ddebug_table.num_debugs
remembers the size of each modules' vector of prdebugs.

We need a few things:

- _ddebug.index field, which knows offset to start of this sub-vector.
  this new field will be "free" because the struct has padding.
  it can be initialized during init, then RO.

- a back-pointer at the beginning of the sub-vector, to the
  ddebug_table "owning" (but not containing) this sub-vector of
  prdebugs.

If we had both, we could get from the ddebug element to its vector
root, back up to the owning ddebug_table, then down to the _callsite
vector, and index to the right element.  While slower than a pointer
deref, this is a cold path, and it allows elimination of the
per-callsite pointer member, thus greater density of the sections, and
still can support sparse site info.

That back-pointer feels tricky.  It needs to be 1st in the sub-vector

is to reserve the 0th slot

ing its spot at the
front of

each module's ddebug sub-vector.

getting space in the
section for it.  Initializing it is easy.

prdebugs are added to section when DECLARE_DYAMIC_DEBUG_METADATA is
compiled; its unclear whether they are intrinsically sorted during
compile, or whether thats a linker thing.

Ideally, a MODULE-mumble declaration can be coaxed into declaring a
module singleton in the section(s), either naturally at the top (or
bottom) or sorted into place.

If that doesn't work, a "preload if module is different" strategy
could maybe work, but I dont know how to do that in macros.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 25f49515c235..ec28c113a361 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -146,7 +146,7 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
 
 static struct _ddebug_callsite *ddebug_site_get(struct _ddebug *dp)
 {
-   return dp->site;
+   return dp->site; /* passthru abstraction */
 }
 static inline void ddebug_site_put(struct _ddebug *dp)
 {
-- 
2.29.2

[RFC PATCH v2 12/19] dyndbg: allow deleting site info via control interface

2020-12-25 Thread Jim Cromie

Allow users & subsystems to selectively delete callsite info for
individual pr-debug callsites, or groups of them.

Its purpose is for subsystems such as DRM which:

- use distinct categories for logging, and can map them over to a
  format prefix, like: "drm:core:", "drm:kms:", etc.

- are happy with group control of all the callsites in a class/cateory.
  individual control is still possible using queries including line numbers

- don't need dynamic "module:function:line:" prefixes in log messages

- don't care about loss of context in /proc/dynamic_debug/control

before:

init/initramfs.c:485 [initramfs]unpack_to_rootfs =_ "Detected %s compressed 
data\012"
init/main.c:1337 [main]run_init_process =pm "%s\012"
init/main.c:1335 [main]run_init_process =pm "  with environment:\012"
init/main.c:1334 [main]run_init_process =pm "%s\012"
init/main.c:1332 [main]run_init_process =pm "  with arguments:\012"
init/main.c:1121 [main]initcall_blacklisted =pm "initcall %s blacklisted\012"
init/main.c:1082 [main]initcall_blacklist =pm "blacklisting initcall %s\012"

then:
  bash-5.0# echo file init/main.c +D > /proc/dynamic_debug/control

after:

init/initramfs.c:485 [initramfs]unpack_to_rootfs =_ "Detected %s compressed 
data\012"
[main]:1337 =pmD "%s\012"
[main]:1335 =pmD "  with environment:\012"
[main]:1334 =pmD "%s\012"
[main]:1332 =pmD "  with arguments:\012"
[main]:1121 =pmD "initcall %s blacklisted\012"
[main]:1082 =pmD "blacklisting initcall %s\012"

Notes:

If Drm adopted dyndbg, i915 + drm* would add ~1600 prdebugs, amdgpu +
drm* would add ~3200 callsites, so the additional memory costs are
substantial.  In trade, drm and drivers would avoid lots of calls to
drm_debug_enabled().  This patch should reduce the costs.

Using this interface, drm could drop site info for all categories /
prefixes controlled by bits in drm.debug, while preserving site info
and individual selectivity for any uncategorized prdebugs.

Lastly, because lineno field was not moved into _ddebug_callsite, it
can be used to modify a single[*] callsite even if drm has dropped all
the callsite data:

  echo module $mod format ^$prefix line $line +p >control

Dropping a _callsite a one-way, information losing operation, so minor
misuse is possible.  Worst case is maybe (depending upon previous
settings) some loss of logging context/decorations.

  echo +D > /proc/dynamic_debug/control

[*] amdgpu has some macros invoking clusters of pr_debugs; each use of
them creates a cluster of pr-debugs with the same line number.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 1 +
 lib/dynamic_debug.c   | 9 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index ea07a91a43bc..49fa1390d1f8 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -40,6 +40,7 @@ struct _ddebug {
 #define _DPRINTK_FLAGS_INCL_FUNCNAME   (1<<2)
 #define _DPRINTK_FLAGS_INCL_LINENO (1<<3)
 #define _DPRINTK_FLAGS_INCL_TID(1<<4)
+#define _DPRINTK_FLAGS_DELETE_SITE (1<<7) /* drop site info to save ram */
 
 #define _DPRINTK_FLAGS_INCL_ANYSITE\
(_DPRINTK_FLAGS_INCL_MODNAME\
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 6203a6ad1706..2d10fc1e16cd 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -90,6 +90,7 @@ static struct { unsigned flag:8; char opt_char; } opt_array[] 
= {
{ _DPRINTK_FLAGS_INCL_LINENO, 'l' },
{ _DPRINTK_FLAGS_INCL_TID, 't' },
{ _DPRINTK_FLAGS_NONE, '_' },
+   { _DPRINTK_FLAGS_DELETE_SITE, 'D' },
 };
 
 struct flagsbuf { char buf[ARRAY_SIZE(opt_array)+1]; };
@@ -198,6 +199,14 @@ static inline void ddebug_alter_site(struct _ddebug *dp,
} else if (modifiers->flags & _DPRINTK_FLAGS_PRINT)
static_branch_enable(&dp->key.dd_key_true);
 #endif
+   /* delete site info for this callsite */
+   if (modifiers->flags & _DPRINTK_FLAGS_DELETE_SITE) {
+   if (dp->site) {
+   vpr_info("dropping site info %s.%s.%d\n", 
dp->site->filename,
+   dp->site->function, dp->lineno);
+   dp->site = NULL;
+   }
+   }
 }
 
 /*
-- 
2.29.2

[RFC PATCH v2 15/19] dyndbg: add ddebug_site_get/put api with pass-thru impl

2020-12-25 Thread Jim Cromie

Now that site info is optional, abstract it so we can manage it more
flexibly later.  Change all site users to use ddebug_site_get(p)
instead, which just returns ->site.  ddebug_site_put is called to
balance gets, it currently does nothing.

no functional changes.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 8ad9be28f38e..25f49515c235 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -144,6 +144,14 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
 query->first_lineno, query->last_lineno);
 }
 
+static struct _ddebug_callsite *ddebug_site_get(struct _ddebug *dp)
+{
+   return dp->site;
+}
+static inline void ddebug_site_put(struct _ddebug *dp)
+{
+}
+
 static int ddebug_match_site(const struct ddebug_query *query,
 const struct _ddebug *dp,
 const struct _ddebug_callsite *dc)
@@ -242,13 +250,13 @@ static int ddebug_change(const struct ddebug_query *query,
struct _ddebug_callsite *dc = dp->site;
 
if (!ddebug_match_site(query, dp, dc))
-   continue;
+   goto skipsite;
 
nfound++;
 
newflags = (dp->flags & modifiers->mask) | 
modifiers->flags;
if (newflags == dp->flags)
-   continue;
+   goto skipsite;
 
ddebug_alter_site(dp, modifiers);
 
@@ -264,6 +272,9 @@ static int ddebug_change(const struct ddebug_query *query,
  dt->mod_name, dp->lineno,
  ddebug_describe_flags(dp->flags, 
&fbuf),
  dp->format);
+
+   skipsite:
+   ddebug_site_put(dp);
}
}
mutex_unlock(&ddebug_lock);
@@ -633,11 +644,11 @@ static int remaining(int wrote)
return 0;
 }
 
-static char *__dynamic_emit_prefix(const struct _ddebug *dp, char *buf)
+static char *__dynamic_emit_prefix(struct _ddebug *dp, char *buf)
 {
int pos_after_tid;
int pos = 0;
-   const struct _ddebug_callsite *desc = dp->site;
+   const struct _ddebug_callsite *desc;
 
*buf = '\0';
 
@@ -653,6 +664,7 @@ static char *__dynamic_emit_prefix(const struct _ddebug 
*dp, char *buf)
if (!(dp->flags & _DPRINTK_FLAGS_INCL_ANYSITE))
return buf;
 
+   desc = ddebug_site_get(dp);
if (desc) {
if (dp->flags & _DPRINTK_FLAGS_INCL_MODNAME)
pos += snprintf(buf + pos, remaining(pos), "%s:",
@@ -670,6 +682,8 @@ static char *__dynamic_emit_prefix(const struct _ddebug 
*dp, char *buf)
if (pos >= PREFIX_SIZE)
buf[PREFIX_SIZE - 1] = '\0';
 
+   ddebug_site_put(dp);
+
return buf;
 }
 
@@ -952,7 +966,8 @@ static int ddebug_proc_show(struct seq_file *m, void *p)
return 0;
}
 
-   dc = dp->site;
+   dc = ddebug_site_get(dp);
+
if (dc) {
seq_printf(m, "%s:%u [%s]%s =%s \"",
   trim_prefix(dc->filename), dp->lineno,
@@ -968,6 +983,8 @@ static int ddebug_proc_show(struct seq_file *m, void *p)
seq_puts(m, "\"\n");
}
 
+   ddebug_site_put(dp);
+
return 0;
 }
 
-- 
2.29.2

[RFC PATCH v2 11/19] dyndbg: refactor ddebug_alter_site out of ddebug_change

2020-12-25 Thread Jim Cromie

Move the JUMP_LABEL/static-key code to a separate inline function.

no functional changes.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index daded73c8575..6203a6ad1706 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -188,6 +188,18 @@ static int ddebug_match_site(const struct ddebug_query 
*query,
return true;
 }
 
+static inline void ddebug_alter_site(struct _ddebug *dp,
+struct flag_settings *modifiers)
+{
+#ifdef CONFIG_JUMP_LABEL
+   if (dp->flags & _DPRINTK_FLAGS_PRINT) {
+   if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT))
+   static_branch_disable(&dp->key.dd_key_true);
+   } else if (modifiers->flags & _DPRINTK_FLAGS_PRINT)
+   static_branch_enable(&dp->key.dd_key_true);
+#endif
+}
+
 /*
  * Search the tables for _ddebug's which match the given `query' and
  * apply the `flags' and `mask' to them.  Returns number of matching
@@ -224,13 +236,9 @@ static int ddebug_change(const struct ddebug_query *query,
newflags = (dp->flags & modifiers->mask) | 
modifiers->flags;
if (newflags == dp->flags)
continue;
-#ifdef CONFIG_JUMP_LABEL
-   if (dp->flags & _DPRINTK_FLAGS_PRINT) {
-   if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT))
-   
static_branch_disable(&dp->key.dd_key_true);
-   } else if (modifiers->flags & _DPRINTK_FLAGS_PRINT)
-   static_branch_enable(&dp->key.dd_key_true);
-#endif
+
+   ddebug_alter_site(dp, modifiers);
+
dp->flags = newflags;
 
if (dc)
-- 
2.29.2

[RFC PATCH v2 09/19] dyndbg: optimize ddebug_emit_prefix

2020-12-25 Thread Jim Cromie

Add early return if no callsite info is specified in site-flags.
This avoids fetching site info that isn't going to be printed.
RFC: is this a proper place to use likely()?

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 9 +
 lib/dynamic_debug.c   | 3 +++
 2 files changed, 12 insertions(+)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index e155dfafc4d9..ea07a91a43bc 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -40,6 +40,15 @@ struct _ddebug {
 #define _DPRINTK_FLAGS_INCL_FUNCNAME   (1<<2)
 #define _DPRINTK_FLAGS_INCL_LINENO (1<<3)
 #define _DPRINTK_FLAGS_INCL_TID(1<<4)
+
+#define _DPRINTK_FLAGS_INCL_ANYSITE\
+   (_DPRINTK_FLAGS_INCL_MODNAME\
+| _DPRINTK_FLAGS_INCL_FUNCNAME \
+| _DPRINTK_FLAGS_INCL_LINENO)
+#define _DPRINTK_FLAGS_INCL_ANY\
+   (_DPRINTK_FLAGS_INCL_ANYSITE\
+| _DPRINTK_FLAGS_INCL_TID)
+
 #if defined DEBUG
 #define _DPRINTK_FLAGS_DEFAULT _DPRINTK_FLAGS_PRINT
 #else
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index f6d8137e4a46..8e81ce58c1bd 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -629,6 +629,9 @@ static char *dynamic_emit_prefix(const struct _ddebug *dp, 
char *buf)
}
pos_after_tid = pos;
 
+   if (!(dp->flags & _DPRINTK_FLAGS_INCL_ANYSITE))
+   return buf;
+
if (desc) {
if (dp->flags & _DPRINTK_FLAGS_INCL_MODNAME)
pos += snprintf(buf + pos, remaining(pos), "%s:",
-- 
2.29.2

[RFC PATCH v2 19/19] dyndbg: try DEFINE_DYNAMIC_DEBUG_TABLE

2020-12-25 Thread Jim Cromie

Test DEFINE_DYNAMIC_DEBUG_TABLE, in i915.ko, by adding an invocation
into i915_drv.c, the 1st object built in the Makefile.  This does
manually what can perhaps be done transparently in headers (hopefully).

DEFINE_DYNAMIC_DEBUG_TABLE is based on DEFINE_DYNAMIC_DEBUG_METADATA;
just like its model, it creates a pair of structs: _ddebug &
_ddebug_callsite, and inits them with format="", lineno=0.

This:
- identifies it clearly in control output
- appears to place it 1st in the section(s)
  it works here, at least for modprobed module
  1st-to-build might be the real reason for the sort.

drivers/gpu/drm/i915/i915_drv.c:0 [i915](null) =_ ""
drivers/gpu/drm/i915/gvt/gvt.c:437 [i915]intel_gvt_register_hypervisor =_ "gvt: 
core: Running with hypervisor %s
drivers/gpu/drm/i915/gvt/gvt.c:378 [i915]intel_gvt_init_device =_ "gvt: core: 
gvt device initialization is done\0

other big diff between DEFINE_DYNAMIC_DEBUG_TABLE/_METADATA:
- not static decls.
  removing static made the line appear in control output (below).
  (cuz it might be reffd elsewhere, so its linked).
  we want this.

  it also needs to be visible where DEFINE_DYNAMIC_DEBUG_METADATA is
  used, so it can be used to initialize .module_index, ie in many other
  objects where pr_debugs are coded.

A peek inside i915_drv.o looks about right; i915_site is expected at least.

Relocation section '.rela__dyndbg' at offset 0x6f48 contains 2 entries:
  Offset  Info   Type   Sym. ValueSym. Name + Addend
0010  014c0001 R_X86_64_64    i915_site + 0
0018  001a0001 R_X86_64_64    .rodata.str1.1 + 
548

Relocation section '.rela__dyndbg_callsites' at offset 0x6f90 contains 2 
entries:
  Offset  Info   Type   Sym. ValueSym. Name + Addend
  001a0001 R_X86_64_64    .rodata.str1.1 + 
d8
0008  000a0001 R_X86_64_64    .rodata.str1.8 + 
110

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/i915_drv.c |  3 +++
 include/linux/dynamic_debug.h   | 47 ++---
 2 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index acc32066cec3..f2fae2a476db 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -88,6 +88,9 @@
 
 static struct drm_driver driver;
 
+DEFINE_DYNAMIC_DEBUG_TABLE(i915);
+// DEFINE_DYNAMIC_DEBUG_TABLE(THIS_MODULE);  // one alternative form
+
 static int i915_get_bridge_dev(struct drm_i915_private *dev_priv)
 {
int domain = pci_domain_nr(dev_priv->drm.pdev->bus);
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 0f4e703c97ee..2f3c0f35cea0 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -102,6 +102,36 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 const struct ib_device *ibdev,
 const char *fmt, ...);
 
+/** DEFINE_DYNAMIC_DEBUG_TABLE(module)
+ *
+ * This is an analogue to DEFINE_DYNAMIC_DEBUG_METADATA()
+
+ * It's just a "callsite" whose primary purpose is to create or
+ * reserve the 0th element pair in our sub-vectors of __dyndbg[] &
+ * __dyndbg_callsites[].  The format & lineno are set to sort 1st,
+ * though I suspect our good order is more about linkage.  Either way,
+ * the TABLE is appearing 1st in control's i915 output.
+
+ * We want to use it to initialize .module_index, but I was unable to
+ * ref &module##_base by any construct I thought to try;
+ * KBUILD_MODNAME in particular didnt work.
+
+ */
+
+#define DEFINE_DYNAMIC_DEBUG_TABLE(module) \
+   struct _ddebug_callsite  __aligned(8)   \
+   __section("__dyndbg_callsites") module##_site = {   \
+   .modname = KBUILD_MODNAME,  \
+   .filename = __FILE__,   \
+   .function = NULL,   \
+   };  \
+   struct _ddebug  __aligned(8)\
+   __section("__dyndbg") module##_base = { \
+   .site = &module##_site, \
+   .format = "",   \
+   .lineno = 0, /* sort on mod, line */\
+   }
+
 #define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)   \
static struct _ddebug_callsite  __aligned(8)\
__section("__dyndbg_callsites") name##_site = { \
@@ -115,23 +145,10 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
.format = (fmt),\
.lineno = __LINE__, \
.flags = _DPRINTK_FLAGS_DEFAULT,\
-   _DPRINTK_KEY_INIT

[RFC PATCH v2 13/19] dyndbg: verify dyndbg & dyndbg_callsite invariant

2020-12-25 Thread Jim Cromie

Prove that linker + DECLARE_DYNAMIC_DEBUG_METADATA reliably place the
2 related struct _ddebug* initializations into parallel/ordered slots
in the __dyndbg_* sections.

This is a step towards dropping the pointer between the 2 structs;
maybe the 2 vectors stay ordered, and we can deduce and use N.  Of
course this test won't survive, since it needs the pointer we seek to
drop, but its a start.

0- iterate over __dyndbg_callsite in parallel with __dyndbg
   rename var: s/iter_start/iter_mod_start/ for clarity, consistency.

I disregarded a checkpatch warning about externs in c-files, staying
consistent with long-standing code seemed better.

1- prove that iter->site == site_iter.
   DECLARE_DYNAMIC_DEBUG_METADATA + linker insure this now
   Maybe we can drop pointer, still get order.

WRT the debug-printing, its noisy, but only with verbose=3.
It warrants trimming later.

The offset grows smoothly, because it is N * sizeof(structs), which
differ.  It looks reliable.  Amend later to do math, converge on
truth.  If numbers are stable after stripping pointer, we have N.

   recptr  mod-ptr N (void*)p
[1.929072] dyndbg: 2828: 82b32f28 82b32f10 1 24 40
[1.929326] dyndbg: 2829: 82b32f40 82b32f10 2 48 80
[1.930209] dyndbg:   2 debug prints in module i386

We have N (col 4), and N * structsize (col 5).  I feel like it still
needs more staring at.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 2d10fc1e16cd..c1a113460637 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -41,6 +41,8 @@
 
 extern struct _ddebug __start___dyndbg[];
 extern struct _ddebug __stop___dyndbg[];
+extern struct _ddebug_callsite __start___dyndbg_callsites[];
+extern struct _ddebug_callsite __stop___dyndbg_callsites[];
 
 struct ddebug_table {
struct list_head link;
@@ -119,6 +121,7 @@ do {
\
 
 #define vpr_info(fmt, ...) vnpr_info(1, fmt, ##__VA_ARGS__)
 #define v2pr_info(fmt, ...)vnpr_info(2, fmt, ##__VA_ARGS__)
+#define v3pr_info(fmt, ...)vnpr_info(2, fmt, ##__VA_ARGS__)
 
 static void vpr_info_dq(const struct ddebug_query *query, const char *msg)
 {
@@ -1147,7 +1150,8 @@ static int __init dynamic_debug_init_control(void)
 
 static int __init dynamic_debug_init(void)
 {
-   struct _ddebug *iter, *iter_start;
+   struct _ddebug *iter, *iter_mod_start;
+   struct _ddebug_callsite *site, *site_mod_start;
const char *modname = NULL;
char *cmdline;
int ret = 0;
@@ -1162,23 +1166,33 @@ static int __init dynamic_debug_init(void)
ddebug_init_success = 1;
return 0;
}
-   iter = __start___dyndbg;
+
+   iter = iter_mod_start = __start___dyndbg;
+   site = site_mod_start = __start___dyndbg_callsites;
modname = iter->site->modname;
-   iter_start = iter;
-   for (; iter < __stop___dyndbg; iter++) {
+
+   for (; iter < __stop___dyndbg; iter++, site++) {
+
+   BUG_ON(site != iter->site);
+   v3pr_info("%u: %px %ld %ld %ld\n", entries, site,
+ site - site_mod_start,
+ ((void *)site - (void *)site_mod_start),
+ ((void *)iter - (void *)iter_mod_start));
entries++;
+
if (strcmp(modname, iter->site->modname)) {
modct++;
-   ret = ddebug_add_module(iter_start, n, modname);
+   ret = ddebug_add_module(iter_mod_start, n, modname);
if (ret)
goto out_err;
n = 0;
modname = iter->site->modname;
-   iter_start = iter;
+   iter_mod_start = iter;
+   site_mod_start = site;
}
n++;
}
-   ret = ddebug_add_module(iter_start, n, modname);
+   ret = ddebug_add_module(iter_mod_start, n, modname);
if (ret)
goto out_err;
 
-- 
2.29.2

[RFC PATCH v2 10/19] dyndbg: avoid calling dyndbg_emit_prefix when it has no work

2020-12-25 Thread Jim Cromie

Wrap function in a static-inline one, which checks flags to avoid
calling the function unnecessarily.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 8e81ce58c1bd..daded73c8575 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -612,7 +612,7 @@ static int remaining(int wrote)
return 0;
 }
 
-static char *dynamic_emit_prefix(const struct _ddebug *dp, char *buf)
+static char *__dynamic_emit_prefix(const struct _ddebug *dp, char *buf)
 {
int pos_after_tid;
int pos = 0;
@@ -652,6 +652,13 @@ static char *dynamic_emit_prefix(const struct _ddebug *dp, 
char *buf)
return buf;
 }
 
+static inline char *dynamic_emit_prefix(struct _ddebug *dp, char *buf)
+{
+   if (unlikely(dp->flags & _DPRINTK_FLAGS_INCL_ANY))
+   return __dynamic_emit_prefix(dp, buf);
+   return buf;
+}
+
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...)
 {
va_list args;
-- 
2.29.2

[RFC PATCH v2 07/19] dyndbg: accept null site in dynamic_emit_prefix

2020-12-25 Thread Jim Cromie

2 prints use site->member, protect them with if site.

no functional changes.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 5422cef58130..190a796da03a 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -628,15 +628,19 @@ static char *dynamic_emit_prefix(const struct _ddebug 
*dp, char *buf)
task_pid_vnr(current));
}
pos_after_tid = pos;
-   if (dp->flags & _DPRINTK_FLAGS_INCL_MODNAME)
-   pos += snprintf(buf + pos, remaining(pos), "%s:",
-   desc->modname);
-   if (dp->flags & _DPRINTK_FLAGS_INCL_FUNCNAME)
-   pos += snprintf(buf + pos, remaining(pos), "%s:",
-   desc->function);
+
+   if (desc) {
+   if (dp->flags & _DPRINTK_FLAGS_INCL_MODNAME)
+   pos += snprintf(buf + pos, remaining(pos), "%s:",
+   desc->modname);
+   if (dp->flags & _DPRINTK_FLAGS_INCL_FUNCNAME)
+   pos += snprintf(buf + pos, remaining(pos), "%s:",
+   desc->function);
+   }
if (dp->flags & _DPRINTK_FLAGS_INCL_LINENO)
pos += snprintf(buf + pos, remaining(pos), "%d:",
dp->lineno);
+
if (pos - pos_after_tid)
pos += snprintf(buf + pos, remaining(pos), " ");
if (pos >= PREFIX_SIZE)
-- 
2.29.2

[RFC PATCH v2 14/19] dyndbg+module: expose ddebug_callsites to modules

2020-12-25 Thread Jim Cromie

In order to drop the pointer connecting _ddebug records to _callsites,
we need to elevate the latter; we need to track it in (internal)
ddebug_tables, and set it in ddebug_add_module.  That last part
exposes it by interface to module.c, so we add a field to load_info,
and adjust load_module to initialize it from the elf section.

Its possible that this closes a hole created when __dyndbg_callsites
section was added, and wasnt handled by module load-info.  I never saw
any misbehavior loading i915.ko into a vm, but still..

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h |  4 ++--
 kernel/module-internal.h  |  1 +
 kernel/module.c   |  9 ++---
 lib/dynamic_debug.c   | 12 
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 49fa1390d1f8..0fcbe96736f3 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -70,8 +70,8 @@ struct _ddebug {
 /* exported for module authors to exercise >control */
 int dynamic_debug_exec_queries(const char *query, const char *modname);
 
-int ddebug_add_module(struct _ddebug *tab, unsigned int n,
-   const char *modname);
+int ddebug_add_module(struct _ddebug *tab, struct _ddebug_callsite *sites,
+ unsigned int n, const char *modname);
 extern int ddebug_remove_module(const char *mod_name);
 extern __printf(2, 3)
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...);
diff --git a/kernel/module-internal.h b/kernel/module-internal.h
index 33783abc377b..920b085d2a1b 100644
--- a/kernel/module-internal.h
+++ b/kernel/module-internal.h
@@ -18,6 +18,7 @@ struct load_info {
char *secstrings, *strtab;
unsigned long symoffs, stroffs, init_typeoffs, core_typeoffs;
struct _ddebug *debug;
+   struct _ddebug_callsite *sites;
unsigned int num_debug;
bool sig_ok;
 #ifdef CONFIG_KALLSYMS
diff --git a/kernel/module.c b/kernel/module.c
index a4fa44a652a7..876765bc666a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2820,11 +2820,12 @@ static void add_kallsyms(struct module *mod, const 
struct load_info *info)
 }
 #endif /* CONFIG_KALLSYMS */
 
-static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug, 
unsigned int num)
+static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug,
+   struct _ddebug_callsite *sites, unsigned int 
num)
 {
if (!debug)
return;
-   ddebug_add_module(debug, num, mod->name);
+   ddebug_add_module(debug, sites, num, mod->name);
 }
 
 static void dynamic_debug_remove(struct module *mod, struct _ddebug *debug)
@@ -3299,6 +3300,8 @@ static int find_module_sections(struct module *mod, 
struct load_info *info)
 
info->debug = section_objs(info, "__dyndbg",
   sizeof(*info->debug), &info->num_debug);
+   info->sites = section_objs(info, "__dyndbg_callsites",
+  sizeof(*info->sites), &info->num_debug);
 
return 0;
 }
@@ -3937,7 +3940,7 @@ static int load_module(struct load_info *info, const char 
__user *uargs,
goto free_arch_cleanup;
}
 
-   dynamic_debug_setup(mod, info->debug, info->num_debug);
+   dynamic_debug_setup(mod, info->debug, info->sites, info->num_debug);
 
/* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
ftrace_module_init(mod);
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index c1a113460637..8ad9be28f38e 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -49,6 +49,7 @@ struct ddebug_table {
const char *mod_name;
unsigned int num_ddebugs;
struct _ddebug *ddebugs;
+   struct _ddebug_callsite *sites;
 };
 
 struct ddebug_query {
@@ -1014,8 +1015,8 @@ static const struct proc_ops proc_fops = {
  * Allocate a new ddebug_table for the given module
  * and add it to the global list.
  */
-int ddebug_add_module(struct _ddebug *tab, unsigned int n,
-const char *name)
+int ddebug_add_module(struct _ddebug *tab, struct _ddebug_callsite *sites,
+ unsigned int n, const char *name)
 {
struct ddebug_table *dt;
 
@@ -1033,6 +1034,7 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n,
dt->mod_name = name;
dt->num_ddebugs = n;
dt->ddebugs = tab;
+   dt->sites = sites;
 
mutex_lock(&ddebug_lock);
list_add(&dt->link, &ddebug_tables);
@@ -1182,7 +1184,9 @@ static int __init dynamic_debug_init(void)
 
if (strcmp(modname, iter->site->modname)) {
modct++;
-   ret = ddebug_add_module(iter_mod_start, n, modname);
+
+   ret = ddebug_add_module(iter_mod_start, site_mod_start,
+   n, modname);

[RFC PATCH v2 08/19] dyndbg: accept null site in ddebug_proc_show

2020-12-25 Thread Jim Cromie

Accept a ddebug record with a null site pointer, and write abbreviated
output for that record that doesn't include site info (but does
include line-number, since that can be used in >control queries).
Also add a 2nd header line with a template for the new output.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 190a796da03a..f6d8137e4a46 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -915,18 +915,27 @@ static int ddebug_proc_show(struct seq_file *m, void *p)
 
if (p == SEQ_START_TOKEN) {
seq_puts(m,
-"# filename:lineno [module]function flags format\n");
+"#: filename:lineno [module]function flags format\n"
+"#| [module]:lineno flags format\n"
+   );
return 0;
}
 
dc = dp->site;
-
-   seq_printf(m, "%s:%u [%s]%s =%s \"",
-  trim_prefix(dc->filename), dp->lineno,
-  iter->table->mod_name, dc->function,
-  ddebug_describe_flags(dp->flags, &flags));
-   seq_escape(m, dp->format, "\t\r\n\"");
-   seq_puts(m, "\"\n");
+   if (dc) {
+   seq_printf(m, "%s:%u [%s]%s =%s \"",
+  trim_prefix(dc->filename), dp->lineno,
+  iter->table->mod_name, dc->function,
+  ddebug_describe_flags(dp->flags, &flags));
+   seq_escape(m, dp->format, "\t\r\n\"");
+   seq_puts(m, "\"\n");
+   } else {
+   seq_printf(m, "[%s]:%u =%s \"",
+  iter->table->mod_name, dp->lineno,
+  ddebug_describe_flags(dp->flags, &flags));
+   seq_escape(m, dp->format, "\t\r\n\"");
+   seq_puts(m, "\"\n");
+   }
 
return 0;
 }
-- 
2.29.2

[RFC PATCH v2 04/19] dyndbg: accept null site in ddebug_match_site

2020-12-25 Thread Jim Cromie

make ddebug_match_site() tolerate null site.

1- move format and line-number check code to the top of the function,
   since they don't use/check site info.

2- test site pointer:
   If its null, we return early, skipping 3:
  If the query tests against missing site info, fail the match.
  otherwize site matches.

3- rest of function (checking site vs query) is unchanged.

ddebug_match_site is agnostic re' module, because it's tested already
by the caller, where it is known from debug_tables.  If a query
constrains module, forex: "module drm*", non-matching modules are
skipped entirely in caller, so we can ignore it here.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 41 +
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index d9a0527ec842..bb9279c8cbfd 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -142,21 +142,7 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
 static int ddebug_match_site(const struct ddebug_query *query,
 const struct _ddebug *dp)
 {
-   struct _ddebug_callsite *dc = dp->site;
-
-   /* match against the source filename */
-   if (query->filename &&
-   !match_wildcard(query->filename, dc->filename) &&
-   !match_wildcard(query->filename,
-   kbasename(dc->filename)) &&
-   !match_wildcard(query->filename,
-   trim_prefix(dc->filename)))
-   return false;
-
-   /* match against the function */
-   if (query->function &&
-   !match_wildcard(query->function, dc->function))
-   return false;
+   struct _ddebug_callsite *dc;
 
/* match against the format */
if (query->format) {
@@ -178,6 +164,29 @@ static int ddebug_match_site(const struct ddebug_query 
*query,
dp->lineno > query->last_lineno)
return false;
 
+   dc = dp->site;
+   if (!dc) {
+   /* site info has been dropped, so query cannot test these 
fields */
+   if (query->filename || query->function)
+   return false;
+   else
+   return true;
+   }
+
+   /* match against the source filename */
+   if (query->filename &&
+   !match_wildcard(query->filename, dc->filename) &&
+   !match_wildcard(query->filename,
+   kbasename(dc->filename)) &&
+   !match_wildcard(query->filename,
+   trim_prefix(dc->filename)))
+   return false;
+
+   /* match against the function */
+   if (query->function &&
+   !match_wildcard(query->function, dc->function))
+   return false;
+
return true;
 }
 
@@ -207,7 +216,7 @@ static int ddebug_change(const struct ddebug_query *query,
 
for (i = 0; i < dt->num_ddebugs; i++) {
struct _ddebug *dp = &dt->ddebugs[i];
-   struct _ddebug_callsite *dc = dp->site;
+   struct _ddebug_callsite *dc;
 
if (!ddebug_match_site(query, dp))
continue;
-- 
2.29.2

[RFC PATCH v2 06/19] dyndbg: accept null site in ddebug_change

2020-12-25 Thread Jim Cromie

fix a debug-print that includes site info, by adding an alternate
debug message that does not.

no functional changes.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 050c65142d9b..5422cef58130 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -232,10 +232,17 @@ static int ddebug_change(const struct ddebug_query *query,
static_branch_enable(&dp->key.dd_key_true);
 #endif
dp->flags = newflags;
-   v2pr_info("changed %s:%d [%s]%s =%s\n",
-trim_prefix(dc->filename), dp->lineno,
-dt->mod_name, dc->function,
-ddebug_describe_flags(dp->flags, &fbuf));
+
+   if (dc)
+   v2pr_info("changed %s:%d [%s]%s =%s\n",
+ trim_prefix(dc->filename), dp->lineno,
+ dt->mod_name, dc->function,
+ ddebug_describe_flags(dp->flags, 
&fbuf));
+   else
+   v2pr_info("changed %s:%d =%s \"%s\"\n",
+ dt->mod_name, dp->lineno,
+ ddebug_describe_flags(dp->flags, 
&fbuf),
+ dp->format);
}
}
mutex_unlock(&ddebug_lock);
-- 
2.29.2

[RFC PATCH v2 05/19] dyndbg: hoist ->site out of ddebug_match_site

2020-12-25 Thread Jim Cromie

A coming change adds _get/_put abstraction on the site pointer, to
allow managing site info more flexibly.  The get/put pattern is best
done at a single lexical scope, where its more obviously correct, so
hoist the ->site ref out of ddebug_match_site, and pass it in instead.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index bb9279c8cbfd..050c65142d9b 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -140,10 +140,9 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
 }
 
 static int ddebug_match_site(const struct ddebug_query *query,
-const struct _ddebug *dp)
+const struct _ddebug *dp,
+const struct _ddebug_callsite *dc)
 {
-   struct _ddebug_callsite *dc;
-
/* match against the format */
if (query->format) {
if (*query->format == '^') {
@@ -164,7 +163,6 @@ static int ddebug_match_site(const struct ddebug_query 
*query,
dp->lineno > query->last_lineno)
return false;
 
-   dc = dp->site;
if (!dc) {
/* site info has been dropped, so query cannot test these 
fields */
if (query->filename || query->function)
@@ -216,9 +214,9 @@ static int ddebug_change(const struct ddebug_query *query,
 
for (i = 0; i < dt->num_ddebugs; i++) {
struct _ddebug *dp = &dt->ddebugs[i];
-   struct _ddebug_callsite *dc;
+   struct _ddebug_callsite *dc = dp->site;
 
-   if (!ddebug_match_site(query, dp))
+   if (!ddebug_match_site(query, dp, dc))
continue;
 
nfound++;
-- 
2.29.2

[RFC PATCH v2 02/19] dyndbg: split struct _ddebug, move display fields to new _ddebug_callsite

2020-12-25 Thread Jim Cromie

struct _ddebug has 5 fields used to select/display pr_debug callsites;
move 3 of them: module, function, file to new struct _ddebug_callsite,
and add pointer from 1st to 2nd (head to body).

The format field is excluded from the move because it is always needed
for an enabled site, the others are just optional decorations, at
least from the logging perspective.  While lineno is also optional, it
can share space with flags, so it stays for density reasons.

While this increases memory footprint by 1 ptr per pr_debug, the
indirection gives several advantages:

- we can drop callsites and their storage opportunistically.

  Subsystems may not want "module:func:line:" in their logs.
  If they use format-prefixes, they can select on them,
  and don't need the site info.
  forex:
  #> echo module drm format "^drm:kms: " +p >control
  ie: dynamic_debug_exec_queries("format '^drm:kms: '", "drm");

- the moved display fields are inherently hierarchical, and the linker
  section is ordered; so (module, file, function) have repeating
  values (90%, 85%, 45%).  This is readily compressible, even with a
  simple field-wise run length encoding.  Since I'm touching this so
  deeply, I reordered the fields to match the hierarchy.

- the separate linker section sets up nice for block compression.
  we could even provide as a kernel associated 'blob', like initrd, DT

- we can allocate uncompressed storage only for enabled callsites.
  could deallocate sites on memory pressure.

- if we can rely on the linker to fill the 2 __dyndbg* sections in the
  same order, we could treat them as parallel vectors, drop the
  pointer, and store each element's index into _ddebug's padding to
  get the _callsite[N].  Do same for flags.

Whats actually done here is rather mechanical.

dynamic_debug.h:

I cut struct _ddebug in half, renamed top-half (body), dropped
__align(8) on the body (its a no-op with 8 byte pointers), and kept
the __align(8) on the head (I suspect its there for the static_key
member).  I added a forward decl for a unified comment for both head &
body, and added head.site to point at body.

DECLARE_DYNAMIC_DEBUG_METADATA does the core of the work; it declares
and initializes both static struct vars together, and refs one to the
other.

dynamic_debug.c:

dynamic_debug_init() mem-usage now also counts callsites.

dynamic_emit_prefix() & ddebug_change() use those moved fields; they
get a new initialized auto-var, and the field refs get adjusted as
needed to follow the move from one struct to the other.

   struct _ddebug_callsite *dc = dp->site;

ddebug_proc_show() differs slightly; it assigns to (not initializes)
the autovar, to avoid a panic when p == SEQ_START_TOKEN.

vmlinux.lds.h:

add __ddebug_callsites section, with the same align(8) and KEEP as
used in the __ddebug section.  RFC this is slightly out of sync with
METADATA code, and dropping align(8) on the struct itself.

Signed-off-by: Jim Cromie 
---
 include/asm-generic/vmlinux.lds.h |  4 +++
 include/linux/dynamic_debug.h | 37 +-
 lib/dynamic_debug.c   | 44 ++-
 3 files changed, 53 insertions(+), 32 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index b2b3d81b1535..1ef1efc73d20 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -340,6 +340,10 @@
*(__tracepoints)\
/* implement dynamic printk debug */\
. = ALIGN(8);   \
+   __start___dyndbg_callsites = .; \
+   KEEP(*(__dyndbg_callsites)) \
+   __stop___dyndbg_callsites = .;  \
+   . = ALIGN(8);   \
__start___dyndbg = .;   \
KEEP(*(__dyndbg))   \
__stop___dyndbg = .;\
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index a57ee75342cf..e155dfafc4d9 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -7,20 +7,28 @@
 #endif
 
 /*
- * An instance of this structure is created in a special
- * ELF section at every dynamic debug callsite.  At runtime,
- * the special section is treated as an array of these.
+ * A pair of these structs are created in 2 special ELF sections
+ * (__dyndbg, __dyndbg_callsites) for every dynamic debug callsite.
+ * At runtime, the sections are treated as arrays.
  */
-struct _ddebug {
+struct _ddebug;
+struct _ddebug_callsite {
/*
-* These fields are used to drive the user interface
-* for selecting and displaying debug callsites.
+* These fields (and lineno) are used to:
+* - decorate l

[RFC PATCH v2 00/19] dynamic debug diet plan

2020-12-25 Thread Jim Cromie

Well, we're mostly overeating, but we can all look forward to a diet
in January.  And more exersize.

dyndbg's compiled-in data-table currently uses 56 bytes per prdebug;
this includes 3 pointers to hierarchical "decorator" data, which is
primarily for adding "module:function:line:" prefixes to prdebug
messages, and for enabling and modifying those prdebugs selectively.

This patchset decouples "decorator" data, and makes it optional, and
disposable.  By separating that data, it opens up possiblities to
compress it, swap it out, map it selectively, etc.


In more detail, patchset does:

1- split struct _ddebug in 2, move "decorator" fields to _ddebug_callsites.

while this adds a pointer per site to memory footprint, it allows:

a- dropping decorations and storage, for some use cases.
   this could include DRM:
   - want to save calls to drm_debug_enabled()
   - use distinct categories, can map to format-prefixes, ex: "drm:kms:"
   - don't need "module:function:line" dynamic prefixes.
   - dont mind loss of info/context in /proc/dynamic_debug/control

b- ddebug_callsites[] contents are hierarchical, compressible.
c- ddebug_callsites[] in separate section is compressible as a block.
d- for just enabled prdebugs, could allocate callsites and fill from zblock.

2- make ddebug_callsites optional internally.
   This lets us drop them outright, for any reason, perhaps memory pressure.

3- allow dropping callsites by those users.
   echo module drm +D > /proc/dynamic_debug/control
   this doesnt currently recover __dyndbg_callsites storage

4- drop _ddebug.site, convert to _ddebug[N].property lookup.
   RFC is mostly here.

rev1: 
https://lore.kernel.org/lkml/20201125194855.2267337-1-jim.cro...@gmail.com/

rev2 differs by dropping zram attempt, making callsite data optional, etc.


Jim Cromie (19): against v5.10

  dyndbg: fix use before null check
1 dyndbg: split struct _ddebug, move display fields to new
_ddebug_callsite

2 dyndbg: refactor part of ddebug_change to ddebug_match_site
  dyndbg: accept null site in ddebug_match_site
  dyndbg: hoist ->site out of ddebug_match_site
  dyndbg: accept null site in ddebug_change
  dyndbg: accept null site in dynamic_emit_prefix
  dyndbg: accept null site in ddebug_proc_show
  
  dyndbg: optimize ddebug_emit_prefix
  dyndbg: avoid calling dyndbg_emit_prefix when it has no work
  
3 dyndbg: refactor ddebug_alter_site out of ddebug_change
  dyndbg: allow deleting site info via control interface
  
4 dyndbg: verify __dyndbg & __dyndbg_callsite invariant
  dyndbg+module: expose dyndbg_callsites to modules
  dyndbg: add ddebug_site_get/put api with pass-thru impl
  dyndbg: ddebug_site_get/put api commentary
  dyndbg: rearrange struct ddebug_callsites
  dyndbg: add module_index to struct _ddebug
  dyndbg: try DEFINE_DYNAMIC_DEBUG_TABLE

 drivers/gpu/drm/i915/i915_drv.c   |   3 +
 include/asm-generic/vmlinux.lds.h |   4 +
 include/linux/dynamic_debug.h |  97 ---
 kernel/module-internal.h  |   1 +
 kernel/module.c   |   9 +-
 lib/dynamic_debug.c   | 271 +-
 6 files changed, 283 insertions(+), 102 deletions(-)

-- 
2.29.2

[RFC PATCH v2 03/19] dyndbg: refactor part of ddebug_change to ddebug_match_site

2020-12-25 Thread Jim Cromie

Move all the site-match logic into a separate function, reindent the
code, and replace the continues with return falses.

No functional changes.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 75 ++---
 1 file changed, 43 insertions(+), 32 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index fb8e0b288f89..d9a0527ec842 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -139,6 +139,48 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
 query->first_lineno, query->last_lineno);
 }
 
+static int ddebug_match_site(const struct ddebug_query *query,
+const struct _ddebug *dp)
+{
+   struct _ddebug_callsite *dc = dp->site;
+
+   /* match against the source filename */
+   if (query->filename &&
+   !match_wildcard(query->filename, dc->filename) &&
+   !match_wildcard(query->filename,
+   kbasename(dc->filename)) &&
+   !match_wildcard(query->filename,
+   trim_prefix(dc->filename)))
+   return false;
+
+   /* match against the function */
+   if (query->function &&
+   !match_wildcard(query->function, dc->function))
+   return false;
+
+   /* match against the format */
+   if (query->format) {
+   if (*query->format == '^') {
+   char *p;
+   /* anchored search. match must be at beginning */
+   p = strstr(dp->format, query->format+1);
+   if (p != dp->format)
+   return false;
+   } else if (!strstr(dp->format, query->format))
+   return false;
+   }
+
+   /* match against the line number range */
+   if (query->first_lineno &&
+   dp->lineno < query->first_lineno)
+   return false;
+   if (query->last_lineno &&
+   dp->lineno > query->last_lineno)
+   return false;
+
+   return true;
+}
+
 /*
  * Search the tables for _ddebug's which match the given `query' and
  * apply the `flags' and `mask' to them.  Returns number of matching
@@ -167,38 +209,7 @@ static int ddebug_change(const struct ddebug_query *query,
struct _ddebug *dp = &dt->ddebugs[i];
struct _ddebug_callsite *dc = dp->site;
 
-   /* match against the source filename */
-   if (query->filename &&
-   !match_wildcard(query->filename, dc->filename) &&
-   !match_wildcard(query->filename,
-  kbasename(dc->filename)) &&
-   !match_wildcard(query->filename,
-  trim_prefix(dc->filename)))
-   continue;
-
-   /* match against the function */
-   if (query->function &&
-   !match_wildcard(query->function, dc->function))
-   continue;
-
-   /* match against the format */
-   if (query->format) {
-   if (*query->format == '^') {
-   char *p;
-   /* anchored search. match must be at 
beginning */
-   p = strstr(dp->format, query->format+1);
-   if (p != dp->format)
-   continue;
-   } else if (!strstr(dp->format, query->format))
-   continue;
-   }
-
-   /* match against the line number range */
-   if (query->first_lineno &&
-   dp->lineno < query->first_lineno)
-   continue;
-   if (query->last_lineno &&
-   dp->lineno > query->last_lineno)
+   if (!ddebug_match_site(query, dp))
continue;
 
nfound++;
-- 
2.29.2

UBSAN: shift-out-of-bounds in dummy_hub_control

2020-12-25 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:e37b12e4 Merge tag 'for-linus-5.11-ofs1' of git://git.kern..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1742993750
kernel config:  https://syzkaller.appspot.com/x/.config?x=98408202fed1f636
dashboard link: https://syzkaller.appspot.com/bug?extid=5925509f78293baa7331
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1781fc5b50
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=157cd12350

The issue was bisected to:

commit 8442b02bf3c6770e0d7e7ea17be36c30e95987b6
Author: Andrey Konovalov 
Date:   Mon Oct 21 14:20:58 2019 +

USB: dummy-hcd: increase max number of devices to 32

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1631d0db50
final oops: https://syzkaller.appspot.com/x/report.txt?x=1531d0db50
console output: https://syzkaller.appspot.com/x/log.txt?x=1131d0db50

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5925509f78293baa7...@syzkaller.appspotmail.com
Fixes: 8442b02bf3c6 ("USB: dummy-hcd: increase max number of devices to 32")


UBSAN: shift-out-of-bounds in drivers/usb/gadget/udc/dummy_hcd.c:2293:33
shift exponent 257 is too large for 32-bit type 'int'
CPU: 0 PID: 8526 Comm: syz-executor949 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 dummy_hub_control.cold+0x1a/0xbc drivers/usb/gadget/udc/dummy_hcd.c:2293
 rh_call_control drivers/usb/core/hcd.c:683 [inline]
 rh_urb_enqueue drivers/usb/core/hcd.c:841 [inline]
 usb_hcd_submit_urb+0xcaa/0x22d0 drivers/usb/core/hcd.c:1544
 usb_submit_urb+0x6e4/0x1560 drivers/usb/core/urb.c:585
 usb_start_wait_urb+0x101/0x4c0 drivers/usb/core/message.c:58
 usb_internal_control_msg drivers/usb/core/message.c:102 [inline]
 usb_control_msg+0x31c/0x4a0 drivers/usb/core/message.c:153
 do_proc_control+0x4cb/0x9c0 drivers/usb/core/devio.c:1165
 proc_control drivers/usb/core/devio.c:1191 [inline]
 usbdev_do_ioctl drivers/usb/core/devio.c:2535 [inline]
 usbdev_ioctl+0x12c1/0x3b20 drivers/usb/core/devio.c:2708
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x443f29
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
fb d7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7ffc10df4328 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 004002e0 RCX: 00443f29
RDX: 2000 RSI: c0185500 RDI: 0003
RBP: 006ce018 R08:  R09: 004002e0
R10: 000f R11: 0246 R12: 00401bb0
R13: 00401c40 R14:  R15: 



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Re: [PATCH v3 3/3] drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs

2020-12-25 Thread Gabriel L. Somlo

On Fri, Dec 25, 2020 at 09:21:20AM -0500, Gabriel Somlo wrote:
> Upstream LiteX now defaults to using 32-bit CSR subregisters
> (see https://github.com/enjoy-digital/litex/commit/a2b71fde).
> 
> This patch expands on commit 22447a99c97e ("drivers/soc/litex: add
> LiteX SoC Controller driver"), adding support for handling both 8-
> and 32-bit LiteX CSR (MMIO) subregisters, as determined by the
> LITEX_SUBREG_SIZE Kconfig option.
> 
> NOTE that while LITEX_SUBREG_SIZE could theoretically be a device
> tree property, defining it as a compile-time constant allows for
> much better optimization of the resulting code. This is further
> supported by the low expected usefulness of deploying the same
> kernel across LiteX SoCs built with different CSR-Bus data widths.
> 
> The constant LITEX_SUBREG_SIZE is renamed to the more descriptive
> LITEX_SUBREG_ALIGN (LiteX CSR subregisters are located at 32-bit
> aligned MMIO addresses).

In the next version (v4), the above will read:

The constant LITEX_REG_SIZE is renamed to the more descriptive
LITEX_SUBREG_ALIGN (LiteX CSR subregisters are located at 32-bit
aligned MMIO addresses).

(i.e., s/LITEX_SUBREG_SIZE/LITEX_REG_SIZE/, sorry for the typo).

> Finally, the litex_[read|write][8|16|32|64]() accessors are
> redefined in terms of litex_[get|set]_reg(), which, after compiler
> optimization, will result in code as efficient as hardcoded shifts,
> but with the added benefit of automatically matching the appropriate
> LITEX_SUBREG_SIZE.
> 
> NOTE that litex_[get|set]_reg() nominally operate 64-bit data, but
> that too will be optimized away by the compiler in situations where
> narrower data is used.

And, while we're at it, the above paragraph will also come off as
hopefully just a bit more articulate:

NOTE that litex_[get|set]_reg() nominally operate on 64-bit data,
but that will also be optimized by the compiler in situations where
narrower data is used from a call site.

Thanks again for the consideration,
--Gabriel

> 
> Signed-off-by: Gabriel Somlo 
> ---
>  drivers/soc/litex/Kconfig  |  12 +++
>  drivers/soc/litex/litex_soc_ctrl.c |   3 +-
>  include/linux/litex.h  | 139 -
>  3 files changed, 70 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/soc/litex/Kconfig b/drivers/soc/litex/Kconfig
> index 7c6b009b6f6c..973f8d2fe1a7 100644
> --- a/drivers/soc/litex/Kconfig
> +++ b/drivers/soc/litex/Kconfig
> @@ -16,4 +16,16 @@ config LITEX_SOC_CONTROLLER
> All drivers that use functions from litex.h must depend on
> LITEX.
>  
> +config LITEX_SUBREG_SIZE
> + int "Size of a LiteX CSR subregister, in bytes"
> + depends on LITEX
> + range 1 4
> + default 4
> + help
> + LiteX MMIO registers (referred to as Configuration and Status
> + registers, or CSRs) are spread across adjacent 8- or 32-bit
> + subregisters, located at 32-bit aligned MMIO addresses. Use
> + this to select the appropriate size (1 or 4 bytes) matching
> + your particular LiteX build.
> +
>  endmenu
> diff --git a/drivers/soc/litex/litex_soc_ctrl.c 
> b/drivers/soc/litex/litex_soc_ctrl.c
> index 65977526d68e..da17ba56b795 100644
> --- a/drivers/soc/litex/litex_soc_ctrl.c
> +++ b/drivers/soc/litex/litex_soc_ctrl.c
> @@ -58,7 +58,8 @@ static int litex_check_csr_access(void __iomem *reg_addr)
>   /* restore original value of the SCRATCH register */
>   litex_write32(reg_addr + SCRATCH_REG_OFF, SCRATCH_REG_VALUE);
>  
> - pr_info("LiteX SoC Controller driver initialized");
> + pr_info("LiteX SoC Controller driver initialized: subreg:%d, align:%d",
> + LITEX_SUBREG_SIZE, LITEX_SUBREG_ALIGN);
>  
>   return 0;
>  }
> diff --git a/include/linux/litex.h b/include/linux/litex.h
> index 918bab45243c..53fb03a2f257 100644
> --- a/include/linux/litex.h
> +++ b/include/linux/litex.h
> @@ -10,20 +10,19 @@
>  #define _LINUX_LITEX_H
>  
>  #include 
> -#include 
> -#include 
>  
> -/*
> - * The parameters below are true for LiteX SoCs configured for 8-bit CSR Bus,
> - * 32-bit aligned.
> - *
> - * Supporting other configurations will require extending the logic in this
> - * header and in the LiteX SoC controller driver.
> - */
> -#define LITEX_REG_SIZE 0x4
> -#define LITEX_SUBREG_SIZE0x1
> +/* LiteX SoCs support 8- or 32-bit CSR Bus data width (i.e., subreg. size) */
> +#if defined(CONFIG_LITEX_SUBREG_SIZE) && \
> + (CONFIG_LITEX_SUBREG_SIZE == 1 || CONFIG_LITEX_SUBREG_SIZE == 4)
> +#define LITEX_SUBREG_SIZE  CONFIG_LITEX_SUBREG_SIZE
> +#else
> +#error LiteX subregister size (LITEX_SUBREG_SIZE) must be 4 or 1!
> +#endif
>  #define LITEX_SUBREG_SIZE_BIT (LITEX_SUBREG_SIZE * 8)
>  
> +/* LiteX subregisters of any width are always aligned on a 4-byte boundary */
> +#define LITEX_SUBREG_ALIGN 0x4
> +
>  static inline void _write_litex_subregister(u32 val, void __iomem *addr)
>  {
>   writel((u32 __force)cpu_to_le32(val), addr);
>

Re: [v6,3/4] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

2020-12-25 Thread Marc Zyngier

Hi Jianjun,

On Fri, 25 Dec 2020 10:03:07 +,
Jianjun Wang  wrote:
> 
> MediaTek's PCIe host controller has three generation HWs, the new
> generation HW is an individual bridge, it supports Gen3 speed and
> up to 256 MSI interrupt numbers for multi-function devices.
> 
> Add support for new Gen3 controller which can be found on MT8192.
> 
> Signed-off-by: Jianjun Wang 
> Acked-by: Ryder Lee 
> ---
>  drivers/pci/controller/Kconfig  |   13 +
>  drivers/pci/controller/Makefile |1 +
>  drivers/pci/controller/pcie-mediatek-gen3.c | 1084 +++
>  3 files changed, 1098 insertions(+)
>  create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c

This is a pretty large patch, and it'd be great if you would split it
into at least 4 parts (core PCIe, PM, MSI, INTx).

> 
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 64e2f5e379aa..b242b17025b3 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -242,6 +242,19 @@ config PCIE_MEDIATEK
> Say Y here if you want to enable PCIe controller support on
> MediaTek SoCs.
>  
> +config PCIE_MEDIATEK_GEN3
> + tristate "MediaTek Gen3 PCIe controller"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + depends on PCI_MSI_IRQ_DOMAIN
> + help
> +   Adds support for PCIe Gen3 MAC controller for MediaTek SoCs.
> +   This PCIe controller is compatible with Gen3, Gen2 and Gen1 speed,
> +   and support up to 256 MSI interrupt numbers for
> +   multi-function devices.
> +
> +   Say Y here if you want to enable Gen3 PCIe controller support on
> +   MediaTek SoCs.
> +
>  config PCIE_TANGO_SMP8759
>   bool "Tango SMP8759 PCIe controller (DANGEROUS)"
>   depends on ARCH_TANGO && PCI_MSI && OF
> diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
> index 04c6edc285c5..df5d77d72a9d 100644
> --- a/drivers/pci/controller/Makefile
> +++ b/drivers/pci/controller/Makefile
> @@ -27,6 +27,7 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o
>  obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o
>  obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
> +obj-$(CONFIG_PCIE_MEDIATEK_GEN3) += pcie-mediatek-gen3.o
>  obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o
>  obj-$(CONFIG_VMD) += vmd.o
>  obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o
> diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c 
> b/drivers/pci/controller/pcie-mediatek-gen3.c
> new file mode 100644
> index ..00cdb598a9f5
> --- /dev/null
> +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
> @@ -0,0 +1,1084 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * MediaTek PCIe host controller driver.
> + *
> + * Copyright (c) 2020 MediaTek Inc.
> + * Author: Jianjun Wang 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "../pci.h"
> +
> +#define PCIE_SETTING_REG 0x80
> +#define PCIE_PCI_IDS_1   0x9c
> +#define PCI_CLASS(class) (class << 8)
> +#define PCIE_RC_MODE BIT(0)
> +
> +#define PCIE_CFGNUM_REG  0x140
> +#define PCIE_CFG_DEVFN(devfn)((devfn) & GENMASK(7, 0))
> +#define PCIE_CFG_BUS(bus)(((bus) << 8) & GENMASK(15, 8))
> +#define PCIE_CFG_BYTE_EN(bytes)  (((bytes) << 16) & GENMASK(19, 
> 16))
> +#define PCIE_CFG_FORCE_BYTE_EN   BIT(20)
> +#define PCIE_CFG_OFFSET_ADDR 0x1000
> +#define PCIE_CFG_HEADER(bus, devfn) \
> + (PCIE_CFG_BUS(bus) | PCIE_CFG_DEVFN(devfn))
> +
> +#define PCIE_RST_CTRL_REG0x148
> +#define PCIE_MAC_RSTBBIT(0)
> +#define PCIE_PHY_RSTBBIT(1)
> +#define PCIE_BRG_RSTBBIT(2)
> +#define PCIE_PE_RSTB BIT(3)
> +
> +#define PCIE_LTSSM_STATUS_REG0x150
> +#define PCIE_LTSSM_STATE_MASKGENMASK(28, 24)
> +#define PCIE_LTSSM_STATE(val)((val & PCIE_LTSSM_STATE_MASK) 
> >> 24)
> +#define PCIE_LTSSM_STATE_L2_IDLE 0x14
> +
> +#define PCIE_LINK_STATUS_REG 0x154
> +#define PCIE_PORT_LINKUP BIT(8)
> +
> +#define PCIE_MSI_SET_NUM 8
> +#define PCIE_MSI_IRQS_PER_SET32
> +#define PCIE_MSI_IRQS_NUM \
> + (PCIE_MSI_IRQS_PER_SET * (PCIE_MSI_SET_NUM))
> +
> +#define PCIE_INT_ENABLE_REG  0x180
> +#define PCIE_MSI_MASKGENMASK(PCIE_MSI_SET_NUM + 8 - 
> 1, 8)
> +#define PCIE_MSI_SHIFT   8
> +#define PCIE_INTX_SHIFT  24
> +#define PCIE_INTX_MASK   GENMASK(27, 24)
> +
> +#define PCIE_INT_STATUS_REG  0x184
> +#define

Re: [PATCH 0/2] crypto: x86/aes-ni-xts - recover and improve performance

2020-12-25 Thread Eric Biggers

On Tue, Dec 22, 2020 at 05:06:27PM +0100, Ard Biesheuvel wrote:
> The AES-NI implementation of XTS was impacted significantly by the retpoline
> changes, which is due to the fact that both its asm helper and the chaining
> mode glue library use indirect calls for processing small quantitities of
> data
> 
> So let's fix this, by:
> - creating a minimal, backportable fix that recovers most of the performance,
>   by reducing the number of indirect calls substantially;
> - for future releases, rewrite the XTS implementation completely, and replace
>   the glue helper with a core asm routine that is more flexible, making the C
>   code wrapper much more straight-forward.
> 
> This results in a substantial performance improvement: around ~2x for 1k and
> 4k blocks, and more than 3x for ~1k blocks that require ciphertext stealing
> (benchmarked using tcrypt using 1420 byte blocks - full results below)
> 
> It also allows us to enable the same driver for i386.
> 
> Cc: Megha Dey 
> Cc: Eric Biggers 
> Cc: Herbert Xu 
> 
> Ard Biesheuvel (2):
>   crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
>   crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper
> 
>  arch/x86/crypto/aesni-intel_asm.S  | 353 
>  arch/x86/crypto/aesni-intel_glue.c | 230 +++--
>  2 files changed, 412 insertions(+), 171 deletions(-)
> 
> -- 
> 2.17.1
> 
> Benchmarked using tcrypt on a Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz.

Thanks for doing this!  I didn't realize that there was such a big performance
regression here.  Getting rid of these indirect calls looks like the right
approach; this all seems to have been written for a world where indirect calls
are much faster...

I did some quick benchmarks on Zen ("AMD Ryzen Threadripper 1950X 16-Core
Processor") with CONFIG_RETPOLINE=y and confirmed the speedup on 4096-byte
blocks is around 2x there too.  (It's over 2x for AES-128-XTS and AES-192-XTS,
and a bit under 2x for AES-256-XTS.  And most of the speedup comes from the
first patch.)  Also, the extra self-tests are passing.

So feel free to add:

Tested-by: Eric Biggers  # x86_64

Note that this patch series didn't apply cleanly, as it seems to depend on some
other patches you've sent out recently.  So I actually tested your
"for-kernelci" branch instead of applying these directly.

- Eric

Re: [GIT PULL] perf tools changes for v5.11, 2nd batch

2020-12-25 Thread pr-tracker-bot

The pull request you sent on Thu, 24 Dec 2020 15:56:33 -0300:

> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-tools-2020-12-24

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/5814bc2d4cc241c1a603fac2b5bf1bd4daa108fc

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [git pull] vfs.git misc stuff

2020-12-25 Thread pr-tracker-bot

The pull request you sent on Thu, 24 Dec 2020 23:35:07 +:

> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.misc

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7bb5226c8a4bbf26a9ededc90532b0ad539d2017

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[PATCH -next] regulator: qcom-rpmh: fix build after QCOM_COMMAND_DB is tristate

2020-12-25 Thread Randy Dunlap

Restrict REGULATOR_QCOM_RPMH to QCOM_COMMAND_DB it the latter is enabled.

Fixes this build error:
  microblaze-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_probe':
  (.text+0x354): undefined reference to `cmd_db_read_addr'

Fixes: 778279f4f5e4 ("soc: qcom: cmd-db: allow loading as a module")
Reported-by: kernel test robot 
Signed-off-by: Randy Dunlap 
Cc: Lina Iyer 
Cc: Liam Girdwood 
Cc: Mark Brown 
---
 drivers/regulator/Kconfig |1 +
 1 file changed, 1 insertion(+)

--- linux-next-20201223.orig/drivers/regulator/Kconfig
+++ linux-next-20201223/drivers/regulator/Kconfig
@@ -881,6 +881,7 @@ config REGULATOR_QCOM_RPM
 config REGULATOR_QCOM_RPMH
tristate "Qualcomm Technologies, Inc. RPMh regulator driver"
depends on QCOM_RPMH || (QCOM_RPMH=n && COMPILE_TEST)
+   depends on QCOM_COMMAND_DB || !QCOM_COMMAND_DB
help
  This driver supports control of PMIC regulators via the RPMh hardware
  block found on Qualcomm Technologies Inc. SoCs.  RPMh regulator

[ANNOUNCE] 4.14.212-rt102

2020-12-25 Thread Clark Williams

Hello RT-list!

I'm pleased to announce the 4.14.212-rt102 stable release.

You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  branch: v4.14-rt
  Head SHA1: cc763276d939207b3424090ab618c3e52f7d49de

Or to build 4.14.212-rt102 directly, the following patches should be applied:

  https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.14.tar.xz

  https://www.kernel.org/pub/linux/kernel/v4.x/patch-4.14.212.xz

  
https://www.kernel.org/pub/linux/kernel/projects/rt/4.14/patch-4.14.212-rt102.patch.xz


You can also build from 4.14.209-rt101 by applying the incremental patch:

  
https://www.kernel.org/pub/linux/kernel/projects/rt/4.14/incr/patch-4.14.209-rt101-rt102.patch.xz

Enjoy!
Clark

arm-linux-gnueabi-ld: sil-sii8620.c:undefined reference to `extcon_register_notifier'

2020-12-25 Thread kernel test robot

Hi Masahiro,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   71c5f03154ac1cb27423b984743ccc2f5d11d14d
commit: def2fbffe62c00c330c7f41584a356001179c59c kconfig: allow symbols implied 
by y to become m
date:   10 months ago
config: arm-randconfig-r033-20201224 (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=def2fbffe62c00c330c7f41584a356001179c59c
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout def2fbffe62c00c330c7f41584a356001179c59c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   arm-linux-gnueabi-ld: drivers/gpu/drm/bridge/sil-sii8620.o: in function 
`sii8620_remove':
   sil-sii8620.c:(.text+0x186): undefined reference to 
`extcon_unregister_notifier'
   arm-linux-gnueabi-ld: drivers/gpu/drm/bridge/sil-sii8620.o: in function 
`sii8620_probe':
   sil-sii8620.c:(.text+0x24fa): undefined reference to 
`extcon_find_edev_by_node'
>> arm-linux-gnueabi-ld: sil-sii8620.c:(.text+0x254e): undefined reference to 
>> `extcon_register_notifier'
   arm-linux-gnueabi-ld: drivers/gpu/drm/bridge/sil-sii8620.o: in function 
`sii8620_extcon_work':
   sil-sii8620.c:(.text+0x25da): undefined reference to `extcon_get_state'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

undefined reference to `__ubsan_handle_alignment_assumption'

2020-12-25 Thread kernel test robot

Hi 周琰杰,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   71c5f03154ac1cb27423b984743ccc2f5d11d14d
commit: 31de313dfdcf6971b0a1c30f86eabaa1eede74b3 PHY: Ingenic: Add USB PHY 
driver using generic PHY framework.
date:   3 weeks ago
config: s390-randconfig-r031-20201224 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
cee1e7d14f4628d6174b33640d502bff3b54ae45)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=31de313dfdcf6971b0a1c30f86eabaa1eede74b3
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 31de313dfdcf6971b0a1c30f86eabaa1eede74b3
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   s390x-linux-gnu-ld: init/main.o: in function `do_initcalls':
   main.c:(.init.text+0x1a34): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: init/initramfs.o: in function `unpack_to_rootfs':
   initramfs.c:(.init.text+0x3c4): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: initramfs.c:(.init.text+0x3e0): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: initramfs.c:(.init.text+0x3fc): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: init/initramfs.o: in function `dir_add':
   initramfs.c:(.init.text+0x1748): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: init/initramfs.o:initramfs.c:(.init.text+0x1b62): more 
undefined references to `__ubsan_handle_alignment_assumption' follow
   s390x-linux-gnu-ld: drivers/phy/ingenic/phy-ingenic-usb.o: in function 
`ingenic_usb_phy_probe':
   phy-ingenic-usb.c:(.text+0x88): undefined reference to 
`devm_platform_ioremap_resource'
   s390x-linux-gnu-ld: drivers/gpio/gpiolib.o: in function 
`gpiochip_add_data_with_key':
>> (.text+0x34f2): undefined reference to `__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: drivers/gpio/gpiolib.o: in function `kzalloc':
   gpiolib.c:(.text+0x5106): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: drivers/gpio/gpiolib.o: in function `kmalloc_array':
   gpiolib.c:(.text+0xc458): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: drivers/gpio/gpiolib-cdev.o: in function `gpio_ioctl':
   gpiolib-cdev.c:(.text+0x2a30): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: gpiolib-cdev.c:(.text+0x2b50): undefined reference to 
`__ubsan_handle_alignment_assumption'
   s390x-linux-gnu-ld: 
drivers/gpio/gpiolib-cdev.o:gpiolib-cdev.c:(.text+0x30da): more undefined 
references to `__ubsan_handle_alignment_assumption' follow

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH 2/2] checkpatch: kconfig: add missing types to regex

2020-12-25 Thread Joe Perches

On Fri, 2020-12-25 at 18:27 +0100, Nicolai Fischer wrote:
> On 12/21/20 6:17 PM, Joe Perches wrote:
[]
> > The message you used:
> > +   WARN("CONFIG_DESCRIPTION",
> > +   "help text is not indented 2 spaces 
> > more than the help keyword\n" . $herecurr);
> > 
> > is IMO a bit oddly phrased and could/should test only
> > the first line after the help keyword and show the help
> > line using $hereprev.
> > 
> > The problem with $herecurr is, that it always contains the first line of 
> > the Kconfig option.
> The loop which actually determines if the warning is to be displayed, leaves 
> $herecurr and likewise $hereprev unaffected.
> 
> So printing $hereprev would unfortunately not be any more helpful than 
> $herecurr.

> Because $herecurr and $hereprev also contain the line number, among other 
> things, I am not sure what would be the best way
> to address this.

There is a mechanism to create an output message block: get_stat_real
that could be used.

[PATCH] drm/amd/display: avoid null pointer dereference in dm_set_vblank

2020-12-25 Thread Defang Bo

[Why]
Similar to commit("drm/amd/display: Guard against null crtc in CRC 
IRQ"),
a null pointer deference can occur if crtc is null in
dm_set_vblank.

[How]

Check that CRTC is non-null before accessing its fields.

Signed-off-by: Defang Bo 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e2b23486..df23d28 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4875,10 +4875,17 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
*crtc, bool enable)
 static inline int dm_set_vblank(struct drm_crtc *crtc, bool enable)
 {
enum dc_irq_source irq_source;
-   struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
-   struct amdgpu_device *adev = drm_to_adev(crtc->dev);
-   struct dm_crtc_state *acrtc_state = to_dm_crtc_state(crtc->state);
+   struct amdgpu_crtc *acrtc;
+   struct amdgpu_device *adev;
+   struct dm_crtc_state *acrtc_state;
int rc = 0;
+
+   if (crtc == NULL)
+   return rc;
+
+   acrtc = to_amdgpu_crtc(crtc);
+   adev = drm_to_adev(crtc->dev);
+   acrtc_state = to_dm_crtc_state(crtc->state);
 
if (enable) {
/* vblank irq on -> Only need vupdate irq in vrr mode */
-- 
2.7.4

Re: [PATCH 2/2] checkpatch: kconfig: add missing types to regex

2020-12-25 Thread Nicolai Fischer



On 12/21/20 6:17 PM, Joe Perches wrote:
> On Mon, 2020-12-21 at 16:08 +0100, Nicolai Fischer wrote:
>> On Sun, 2020-12-20 at 20:16 +0100, Joe Perches wrote:
>>> On Mon, 2020-12-14 at 11:24 +0100, Nicolai Fischer wrote:
 Kconfig parsing does not recognise all type attributes.
 This adds the missing 'int', 'sting' and 'hex' types.
>>> []
 diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>>> []
 @@ -3321,7 +3321,7 @@ sub process {
    next if ($f =~ /^-/);
    last if (!$file && $f =~ /^\@\@/);
  

 -  if ($lines[$ln - 1] =~ 
 /^\+\s*(?:bool|tristate|prompt)\s*["']/) {
 +  if ($lines[$ln - 1] =~ 
 /^\+\s*(?:bool|tristate|int|hex|string|prompt)\s*["']/) {
    $is_start = 1;
    } elsif ($lines[$ln - 1] =~ /^\+\s*help$/) {
    $length = -1;
>>>
>>> Another thing that could be done is to enforce the "extra 2 spaces"
>>> indent by capturing the whitespace before the help keyword:
>>>
>>> } elsif ($lines[$ln - 1] =~ /^\+\s*help$/) {
>>>
>>> could be
>>>
>>> } elsif ($lines[$ln - 1] =~ /^\+(\s*)help\s*$/) 
>>> {
>>>
>>> with $1 used to validate the extra indent.
>>>
>>>
>>
>>
>> In case the indent does not match, should we display a new warning as in our 
>> previous patch?
> 
> Sure, but in a separate patch and ensure blank lines are ignored.
> 
> +   if ($l !~ /^\ {2}/) {
> +   $wrong_indent = 1;
> }
> 
> The message you used:
> +   WARN("CONFIG_DESCRIPTION",
> +   "help text is not indented 2 spaces 
> more than the help keyword\n" . $herecurr);
> 
> is IMO a bit oddly phrased and could/should test only
> the first line after the help keyword and show the help
> line using $hereprev.
> 
> The problem with $herecurr is, that it always contains the first line of the 
> Kconfig option.
The loop which actually determines if the warning is to be displayed, leaves 
$herecurr and likewise $hereprev unaffected.

So printing $hereprev would unfortunately not be any more helpful than 
$herecurr.

Because $herecurr and $hereprev also contain the line number, among other 
things, I am not sure what would be the best way
to address this.

Re: [PATCH v6 3/5] counter: Add character device interface

2020-12-25 Thread William Breathitt Gray

Hi David,

I agree with your suggested changes -- just a couple select comments
following below.

On Sun, Dec 13, 2020 at 05:58:26PM -0600, David Lechner wrote:
> > +static int counter_add_watch(struct counter_device *const counter,
> > +const unsigned long arg)
> > +{

[...]

> > +
> > +dummy_component:
> > +   comp_node.component = watch.component;
> 
> 
> In my experiments, I added a events_validate driver callback here to
> validate each event as it is added. This way the user can know exactly
> which event caused the problem rather than waiting for the event_config
> callback.

Yes, this is a good idea and I have use for this in the 104-quad-8
driver as well. I'm going to name this "watch_validate" however, because
I need to validate the requested channel as well as the requested event
here (both part of the struct counter_watch).

> > diff --git a/include/linux/counter.h b/include/linux/counter.h
> > index 3f3f8ba6c1b4..98cd7c035968 100644
> > --- a/include/linux/counter.h
> 
> 
> > 
> > +/**
> > + * struct counter_event_node - Counter Event node
> > + * @l: list of current watching Counter events
> > + * @event: event that triggers
> > + * @channel:   event channel
> > + * @comp_list: list of components to watch when event triggers
> > + */
> > +struct counter_event_node {
> > +   struct list_head l;
> > +   u8 event;
> > +   u8 channel;
> > +   struct list_head comp_list;
> > +};
> > +
> 
> 
> Unless this is needed outside of the drivers/counter/ directory, I
> would suggest putting it in drivers/counter/counter-chrdev.h instead
> of include/linux/counter.h.

The "events_list" member of the struct counter_device is a list of
struct counter_event_node. The events_configure() callback should parse
through this list to determine the current events configuration request.
As such, driver authors will need this structure available via
include/linux/counter.h so they can parse "events_list".

William Breathitt Gray

signature.asc
Description: PGP signature

Wrong __setup() callbacks return values and /init's environment pollution

2020-12-25 Thread Igor Zhbanov


It seems that nobody knows how to write __setup() callback functions for
parsing the command line parameters. And there are no documentation or
comments about best practices.

Despite being declared obsolete __setup() is used about 435 times in the
kernel and 51 of them (~11.2%) are erroneous in the sense of returning
incorrect value (0) resulting in the /init process environment pollution.

Initially it was mentioned that the callback function should return 1 when
the parameter value is parsed and consumed successfully, or return 0 to
keep the unparsed option as init's environment variable.

But there are no comments or documentation about it, so developers often
always returning 0 (as it is typically expected in other kernel functions)
or returning -EINVAL on parse error. All these cases resulting in init's
environment pollution (which is limited only to 32 variables in the config).

Even the original usage when 0 is returned on the parse error is questionable
now. There are no (or not so many) parameter names clashes between different
kernel subsystems, as well as there not so many parameters to be passed to
init/systemd that could be interpreted as the kernel parameters. So if a
kernel module sure that this is its own parameter, may be it would be better
to always return 1 and consume it, even it is malformed.

Also there are no recommendations on whether to print a warning when
incorrect parameter is passed. Some of the functions print a warning on
incorrect values; some are silently proceeding with the default values.
Some are even calling panic() on incorrect parameters (e.g.
setup_time_travel() -> time_travel_connect_external()). So there is no
consistency on what behavior for handling incorrect parameters values are
recommended.

I've tried to categorize all of the __setup() usages. And here is the zoo:

Functions always returning 1 (right):
  (About 324)

Functions returning 0 on error, 1 on success (mostly right):
  arch/arm/mach-pxa/viper.c:__setup("tpm=", viper_tpm_setup);
  arch/arm/mach-omap1/i2c.c:__setup("i2c_bus=", omap_i2c_bus_setup);
  arch/powerpc/platforms/pseries/lpar.c:__setup("cmo_free_hint=", 
cmo_free_hint);
  arch/powerpc/sysdev/fsl_85xx_l2ctlr.c:__setup("cache-sram-size=", 
get_size_from_cmdline);
  arch/powerpc/sysdev/fsl_85xx_l2ctlr.c:__setup("cache-sram-offset=", 
get_offset_from_cmdline);
  arch/powerpc/kernel/iommu.c:__setup("fail_iommu=", setup_fail_iommu);
  arch/powerpc/kernel/rtasd.c:__setup("surveillance=", surveillance_setup);
  arch/powerpc/kernel/rtasd.c:__setup("rtasmsgs=", rtasmsgs_setup);
  arch/x86/kernel/cpu/mce/core.c:__setup("mce", mcheck_enable);
  arch/x86/kernel/cpu/common.c:__setup("nofsgsbase", x86_nofsgsbase_setup);
  arch/microblaze/kernel/traps.c:__setup("kstack=", kstack_setup);
  arch/c6x/kernel/soc.c:__setup("emac_addr=", get_mac_addr_from_cmdline);
  arch/sparc/kernel/apc.c:__setup("apc=", apc_setup);
  arch/um/drivers/stdio_console.c:__setup("con", console_chan_setup);
  net/ipv4/tcp_metrics.c:__setup("tcpmhash_entries=", set_tcpmhash_entries);
  net/ipv4/tcp.c:__setup("thash_entries=", set_thash_entries);
  net/ipv4/udp.c:__setup("uhash_entries=", set_uhash_entries);
  net/ipv4/ipconfig.c:__setup("carrier_timeout=", set_carrier_timeout);
  net/ethernet/eth.c:__setup("ether=", netdev_boot_setup);
  net/core/dev.c:__setup("netdev=", netdev_boot_setup);
  drivers/acpi/osl.c:__setup("acpi_enforce_resources=", 
acpi_enforce_resources_setup);
  drivers/scsi/BusLogic.c:__setup("BusLogic=", blogic_setup);
  drivers/scsi/mac_scsi.c:__setup("mac5380=", mac_scsi_setup);
  drivers/scsi/atari_scsi.c:__setup("atascsi=", atari_scsi_setup);
  drivers/net/arcnet/com90io.c:__setup("com90io=", com90io_setup);
  drivers/net/appletalk/ltpc.c:__setup("ltpc=", ltpc_setup);
  drivers/net/hamradio/baycom_par.c:__setup("baycom_par=", baycom_par_setup);
  drivers/net/hamradio/baycom_ser_hdx.c:__setup("baycom_ser_hdx=", 
baycom_ser_hdx_setup);
  drivers/net/hamradio/baycom_epp.c:__setup("baycom_epp=", baycom_epp_setup);
  drivers/net/hamradio/baycom_ser_fdx.c:__setup("baycom_ser_fdx=", 
baycom_ser_fdx_setup);
  drivers/video/console/mdacon.c:__setup("mdacon=", mdacon_setup);
  drivers/md/md-autodetect.c:__setup("md=", md_setup);
  drivers/block/amiflop.c:__setup("floppy=", amiga_floppy_setup);
  drivers/block/floppy.c:__setup("floppy=", floppy_setup);
  drivers/block/ataflop.c:__setup("floppy=", atari_floppy_setup);
  drivers/char/lp.c:__setup("lp=", lp_setup);
  mm/failslab.c:__setup("failslab=", setup_failslab);
  mm/huge_memory.c:__setup("transparent_hugepage=", setup_transparent_hugepage);
  mm/hugetlb.c:__setup("hugepages=", hugepages_setup);
  mm/hugetlb.c:__setup("hugepagesz=", hugepagesz_setup);
  mm/hugetlb.c:__setup("default_hugepagesz=", default_hugepagesz_setup);
  mm/page_alloc.c:__setup("hashdist=", set_hashdist);
  mm/mempolicy.c:__setup("numa_balancing=", setup_numabalancing);
  fs/namespace.c:__setup("mhash_entries=", set_mhash_entries);
  fs/namespace.c

Re: riscv+KASAN does not boot

2020-12-25 Thread Dmitry Vyukov

On Fri, Dec 25, 2020 at 5:58 PM Andreas Schwab  wrote:
>
> On Dez 25 2020, Dmitry Vyukov wrote:
>
> > qemu-system-riscv64 \
> > -machine virt -bios default -smp 1 -m 2G \
> > -device virtio-blk-device,drive=hd0 \
> > -drive file=buildroot-riscv64.ext4,if=none,format=raw,id=hd0 \
> > -kernel arch/riscv/boot/Image \
> > -nographic \
> > -device virtio-rng-device,rng=rng0 -object
> > rng-random,filename=/dev/urandom,id=rng0 \
> > -netdev user,id=net0,host=10.0.2.10,hostfwd=tcp::10022-:22 -device
> > virtio-net-device,netdev=net0 \
> > -append "root=/dev/vda earlyprintk=serial console=ttyS0 oops=panic
> > panic_on_warn=1 panic=86400"
>
> Do you get more output with earlycon=sbi?

Hi Andreas,

For defconfig+kvm_guest.config+ scripts/config -e KASAN -e
KASAN_INLINE it actually gave me more output:


OpenSBI v0.7
   _  _
  / __ \  / |  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |) | |_) || |_
  \/| .__/ \___|_| |_|_/|/_|
| |
|_|

Platform Name  : QEMU Virt Machine
Platform HART Features : RV64ACDFIMSU
Current Hart   : 0
Firmware Base  : 0x8000
Firmware Size  : 132 KB
Runtime SBI Version: 0.2

MIDELEG : 0x0222
MEDELEG : 0xb109
PMP0: 0x8000-0x8003 (A)
PMP1: 0x-0x (A,R,W,X)
[0.00] Linux version 5.10.0-01370-g71c5f03154ac
(dvyu...@dvyukov-desk.muc.corp.google.com) (riscv64-linux-gnu-gcc
(Debian 10.2.0-9) 10.2.0, GNU ld (GNU Binutils for Debian) 2.35.1) #17
SMP Fri Dec 25 18:10:12 CET 2020
[0.00] OF: fdt: Ignoring memory range 0x8000 - 0x8020
[0.00] earlycon: sbi0 at I/O port 0x0 (options '')
[0.00] printk: bootconsole [sbi0] enabled
[0.00] efi: UEFI not found.
[0.00] Zone ranges:
[0.00]   DMA32[mem 0x8020-0x]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x8020-0x]
[0.00] Initmem setup node 0 [mem 0x8020-0x]
[0.00] SBI specification v0.2 detected
[0.00] SBI implementation ID=0x1 Version=0x7
[0.00] SBI v0.2 TIME extension detected
[0.00] SBI v0.2 IPI extension detected
[0.00] SBI v0.2 RFENCE extension detected
[0.00] software IO TLB: mapped [mem
0xfa3f9000-0xfe3f9000] (64MB)
[0.00] Unable to handle kernel paging request at virtual
address dfc81004
[0.00] Oops [#1]
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted
5.10.0-01370-g71c5f03154ac #17
[0.00] epc: ffe00042e3e4 ra : ffe000c0462c sp : ffe001603ea0
[0.00]  gp : ffe0016e3c60 tp : ffe00160cd40 t0 :
dfc81004
[0.00]  t1 : ffe000e0a838 t2 :  s0 :
ffe001603f50
[0.00]  s1 : ffe0016e50a8 a0 : dfc81004 a1 :

[0.00]  a2 : 0ffc a3 : dfc82000 a4 :

[0.00]  a5 : 3e8c6001 a6 : ffe000e0a820 a7 :
0900
[0.00]  s2 : dfc82000 s3 : dfc8 s4 :
0001
[0.00]  s5 : ffe0016e5108 s6 : f000 s7 :
dfc81004
[0.00]  s8 : 0080 s9 :  s10:
ffe07a119000
[0.00]  s11: ffc0 t3 : ffe0016eb908 t4 :
0001
[0.00]  t5 : ffc4001c150a t6 : ffe001603be8
[0.00] status: 0100 badaddr: dfc81004
cause: 000f
[0.00] random: get_random_bytes called from
oops_exit+0x30/0x58 with crng_init=0
[0.00] ---[ end trace  ]---
[0.00] Kernel panic - not syncing: Fatal exception
[0.00] ---[ end Kernel panic - not syncing: Fatal exception ]---


But I first tried with a the kernel image I had in the dir, I think it
was this config (no KASAN):
https://gist.githubusercontent.com/dvyukov/b2b62beccf80493781ab03b41430e616/raw/62e673cff08a8a41656d2871b8a37f74b00f509f/gistfile1.txt

and earlycon=sbi did not change anything (no output after OpenSBI).
So potentially there are 2 different problems.

undefined reference to `cmd_db_read_addr'

2020-12-25 Thread kernel test robot

Hi Lina,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   71c5f03154ac1cb27423b984743ccc2f5d11d14d
commit: 778279f4f5e4e89ff31803ba48135256563825c2 soc: qcom: cmd-db: allow 
loading as a module
date:   9 weeks ago
config: microblaze-randconfig-r026-20201224 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=778279f4f5e4e89ff31803ba48135256563825c2
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 778279f4f5e4e89ff31803ba48135256563825c2
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   microblaze-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_probe':
>> (.text+0x354): undefined reference to `cmd_db_read_addr'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

1 2 3 >

1 - 100 of 200 matches

Mail list logo