[PATCH] lib,kprobes: objpool_try_add_slot merged into objpool_push

2023-10-24 Thread wuqiang.matt
After several rounds of objpool improvements the existence
of inline function objpool_try_add_slot is not appropriate
any more:
1) push can only happen on local cpu node, the cpu param of
   objpool_try_add_slot is misleading
2) objpool_push is only a wrapper of objpool_try_add_slot,
   with the original loop of all cpu nodes removed

Signed-off-by: wuqiang.matt 
---
 lib/objpool.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/lib/objpool.c b/lib/objpool.c
index a032701beccb..7474a3a60cad 100644
--- a/lib/objpool.c
+++ b/lib/objpool.c
@@ -152,13 +152,17 @@ int objpool_init(struct objpool_head *pool, int nr_objs, 
int object_size,
 }
 EXPORT_SYMBOL_GPL(objpool_init);
 
-/* adding object to slot, abort if the slot was already full */
-static inline int
-objpool_try_add_slot(void *obj, struct objpool_head *pool, int cpu)
+/* reclaim an object to object pool */
+int objpool_push(void *obj, struct objpool_head *pool)
 {
-   struct objpool_slot *slot = pool->cpu_slots[cpu];
+   struct objpool_slot *slot;
uint32_t head, tail;
+   unsigned long flags;
+
+   /* disable local irq to avoid preemption & interruption */
+   raw_local_irq_save(flags);
 
+   slot = pool->cpu_slots[raw_smp_processor_id()];
/* loading tail and head as a local snapshot, tail first */
tail = READ_ONCE(slot->tail);
 
@@ -173,21 +177,9 @@ objpool_try_add_slot(void *obj, struct objpool_head *pool, 
int cpu)
/* update sequence to make this obj available for pop() */
smp_store_release(>last, tail + 1);
 
-   return 0;
-}
-
-/* reclaim an object to object pool */
-int objpool_push(void *obj, struct objpool_head *pool)
-{
-   unsigned long flags;
-   int rc;
-
-   /* disable local irq to avoid preemption & interruption */
-   raw_local_irq_save(flags);
-   rc = objpool_try_add_slot(obj, pool, raw_smp_processor_id());
raw_local_irq_restore(flags);
 
-   return rc;
+   return 0;
 }
 EXPORT_SYMBOL_GPL(objpool_push);
 
-- 
2.40.1




Re: [PATCH RFC 00/37] Add support for arm64 MTE dynamic tag storage reuse

2023-10-24 Thread Hyesoo Yu
On Wed, Sep 13, 2023 at 04:29:25PM +0100, Catalin Marinas wrote:
> On Mon, Sep 11, 2023 at 02:29:03PM +0200, David Hildenbrand wrote:
> > On 11.09.23 13:52, Catalin Marinas wrote:
> > > On Wed, Sep 06, 2023 at 12:23:21PM +0100, Alexandru Elisei wrote:
> > > > On Thu, Aug 24, 2023 at 04:24:30PM +0100, Catalin Marinas wrote:
> > > > > On Thu, Aug 24, 2023 at 01:25:41PM +0200, David Hildenbrand wrote:
> > > > > > On 24.08.23 13:06, David Hildenbrand wrote:
> > > > > > > Regarding one complication: "The kernel needs to know where to 
> > > > > > > allocate
> > > > > > > a PROT_MTE page from or migrate a current page if it becomes 
> > > > > > > PROT_MTE
> > > > > > > (mprotect()) and the range it is in does not support tagging.",
> > > > > > > simplified handling would be if it's in a MIGRATE_CMA pageblock, 
> > > > > > > it
> > > > > > > doesn't support tagging. You have to migrate to a !CMA page (for
> > > > > > > example, not specifying GFP_MOVABLE as a quick way to achieve 
> > > > > > > that).
> > > > > > 
> > > > > > Okay, I now realize that this patch set effectively duplicates some 
> > > > > > CMA
> > > > > > behavior using a new migrate-type.
> > > [...]
> > > > I considered mixing the tag storage memory memory with normal memory and
> > > > adding it to MIGRATE_CMA. But since tag storage memory cannot be tagged,
> > > > this means that it's not enough anymore to have a __GFP_MOVABLE 
> > > > allocation
> > > > request to use MIGRATE_CMA.
> > > > 
> > > > I considered two solutions to this problem:
> > > > 
> > > > 1. Only allocate from MIGRATE_CMA is the requested memory is not tagged 
> > > > =>
> > > > this effectively means transforming all memory from MIGRATE_CMA into the
> > > > MIGRATE_METADATA migratetype that the series introduces. Not very
> > > > appealing, because that means treating normal memory that is also on the
> > > > MIGRATE_CMA lists as tagged memory.
> > > 
> > > That's indeed not ideal. We could try this if it makes the patches
> > > significantly simpler, though I'm not so sure.
> > > 
> > > Allocating metadata is the easier part as we know the correspondence
> > > from the tagged pages (32 PROT_MTE page) to the metadata page (1 tag
> > > storage page), so alloc_contig_range() does this for us. Just adding it
> > > to the CMA range is sufficient.
> > > 
> > > However, making sure that we don't allocate PROT_MTE pages from the
> > > metadata range is what led us to another migrate type. I guess we could
> > > achieve something similar with a new zone or a CPU-less NUMA node,
> > 
> > Ideally, no significant core-mm changes to optimize for an architecture
> > oddity. That implies, no new zones and no new migratetypes -- unless it is
> > unavoidable and you are confident that you can convince core-MM people that
> > the use case (giving back 3% of system RAM at max in some setups) is worth
> > the trouble.
> 
> If I was an mm maintainer, I'd also question this ;). But vendors seem
> pretty picky about the amount of RAM reserved for MTE (e.g. 0.5G for a
> 16G platform does look somewhat big). As more and more apps adopt MTE,
> the wastage would be smaller but the first step is getting vendors to
> enable it.
> 
> > I also had CPU-less NUMA nodes in mind when thinking about that, but not
> > sure how easy it would be to integrate it. If the tag memory has actually
> > different performance characteristics as well, a NUMA node would be the
> > right choice.
> 
> In general I'd expect the same characteristics. However, changing the
> memory designation from tag to data (and vice-versa) requires some cache
> maintenance. The allocation cost is slightly higher (not the runtime
> one), so it would help if the page allocator does not favour this range.
> Anyway, that's an optimisation to worry about later.
> 
> > If we could find some way to easily support this either via CMA or CPU-less
> > NUMA nodes, that would be much preferable; even if we cannot cover each and
> > every future use case right now. I expect some issues with CXL+MTE either
> > way , but are happy to be taught otherwise :)
> 
> I think CXL+MTE is rather theoretical at the moment. Given that PCIe
> doesn't have any notion of MTE, more likely there would be some piece of
> interconnect that generates two memory accesses: one for data and the
> other for tags at a configurable offset (which may or may not be in the
> same CXL range).
> 
> > Another thought I had was adding something like CMA memory characteristics.
> > Like, asking if a given CMA area/page supports tagging (i.e., flag for the
> > CMA area set?)?
> 
> I don't think adding CMA memory characteristics helps much. The metadata
> allocation wouldn't go through cma_alloc() but rather
> alloc_contig_range() directly for a specific pfn corresponding to the
> data pages with PROT_MTE. The core mm code doesn't need to know about
> the tag storage layout.
> 
> It's also unlikely for cma_alloc() memory to be mapped as PROT_MTE.
> That's typically coming from device 

Re: [PATCH] locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()

2023-10-24 Thread wuqiang.matt

On 2023/10/25 07:42, Masami Hiramatsu (Google) wrote:

On Tue, 24 Oct 2023 16:08:12 +0100
Mark Rutland  wrote:


On Tue, Oct 24, 2023 at 11:52:54PM +0900, Masami Hiramatsu (Google) wrote:

From: Masami Hiramatsu (Google) 

Use generic_cmpxchg_local() for arch_cmpxchg_local() implementation
in SH architecture because it does not implement arch_cmpxchg_local().


I do not think this is correct.

The implementation in  is UP-only (and it only
disables interrupts), whereas arch/sh can be built SMP. We should probably add
some guards into  for that as we have in
.


Isn't cmpxchg_local for the data which only needs to ensure to do cmpxchg
on local CPU?


asm-generic/cmpxchg.h is only for UP, will throw an error for SMP building:

#ifdef CONFIG_SMP
#error "Cannot use generic cmpxchg on SMP"
#endif

SH arch seems it does have SMP systems. The arch/sh/include/asm/cmpxchg.h has
the following codes:

#if defined(CONFIG_GUSA_RB)
#include 
#elif defined(CONFIG_CPU_SH4A)
#include 
#elif defined(CONFIG_CPU_J2) && defined(CONFIG_SMP)
#include 
#else
#include 
#endif


So I think it doesn't care about the other CPUs (IOW, it should not touched by
other CPUs), so it only considers UP case. E.g. on x86, arch_cmpxchg_local() is
defined as raw "cmpxchg" without lock prefix.

#define __cmpxchg_local(ptr, old, new, size)\
 __raw_cmpxchg((ptr), (old), (new), (size), "")


Thank you,




I think the right thing to do here is to define arch_cmpxchg_local() in terms
of arch_cmpxchg(), i.e. at the bottom of arch/sh's  add:

#define arch_cmpxchg_local  arch_cmpxchg


I agree too. Might not be performance optimized but guarantees correctness.


Mark.



Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202310241310.ir5uukog-...@intel.com/
Signed-off-by: Masami Hiramatsu (Google) 
---
  arch/sh/include/asm/cmpxchg.h |2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/sh/include/asm/cmpxchg.h b/arch/sh/include/asm/cmpxchg.h
index 288f6f38d98f..e920e61fb817 100644
--- a/arch/sh/include/asm/cmpxchg.h
+++ b/arch/sh/include/asm/cmpxchg.h
@@ -71,4 +71,6 @@ static inline unsigned long __cmpxchg(volatile void * ptr, 
unsigned long old,
(unsigned long)_n_, sizeof(*(ptr))); \
})
  
+#include 

+
  #endif /* __ASM_SH_CMPXCHG_H */









Re: [PATCH] locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()

2023-10-24 Thread Google
On Tue, 24 Oct 2023 16:08:12 +0100
Mark Rutland  wrote:

> On Tue, Oct 24, 2023 at 11:52:54PM +0900, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) 
> > 
> > Use generic_cmpxchg_local() for arch_cmpxchg_local() implementation
> > in SH architecture because it does not implement arch_cmpxchg_local().
> 
> I do not think this is correct.
> 
> The implementation in  is UP-only (and it only
> disables interrupts), whereas arch/sh can be built SMP. We should probably add
> some guards into  for that as we have in
> .

Isn't cmpxchg_local for the data which only needs to ensure to do cmpxchg
on local CPU?

So I think it doesn't care about the other CPUs (IOW, it should not touched by
other CPUs), so it only considers UP case. E.g. on x86, arch_cmpxchg_local() is
defined as raw "cmpxchg" without lock prefix.

#define __cmpxchg_local(ptr, old, new, size)\
__raw_cmpxchg((ptr), (old), (new), (size), "")


Thank you,


> 
> I think the right thing to do here is to define arch_cmpxchg_local() in terms
> of arch_cmpxchg(), i.e. at the bottom of arch/sh's  add:
> 
> #define arch_cmpxchg_local  arch_cmpxchg
> 
> Mark.
> 
> > 
> > Reported-by: kernel test robot 
> > Closes: 
> > https://lore.kernel.org/oe-kbuild-all/202310241310.ir5uukog-...@intel.com/
> > Signed-off-by: Masami Hiramatsu (Google) 
> > ---
> >  arch/sh/include/asm/cmpxchg.h |2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/sh/include/asm/cmpxchg.h b/arch/sh/include/asm/cmpxchg.h
> > index 288f6f38d98f..e920e61fb817 100644
> > --- a/arch/sh/include/asm/cmpxchg.h
> > +++ b/arch/sh/include/asm/cmpxchg.h
> > @@ -71,4 +71,6 @@ static inline unsigned long __cmpxchg(volatile void * 
> > ptr, unsigned long old,
> > (unsigned long)_n_, sizeof(*(ptr))); \
> >})
> >  
> > +#include 
> > +
> >  #endif /* __ASM_SH_CMPXCHG_H */
> > 


-- 
Masami Hiramatsu (Google) 



[PATCH v2 2/4] dt-bindings: arm: qcom: Add Samsung Galaxy Tab 4 10.1 LTE

2023-10-24 Thread Stefan Hansson
This documents Samsung Galaxy Tab 4 10.1 LTE (samsung,matisselte)
which is a tablet by Samsung based on the MSM8926 SoC.

Signed-off-by: Stefan Hansson 
---
 Documentation/devicetree/bindings/arm/qcom.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/arm/qcom.yaml 
b/Documentation/devicetree/bindings/arm/qcom.yaml
index 88b84035e7b1..242ffe89c6c6 100644
--- a/Documentation/devicetree/bindings/arm/qcom.yaml
+++ b/Documentation/devicetree/bindings/arm/qcom.yaml
@@ -196,6 +196,7 @@ properties:
   - enum:
   - microsoft,superman-lte
   - microsoft,tesla
+  - samsung,matisselte
   - const: qcom,msm8926
   - const: qcom,msm8226
 
-- 
2.41.0



[PATCH v2 4/4] ARM: dts: qcom: samsung-matisse-common: Add UART

2023-10-24 Thread Stefan Hansson
This was not enabled in the matisse-wifi tree.

Signed-off-by: Stefan Hansson 
---
 .../boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi| 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi 
b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
index 11fec4e963b7..35290ce63b40 100644
--- a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
+++ b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
@@ -233,6 +233,10 @@ muic: usb-switch@25 {
};
 };
 
+_uart3 {
+   status = "okay";
+};
+
 _requests {
regulators {
compatible = "qcom,rpm-pm8226-regulators";
-- 
2.41.0



[PATCH v2 1/4] ARM: dts: qcom: samsung-matisse-common: Add initial common device tree

2023-10-24 Thread Stefan Hansson
According to the dts from the kernel source code released by Samsung,
matissewifi and matisselte only have minor differences in hardware, so
use a shared dtsi to reduce duplicated code. Additionally, this should
make adding support for matisse3g easier should someone want to do that
at a later point.

As such, add a common device tree for all matisse devices by Samsung
based on the matissewifi dts. Support for matisselte will be introduced
in a later patch in this series and will use the common dtsi as well.

Signed-off-by: Stefan Hansson 
---
 .../qcom-apq8026-samsung-matisse-wifi.dts | 467 +
 .../qcom-msm8226-samsung-matisse-common.dtsi  | 474 ++
 2 files changed, 483 insertions(+), 458 deletions(-)
 create mode 100644 
arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi

diff --git a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts 
b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
index f516e0426bb9..98d4bb797617 100644
--- a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
+++ b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
@@ -5,222 +5,12 @@
 
 /dts-v1/;
 
-#include 
-#include "qcom-msm8226.dtsi"
-#include "qcom-pm8226.dtsi"
-
-/delete-node/ _region;
-/delete-node/ _region;
+#include "qcom-msm8226-samsung-matisse-common.dtsi"
 
 / {
model = "Samsung Galaxy Tab 4 10.1";
compatible = "samsung,matisse-wifi", "qcom,apq8026";
chassis-type = "tablet";
-
-   aliases {
-   mmc0 = _1; /* SDC1 eMMC slot */
-   mmc1 = _2; /* SDC2 SD card slot */
-   display0 = 
-   };
-
-   chosen {
-   #address-cells = <1>;
-   #size-cells = <1>;
-   ranges;
-
-   stdout-path = "display0";
-
-   framebuffer0: framebuffer@320 {
-   compatible = "simple-framebuffer";
-   reg = <0x0320 0x80>;
-   width = <1280>;
-   height = <800>;
-   stride = <(1280 * 3)>;
-   format = "r8g8b8";
-   };
-   };
-
-   gpio-hall-sensor {
-   compatible = "gpio-keys";
-
-   event-hall-sensor {
-   label = "Hall Effect Sensor";
-   gpios = < 110 GPIO_ACTIVE_LOW>;
-   linux,input-type = ;
-   linux,code = ;
-   debounce-interval = <15>;
-   linux,can-disable;
-   wakeup-source;
-   };
-   };
-
-   gpio-keys {
-   compatible = "gpio-keys";
-   autorepeat;
-
-   key-home {
-   label = "Home";
-   gpios = < 108 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-
-   key-volume-down {
-   label = "Volume Down";
-   gpios = < 107 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-
-   key-volume-up {
-   label = "Volume Up";
-   gpios = < 106 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-   };
-
-   i2c-backlight {
-   compatible = "i2c-gpio";
-   sda-gpios = < 20 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
-   scl-gpios = < 21 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
-
-   pinctrl-0 = <_i2c_default_state>;
-   pinctrl-names = "default";
-
-   i2c-gpio,delay-us = <4>;
-
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   backlight@2c {
-   compatible = "ti,lp8556";
-   reg = <0x2c>;
-
-   dev-ctrl = /bits/ 8 <0x80>;
-   init-brt = /bits/ 8 <0x3f>;
-
-   pwms = <_pwm 0 10>;
-   pwm-names = "lp8556";
-
-   rom-a0h {
-   rom-addr = /bits/ 8 <0xa0>;
-   rom-val = /bits/ 8 <0x44>;
-   };
-
-   rom-a1h {
-   rom-addr = /bits/ 8 <0xa1>;
-   rom-val = /bits/ 8 <0x6c>;
-   };
-
-   rom-a5h {
-   rom-addr = /bits/ 8 <0xa5>;
-   rom-val = /bits/ 8 <0x24>;
-   };
-   };
-   };
-
-   backlight_pwm: pwm {
-   compatible = "clk-pwm";
-   #pwm-cells = <2>;
-   clocks = < CAMSS_GP0_CLK>;
-   pinctrl-0 = <_pwm_default_state>;
-

[PATCH v2 3/4] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE (SM-T535)

2023-10-24 Thread Stefan Hansson
Add a device tree for the Samsung Galaxy Tab 4 10.1 (SM-T535) LTE tablet
based on the MSM8926 platform.

Signed-off-by: Stefan Hansson 
---
 arch/arm/boot/dts/qcom/Makefile   |  1 +
 .../qcom/qcom-msm8926-samsung-matisselte.dts  | 36 +++
 2 files changed, 37 insertions(+)
 create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts

diff --git a/arch/arm/boot/dts/qcom/Makefile b/arch/arm/boot/dts/qcom/Makefile
index a3d293e40820..cab35eeb30f6 100644
--- a/arch/arm/boot/dts/qcom/Makefile
+++ b/arch/arm/boot/dts/qcom/Makefile
@@ -34,6 +34,7 @@ dtb-$(CONFIG_ARCH_QCOM) += \
qcom-msm8916-samsung-serranove.dtb \
qcom-msm8926-microsoft-superman-lte.dtb \
qcom-msm8926-microsoft-tesla.dtb \
+   qcom-msm8926-samsung-matisselte.dtb \
qcom-msm8960-cdp.dtb \
qcom-msm8960-samsung-expressatt.dtb \
qcom-msm8974-lge-nexus5-hammerhead.dtb \
diff --git a/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts 
b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
new file mode 100644
index ..6e25b1a74ce5
--- /dev/null
+++ b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2022, Matti Lehtimäki 
+ * Copyright (c) 2023, Stefan Hansson 
+ */
+
+/dts-v1/;
+
+#include "qcom-msm8226-samsung-matisse-common.dtsi"
+
+/ {
+   model = "Samsung Galaxy Tab 4 10.1 LTE";
+   compatible = "samsung,matisselte", "qcom,msm8926", "qcom,msm8226";
+   chassis-type = "tablet";
+};
+
+_l3 {
+   regulator-max-microvolt = <135>;
+};
+
+_s4 {
+   regulator-max-microvolt = <220>;
+};
+
+_tsp_3p3v {
+   gpio = < 32 GPIO_ACTIVE_HIGH>;
+};
+
+_2 {
+   /* SD card fails to probe with error -110 */
+   status = "disabled";
+};
+
+_en1_default_state {
+   pins = "gpio32";
+};
-- 
2.41.0



[PATCH v2 0/4] Add samsung-matisselte and common matisse dtsi

2023-10-24 Thread Stefan Hansson
This series adds a common samsung-matisse dtsi and reworks
samsung-matisse-wifi to use it, and introduces samsung-matisselte. I
choose matisselte over matisse-lte as this is how most other devices
(klte, s3ve3g) do it and it is the codename that Samsung gave the
device. See individual commits for more information.

---
Changes since v1:

 - Rebased on latest linux-next
 - Added qcom,msm8226 compatible to matisselte inspired by recent Lumia
   830 patch. This is done as in v1, the patch was rejected because I
   included the msm8226 dtsi despite not marking matisselte as
   compatible with msm8226, and I was not sure how to resolve that. As
   such, I'm copying what was done in the Lumia 830 (microsoft-tesla)
   patch given that it was accepted.

Stefan Hansson (4):
  ARM: dts: qcom: samsung-matisse-common: Add initial common device tree
  dt-bindings: arm: qcom: Add Samsung Galaxy Tab 4 10.1 LTE
  ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE
(SM-T535)
  ARM: dts: qcom: samsung-matisse-common: Add UART

 .../devicetree/bindings/arm/qcom.yaml |   1 +
 arch/arm/boot/dts/qcom/Makefile   |   1 +
 .../qcom-apq8026-samsung-matisse-wifi.dts | 467 +
 .../qcom-msm8226-samsung-matisse-common.dtsi  | 478 ++
 .../qcom/qcom-msm8926-samsung-matisselte.dts  |  36 ++
 5 files changed, 525 insertions(+), 458 deletions(-)
 create mode 100644 
arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
 create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts

-- 
2.41.0



[PATCH v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms

2023-10-24 Thread Alessandro Carminati (Red Hat)
In the kernel environment, scenarios often arise where identical names
are shared among symbols within core image or modules.
While this poses no complications for the kernel's binary itself, it
creates challenges when conducting trace or probe operations using tools
like kprobe.

A solution has been introduced, referred to as "kas_alias."
During the kernel's build process, an extensive scan of all objects is
performed, encompassing both core kernel components and modules, to
collect comprehensive symbol information.
Subsequently, for all duplicate symbolsthe process enriches symbol names
by appending meaningful suffixes derived from source files and line
numbers.
These freshly generated aliases simplify interaction with symbols.

The procedure is executed as follows.
During the kernel's build phase, an exhaustive search for duplicates among
symbols that share the same name in both kernel image and all modules
object files.
For the kernel core image, a new nem data file is created and alias for
all duplicate symbols is added.
For modules, the lib objects the ELF symtable is modified with the
addition of the alias for the duplicate symbols.

Consider the symbol "device_show", you can expect an output like the
following:

 ~ # cat /proc/kallsyms | grep " device_show"
 963cd2a0 t device_show
 963cd2a0 t device_show@drivers_pci_pci_sysfs_c_49
 96454b60 t device_show
 96454b60 t device_show@drivers_virtio_virtio_c_16
 966e1700 T device_show_ulong
 966e1740 T device_show_int
 966e1770 T device_show_bool
 c04e10a0 t device_show [mmc_core]
 c04e10a0 t device_show@drivers_mmc_core_sdio_bus_c_45  [mmc_core]

Signed-off-by: Alessandro Carminati (Red Hat) 

NOTE1:
About the symbols name duplication that happens as consequence of the
inclusion compat_binfmt_elf.c does, it is evident that this corner is
inherently challenging the addr2line approach.
Attempting to conceal this limitation would be counterproductive.

compat_binfmt_elf.c includes directly binfmt_elf.c, addr2line can't help
but report all functions and data declared by that file, coming from
binfmt_elf.c.

My position is that, rather than producing a more complicated pipeline
to handle this corner case, it is better to fix the compat_binfmt_elf.c
anomaly.

This patch does not deal with the two potentially problematic symbols
defined by compat_binfmt_elf.c

NOTE2:
The current implementation does not offer a solution for out-of-tree modules.
My stance is that these modules fall outside the scope, but I welcome any
comments or feedback regarding this matter.

Changes from v1:
* Integrated changes requested by Masami to exclude symbols with prefixes
  "_cfi" and "_pfx".
* Introduced a small framework to handle patterns that need to be excluded
  from the alias production.
* Excluded other symbols using the framework.
* Introduced the ability to discriminate between text and data symbols.
* Added two new config symbols in this version:
  CONFIG_KALLSYMS_ALIAS_DATA, which allows data for data, and
  CONFIG_KALLSYMS_ALIAS_DATA_ALL, which excludes all filters and provides
  an alias for each duplicated symbol.

https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carmin...@gmail.com/

Changes from v2:
* Alias tags are created by querying DWARF information from the vmlinux.
* The filename + line number is normalized and appended to the original
  name.
* The tag begins with '@' to indicate the symbol source.
* Not a change, but worth mentioning, since the alias is added to the
  existing list, the old duplicated name is preserved, and the livepatch
  way of dealing with duplicates is maintained.
* Acknowledging the existence of scenarios where inlined functions
  declared in header files may result in multiple copies due to compiler
  behavior, though it is not actionable as it does not pose an operational
  issue.
* Highlighting a single exception where the same name refers to different
  functions: the case of "compat_binfmt_elf.c," which directly includes
  "binfmt_elf.c" producing identical function copies in two separate
  modules.

https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carmin...@gmail.com/

Changes from v3:
* kas_alias was rewritten in Python to create a more concise and
  maintainable codebase.
* The previous automation process used by kas_alias to locate the vmlinux
  and the addr2line has been replaced with an explicit command-line switch
  for specifying these requirements.
* addr2line has been added into the main Makefile.
* A new command-line switch has been introduced, enabling users to extend
  the alias to global data names.

https://lore.kernel.org/all/20230828080423.3539686-1-alessandro.carmin...@gmail.com/

Changes from v4:
* Fixed the O= build issue
* The tool halts execution upon encountering major issues, thereby ensuring
  the pipeline is interrupted.
* A cmdline option to specify the source directory added.
* Minor code adjusments.
* 

Re: [PATCH v2 3/3] dt-bindings: usb: fsa4480: Add compatible for OCP96011

2023-10-24 Thread Rob Herring


On Fri, 20 Oct 2023 11:33:20 +0200, Luca Weiss wrote:
> The Orient-Chip OCP96011 is generally compatible with the FSA4480, add a
> compatible for it with the fallback on fsa4480.
> 
> However the AUX/SBU connections are expected to be swapped compared to
> FSA4480, so document this in the data-lanes description.
> 
> Signed-off-by: Luca Weiss 
> ---
>  Documentation/devicetree/bindings/usb/fcs,fsa4480.yaml | 18 
> ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 

Reviewed-by: Rob Herring 



Re: [PATCH v2 1/3] dt-bindings: usb: fsa4480: Add data-lanes property to endpoint

2023-10-24 Thread Rob Herring


On Fri, 20 Oct 2023 11:33:18 +0200, Luca Weiss wrote:
> Allow specifying data-lanes to reverse the muxing orientation between
> AUX+/- and SBU1/2 where necessary by the hardware design.
> 
> In the mux there's a switch that needs to be controlled from the OS, and
> it either connects AUX+ -> SBU1 and AUX- -> SBU2, or the reverse: AUX+
> -> SBU2 and AUX- -> SBU1, depending on the orientation of how the USB-C
> connector is plugged in.
> 
> With this data-lanes property we can now specify that AUX+ and AUX-
> connections are swapped between the SoC and the mux, therefore the OS
> needs to consider this and reverse the direction of this switch in the
> mux.
> 
> ___  ___
>   |  | |
>   |-- HP   --| |
>   |-- MIC  --| |or
> SoC   |  | MUX |-- SBU1 --->  To the USB-C
> Codec |-- AUX+ --| |-- SBU2 --->  connected
>   |-- AUX- --| |
> __|  |_|
> 
> (thanks to Neil Armstrong for this ASCII art)
> 
> Signed-off-by: Luca Weiss 
> ---
>  .../devicetree/bindings/usb/fcs,fsa4480.yaml   | 29 
> +-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 

Reviewed-by: Rob Herring 



[PATCH] eventfs: Fix typo in eventfs_inode union comment

2023-10-24 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

It's eventfs_inode not eventfs_indoe. There's no deer involved!

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 298d3ecaf621..64fde9490f52 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -37,7 +37,7 @@ struct eventfs_inode {
/*
 * Union - used for deletion
 * @del_list:   list of eventfs_inode to delete
-* @rcu:eventfs_indoe to delete in RCU
+* @rcu:eventfs_inode to delete in RCU
 * @is_freed:   node is freed if one of the above is set
 */
union {
-- 
2.42.0




Re: [PATCH] locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()

2023-10-24 Thread John Paul Adrian Glaubitz
On Tue, 2023-10-24 at 23:52 +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) 
> 
> Use generic_cmpxchg_local() for arch_cmpxchg_local() implementation
> in SH architecture because it does not implement arch_cmpxchg_local().
> 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202310241310.ir5uukog-...@intel.com/
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  arch/sh/include/asm/cmpxchg.h |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/sh/include/asm/cmpxchg.h b/arch/sh/include/asm/cmpxchg.h
> index 288f6f38d98f..e920e61fb817 100644
> --- a/arch/sh/include/asm/cmpxchg.h
> +++ b/arch/sh/include/asm/cmpxchg.h
> @@ -71,4 +71,6 @@ static inline unsigned long __cmpxchg(volatile void * ptr, 
> unsigned long old,
>   (unsigned long)_n_, sizeof(*(ptr))); \
>})
>  
> +#include 
> +
>  #endif /* __ASM_SH_CMPXCHG_H */

Reviewed-by: John Paul Adrian Glaubitz 

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [PATCH] dt-bindings: Drop kernel copy of common reserved-memory bindings

2023-10-24 Thread Rob Herring


On Fri, 13 Oct 2023 15:08:49 -0500, Rob Herring wrote:
> The common reserved-memory bindings have recently been copied from the
> kernel tree into dtschema. The preference is to host common, stable
> bindings in dtschema. As reserved-memory is documented in the DT Spec,
> it meets the criteria.
> 
> The v2023.09 version of dtschema is what contains the reserved-memory
> schemas we depend on, so bump the minimum version to that. Otherwise,
> references to these schemas will generate errors.
> 
> Signed-off-by: Rob Herring 
> ---
>  Documentation/devicetree/bindings/Makefile|   2 +-
>  .../remoteproc/renesas,rcar-rproc.yaml|   2 +-
>  .../bindings/reserved-memory/framebuffer.yaml |  52 -
>  .../reserved-memory/memory-region.yaml|  40 
>  .../reserved-memory/reserved-memory.txt   |   2 +-
>  .../reserved-memory/reserved-memory.yaml  | 181 --
>  .../reserved-memory/shared-dma-pool.yaml  |  97 --
>  .../bindings/sound/mediatek,mt8188-afe.yaml   |   2 +-
>  8 files changed, 4 insertions(+), 374 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/reserved-memory/framebuffer.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/reserved-memory/memory-region.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
> 

Applied, thanks!



[PATCH] eventfs: Fix WARN_ON() in create_file_dentry()

2023-10-24 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

As the comment right above a WARN_ON() in create_file_dentry() states:

  * Note, with the mutex held, the e_dentry cannot have content
  * and the ei->is_freed be true at the same time.

But the WARN_ON() only has:

  WARN_ON_ONCE(ei->is_free);

Where to match the comment (and what it should actually do) is:

  dentry = *e_dentry;
  WARN_ON_ONCE(dentry && ei->is_free)

Also in that case, set dentry to NULL (although it should never happen).

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 09ab93357957..4d2da7480e5f 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -264,8 +264,9 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry 
**e_dentry,
 * Note, with the mutex held, the e_dentry cannot have content
 * and the ei->is_freed be true at the same time.
 */
-   WARN_ON_ONCE(ei->is_freed);
dentry = *e_dentry;
+   if (WARN_ON_ONCE(dentry && ei->is_freed))
+   dentry = NULL;
/* The lookup does not need to up the dentry refcount */
if (dentry && !lookup)
dget(dentry);
-- 
2.42.0




Re: [PATCH] locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()

2023-10-24 Thread Mark Rutland
On Tue, Oct 24, 2023 at 11:52:54PM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) 
> 
> Use generic_cmpxchg_local() for arch_cmpxchg_local() implementation
> in SH architecture because it does not implement arch_cmpxchg_local().

I do not think this is correct.

The implementation in  is UP-only (and it only
disables interrupts), whereas arch/sh can be built SMP. We should probably add
some guards into  for that as we have in
.

I think the right thing to do here is to define arch_cmpxchg_local() in terms
of arch_cmpxchg(), i.e. at the bottom of arch/sh's  add:

#define arch_cmpxchg_local  arch_cmpxchg

Mark.

> 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202310241310.ir5uukog-...@intel.com/
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  arch/sh/include/asm/cmpxchg.h |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/sh/include/asm/cmpxchg.h b/arch/sh/include/asm/cmpxchg.h
> index 288f6f38d98f..e920e61fb817 100644
> --- a/arch/sh/include/asm/cmpxchg.h
> +++ b/arch/sh/include/asm/cmpxchg.h
> @@ -71,4 +71,6 @@ static inline unsigned long __cmpxchg(volatile void * ptr, 
> unsigned long old,
>   (unsigned long)_n_, sizeof(*(ptr))); \
>})
>  
> +#include 
> +
>  #endif /* __ASM_SH_CMPXCHG_H */
> 



[PATCH] locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()

2023-10-24 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Use generic_cmpxchg_local() for arch_cmpxchg_local() implementation
in SH architecture because it does not implement arch_cmpxchg_local().

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202310241310.ir5uukog-...@intel.com/
Signed-off-by: Masami Hiramatsu (Google) 
---
 arch/sh/include/asm/cmpxchg.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/include/asm/cmpxchg.h b/arch/sh/include/asm/cmpxchg.h
index 288f6f38d98f..e920e61fb817 100644
--- a/arch/sh/include/asm/cmpxchg.h
+++ b/arch/sh/include/asm/cmpxchg.h
@@ -71,4 +71,6 @@ static inline unsigned long __cmpxchg(volatile void * ptr, 
unsigned long old,
(unsigned long)_n_, sizeof(*(ptr))); \
   })
 
+#include 
+
 #endif /* __ASM_SH_CMPXCHG_H */




Re: [PATCH v1] lib,kprobes: using try_cmpxchg_local in objpool_push

2023-10-24 Thread Google
On Tue, 24 Oct 2023 09:57:17 +0800
"wuqiang.matt"  wrote:

> On 2023/10/24 09:01, Masami Hiramatsu (Google) wrote:
> > On Mon, 23 Oct 2023 11:43:04 -0400
> > Steven Rostedt  wrote:
> > 
> >> On Mon, 23 Oct 2023 19:24:52 +0800
> >> "wuqiang.matt"  wrote:
> >>
> >>> The objpool_push can only happen on local cpu node, so only the local
> >>> cpu can touch slot->tail and slot->last, which ensures the correctness
> >>> of using cmpxchg without lock prefix (using try_cmpxchg_local instead
> >>> of try_cmpxchg_acquire).
> >>>
> >>> Testing with IACA found the lock version of pop/push pair costs 16.46
> >>> cycles and local-push version costs 15.63 cycles. Kretprobe throughput
> >>> is improved to 1.019 times of the lock version for x86_64 systems.
> >>>
> >>> OS: Debian 10 X86_64, Linux 6.6rc6 with freelist
> >>> HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
> >>>
> >>>   1T 2T 4T 8T16T
> >>>lock:29909085   59865637  119692073  239750369  478005250
> >>>local:   30297523   60532376  121147338  242598499  484620355
> >>>  32T48T64T96T   128T
> >>>lock:   957553042 1435814086 1680872925 2043126796 2165424198
> >>>local:  968526317 1454991286 1861053557 2059530343 2171732306
> >>>
> >>> Signed-off-by: wuqiang.matt 
> >>> ---
> >>>   lib/objpool.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/lib/objpool.c b/lib/objpool.c
> >>> index ce0087f64400..a032701beccb 100644
> >>> --- a/lib/objpool.c
> >>> +++ b/lib/objpool.c
> >>> @@ -166,7 +166,7 @@ objpool_try_add_slot(void *obj, struct objpool_head 
> >>> *pool, int cpu)
> >>>   head = READ_ONCE(slot->head);
> >>>   /* fault caught: something must be wrong */
> >>>   WARN_ON_ONCE(tail - head > pool->nr_objs);
> >>> - } while (!try_cmpxchg_acquire(>tail, , tail + 1));
> >>> + } while (!try_cmpxchg_local(>tail, , tail + 1));
> >>>   
> >>>   /* now the tail position is reserved for the given obj */
> >>>   WRITE_ONCE(slot->entries[tail & slot->mask], obj);
> >>
> >> I'm good with the change, but I don't like how "cpu" is passed to this
> >> function. It currently is only used in one location, which does:
> >>
> >>rc = objpool_try_add_slot(obj, pool, raw_smp_processor_id());
> >>
> >> Which makes this change fine. But there's nothing here to prevent someone
> >> for some reason passing another CPU to that function.
> >>
> >> If we are to make that change, I would be much more comfortable with
> >> removing "int cpu" as a parameter to objpool_try_add_slot() and adding:
> >>
> >>int cpu = raw_smp_processor_id();
> >>
> >> Which now shows that this function *only* deals with the current CPU.
> > 
> > Oh indeed. It used to search all CPUs to push the object, but
> > I asked him to stop that because there should be enough space to
> > push it in the local ring. This is a remnant of that time.
> 
> Yes, good catch. Thanks for the explanation.
> 
> > Wuqiang, can you make another patch to fix it?
> 
> I'm thinking of removing the inline function objpool_try_add_slot and merging
> its functionality to objpool_push, like the followings:

Looks good.

> 
> 
> /* reclaim an object to object pool */
> int objpool_push(void *obj, struct objpool_head *pool)
> {
>   struct objpool_slot *slot;
>   uint32_t head, tail;
>   unsigned long flags;
> 
>   /* disable local irq to avoid preemption & interruption */
>   raw_local_irq_save(flags);
> 
>   slot = pool->cpu_slots[raw_smp_processor_id()];
> 
>   /* loading tail and head as a local snapshot, tail first */
>   tail = READ_ONCE(slot->tail);
> 
>   do {
>   head = READ_ONCE(slot->head);
>   /* fault caught: something must be wrong */
>   WARN_ON_ONCE(tail - head > pool->nr_objs);
>   } while (!try_cmpxchg_local(>tail, , tail + 1));
> 
>   /* now the tail position is reserved for the given obj */
>   WRITE_ONCE(slot->entries[tail & slot->mask], obj);
>   /* update sequence to make this obj available for pop() */
>   smp_store_release(>last, tail + 1);
> 
>   raw_local_irq_restore(flags);
> 
>   return 0;
> }
> 
> I'll prepare a new patch for this improvement.

Thanks!

> 
> > Thank you,
> > 
> >>
> >> -- Steve
> > 
> 
> Thanks for your time,
> wuqiang


-- 
Masami Hiramatsu (Google)