Re: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for PowerPC

2024-09-04 Thread 虞陆铭
Hi,

the patch set and mmio trace capability helped us to track down to the root 
cause of  a mysterious EEH only on powernv platform
 triggered by a testpmd dpdk user mode driver (UIO) by comparing the mmio trace 
from UIO and native kernel driver from the same nic. 
And the problem is solved by switch to VFIO driver as backend.

the problem is the mmiotrace could not capture user mode mmio which I think it 
could be limitation by its current design.

so, I think we could tidy up the patch and do some feature enhancement then we 
can submit again for the debug value of the patch.
As it really is painful when EEH A/B record doesn't point to a clear root cause 
or suffient details that can help us to fix  driver code for the unaligned
MMIO for a 3rd party nic vendor on powernv platform.
 
Cheers,
Luming
-- Original --
From:  "Yang Jialong 杨佳龙";
Date:  Fri, Jun 28, 2024 04:21 PM
To:  "mpe"; "npiggin"; 
"christophe.leroy"; "Naveen N. 
Rao"; 
Cc:  "虞陆铭"; "shenghui.qu"; 
"linuxppc-dev"; 
"linux-kernel"; "Steven 
Rostedt"; "Masami Hiramatsu"; "Karol 
Herbst"; "Pekka Paalanen"; 
"nouveau"; 
Subject:  Re: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for 
PowerPC

 


在 2024/6/28 15:02, Michael Ellerman 写道:
> Jialong Yang  writes:
>> mmiotrace is a useful tool to trace MMIO accesses. Nowadays, it only
>> supported on x86 and x86_64 platforms.
> I've never used mmiotrace, and don't know it well.
>
> I'm not necessarily opposed to merging it, but AFAIK it was mostly used
> for reverse engineering proprietary drivers, where the driver itself
> couldn't be easily instrumented. Is that what you're using it for?

Yes. Just like you think. We have used it for network stack debug in 
ppc64le.


>
> For drivers where we have the source wouldn't it be easier to just use
> tracepoints in the MMIO accessors?


Tracepoints need pre-defined. And in some big driver, it's not easy to 
overwrite

all points where access registers in io area. And tracepoint is C 
function level filter.

mmiotrace is similar to set tracepoints in writel/readl... But it can do 
deeperly.

mmiotrace is a asm level filter tool. It doesn't care what have done in 
C level. It will

only find what have done by asm, such as stw(store word)/lw(load word),  
just like standing

in the view of device.


>
> Is it still in-use/maintained on the x86 side?


Here is some core file patches number in x86:

|  | mmio_mod.c | kmmio.c | pf_in.* | testmmiotrace.c |
|--++-+-+-|
| 2022 |   1 |3 | |  |
| 2021 |   2 |1 |   | |
| 2020 |   4 |4 | |   1 |
| 2019 |   2 |1 |1 |   4 |
| 2018 |  |2 |   |  |
| 2017 |   2 |2 | |   1 |
| 2016 |   1 |2 |1 ||
| 2014 |  |1 |   |  |
| 2013 |   1 |   |   |  |
| 2012 |   1 |   | |  |
| 2011 |   3 |   |1 | |
| 2010 |   1 |3 |2 |   1 |
| 2009 |   4 |  19 | |   3 |
| 2008 | 13 |5 |2 |   3 |

>
>> Here is a support for powerpc.
>> The manual is located at Documentation/trace/mmiotrace.rst which means
>> I have not changed user API. People will be easy to use it.
>> Almost all files are copied from x86/mm, there are only some
>> differences from hardware and architectures software.
>>
>> LINK: 
>> https://lore.kernel.org/lkml/20080127195536.50809...@daedalus.pq.iki.fi/
>>
>> Signed-off-by: Jialong Yang 
>> ---
>>   arch/powerpc/Kconfig.debug   |   3 +
>>   arch/powerpc/mm/Makefile |   1 +
>>   arch/powerpc/mm/kmmio.c  | 649 +++
>>   arch/powerpc/mm/mmio-mod.c   | 414 
>>   arch/powerpc/mm/mmiotrace_arch.c | 149 +++
>>   arch/powerpc/mm/mmiotrace_arch.h |  25 ++
>>   arch/powerpc/mm/pf_in.c  | 185 +
>>   arch/powerpc/mm/pf_in.h  |  33 ++
>>   8 files changed, 1459 insertions(+)
>
> At a glance most of that code could be shared between arches. I don't
> think I can merge that as-is, without some attempt to split the generic
> parts out.


Right.

I just copy them from arch/x86/mm. There are many code not arch specific.


> cheers
>

Re: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for PowerPC

2024-06-28 Thread Yang Jialong 杨佳龙



在 2024/6/28 15:02, Michael Ellerman 写道:

Jialong Yang  writes:

mmiotrace is a useful tool to trace MMIO accesses. Nowadays, it only
supported on x86 and x86_64 platforms.

I've never used mmiotrace, and don't know it well.

I'm not necessarily opposed to merging it, but AFAIK it was mostly used
for reverse engineering proprietary drivers, where the driver itself
couldn't be easily instrumented. Is that what you're using it for?


Yes. Just like you think. We have used it for network stack debug in 
ppc64le.





For drivers where we have the source wouldn't it be easier to just use
tracepoints in the MMIO accessors?



Tracepoints need pre-defined. And in some big driver, it's not easy to 
overwrite


all points where access registers in io area. And tracepoint is C 
function level filter.


mmiotrace is similar to set tracepoints in writel/readl... But it can do 
deeperly.


mmiotrace is a asm level filter tool. It doesn't care what have done in 
C level. It will


only find what have done by asm, such as stw(store word)/lw(load word),  
just like standing


in the view of device.




Is it still in-use/maintained on the x86 side?



Here is some core file patches number in x86:

|  | mmio_mod.c | kmmio.c | pf_in.* | testmmiotrace.c |
|--++-+-+-|
| 2022 |           1 |        3 | |      |
| 2021 |           2 |            1 |               |     |
| 2020 |           4 |            4 | |   1 |
| 2019 |           2 |            1 |            1 |   4 |
| 2018 |              |            2 |               |      |
| 2017 |           2 |            2 | |   1 |
| 2016 |           1 |            2 |            1 |    |
| 2014 |          |            1 |               |      |
| 2013 |           1 |               |           |      |
| 2012 |           1 |           | |      |
| 2011 |           3 |           |            1 |     |
| 2010 |           1 |            3 |            2 |   1 |
| 2009 |           4 |          19 | |   3 |
| 2008 |         13 |            5 |            2 |   3 |




Here is a support for powerpc.
The manual is located at Documentation/trace/mmiotrace.rst which means
I have not changed user API. People will be easy to use it.
Almost all files are copied from x86/mm, there are only some
differences from hardware and architectures software.

LINK: https://lore.kernel.org/lkml/20080127195536.50809...@daedalus.pq.iki.fi/

Signed-off-by: Jialong Yang 
---
  arch/powerpc/Kconfig.debug   |   3 +
  arch/powerpc/mm/Makefile |   1 +
  arch/powerpc/mm/kmmio.c  | 649 +++
  arch/powerpc/mm/mmio-mod.c   | 414 
  arch/powerpc/mm/mmiotrace_arch.c | 149 +++
  arch/powerpc/mm/mmiotrace_arch.h |  25 ++
  arch/powerpc/mm/pf_in.c  | 185 +
  arch/powerpc/mm/pf_in.h  |  33 ++
  8 files changed, 1459 insertions(+)
   
At a glance most of that code could be shared between arches. I don't

think I can merge that as-is, without some attempt to split the generic
parts out.



Right.

I just copy them from arch/x86/mm. There are many code not arch specific.



cheers





Re: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for PowerPC

2024-06-28 Thread Michael Ellerman
Jialong Yang  writes:
> mmiotrace is a useful tool to trace MMIO accesses. Nowadays, it only
> supported on x86 and x86_64 platforms.

I've never used mmiotrace, and don't know it well.

I'm not necessarily opposed to merging it, but AFAIK it was mostly used
for reverse engineering proprietary drivers, where the driver itself
couldn't be easily instrumented. Is that what you're using it for?

For drivers where we have the source wouldn't it be easier to just use
tracepoints in the MMIO accessors?

Is it still in-use/maintained on the x86 side?

> Here is a support for powerpc.
> The manual is located at Documentation/trace/mmiotrace.rst which means
> I have not changed user API. People will be easy to use it.
> Almost all files are copied from x86/mm, there are only some
> differences from hardware and architectures software.
>
> LINK: https://lore.kernel.org/lkml/20080127195536.50809...@daedalus.pq.iki.fi/
>
> Signed-off-by: Jialong Yang 
> ---
>  arch/powerpc/Kconfig.debug   |   3 +
>  arch/powerpc/mm/Makefile |   1 +
>  arch/powerpc/mm/kmmio.c  | 649 +++
>  arch/powerpc/mm/mmio-mod.c   | 414 
>  arch/powerpc/mm/mmiotrace_arch.c | 149 +++
>  arch/powerpc/mm/mmiotrace_arch.h |  25 ++
>  arch/powerpc/mm/pf_in.c  | 185 +
>  arch/powerpc/mm/pf_in.h  |  33 ++
>  8 files changed, 1459 insertions(+)
  
At a glance most of that code could be shared between arches. I don't
think I can merge that as-is, without some attempt to split the generic
parts out.

cheers


Re: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for PowerPC

2024-06-27 Thread kernel test robot
Hi Jialong,

kernel test robot noticed the following build errors:

[auto build test ERROR on powerpc/next]
[also build test ERROR on powerpc/fixes linus/master v6.10-rc5 next-20240626]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Jialong-Yang/powerpc-mmiotrace-bind-ioremap-and-page-fault-to-active-mmiotrace/20240624-163027
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:
https://lore.kernel.org/r/2bf90acf7d29641ba6643934ff8dbba897dbd2d9.1718873074.git.jialong.yang%40shingroup.cn
patch subject: [PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for 
PowerPC
config: powerpc-randconfig-r113-20240627 
(https://download.01.org/0day-ci/archive/20240627/202406271946.a6jwffay-...@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 13.2.0
reproduce: 
(https://download.01.org/0day-ci/archive/20240627/202406271946.a6jwffay-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202406271946.a6jwffay-...@intel.com/

All error/warnings (new ones prefixed by >>):

   arch/powerpc/mm/kmmio.c: In function 'pmd_mkinvalid':
>> arch/powerpc/mm/kmmio.c:140:16: error: implicit declaration of function 
>> '__pmd_raw' [-Werror=implicit-function-declaration]
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |^
>> arch/powerpc/mm/kmmio.c:140:26: error: implicit declaration of function 
>> 'pmd_raw'; did you mean 'pmd_bad'? [-Werror=implicit-function-declaration]
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |  ^~~
 |  pmd_bad
   In file included from include/linux/byteorder/big_endian.h:5,
from arch/powerpc/include/uapi/asm/byteorder.h:14,
from include/asm-generic/bitops/le.h:6,
from arch/powerpc/include/asm/bitops.h:325,
from include/linux/bitops.h:63,
from include/linux/thread_info.h:27,
from arch/powerpc/include/asm/ptrace.h:342,
from arch/powerpc/include/asm/hw_irq.h:12,
from arch/powerpc/include/asm/irqflags.h:12,
from include/linux/irqflags.h:18,
from include/asm-generic/cmpxchg-local.h:6,
from arch/powerpc/include/asm/cmpxchg.h:755,
from arch/powerpc/include/asm/atomic.h:11,
from include/linux/atomic.h:7,
from include/linux/rcupdate.h:25,
from include/linux/rculist.h:11,
from arch/powerpc/mm/kmmio.c:10:
>> arch/powerpc/mm/kmmio.c:140:70: error: '_PAGE_INVALID' undeclared (first use 
>> in this function); did you mean 'RPM_INVALID'?
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |  
^
   include/uapi/linux/byteorder/big_endian.h:38:51: note: in definition of 
macro '__cpu_to_be64'
  38 | #define __cpu_to_be64(x) ((__force __be64)(__u64)(x))
 |   ^
   arch/powerpc/mm/kmmio.c:140:42: note: in expansion of macro 'cpu_to_be64'
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |  ^~~
   arch/powerpc/mm/kmmio.c:140:70: note: each undeclared identifier is reported 
only once for each function it appears in
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |  
^
   include/uapi/linux/byteorder/big_endian.h:38:51: note: in definition of 
macro '__cpu_to_be64'
  38 | #define __cpu_to_be64(x) ((__force __be64)(__u64)(x))
 |   ^
   arch/powerpc/mm/kmmio.c:140:42: note: in expansion of macro 'cpu_to_be64'
 140 | return __pmd_raw(pmd_raw(pmd) & ~cpu_to_be64(_PAGE_PRESENT | 
_PAGE_INVALID));
 |  ^~~
   arch/powerpc/mm/kmmio.c: In function 'kmmio_handler':
>> arch/powerpc/mm/kmmio.c:318:32: error: 'struct pt_regs' has 

[PATCH v1 1/2] powerpc/mmiotrace: Add MMIO Tracing tool for PowerPC

2024-06-20 Thread Jialong Yang
mmiotrace is a useful tool to trace MMIO accesses. Nowadays, it only
supported on x86 and x86_64 platforms. Here is a support for powerpc.
The manual is located at Documentation/trace/mmiotrace.rst which means
I have not changed user API. People will be easy to use it.
Almost all files are copied from x86/mm, there are only some
differences from hardware and architectures software.

LINK: https://lore.kernel.org/lkml/20080127195536.50809...@daedalus.pq.iki.fi/

Signed-off-by: Jialong Yang 
---
 arch/powerpc/Kconfig.debug   |   3 +
 arch/powerpc/mm/Makefile |   1 +
 arch/powerpc/mm/kmmio.c  | 649 +++
 arch/powerpc/mm/mmio-mod.c   | 414 
 arch/powerpc/mm/mmiotrace_arch.c | 149 +++
 arch/powerpc/mm/mmiotrace_arch.h |  25 ++
 arch/powerpc/mm/pf_in.c  | 185 +
 arch/powerpc/mm/pf_in.h  |  33 ++
 8 files changed, 1459 insertions(+)
 create mode 100644 arch/powerpc/mm/kmmio.c
 create mode 100644 arch/powerpc/mm/mmio-mod.c
 create mode 100644 arch/powerpc/mm/mmiotrace_arch.c
 create mode 100644 arch/powerpc/mm/mmiotrace_arch.h
 create mode 100644 arch/powerpc/mm/pf_in.c
 create mode 100644 arch/powerpc/mm/pf_in.h

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 8c80b154e814..8a69188aa75a 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -1,5 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
 
+config HAVE_MMIOTRACE_SUPPORT
+   def_bool y
+
 config PPC_DISABLE_WERROR
bool "Don't build arch/powerpc code with -Werror"
help
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 0fe2f085c05a..cb92049f1239 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
 obj-$(CONFIG_PPC_COPRO_BASE)   += copro_fault.o
 obj-$(CONFIG_PTDUMP_CORE)  += ptdump/
 obj-$(CONFIG_KASAN)+= kasan/
+obj-$(CONFIG_MMIOTRACE) += kmmio.o mmio-mod.o pf_in.o mmiotrace_arch.o
diff --git a/arch/powerpc/mm/kmmio.c b/arch/powerpc/mm/kmmio.c
new file mode 100644
index ..f4374e721b37
--- /dev/null
+++ b/arch/powerpc/mm/kmmio.c
@@ -0,0 +1,649 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Support for MMIO probes.
+ * Derived from arch/x86/mm/kmmio.c:
+ *   Copyright (C) 2024 Jialong Yang (jialong.y...@shingroup.cn)
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mmiotrace_arch.h"
+
+typedef unsigned long  pteval_t;
+typedef unsigned long  pmdval_t;
+
+#define KMMIO_PAGE_HASH_BITS 4
+#define KMMIO_PAGE_TABLE_SIZE (1 << KMMIO_PAGE_HASH_BITS)
+
+struct kmmio_fault_page {
+   struct list_head list;
+   struct kmmio_fault_page *release_next;
+   unsigned long addr; /* the requested address */
+   pteval_t old_presence; /* page presence prior to arming */
+   bool armed;
+
+   /*
+* Number of times this page has been registered as a part
+* of a probe. If zero, page is disarmed and this may be freed.
+* Used only by writers (RCU) and post_kmmio_handler().
+* Protected by kmmio_lock, when linked into kmmio_page_table.
+*/
+   int count;
+
+   bool scheduled_for_release;
+};
+
+struct kmmio_delayed_release {
+   struct rcu_head rcu;
+   struct kmmio_fault_page *release_list;
+};
+
+struct kmmio_context {
+   struct kmmio_fault_page *fpage;
+   struct kmmio_probe *probe;
+   unsigned long saved_flags;
+   unsigned long saved_softe;
+   unsigned long addr;
+   int active;
+};
+
+/*
+ * The kmmio_lock is taken in int3 context, which is treated as NMI context.
+ * This causes lockdep to complain about it bein in both NMI and normal
+ * context. Hide it from lockdep, as it should not have any other locks
+ * taken under it, and this is only enabled for debugging mmio anyway.
+ */
+static arch_spinlock_t kmmio_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+/* Protected by kmmio_lock */
+unsigned int kmmio_count;
+
+/* Read-protected by RCU, write-protected by kmmio_lock. */
+static struct list_head kmmio_page_table[KMMIO_PAGE_TABLE_SIZE];
+static LIST_HEAD(kmmio_probes);
+
+static struct list_head *kmmio_page_list(unsigned long addr)
+{
+   unsigned int l;
+   pte_t *pte = lookup_address(addr, &l);
+
+   if (!pte)
+   return NULL;
+   addr &= page_level_mask(l);
+
+   return &kmmio_page_table[hash_long(addr, KMMIO_PAGE_HASH_BITS)];
+}
+
+/* Accessed per-cpu */
+static DEFINE_PER_CPU(struct kmmio_context, kmmio_ctx);
+
+/*
+ * this is basically a dynamic stabbing problem:
+ * Could use the existing prio tree code or
+ * Possible better implementations:
+ * The Interval Skip List: A Data Structure for Finding All Intervals That
+ * Over