Re: [PATCH 08/20] sh: boot: Add proper forward declarations

2024-03-04 Thread Yoshinori Sato
On Sat, 02 Mar 2024 06:02:22 +0900,
Geert Uytterhoeven wrote:
> 
> arch/sh/boot/compressed/cache.c:2:5: warning: no previous prototype for 
> ‘cache_control’ [-Wmissing-prototypes]
> arch/sh/boot/compressed/misc.c:115:6: warning: no previous prototype for 
> ‘ftrace_stub’ [-Wmissing-prototypes]
> arch/sh/boot/compressed/misc.c:118:6: warning: no previous prototype for 
> ‘arch_ftrace_ops_list_func’ [-Wmissing-prototypes]
> arch/sh/boot/compressed/misc.c:128:6: warning: no previous prototype for 
> ‘decompress_kernel’ [-Wmissing-prototypes]
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  arch/sh/boot/compressed/cache.c |  3 +++
>  arch/sh/boot/compressed/cache.h | 10 ++
>  arch/sh/boot/compressed/misc.c  |  8 +++-
>  arch/sh/boot/compressed/misc.h  |  9 +
>  4 files changed, 25 insertions(+), 5 deletions(-)
>  create mode 100644 arch/sh/boot/compressed/cache.h
>  create mode 100644 arch/sh/boot/compressed/misc.h
> 
> diff --git a/arch/sh/boot/compressed/cache.c b/arch/sh/boot/compressed/cache.c
> index 31e04ff4841ed084..95c1e73ccbb7e011 100644
> --- a/arch/sh/boot/compressed/cache.c
> +++ b/arch/sh/boot/compressed/cache.c
> @@ -1,4 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
> +
> +#include "cache.h"
> +
>  int cache_control(unsigned int command)
>  {
>   volatile unsigned int *p = (volatile unsigned int *) 0x8000;
> diff --git a/arch/sh/boot/compressed/cache.h b/arch/sh/boot/compressed/cache.h
> new file mode 100644
> index ..b622b68c87f59b97
> --- /dev/null
> +++ b/arch/sh/boot/compressed/cache.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef CACHE_H
> +#define CACHE_H
> +
> +#define CACHE_ENABLE 0
> +#define CACHE_DISABLE1
> +
> +int cache_control(unsigned int command);
> +
> +#endif /* CACHE_H */
> diff --git a/arch/sh/boot/compressed/misc.c b/arch/sh/boot/compressed/misc.c
> index ca05c99a3d5b488d..5178150ca6650dcf 100644
> --- a/arch/sh/boot/compressed/misc.c
> +++ b/arch/sh/boot/compressed/misc.c
> @@ -16,6 +16,9 @@
>  #include 
>  #include 
>  
> +#include "cache.h"
> +#include "misc.h"
> +
>  /*
>   * gzip declarations
>   */
> @@ -26,11 +29,6 @@
>  #undef memcpy
>  #define memzero(s, n) memset ((s), 0, (n))
>  
> -/* cache.c */
> -#define CACHE_ENABLE  0
> -#define CACHE_DISABLE 1
> -int cache_control(unsigned int command);
> -
>  extern char input_data[];
>  extern int input_len;
>  static unsigned char *output;
> diff --git a/arch/sh/boot/compressed/misc.h b/arch/sh/boot/compressed/misc.h
> new file mode 100644
> index ..2b4534faa3052857
> --- /dev/null
> +++ b/arch/sh/boot/compressed/misc.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef MISC_H
> +#define MISC_H
> +
> +void arch_ftrace_ops_list_func(void);
> +void decompress_kernel(void);
> +void ftrace_stub(void);
> +
> +#endif /* MISC_H */
> -- 
> 2.34.1
> 
> 

This cache control is from SH5, so it is no longer needed.
I think it's better to simply delete chace.c and cache_control.

-- 
Yosinori Sato



Re: [PATCH] tracing: Remove precision vsnprintf() check from print event

2024-03-04 Thread Sachin Sant



> On 05-Mar-2024, at 4:13 AM, Steven Rostedt  wrote:
> 
> From: "Steven Rostedt (Google)" 
> 
> This reverts 60be76eeabb3d ("tracing: Add size check when printing
> trace_marker output"). The only reason the precision check was added
> was because of a bug that miscalculated the write size of the string into
> the ring buffer and it truncated it removing the terminating nul byte. On
> reading the trace it crashed the kernel. But this was due to the bug in
> the code that happened during development and should never happen in
> practice. If anything, the precision can hide bugs where the string in the
> ring buffer isn't nul terminated and it will not be checked.
> 
> Link: 
> https://lore.kernel.org/all/c7e7af1a-d30f-4d18-b8e5-af1ef5800...@linux.ibm.com/
> Link: 
> https://lore.kernel.org/linux-trace-kernel/20240227125706.04279...@gandalf.local.home
> Link: https://lore.kernel.org/all/20240302111244.3a167...@gandalf.local.home/
> 
> Reported-by: Sachin Sant 
> Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker 
> output")
> Signed-off-by: Steven Rostedt (Google) 
> ---

This fixes the reported problem for me.

All the ftrace selftests complete without any fails.
# of passed:  121
# of failed:  0
# of unresolved:  6
# of untested:  0
# of unsupported:  7
# of xfailed:  1
# of undefined(test bug):  0

Tested-by: Sachin Sant 


— Sachin



[PATCH v2] tracing: Limit trace_marker writes to just 4K

2024-03-04 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Limit the max print event of trace_marker to just 4K string size. This must
also be less than the amount that can be held by a trace_seq along with
the text that is before the output (like the task name, PID, CPU, state,
etc). As trace_seq is made to handle large events (some greater than 4K).
Make the max size of a trace_marker write event be 4K which is guaranteed
to fit in the trace_seq buffer.

Suggested-by: Mathieu Desnoyers 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v1: 
https://lore.kernel.org/linux-trace-kernel/20240304192710.4c996...@gandalf.local.home/

- Just make the max limit 4K and not half of the trace_seq size.
  The trace_seq is already made to handle events greater than 4k.

 kernel/trace/trace.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8198bfc54b58..d16b95ca58a7 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7293,6 +7293,8 @@ tracing_free_buffer_release(struct inode *inode, struct 
file *filp)
return 0;
 }
 
+#define TRACE_MARKER_MAX_SIZE  4096
+
 static ssize_t
 tracing_mark_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *fpos)
@@ -7320,6 +7322,9 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if ((ssize_t)cnt < 0)
return -EINVAL;
 
+   if (cnt > TRACE_MARKER_MAX_SIZE)
+   cnt = TRACE_MARKER_MAX_SIZE;
+
meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
  again:
size = cnt + meta_size;
@@ -7328,11 +7333,6 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if (cnt < FAULTED_SIZE)
size += FAULTED_SIZE - cnt;
 
-   if (size > TRACE_SEQ_BUFFER_SIZE) {
-   cnt -= size - TRACE_SEQ_BUFFER_SIZE;
-   goto again;
-   }
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
-- 
2.43.0




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 21:48:44 -0500
Mathieu Desnoyers  wrote:

> On 2024-03-04 21:37, Steven Rostedt wrote:
> > On Mon, 4 Mar 2024 21:35:38 -0500
> > Steven Rostedt  wrote:
> >   
> >>> And it's not for debugging, it's for validation of assumptions
> >>> made about an upper bound limit defined for a compile-time
> >>> check, so as the code evolves issues are caught early.  
> >>
> >> validating is debugging.  
> > 
> > Did Linus put you up to this? To test me to see if I'm learning how to say 
> > "No" ;-)  
> 
> No, he did not. 

I was being facetious.

> I genuinely think that validating size limits like
> this either at compile time or, when they can vary at runtime like
> in this case, with a dynamic check, decreases the cognitive
> load on the reviewers. We can then assume that whatever limit
> was put in place is actually enforced and not just wishful
> thinking.

I'm for that too. But the purpose of trace_seq was to be able to safely
append strings on top of each other. It was written specifically for the
trace file output. It should be long enough to print the entire string. If
the trace_seq overflows, it will not add any more content. But this is
checked at the end to see if everything did fit.

I had the "half" size logic because it was still way more than enough to
hold the write and the header, where the header should never be that big.

It's not much different than vsnprintf() having a 32K precision field that
warns if you go over it. If a header is 128 bytes then it is really a
failure in output, as it will cause each line to overflow a normal 80
character terminal screen before it even gets to the content of the event.

Such a header is stupid and this is where I'm starting to understand Linus,
where we don't need to write a bunch of debug code to make sure we don't
have some huge header just because it may cause the content of the event
from being visible.

Oh! and yes, there are events that can be large (stack traces and such). If
a header is created to be that big, you will likely drop actual real
events that have nothing to do with trace_marker.

> 
> If the "header" size upper bound is not validated at runtime, there
> is not much point in adding the BUILD_BUG_ON() based on that value
> in the first place, and you should then just add a runtime check that
> you don't overflow the output buffer before writing the output to it.


OK, then I guess we don't need the checks. 4K for a trace_marker limit and
8K for the trace_seq that will eventually hold its content, is a pretty
simple concept. Do we need a bunch of logic to keep it from breaking?
Especially since trace_marker is used a lot in testing the tracing code
itself.

-- Steve




[PATCH net-next v3] tcp: Add skb addr and sock addr to arguments of tracepoint tcp_probe.

2024-03-04 Thread fuyuanli
It is useful to expose skb addr and sock addr to user in tracepoint
tcp_probe, so that we can get more information while monitoring
receiving of tcp data, by ebpf or other ways.

For example, we need to identify a packet by seq and end_seq when
calculate transmit latency between layer 2 and layer 4 by ebpf, but which is
not available in tcp_probe, so we can only use kprobe hooking
tcp_rcv_established to get them. But we can use tcp_probe directly if skb
addr and sock addr are available, which is more efficient.

Signed-off-by: fuyuanli 
Reviewed-by: Jason Xing 
---
v3
Link: 
https://lore.kernel.org/netdev/20240304034632.GA21357@didi-ThinkCentre-M920t-N000/
1) Fix some errors of spellchecking in commit message.

v2
Link: 
https://lore.kernel.org/netdev/20240229052813.GA23899@didi-ThinkCentre-M920t-N000/
1) Add printing about those two addresses.
2) Target "net-next" in the title of patch.
---
 include/trace/events/tcp.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 7b1ddffa3dfc..efb651683781 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -258,6 +258,8 @@ TRACE_EVENT(tcp_probe,
__field(__u32, srtt)
__field(__u32, rcv_wnd)
__field(__u64, sock_cookie)
+   __field(const void *, skbaddr)
+   __field(const void *, skaddr)
),
 
TP_fast_assign(
@@ -285,14 +287,18 @@ TRACE_EVENT(tcp_probe,
__entry->ssthresh = tcp_current_ssthresh(sk);
__entry->srtt = tp->srtt_us >> 3;
__entry->sock_cookie = sock_gen_cookie(sk);
+
+   __entry->skbaddr = skb;
+   __entry->skaddr = sk;
),
 
-   TP_printk("family=%s src=%pISpc dest=%pISpc mark=%#x data_len=%d 
snd_nxt=%#x snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u rcv_wnd=%u 
sock_cookie=%llx",
+   TP_printk("family=%s src=%pISpc dest=%pISpc mark=%#x data_len=%d 
snd_nxt=%#x snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u rcv_wnd=%u 
sock_cookie=%llx skbaddr=%p skaddr=%p",
  show_family_name(__entry->family),
  __entry->saddr, __entry->daddr, __entry->mark,
  __entry->data_len, __entry->snd_nxt, __entry->snd_una,
  __entry->snd_cwnd, __entry->ssthresh, __entry->snd_wnd,
- __entry->srtt, __entry->rcv_wnd, __entry->sock_cookie)
+ __entry->srtt, __entry->rcv_wnd, __entry->sock_cookie,
+ __entry->skbaddr, __entry->skaddr)
 );
 
 #define TP_STORE_ADDR_PORTS_SKB_V4(__entry, skb)   \
-- 
2.17.1




Re: [PATCH v3 2/2] riscv: Fix text patching when IPI are used

2024-03-04 Thread Anup Patel
On Tue, Mar 5, 2024 at 1:54 AM Björn Töpel  wrote:
>
> Conor Dooley  writes:
>
> > On Thu, Feb 29, 2024 at 01:10:56PM +0100, Alexandre Ghiti wrote:
> >> For now, we use stop_machine() to patch the text and when we use IPIs for
> >> remote icache flushes (which is emitted in patch_text_nosync()), the system
> >> hangs.
> >>
> >> So instead, make sure every CPU executes the stop_machine() patching
> >> function and emit a local icache flush there.
> >>
> >> Co-developed-by: Björn Töpel 
> >> Signed-off-by: Björn Töpel 
> >> Signed-off-by: Alexandre Ghiti 
> >> Reviewed-by: Andrea Parri 
> >
> > What commit does this fix?
>
> Hmm. The bug is exposed when the AIA IPI are introduced, and used
> (instead of the firmware-based).
>
> I'm not sure this is something we'd like backported, but rather a
> prerequisite to AIA.
>
> @Anup @Alex WDYT?
>

The current text patching never considered IPIs being injected
directly in S-mode from hart to another so we are seeing this
issue now with AIA IPIs.

We certainly don't need to backport this fix since it's more
of a preparatory fix for AIA IPIs.

Regards,
Anup



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 21:37, Steven Rostedt wrote:

On Mon, 4 Mar 2024 21:35:38 -0500
Steven Rostedt  wrote:


And it's not for debugging, it's for validation of assumptions
made about an upper bound limit defined for a compile-time
check, so as the code evolves issues are caught early.


validating is debugging.


Did Linus put you up to this? To test me to see if I'm learning how to say "No" 
;-)


No, he did not. I genuinely think that validating size limits like
this either at compile time or, when they can vary at runtime like
in this case, with a dynamic check, decreases the cognitive
load on the reviewers. We can then assume that whatever limit
was put in place is actually enforced and not just wishful
thinking.

If the "header" size upper bound is not validated at runtime, there
is not much point in adding the BUILD_BUG_ON() based on that value
in the first place, and you should then just add a runtime check that
you don't overflow the output buffer before writing the output to it.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 21:35:38 -0500
Steven Rostedt  wrote:

> > And it's not for debugging, it's for validation of assumptions
> > made about an upper bound limit defined for a compile-time
> > check, so as the code evolves issues are caught early.  
> 
> validating is debugging.

Did Linus put you up to this? To test me to see if I'm learning how to say "No" 
;-)

-- Steve



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 21:18:13 -0500
Mathieu Desnoyers  wrote:

> On 2024-03-04 20:59, Steven Rostedt wrote:
> > On Mon, 4 Mar 2024 20:42:39 -0500
> > Mathieu Desnoyers  wrote:
> >   
> >> #define TRACE_OUTPUT_META_DATA_MAX_LEN 80
> >>
> >> and a runtime check in the code generating this header.
> >>
> >> This would avoid adding an unchecked upper limit.  
> > 
> > That would be a lot of complex code that is for debugging something that
> > has never happened in the past 16 years and Linus has already reprimanded
> > me on doing such things.  
> 
> Is it more complex than remembering the iterator position in
> print_trace_fmt() right before:
> 
>  if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
>  if (iter->iter_flags & TRACE_FILE_LAT_FMT)
>  trace_print_lat_context(iter);
>  else
>  trace_print_context(iter);
>  }

You forgot all of theses:

if (iter->ent->type == TRACE_BPUTS &&
trace_flags & TRACE_ITER_PRINTK &&
trace_flags & TRACE_ITER_PRINTK_MSGONLY)
return trace_print_bputs_msg_only(iter);

if (iter->ent->type == TRACE_BPRINT &&
trace_flags & TRACE_ITER_PRINTK &&
trace_flags & TRACE_ITER_PRINTK_MSGONLY)
return trace_print_bprintk_msg_only(iter);

if (iter->ent->type == TRACE_PRINT &&
trace_flags & TRACE_ITER_PRINTK &&
trace_flags & TRACE_ITER_PRINTK_MSGONLY)
return trace_print_printk_msg_only(iter);

if (trace_flags & TRACE_ITER_BIN)
return print_bin_fmt(iter);

if (trace_flags & TRACE_ITER_HEX)
return print_hex_fmt(iter);

if (trace_flags & TRACE_ITER_RAW)
return print_raw_fmt(iter);

return print_trace_fmt(iter);

> 
> and then checking just after that the offset is not beyond 128
> bytes ? Perhaps there is even something internal to "iter"
> that already knows about this offset (?).
> 
> This would catch any context going beyond its planned upper
> limit early. Failing early and not just in rare corner cases
> is good.
> 
> And it's not for debugging, it's for validation of assumptions
> made about an upper bound limit defined for a compile-time
> check, so as the code evolves issues are caught early.

validating is debugging.

Seriously Mathieu. Why do we need that? The BUILD_BUG_ON() itself is
probably not even needed. Why do all this just to prevent an unlikely
situation of an event being dropped from printing?

It's not even lost (unless they are using trace_pipe, which very few people
use). If you see a "LINE TOO BIG" you can run:

  # trace-cmd extract
  # trace-cmd report

Which will pull out the raw data where the kernel trace_seq doesn't matter
and trace_cmd will handle it just fine (its trace_seq is dynamically
allocated and can increase in size when needed).

-- Steve



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 20:59, Steven Rostedt wrote:

On Mon, 4 Mar 2024 20:42:39 -0500
Mathieu Desnoyers  wrote:


#define TRACE_OUTPUT_META_DATA_MAX_LEN  80

and a runtime check in the code generating this header.

This would avoid adding an unchecked upper limit.


That would be a lot of complex code that is for debugging something that
has never happened in the past 16 years and Linus has already reprimanded
me on doing such things.


Is it more complex than remembering the iterator position in
print_trace_fmt() right before:

if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
if (iter->iter_flags & TRACE_FILE_LAT_FMT)
trace_print_lat_context(iter);
else
trace_print_context(iter);
}

and then checking just after that the offset is not beyond 128
bytes ? Perhaps there is even something internal to "iter"
that already knows about this offset (?).

This would catch any context going beyond its planned upper
limit early. Failing early and not just in rare corner cases
is good.

And it's not for debugging, it's for validation of assumptions
made about an upper bound limit defined for a compile-time
check, so as the code evolves issues are caught early.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 20:42:39 -0500
Mathieu Desnoyers  wrote:

> #define TRACE_OUTPUT_META_DATA_MAX_LEN80
> 
> and a runtime check in the code generating this header.
> 
> This would avoid adding an unchecked upper limit.

That would be a lot of complex code that is for debugging something that
has never happened in the past 16 years and Linus has already reprimanded
me on doing such things.

-- Steve



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 20:36:28 -0500
Mathieu Desnoyers  wrote:

> > <...>-999 [001] .  2296.140373: tracing_mark_write: 
> > hello
> > ^^^
> > This is the meta data that is added to trace_seq  
> 
> If this header has a known well-defined upper-limit length, then use
> that in the BUILD_BUG_ON().

Unfortunately there's no set limit. It's built up by different callbacks
and such. The output can be changed by options set by the user and even by
tracers, like the function graph tracer:

# tracer: function_graph
#
# CPU  DURATION  FUNCTION CALLS
# | |   | |   |   |   |
 1)   |  /* hello */


But the worse that will happen if it overflows is that the event is
replaced with:

 <...>-999 [001] .  2296.140373: [LINE TOO BIG]

But this has never happened outside of development. I guess you could
trigger it if you add a trace_printk() that has a string bigger than
TRACE_SEQ_BUFFER_SIZE. But as Linus says, "Don't do stupid things" ;-)

But in reality, with all the options and everything, I've never seen the
appended text more than 80 bytes (and probably much less).

-- Steve



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 20:41, Steven Rostedt wrote:

On Mon, 4 Mar 2024 20:35:16 -0500
Steven Rostedt  wrote:


BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > 
TRACE_SEQ_SIZE);


That's not the meta size I'm worried about. The sizeof(meta data) is the
raw event binary data, which is not related to the size of the event output.

  # cd /sys/kernel/tracing
  # echo hello > trace_marker
  # cat trace
[..]
<...>-999 [001] .  2296.140373: tracing_mark_write: hello
^^^
This is the meta data that is added to trace_seq


That said, the meta data is most likely not going to be more than 128 bytes
(it shouldn't be more than 80).

I could do as you suggest and create a separate TRACE_MARKER_SIZE and just
make sure that it's less than TRACE_SEQ_BUFFER_SIZE (as that's the size of
the content) by 128 bytes.

/* Added meta data should not be more than 128 bytes */
BUILD_BUG_ON((TRACE_MARKER_MAX_SIZE + 128) > TRACE_SEQ_BUFFER_SIZE);


Bonus points if you add

#define TRACE_OUTPUT_META_DATA_MAX_LEN  80

and a runtime check in the code generating this header.

This would avoid adding an unchecked upper limit.

Thanks,

Mathieu



-- Steve


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 20:35:16 -0500
Steven Rostedt  wrote:

> > BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > 
> > TRACE_SEQ_SIZE);  
> 
> That's not the meta size I'm worried about. The sizeof(meta data) is the
> raw event binary data, which is not related to the size of the event output.
> 
>  # cd /sys/kernel/tracing
>  # echo hello > trace_marker
>  # cat trace
> [..]
><...>-999 [001] .  2296.140373: tracing_mark_write: hello
> ^^^
>This is the meta data that is added to trace_seq

That said, the meta data is most likely not going to be more than 128 bytes
(it shouldn't be more than 80).

I could do as you suggest and create a separate TRACE_MARKER_SIZE and just
make sure that it's less than TRACE_SEQ_BUFFER_SIZE (as that's the size of
the content) by 128 bytes.

/* Added meta data should not be more than 128 bytes */
BUILD_BUG_ON((TRACE_MARKER_MAX_SIZE + 128) > TRACE_SEQ_BUFFER_SIZE);

-- Steve



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 20:35, Steven Rostedt wrote:

On Mon, 4 Mar 2024 20:15:57 -0500
Mathieu Desnoyers  wrote:


On 2024-03-04 19:27, Steven Rostedt wrote:

From: "Steven Rostedt (Google)" 

Since the size of trace_seq's buffer is the max an event can output, have
the trace_marker be half of the entire TRACE_SEQ_SIZE, which is 4K. That
will keep writes that has meta data written from being dropped (but
reported), because the total output of the print event is greater than
what the trace_seq can hold.


Defining the trace_mark limit in terms of "TRACE_SEQ_SIZE / 2"
seems backwards. It's basically using a define of the maximum
buffer size for the pretty-printing output and defining the maximum
input size of a system call to half of that.

I'd rather see, in a header file shared between tracing mark
write implementation and output implementation:

#define TRACING_MARK_MAX_SIZE   4096

and then a static validation that this input fits within your
pretty printing output in the output implementation file:

BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > 
TRACE_SEQ_SIZE);


That's not the meta size I'm worried about. The sizeof(meta data) is the
raw event binary data, which is not related to the size of the event output.

  # cd /sys/kernel/tracing
  # echo hello > trace_marker
  # cat trace
[..]
<...>-999 [001] .  2296.140373: tracing_mark_write: hello
^^^
This is the meta data that is added to trace_seq


If this header has a known well-defined upper-limit length, then use
that in the BUILD_BUG_ON().

Thanks,

Mathieu



-- Steve
--

Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 20:15:57 -0500
Mathieu Desnoyers  wrote:

> On 2024-03-04 19:27, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" 
> > 
> > Since the size of trace_seq's buffer is the max an event can output, have
> > the trace_marker be half of the entire TRACE_SEQ_SIZE, which is 4K. That
> > will keep writes that has meta data written from being dropped (but
> > reported), because the total output of the print event is greater than
> > what the trace_seq can hold.  
> 
> Defining the trace_mark limit in terms of "TRACE_SEQ_SIZE / 2"
> seems backwards. It's basically using a define of the maximum
> buffer size for the pretty-printing output and defining the maximum
> input size of a system call to half of that.
> 
> I'd rather see, in a header file shared between tracing mark
> write implementation and output implementation:
> 
> #define TRACING_MARK_MAX_SIZE 4096
> 
> and then a static validation that this input fits within your
> pretty printing output in the output implementation file:
> 
> BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > 
> TRACE_SEQ_SIZE);

That's not the meta size I'm worried about. The sizeof(meta data) is the
raw event binary data, which is not related to the size of the event output.

 # cd /sys/kernel/tracing
 # echo hello > trace_marker
 # cat trace
[..]
   <...>-999 [001] .  2296.140373: tracing_mark_write: hello
^^^
   This is the meta data that is added to trace_seq

-- Steve



> 
> This way we clearly document that the tracing mark write
> input limit is 4kB, rather than something derived from
> the size of an output buffer.
> 
> Thanks,
> 
> Mathieu
> 
> > 
> > Signed-off-by: Steven Rostedt (Google) 
> > ---
> >   kernel/trace/trace.c | 16 +++-
> >   1 file changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 8198bfc54b58..d68544aef65f 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char 
> > __user *ubuf,
> > if ((ssize_t)cnt < 0)
> > return -EINVAL;
> >   
> > +   /*
> > +* TRACE_SEQ_SIZE is the total size of trace_seq buffer used
> > +* for output. As the print event outputs more than just
> > +* the string written, keep it smaller than the trace_seq
> > +* as it could drop the event if the extra data makes it bigger
> > +* than what the trace_seq can hold. Half he TRACE_SEQ_SIZE
> > +* is more than enough.
> > +*/
> > +   if (cnt > TRACE_SEQ_SIZE / 2)
> > +   cnt = TRACE_SEQ_SIZE / 2;
> > +
> > meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
> >again:
> > size = cnt + meta_size;
> > @@ -7328,11 +7339,6 @@ tracing_mark_write(struct file *filp, const char 
> > __user *ubuf,
> > if (cnt < FAULTED_SIZE)
> > size += FAULTED_SIZE - cnt;
> >   
> > -   if (size > TRACE_SEQ_BUFFER_SIZE) {
> > -   cnt -= size - TRACE_SEQ_BUFFER_SIZE;
> > -   goto again;
> > -   }
> > -
> > buffer = tr->array_buffer.buffer;
> > event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
> > tracing_gen_ctx());  
> 




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 19:27, Steven Rostedt wrote:

From: "Steven Rostedt (Google)" 

Since the size of trace_seq's buffer is the max an event can output, have
the trace_marker be half of the entire TRACE_SEQ_SIZE, which is 4K. That
will keep writes that has meta data written from being dropped (but
reported), because the total output of the print event is greater than
what the trace_seq can hold.


Defining the trace_mark limit in terms of "TRACE_SEQ_SIZE / 2"
seems backwards. It's basically using a define of the maximum
buffer size for the pretty-printing output and defining the maximum
input size of a system call to half of that.

I'd rather see, in a header file shared between tracing mark
write implementation and output implementation:

#define TRACING_MARK_MAX_SIZE   4096

and then a static validation that this input fits within your
pretty printing output in the output implementation file:

BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > 
TRACE_SEQ_SIZE);

This way we clearly document that the tracing mark write
input limit is 4kB, rather than something derived from
the size of an output buffer.

Thanks,

Mathieu



Signed-off-by: Steven Rostedt (Google) 
---
  kernel/trace/trace.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8198bfc54b58..d68544aef65f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if ((ssize_t)cnt < 0)
return -EINVAL;
  
+	/*

+* TRACE_SEQ_SIZE is the total size of trace_seq buffer used
+* for output. As the print event outputs more than just
+* the string written, keep it smaller than the trace_seq
+* as it could drop the event if the extra data makes it bigger
+* than what the trace_seq can hold. Half he TRACE_SEQ_SIZE
+* is more than enough.
+*/
+   if (cnt > TRACE_SEQ_SIZE / 2)
+   cnt = TRACE_SEQ_SIZE / 2;
+
meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
   again:
size = cnt + meta_size;
@@ -7328,11 +7339,6 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if (cnt < FAULTED_SIZE)
size += FAULTED_SIZE - cnt;
  
-	if (size > TRACE_SEQ_BUFFER_SIZE) {

-   cnt -= size - TRACE_SEQ_BUFFER_SIZE;
-   goto again;
-   }
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [RFC PATCH v3 4/5] input: add onkey driver for Marvell 88PM886 PMIC

2024-03-04 Thread Dmitry Torokhov
On Mon, Mar 04, 2024 at 09:28:45PM +0100, Karel Balej wrote:
> Dmitry,
> 
> Dmitry Torokhov, 2024-03-03T12:39:46-08:00:
> > On Sun, Mar 03, 2024 at 11:04:25AM +0100, Karel Balej wrote:
> > > From: Karel Balej 
> > > 
> > > Marvell 88PM886 PMIC provides onkey among other things. Add client
> > > driver to handle it. The driver currently only provides a basic support
> > > omitting additional functions found in the vendor version, such as long
> > > onkey and GPIO integration.
> > > 
> > > Signed-off-by: Karel Balej 
> > > ---
> > > 
> > > Notes:
> > > RFC v3:
> > > - Drop wakeup-source.
> > > RFC v2:
> > > - Address Dmitry's feedback:
> > >   - Sort includes alphabetically.
> > >   - Drop onkey->irq.
> > >   - ret -> err in irq_handler and no initialization.
> > >   - Break long lines and other formatting.
> > >   - Do not clobber platform_get_irq error.
> > >   - Do not set device parent manually.
> > >   - Use input_set_capability.
> > >   - Use the wakeup-source DT property.
> > >   - Drop of_match_table.
> >
> > I only said that you should not be using of_match_ptr(), but you still
> > need to have of_match_table set and have MODULE_DEVICE_TABLE() for the
> > proper module loading support.
> 
> I removed of_match_table because I no longer need compatible for this --
> there are no device tree properties and the driver is being instantiated
> by the MFD driver.
> 
> Is the MODULE_DEVICE_TABLE() entry needed for the driver to probe when
> compiled as module? If that is the case, given what I write above, am I
> correct that MODULE_DEVICE_TABLE(platform,...) would be the right thing
> to use here?

Yes, if uevent generated for the device is "platform:" then
MODULE_DEVICE_TABLE(platform,...) will suffice. I am not sure how MFD
sets it up (OF modalias or platform), but you should be able to check
the format looking at the "uevent" attribute for your device in sysfs
(/sys/devices/bus/platform/...). 

Thanks.

-- 
Dmitry



Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 16:43:46 -0800
Randy Dunlap  wrote:

> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 8198bfc54b58..d68544aef65f 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char 
> > __user *ubuf,
> > if ((ssize_t)cnt < 0)
> > return -EINVAL;
> >  
> > +   /*
> > +* TRACE_SEQ_SIZE is the total size of trace_seq buffer used
> > +* for output. As the print event outputs more than just
> > +* the string written, keep it smaller than the trace_seq
> > +* as it could drop the event if the extra data makes it bigger
> > +* than what the trace_seq can hold. Half he TRACE_SEQ_SIZE  
> 
> the

I honestly think my 't' key isn't triggering as much. At least when before
hitting 'h', as I noticed I've been writing "he", "hey" and "here" a lot,
and spell check isn't (obviously) catching it ;-)

-- Steve


> 
> > +* is more than enough.
> > +*/
> > +   if (cnt > TRACE_SEQ_SIZE / 2)
> > +   cnt = TRACE_SEQ_SIZE / 2;
> > +
> > meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
> >   again:
> > size = cnt + meta_size;  
> 
> 




Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Randy Dunlap


> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 8198bfc54b58..d68544aef65f 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char 
> __user *ubuf,
>   if ((ssize_t)cnt < 0)
>   return -EINVAL;
>  
> + /*
> +  * TRACE_SEQ_SIZE is the total size of trace_seq buffer used
> +  * for output. As the print event outputs more than just
> +  * the string written, keep it smaller than the trace_seq
> +  * as it could drop the event if the extra data makes it bigger
> +  * than what the trace_seq can hold. Half he TRACE_SEQ_SIZE

  the

> +  * is more than enough.
> +  */
> + if (cnt > TRACE_SEQ_SIZE / 2)
> + cnt = TRACE_SEQ_SIZE / 2;
> +
>   meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
>   again:
>   size = cnt + meta_size;


-- 
#Randy



[PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE

2024-03-04 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Since the size of trace_seq's buffer is the max an event can output, have
the trace_marker be half of the entire TRACE_SEQ_SIZE, which is 4K. That
will keep writes that has meta data written from being dropped (but
reported), because the total output of the print event is greater than
what the trace_seq can hold.

Signed-off-by: Steven Rostedt (Google) 
---
 kernel/trace/trace.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8198bfc54b58..d68544aef65f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if ((ssize_t)cnt < 0)
return -EINVAL;
 
+   /*
+* TRACE_SEQ_SIZE is the total size of trace_seq buffer used
+* for output. As the print event outputs more than just
+* the string written, keep it smaller than the trace_seq
+* as it could drop the event if the extra data makes it bigger
+* than what the trace_seq can hold. Half he TRACE_SEQ_SIZE
+* is more than enough.
+*/
+   if (cnt > TRACE_SEQ_SIZE / 2)
+   cnt = TRACE_SEQ_SIZE / 2;
+
meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
  again:
size = cnt + meta_size;
@@ -7328,11 +7339,6 @@ tracing_mark_write(struct file *filp, const char __user 
*ubuf,
if (cnt < FAULTED_SIZE)
size += FAULTED_SIZE - cnt;
 
-   if (size > TRACE_SEQ_BUFFER_SIZE) {
-   cnt -= size - TRACE_SEQ_BUFFER_SIZE;
-   goto again;
-   }
-
buffer = tr->array_buffer.buffer;
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
tracing_gen_ctx());
-- 
2.43.0




Re: [PATCH] tracing: Remove precision vsnprintf() check from print event

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 18:55:00 -0500
Steven Rostedt  wrote:

> On Mon, 4 Mar 2024 18:23:41 -0500
> Mathieu Desnoyers  wrote:
> 
> > It appears to currently be limited by
> > 
> > #define TRACE_SEQ_BUFFER_SIZE   (PAGE_SIZE * 2 - \
> >  (sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))
> > 
> > checked within tracing_mark_write().  
> 
> Yeah, I can hard code this to 8K as it handles output of complete events,
> that can dump a lot of data, and then limit the trace_marker writes to be 4K.

Actually, the trace_marker writes is already limited by
TRACE_SEQ_BUFFER_SIZE, and by making this hard coded to 8K, it limits the
size of the trace_marker writes.

I may make the writes even smaller.

-- Steve




[PATCH] tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE

2024-03-04 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The trace_seq buffer is used to print out entire events. It's typically
set to PAGE_SIZE * 2 as there's some events that can be quite large.

As a side effect, writes to trace_marker is limited by both the size of the
trace_seq buffer as well as the ring buffer's sub-buffer size (which is a
power of PAGE_SIZE). By limiting the trace_seq size, it also limits the
size of the largest string written to trace_marker.

trace_seq does not need to be dependent on PAGE_SIZE like the ring buffer
sub-buffers need to be. Hard code it to 8K which is PAGE_SIZE * 2 on most
architectures. This will also limit the size of trace_marker on those
architectures with greater than 4K PAGE_SIZE.

Link: https://lore.kernel.org/all/20240302111244.3a167...@gandalf.local.home/

Signed-off-by: Steven Rostedt (Google) 
---
 include/linux/trace_seq.h | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index 9ec229dfddaa..1ef95c0287f0 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -9,9 +9,15 @@
 /*
  * Trace sequences are used to allow a function to call several other functions
  * to create a string of data to use.
+ *
+ * Have the trace seq to be 8K which is typically PAGE_SIZE * 2 on
+ * most architectures. The TRACE_SEQ_BUFFER_SIZE (which is
+ * TRACE_SEQ_SIZE minus the other fields of trace_seq), is the
+ * max size the output of a trace event may be.
  */
 
-#define TRACE_SEQ_BUFFER_SIZE  (PAGE_SIZE * 2 - \
+#define TRACE_SEQ_SIZE 8192
+#define TRACE_SEQ_BUFFER_SIZE  (TRACE_SEQ_SIZE - \
(sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))
 
 struct trace_seq {
-- 
2.43.0




Re: [PATCH] tracing: Remove precision vsnprintf() check from print event

2024-03-04 Thread Steven Rostedt
On Mon, 4 Mar 2024 18:23:41 -0500
Mathieu Desnoyers  wrote:

> It appears to currently be limited by
> 
> #define TRACE_SEQ_BUFFER_SIZE   (PAGE_SIZE * 2 - \
>  (sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))
> 
> checked within tracing_mark_write().

Yeah, I can hard code this to 8K as it handles output of complete events,
that can dump a lot of data, and then limit the trace_marker writes to be 4K.

-- Steve



Re: [PATCH] tracing: Remove precision vsnprintf() check from print event

2024-03-04 Thread Mathieu Desnoyers

On 2024-03-04 17:43, Steven Rostedt wrote:

From: "Steven Rostedt (Google)" 

This reverts 60be76eeabb3d ("tracing: Add size check when printing
trace_marker output"). The only reason the precision check was added
was because of a bug that miscalculated the write size of the string into
the ring buffer and it truncated it removing the terminating nul byte. On
reading the trace it crashed the kernel. But this was due to the bug in
the code that happened during development and should never happen in
practice. If anything, the precision can hide bugs where the string in the
ring buffer isn't nul terminated and it will not be checked.

Link: 
https://lore.kernel.org/all/c7e7af1a-d30f-4d18-b8e5-af1ef5800...@linux.ibm.com/
Link: 
https://lore.kernel.org/linux-trace-kernel/20240227125706.04279...@gandalf.local.home
Link: https://lore.kernel.org/all/20240302111244.3a167...@gandalf.local.home/

Reported-by: Sachin Sant 
Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker 
output")
Signed-off-by: Steven Rostedt (Google) 


This is a step in the right direction IMHO.

Reviewed-by: Mathieu Desnoyers 

Just out of curiosity, is there anything to prevent trace_marker from
writing a huge string into the ring buffer in the first place ? Is this
limit implicit and based on the page size of the architecture or is it
a known fixed limit ? (e.g. 4kB strings).

It appears to currently be limited by

#define TRACE_SEQ_BUFFER_SIZE   (PAGE_SIZE * 2 - \
(sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))

checked within tracing_mark_write().

I would have hoped for a simpler limit (e.g. 4kB) consistent across
architectures. But that would belong to a separate change.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




[PATCH] tracing: Remove precision vsnprintf() check from print event

2024-03-04 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

This reverts 60be76eeabb3d ("tracing: Add size check when printing
trace_marker output"). The only reason the precision check was added
was because of a bug that miscalculated the write size of the string into
the ring buffer and it truncated it removing the terminating nul byte. On
reading the trace it crashed the kernel. But this was due to the bug in
the code that happened during development and should never happen in
practice. If anything, the precision can hide bugs where the string in the
ring buffer isn't nul terminated and it will not be checked.

Link: 
https://lore.kernel.org/all/c7e7af1a-d30f-4d18-b8e5-af1ef5800...@linux.ibm.com/
Link: 
https://lore.kernel.org/linux-trace-kernel/20240227125706.04279...@gandalf.local.home
Link: https://lore.kernel.org/all/20240302111244.3a167...@gandalf.local.home/

Reported-by: Sachin Sant 
Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker 
output")
Signed-off-by: Steven Rostedt (Google) 
---
 kernel/trace/trace_output.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 3e7fa44dc2b2..d8b302d01083 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -1587,12 +1587,11 @@ static enum print_line_t trace_print_print(struct 
trace_iterator *iter,
 {
struct print_entry *field;
struct trace_seq *s = >seq;
-   int max = iter->ent_size - offsetof(struct print_entry, buf);
 
trace_assign_type(field, iter->ent);
 
seq_print_ip_sym(s, field->ip, flags);
-   trace_seq_printf(s, ": %.*s", max, field->buf);
+   trace_seq_printf(s, ": %s", field->buf);
 
return trace_handle_return(s);
 }
@@ -1601,11 +1600,10 @@ static enum print_line_t trace_print_raw(struct 
trace_iterator *iter, int flags,
 struct trace_event *event)
 {
struct print_entry *field;
-   int max = iter->ent_size - offsetof(struct print_entry, buf);
 
trace_assign_type(field, iter->ent);
 
-   trace_seq_printf(>seq, "# %lx %.*s", field->ip, max, field->buf);
+   trace_seq_printf(>seq, "# %lx %s", field->ip, field->buf);
 
return trace_handle_return(>seq);
 }
-- 
2.43.0




Re: [RFC PATCH v3 4/5] input: add onkey driver for Marvell 88PM886 PMIC

2024-03-04 Thread Karel Balej
Dmitry,

Dmitry Torokhov, 2024-03-03T12:39:46-08:00:
> On Sun, Mar 03, 2024 at 11:04:25AM +0100, Karel Balej wrote:
> > From: Karel Balej 
> > 
> > Marvell 88PM886 PMIC provides onkey among other things. Add client
> > driver to handle it. The driver currently only provides a basic support
> > omitting additional functions found in the vendor version, such as long
> > onkey and GPIO integration.
> > 
> > Signed-off-by: Karel Balej 
> > ---
> > 
> > Notes:
> > RFC v3:
> > - Drop wakeup-source.
> > RFC v2:
> > - Address Dmitry's feedback:
> >   - Sort includes alphabetically.
> >   - Drop onkey->irq.
> >   - ret -> err in irq_handler and no initialization.
> >   - Break long lines and other formatting.
> >   - Do not clobber platform_get_irq error.
> >   - Do not set device parent manually.
> >   - Use input_set_capability.
> >   - Use the wakeup-source DT property.
> >   - Drop of_match_table.
>
> I only said that you should not be using of_match_ptr(), but you still
> need to have of_match_table set and have MODULE_DEVICE_TABLE() for the
> proper module loading support.

I removed of_match_table because I no longer need compatible for this --
there are no device tree properties and the driver is being instantiated
by the MFD driver.

Is the MODULE_DEVICE_TABLE() entry needed for the driver to probe when
compiled as module? If that is the case, given what I write above, am I
correct that MODULE_DEVICE_TABLE(platform,...) would be the right thing
to use here?

Thank you, kind regards,
K. B.



Re: [PATCH v3 2/2] riscv: Fix text patching when IPI are used

2024-03-04 Thread Björn Töpel
Conor Dooley  writes:

> On Thu, Feb 29, 2024 at 01:10:56PM +0100, Alexandre Ghiti wrote:
>> For now, we use stop_machine() to patch the text and when we use IPIs for
>> remote icache flushes (which is emitted in patch_text_nosync()), the system
>> hangs.
>> 
>> So instead, make sure every CPU executes the stop_machine() patching
>> function and emit a local icache flush there.
>> 
>> Co-developed-by: Björn Töpel 
>> Signed-off-by: Björn Töpel 
>> Signed-off-by: Alexandre Ghiti 
>> Reviewed-by: Andrea Parri 
>
> What commit does this fix?

Hmm. The bug is exposed when the AIA IPI are introduced, and used
(instead of the firmware-based).

I'm not sure this is something we'd like backported, but rather a
prerequisite to AIA.

@Anup @Alex WDYT?



Re: [PATCH v3 2/2] riscv: Fix text patching when IPI are used

2024-03-04 Thread Conor Dooley
On Thu, Feb 29, 2024 at 01:10:56PM +0100, Alexandre Ghiti wrote:
> For now, we use stop_machine() to patch the text and when we use IPIs for
> remote icache flushes (which is emitted in patch_text_nosync()), the system
> hangs.
> 
> So instead, make sure every CPU executes the stop_machine() patching
> function and emit a local icache flush there.
> 
> Co-developed-by: Björn Töpel 
> Signed-off-by: Björn Töpel 
> Signed-off-by: Alexandre Ghiti 
> Reviewed-by: Andrea Parri 

What commit does this fix?



signature.asc
Description: PGP signature


Re: [PATCH v3 1/2] riscv: Remove superfluous smp_mb()

2024-03-04 Thread Conor Dooley
On Thu, Feb 29, 2024 at 01:10:55PM +0100, Alexandre Ghiti wrote:
> This memory barrier is not needed and not documented so simply remove
> it.

This looks like it should be patch 2 in the series, not patch 1, as it
is cleanup rather than a fix that needs backporting.

> 
> Suggested-by: Andrea Parri 
> Signed-off-by: Alexandre Ghiti 
> Reviewed-by: Andrea Parri 
> ---
>  arch/riscv/kernel/patch.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 37e87fdcf6a0..0b5c16dfe3f4 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -239,7 +239,6 @@ static int patch_text_cb(void *data)
>   } else {
>   while (atomic_read(>cpu_count) <= num_online_cpus())
>   cpu_relax();
> - smp_mb();
>   }
>  
>   return ret;
> -- 
> 2.39.2
> 
> 
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


signature.asc
Description: PGP signature


Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-04 Thread Mathieu Poirier
Good day Abdellatif,

On Fri, Mar 01, 2024 at 04:42:25PM +, abdellatif.elkhl...@arm.com wrote:
> From: Abdellatif El Khlifi 
> 
> introduce remoteproc support for Arm remote processors
> 
> The supported remote processors are those that come with a reset
> control register and a reset status register. The driver allows to
> switch on or off the remote processor.
> 
> The current use case is Corstone-1000 External System (Cortex-M3).
> 
> The driver can be extended to support other remote processors
> controlled with a reset control and a reset status registers.
> 
> The driver also supports control of multiple remote processors at the
> same time.
> 
> Signed-off-by: Abdellatif El Khlifi 
> ---
>  MAINTAINERS|   6 +
>  drivers/remoteproc/Kconfig |  18 ++
>  drivers/remoteproc/Makefile|   1 +
>  drivers/remoteproc/arm_rproc.c | 395 +
>  4 files changed, 420 insertions(+)
>  create mode 100644 drivers/remoteproc/arm_rproc.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8d1052fa6a69..54d6a40feea5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1764,6 +1764,12 @@ S: Maintained
>  F:   Documentation/devicetree/bindings/interrupt-controller/arm,vic.yaml
>  F:   drivers/irqchip/irq-vic.c
>  
> +ARM REMOTEPROC DRIVER
> +M:   Abdellatif El Khlifi 
> +L:   linux-remotep...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/remoteproc/arm_rproc.c
> +

Humm... I'm not sure this is needed for now.  You'll be CC'ed in future postings
anyway if someone changes this drivers.

>  ARM SMC WATCHDOG DRIVER
>  M:   Julius Werner 
>  R:   Evan Benn 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 48845dc8fa85..57fbac454a5d 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -365,6 +365,24 @@ config XLNX_R5_REMOTEPROC
>  
> It's safe to say N if not interested in using RPU r5f cores.
>  
> +config ARM_REMOTEPROC
> + tristate "Arm remoteproc support"
> + depends on HAS_IOMEM && ARM64
> + default n
> + help
> +   Say y here to support Arm remote processors via the remote
> +   processor framework.
> +
> +   The supported processors are those that come with a reset control 
> register
> +   and a reset status register. The design can be extended to support 
> different
> +   processors meeting these requirements.
> +   The driver also supports control of multiple remote cores at the same 
> time.
> +
> +   Supported remote cores:
> +   Corstone-1000 External System (Cortex-M3)
> +

Please remove.  The descrition in the Kconfig file should be related to a family
of device and the specific model number found in the driver.  

> +   It's safe to say N here.
> +
>  endif # REMOTEPROC
>  
>  endmenu
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index 91314a9b43ce..73126310835b 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -39,3 +39,4 @@ obj-$(CONFIG_STM32_RPROC)   += stm32_rproc.o
>  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)   += ti_k3_dsp_remoteproc.o
>  obj-$(CONFIG_TI_K3_R5_REMOTEPROC)+= ti_k3_r5_remoteproc.o
>  obj-$(CONFIG_XLNX_R5_REMOTEPROC) += xlnx_r5_remoteproc.o
> +obj-$(CONFIG_ARM_REMOTEPROC) += arm_rproc.o
> diff --git a/drivers/remoteproc/arm_rproc.c b/drivers/remoteproc/arm_rproc.c
> new file mode 100644
> index ..6afa78ae7ad3
> --- /dev/null
> +++ b/drivers/remoteproc/arm_rproc.c
> @@ -0,0 +1,395 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright 2024 Arm Limited and/or its affiliates 
> 
> + *
> + * Authors:
> + *   Abdellatif El Khlifi 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "remoteproc_internal.h"
> +
> +/**
> + * struct arm_rproc_reset_cfg - remote processor reset configuration
> + * @ctrl_reg: address of the control register
> + * @state_reg: address of the reset status register
> + */
> +struct arm_rproc_reset_cfg {
> + void __iomem *ctrl_reg;
> + void __iomem *state_reg;
> +};
> +
> +struct arm_rproc;
> +

This is useless, please remove.

> +/**
> + * struct arm_rproc_dcfg - Arm remote processor configuration

Configuration?  This looks more like operations to me.  Please consider
renaming.

> + * @stop: stop callback function
> + * @start: start callback function
> + */
> +struct arm_rproc_dcfg {
> + int (*stop)(struct rproc *rproc);
> + int (*start)(struct rproc *rproc);
> +};
> +
> +/**
> + * struct arm_rproc - Arm remote processor instance
> + * @rproc: rproc handler
> + * @core_dcfg: device configuration pointer
> + * @reset_cfg: reset configuration registers
> + */
> +struct arm_rproc {
> + struct rproc*rproc;
> + const struct arm_rproc_dcfg *core_dcfg;
> + struct arm_rproc_reset_cfg  reset_cfg;
> +};
> +
> +/* Definitions for 

Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-04 Thread Rob Herring
On Fri, Mar 01, 2024 at 04:42:25PM +, abdellatif.elkhl...@arm.com wrote:
> From: Abdellatif El Khlifi 
> 
> introduce remoteproc support for Arm remote processors
> 
> The supported remote processors are those that come with a reset
> control register and a reset status register. The driver allows to
> switch on or off the remote processor.
> 
> The current use case is Corstone-1000 External System (Cortex-M3).
> 
> The driver can be extended to support other remote processors
> controlled with a reset control and a reset status registers.
> 
> The driver also supports control of multiple remote processors at the
> same time.
> 
> Signed-off-by: Abdellatif El Khlifi 
> ---
>  MAINTAINERS|   6 +
>  drivers/remoteproc/Kconfig |  18 ++
>  drivers/remoteproc/Makefile|   1 +
>  drivers/remoteproc/arm_rproc.c | 395 +
>  4 files changed, 420 insertions(+)
>  create mode 100644 drivers/remoteproc/arm_rproc.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8d1052fa6a69..54d6a40feea5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1764,6 +1764,12 @@ S: Maintained
>  F:   Documentation/devicetree/bindings/interrupt-controller/arm,vic.yaml
>  F:   drivers/irqchip/irq-vic.c
>  
> +ARM REMOTEPROC DRIVER
> +M:   Abdellatif El Khlifi 
> +L:   linux-remotep...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/remoteproc/arm_rproc.c
> +
>  ARM SMC WATCHDOG DRIVER
>  M:   Julius Werner 
>  R:   Evan Benn 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 48845dc8fa85..57fbac454a5d 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -365,6 +365,24 @@ config XLNX_R5_REMOTEPROC
>  
> It's safe to say N if not interested in using RPU r5f cores.
>  
> +config ARM_REMOTEPROC
> + tristate "Arm remoteproc support"

Too generic of a name. It should say Corstone or Corstone-1000. Here and 
everywhere you use just 'Arm'.

> + depends on HAS_IOMEM && ARM64

depends on ARM64 || (HAS_IOMEM && COMPILE_TEST)

That gets us wider build coverage. You should check at least x86 
allmodconfig passes.

> + default n

The default is already n, so drop.

> + help
> +   Say y here to support Arm remote processors via the remote
> +   processor framework.
> +
> +   The supported processors are those that come with a reset control 
> register
> +   and a reset status register. The design can be extended to support 
> different
> +   processors meeting these requirements.
> +   The driver also supports control of multiple remote cores at the same 
> time.
> +
> +   Supported remote cores:
> +   Corstone-1000 External System (Cortex-M3)
> +
> +   It's safe to say N here.
> +
>  endif # REMOTEPROC
>  
>  endmenu
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index 91314a9b43ce..73126310835b 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -39,3 +39,4 @@ obj-$(CONFIG_STM32_RPROC)   += stm32_rproc.o
>  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)   += ti_k3_dsp_remoteproc.o
>  obj-$(CONFIG_TI_K3_R5_REMOTEPROC)+= ti_k3_r5_remoteproc.o
>  obj-$(CONFIG_XLNX_R5_REMOTEPROC) += xlnx_r5_remoteproc.o
> +obj-$(CONFIG_ARM_REMOTEPROC) += arm_rproc.o
> diff --git a/drivers/remoteproc/arm_rproc.c b/drivers/remoteproc/arm_rproc.c
> new file mode 100644
> index ..6afa78ae7ad3
> --- /dev/null
> +++ b/drivers/remoteproc/arm_rproc.c
> @@ -0,0 +1,395 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright 2024 Arm Limited and/or its affiliates 
> 

We don't normally put OSO email in here.

> + *
> + * Authors:
> + *   Abdellatif El Khlifi 

That's recorded in the commit message and by git, so no need to put it 
in the file.

> + */




[PATCH net-next v4] net: dqs: add NIC stall detector based on BQL

2024-03-04 Thread Breno Leitao
From: Jakub Kicinski 

softnet_data->time_squeeze is sometimes used as a proxy for
host overload or indication of scheduling problems. In practice
this statistic is very noisy and has hard to grasp units -
e.g. is 10 squeezes a second to be expected, or high?

Delaying network (NAPI) processing leads to drops on NIC queues
but also RTT bloat, impacting pacing and CA decisions.
Stalls are a little hard to detect on the Rx side, because
there may simply have not been any packets received in given
period of time. Packet timestamps help a little bit, but
again we don't know if packets are stale because we're
not keeping up or because someone (*cough* cgroups)
disabled IRQs for a long time.

We can, however, use Tx as a proxy for Rx stalls. Most drivers
use combined Rx+Tx NAPIs so if Tx gets starved so will Rx.
On the Tx side we know exactly when packets get queued,
and completed, so there is no uncertainty.

This patch adds stall checks to BQL. Why BQL? Because
it's a convenient place to add such checks, already
called by most drivers, and it has copious free space
in its structures (this patch adds no extra cache
references or dirtying to the fast path).

The algorithm takes one parameter - max delay AKA stall
threshold and increments a counter whenever NAPI got delayed
for at least that amount of time. It also records the length
of the longest stall.

To be precise every time NAPI has not polled for at least
stall thrs we check if there were any Tx packets queued
between last NAPI run and now - stall_thrs/2.

Unlike the classic Tx watchdog this mechanism does not
ignore stalls caused by Tx being disabled, or loss of link.
I don't think the check is worth the complexity, and
stall is a stall, whether due to host overload, flow
control, link down... doesn't matter much to the application.

We have been running this detector in production at Meta
for 2 years, with the threshold of 8ms. It's the lowest
value where false positives become rare. There's still
a constant stream of reported stalls (especially without
the ksoftirqd deferral patches reverted), those who like
their stall metrics to be 0 may prefer higher value.

Signed-off-by: Jakub Kicinski 
Signed-off-by: Breno Leitao 
---
v1:
  * https://lore.kernel.org/netdev/202306172057.jx7yhliu-...@intel.com/T/
v2:
  * Fix the documentation file in patch 0001, since patch 0002 will
touch it later.
  * Fix the kernel test robot issues, marking functions as statics.
  * Use #include  instead of .
  * Added some new comments around, mainly around barriers.
  * Format struct `netdev_queue_attribute` assignments to make
checkpatch happy.
  * Updated and fixed the path in sysfs-class-net-queues
documentation.
v3:
  * Sent patch 0002 against net-next.
- The first patch was accepted against 'net'
v4:
 * Added code documentation to clarify the history usage
 * Added better documentation for TX completion stall in
   Documentation/ABI/testing/sysfs-class-net-queues.
 * Changed stall_thrs and stall_max from "unsigned char" to "unsigned
   short"

 .../ABI/testing/sysfs-class-net-queues| 23 ++
 include/linux/dynamic_queue_limits.h  | 45 +++
 include/trace/events/napi.h   | 33 +
 lib/dynamic_queue_limits.c| 74 +++
 net/core/net-sysfs.c  | 62 
 5 files changed, 237 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-class-net-queues 
b/Documentation/ABI/testing/sysfs-class-net-queues
index 5bff64d256c2..84aa25e0d14d 100644
--- a/Documentation/ABI/testing/sysfs-class-net-queues
+++ b/Documentation/ABI/testing/sysfs-class-net-queues
@@ -96,3 +96,26 @@ Description:
Indicates the absolute minimum limit of bytes allowed to be
queued on this network device transmit queue. Default value is
0.
+
+What:  
/sys/class/net//queues/tx-/byte_queue_limits/stall_thrs
+Date:  Jan 2024
+KernelVersion: 6.9
+Contact:   net...@vger.kernel.org
+Description:
+   Tx completion stall detection threshold in ms. Kernel will
+   guarantee to detect all stalls longer than this threshold but
+   may also detect stalls longer than half of the threshold.
+
+What:  
/sys/class/net//queues/tx-/byte_queue_limits/stall_cnt
+Date:  Jan 2024
+KernelVersion: 6.9
+Contact:   net...@vger.kernel.org
+Description:
+   Number of detected Tx completion stalls.
+
+What:  
/sys/class/net//queues/tx-/byte_queue_limits/stall_max
+Date:  Jan 2024
+KernelVersion: 6.9
+Contact:   net...@vger.kernel.org
+Description:
+   Longest detected Tx completion stall. Write 0 to clear.
diff --git a/include/linux/dynamic_queue_limits.h 
b/include/linux/dynamic_queue_limits.h
index 407c2f281b64..5693a4be0d9a 100644
--- a/include/linux/dynamic_queue_limits.h
+++ b/include/linux/dynamic_queue_limits.h
@@ -38,14 +38,22 @@
 
 

Re: [PATCH net-next v2 1/3] xsk: Remove non-zero 'dma_page' check in xp_assign_dev

2024-03-04 Thread Magnus Karlsson
On Thu, 29 Feb 2024 at 13:52, wangyunjian  wrote:
>
> > -Original Message-
> > From: Paolo Abeni [mailto:pab...@redhat.com]
> > Sent: Thursday, February 29, 2024 6:43 PM
> > To: wangyunjian ; m...@redhat.com;
> > willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> > bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> > jonathan.le...@gmail.com; da...@davemloft.net
> > Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> > linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> > virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> > 
> > Subject: Re: [PATCH net-next v2 1/3] xsk: Remove non-zero 'dma_page' check 
> > in
> > xp_assign_dev
> >
> > On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > > Now dma mappings are used by the physical NICs. However the vNIC maybe
> > > do not need them. So remove non-zero 'dma_page' check in
> > > xp_assign_dev.
> > >
> > > Signed-off-by: Yunjian Wang 
> > > ---
> > >  net/xdp/xsk_buff_pool.c | 7 ---
> > >  1 file changed, 7 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index
> > > ce60ecd48a4d..a5af75b1f43c 100644
> > > --- a/net/xdp/xsk_buff_pool.c
> > > +++ b/net/xdp/xsk_buff_pool.c
> > > @@ -219,16 +219,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
> > > if (err)
> > > goto err_unreg_pool;
> > >
> > > -   if (!pool->dma_pages) {
> > > -   WARN(1, "Driver did not DMA map zero-copy buffers");
> > > -   err = -EINVAL;
> > > -   goto err_unreg_xsk;
> > > -   }
> >
> > This would unconditionally remove an otherwise valid check for most NIC. 
> > What
> > about let the driver declare it wont need DMA map with a
> > (pool?) flag.
>
> This check is redundant. The NIC's driver determines whether a DMA map is 
> required.
> If the NIC'driver requires the DMA map, it uses the xsk_pool_dma_map 
> function, which
> initializes the DMA map and performs a check.

Just to provide some context: I put this check there many years ago to
guard against a zero-copy driver writer forgetting to call
xsk_pool_dma_map() during the implementation phase. A working driver
will always have pool->dma_pages != NULL. If you both think that this
check is too much of a precaution, then I have no problem getting rid
of it. Just thought that a text warning would be nicer than a crash
later.

Thanks: Magnus

> Thanks
>
> >
> > Cheers,
> >
> > Paolo
>



RE: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-04 Thread wangyunjian



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Friday, March 1, 2024 7:53 PM
> To: wangyunjian 
> Cc: Paolo Abeni ; willemdebruijn.ker...@gmail.com;
> jasow...@redhat.com; k...@kernel.org; bj...@kernel.org;
> magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net; b...@vger.kernel.org;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
> 
> On Fri, Mar 01, 2024 at 11:45:52AM +, wangyunjian wrote:
> > > -Original Message-
> > > From: Paolo Abeni [mailto:pab...@redhat.com]
> > > Sent: Thursday, February 29, 2024 7:13 PM
> > > To: wangyunjian ; m...@redhat.com;
> > > willemdebruijn.ker...@gmail.com; jasow...@redhat.com;
> > > k...@kernel.org; bj...@kernel.org; magnus.karls...@intel.com;
> > > maciej.fijalkow...@intel.com; jonathan.le...@gmail.com;
> > > da...@davemloft.net
> > > Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> > > virtualizat...@lists.linux.dev; xudingke ;
> > > liwei (DT) 
> > > Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy
> > > support
> > >
> > > On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > > > @@ -2661,6 +2776,54 @@ static int tun_ptr_peek_len(void *ptr)
> > > > }
> > > >  }
> > > >
> > > > +static void tun_peek_xsk(struct tun_file *tfile) {
> > > > +   struct xsk_buff_pool *pool;
> > > > +   u32 i, batch, budget;
> > > > +   void *frame;
> > > > +
> > > > +   if (!ptr_ring_empty(>tx_ring))
> > > > +   return;
> > > > +
> > > > +   spin_lock(>pool_lock);
> > > > +   pool = tfile->xsk_pool;
> > > > +   if (!pool) {
> > > > +   spin_unlock(>pool_lock);
> > > > +   return;
> > > > +   }
> > > > +
> > > > +   if (tfile->nb_descs) {
> > > > +   xsk_tx_completed(pool, tfile->nb_descs);
> > > > +   if (xsk_uses_need_wakeup(pool))
> > > > +   xsk_set_tx_need_wakeup(pool);
> > > > +   }
> > > > +
> > > > +   spin_lock(>tx_ring.producer_lock);
> > > > +   budget = min_t(u32, tfile->tx_ring.size, TUN_XDP_BATCH);
> > > > +
> > > > +   batch = xsk_tx_peek_release_desc_batch(pool, budget);
> > > > +   if (!batch) {
> > >
> > > This branch looks like an unneeded "optimization". The generic loop
> > > below should have the same effect with no measurable perf delta - and
> smaller code.
> > > Just remove this.
> > >
> > > > +   tfile->nb_descs = 0;
> > > > +   spin_unlock(>tx_ring.producer_lock);
> > > > +   spin_unlock(>pool_lock);
> > > > +   return;
> > > > +   }
> > > > +
> > > > +   tfile->nb_descs = batch;
> > > > +   for (i = 0; i < batch; i++) {
> > > > +   /* Encode the XDP DESC flag into lowest bit for 
> > > > consumer to
> differ
> > > > +* XDP desc from XDP buffer and sk_buff.
> > > > +*/
> > > > +   frame = tun_xdp_desc_to_ptr(>tx_descs[i]);
> > > > +   /* The budget must be less than or equal to 
> > > > tx_ring.size,
> > > > +* so enqueuing will not fail.
> > > > +*/
> > > > +   __ptr_ring_produce(>tx_ring, frame);
> > > > +   }
> > > > +   spin_unlock(>tx_ring.producer_lock);
> > > > +   spin_unlock(>pool_lock);
> > >
> > > More related to the general design: it looks wrong. What if
> > > get_rx_bufs() will fail (ENOBUF) after successful peeking? With no
> > > more incoming packets, later peek will return 0 and it looks like
> > > that the half-processed packets will stay in the ring forever???
> > >
> > > I think the 'ring produce' part should be moved into tun_do_read().
> >
> > Currently, the vhost-net obtains a batch descriptors/sk_buffs from the
> > ptr_ring and enqueue the batch descriptors/sk_buffs to the
> > virtqueue'queue, and then consumes the descriptors/sk_buffs from the
> > virtqueue'queue in sequence. As a result, TUN does not know whether
> > the batch descriptors have been used up, and thus does not know when to
> return the batch descriptors.
> >
> > So, I think it's reasonable that when vhost-net checks ptr_ring is
> > empty, it calls peek_len to get new xsk's descs and return the descriptors.
> >
> > Thanks
> 
> What you need to think about is that if you peek, another call in parallel 
> can get
> the same value at the same time.

Thank you. I have identified a problem. The tx_descs array was created within 
xsk's pool.
When xsk is freed, the pool and tx_descs are also freed. Howerver, some descs 
may
remain in the virtqueue'queue, which could lead to a use-after-free scenario. 
Currently,
I do not have an idea to solve this concurrency problem and believe this 
scenario may
not be appropriate 

RE: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-04 Thread wangyunjian


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Monday, March 4, 2024 2:56 PM
> To: wangyunjian 
> Cc: m...@redhat.com; willemdebruijn.ker...@gmail.com; k...@kernel.org;
> bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net; b...@vger.kernel.org;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
> 
> On Wed, Feb 28, 2024 at 7:06 PM Yunjian Wang 
> wrote:
> >
> > This patch set allows TUN to support the AF_XDP Tx zero-copy feature,
> > which can significantly reduce CPU utilization for XDP programs.
> >
> > Since commit fc72d1d54dd9 ("tuntap: XDP transmission"), the pointer
> > ring has been utilized to queue different types of pointers by
> > encoding the type into the lower bits. Therefore, we introduce a new
> > flag, TUN_XDP_DESC_FLAG(0x2UL), which allows us to enqueue XDP
> > descriptors and differentiate them from XDP buffers and sk_buffs.
> > Additionally, a spin lock is added for enabling and disabling operations on 
> > the
> xsk pool.
> >
> > The performance testing was performed on a Intel E5-2620 2.40GHz
> machine.
> > Traffic were generated/send through TUN(testpmd txonly with AF_XDP) to
> > VM (testpmd rxonly in guest).
> >
> > +--+-+-+-+
> > |  |   copy  |zero-copy| speedup |
> > +--+-+-+-+
> > | UDP  |   Mpps  |   Mpps  |%|
> > | 64   |   2.5   |   4.0   |   60%   |
> > | 512  |   2.1   |   3.6   |   71%   |
> > | 1024 |   1.9   |   3.3   |   73%   |
> > +--+-+-+-+
> >
> > Signed-off-by: Yunjian Wang 
> > ---
> >  drivers/net/tun.c  | 177
> +++--
> >  drivers/vhost/net.c|   4 +
> >  include/linux/if_tun.h |  32 
> >  3 files changed, 208 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c index
> > bc80fc1d576e..7f4ff50b532c 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -63,6 +63,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -86,6 +87,7 @@ static void tun_default_link_ksettings(struct
> net_device *dev,
> >struct
> ethtool_link_ksettings
> > *cmd);
> >
> >  #define TUN_RX_PAD (NET_IP_ALIGN + NET_SKB_PAD)
> > +#define TUN_XDP_BATCH 64
> >
> >  /* TUN device flags */
> >
> > @@ -146,6 +148,9 @@ struct tun_file {
> > struct tun_struct *detached;
> > struct ptr_ring tx_ring;
> > struct xdp_rxq_info xdp_rxq;
> > +   struct xsk_buff_pool *xsk_pool;
> > +   spinlock_t pool_lock;   /* Protects xsk pool enable/disable */
> > +   u32 nb_descs;
> >  };
> >
> >  struct tun_page {
> > @@ -614,6 +619,8 @@ void tun_ptr_free(void *ptr)
> > struct xdp_frame *xdpf = tun_ptr_to_xdp(ptr);
> >
> > xdp_return_frame(xdpf);
> > +   } else if (tun_is_xdp_desc_frame(ptr)) {
> > +   return;
> > } else {
> > __skb_array_destroy_skb(ptr);
> > }
> > @@ -631,6 +638,37 @@ static void tun_queue_purge(struct tun_file *tfile)
> > skb_queue_purge(>sk.sk_error_queue);
> >  }
> >
> > +static void tun_set_xsk_pool(struct tun_file *tfile, struct
> > +xsk_buff_pool *pool) {
> > +   if (!pool)
> > +   return;
> > +
> > +   spin_lock(>pool_lock);
> > +   xsk_pool_set_rxq_info(pool, >xdp_rxq);
> > +   tfile->xsk_pool = pool;
> > +   spin_unlock(>pool_lock); }
> > +
> > +static void tun_clean_xsk_pool(struct tun_file *tfile) {
> > +   spin_lock(>pool_lock);
> > +   if (tfile->xsk_pool) {
> > +   void *ptr;
> > +
> > +   while ((ptr = ptr_ring_consume(>tx_ring)) != NULL)
> > +   tun_ptr_free(ptr);
> > +
> > +   if (tfile->nb_descs) {
> > +   xsk_tx_completed(tfile->xsk_pool,
> tfile->nb_descs);
> > +   if (xsk_uses_need_wakeup(tfile->xsk_pool))
> > +
> xsk_set_tx_need_wakeup(tfile->xsk_pool);
> > +   tfile->nb_descs = 0;
> > +   }
> > +   tfile->xsk_pool = NULL;
> > +   }
> > +   spin_unlock(>pool_lock); }
> > +
> >  static void __tun_detach(struct tun_file *tfile, bool clean)  {
> > struct tun_file *ntfile;
> > @@ -648,6 +686,11 @@ static void __tun_detach(struct tun_file *tfile, bool
> clean)
> > u16 index = tfile->queue_index;
> > BUG_ON(index >= tun->numqueues);
> >
> > +   ntfile = rtnl_dereference(tun->tfiles[tun->numqueues -
> 1]);
> > +   /* Stop xsk zc xmit */
> > +   tun_clean_xsk_pool(tfile);
> > +   tun_clean_xsk_pool(ntfile);
> > +
> > 

[PATCH net-next v2 2/2] tcp: add tracing of skbaddr in tcp_event_skb class

2024-03-04 Thread Jason Xing
From: Jason Xing 

Use the existing parameter and print the address of skbaddr
as other trace functions do.

Signed-off-by: Jason Xing 
---
 include/trace/events/tcp.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index ac36067ae066..6ca3e0343666 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -362,7 +362,8 @@ DECLARE_EVENT_CLASS(tcp_event_skb,
TP_STORE_ADDR_PORTS_SKB(__entry, skb);
),
 
-   TP_printk("src=%pISpc dest=%pISpc", __entry->saddr, __entry->daddr)
+   TP_printk("skbaddr=%p src=%pISpc dest=%pISpc",
+ __entry->skbaddr, __entry->saddr, __entry->daddr)
 );
 
 DEFINE_EVENT(tcp_event_skb, tcp_bad_csum,
-- 
2.37.3




[PATCH net-next v2 1/2] tcp: add tracing of skb/skaddr in tcp_event_sk_skb class

2024-03-04 Thread Jason Xing
From: Jason Xing 

Printing the addresses can help us identify the exact skb/sk
for those system in which it's not that easy to run BPF program.
As we can see, it already fetches those, then use it directly
and it will print like below:

...tcp_retransmit_skb: skbaddr=XXX skaddr=XXX family=AF_INET...

Signed-off-by: Jason Xing 
--
v2
Link: 
https://lore.kernel.org/netdev/cann89ijcscrakauk1gzzfooo20rtc9ixpij4lsowt5ruac_...@mail.gmail.com/
1. correct the previous description
---
 include/trace/events/tcp.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 7b1ddffa3dfc..ac36067ae066 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -88,7 +88,8 @@ DECLARE_EVENT_CLASS(tcp_event_sk_skb,
  sk->sk_v6_rcv_saddr, sk->sk_v6_daddr);
),
 
-   TP_printk("family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 
saddrv6=%pI6c daddrv6=%pI6c state=%s",
+   TP_printk("skbaddr=%p skaddr=%p family=%s sport=%hu dport=%hu 
saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c state=%s",
+ __entry->skbaddr, __entry->skaddr,
  show_family_name(__entry->family),
  __entry->sport, __entry->dport, __entry->saddr, 
__entry->daddr,
  __entry->saddr_v6, __entry->daddr_v6,
-- 
2.37.3




[PATCH net-next v2 0/2] tcp: add two missing addresses when using trace

2024-03-04 Thread Jason Xing
From: Jason Xing 

When I reviewed other people's patch [1], I noticed that similar things
also happen in tcp_event_skb class and tcp_event_sk_skb class. They
don't print those two addrs of skb/sk which already exist.

In this patch, I just do as other trace functions do, like
trace_net_dev_start_xmit(), to know the exact flow or skb we would like
to know in case some systems doesn't support BPF programs well or we
have to use /sys/kernel/debug/tracing only for some reasons.

[1]
Link: 
https://lore.kernel.org/netdev/CAL+tcoAhvFhXdr1WQU8mv_6ZX5nOoNpbOLAB6=c+db-qxq1...@mail.gmail.com/

v2
Link: 
https://lore.kernel.org/netdev/cann89ijcscrakauk1gzzfooo20rtc9ixpij4lsowt5ruac_...@mail.gmail.com/
1. change the description.

Jason Xing (2):
  tcp: add tracing of skb/skaddr in tcp_event_sk_skb class
  tcp: add tracing of skbaddr in tcp_event_skb class

 include/trace/events/tcp.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.37.3




Re: [PATCH v6 7/7] Documentation: KVM: Add hypercall for LoongArch

2024-03-04 Thread maobibo




On 2024/3/2 下午5:41, WANG Xuerui wrote:

On 3/2/24 16:47, Bibo Mao wrote:

Add documentation topic for using pv_virt when running as a guest
on KVM hypervisor.

Signed-off-by: Bibo Mao 
---
  Documentation/virt/kvm/index.rst  |  1 +
  .../virt/kvm/loongarch/hypercalls.rst | 79 +++
  Documentation/virt/kvm/loongarch/index.rst    | 10 +++
  3 files changed, 90 insertions(+)
  create mode 100644 Documentation/virt/kvm/loongarch/hypercalls.rst
  create mode 100644 Documentation/virt/kvm/loongarch/index.rst

diff --git a/Documentation/virt/kvm/index.rst 
b/Documentation/virt/kvm/index.rst

index ad13ec55ddfe..9ca5a45c2140 100644
--- a/Documentation/virt/kvm/index.rst
+++ b/Documentation/virt/kvm/index.rst
@@ -14,6 +14,7 @@ KVM
 s390/index
 ppc-pv
 x86/index
+   loongarch/index
 locking
 vcpu-requests
diff --git a/Documentation/virt/kvm/loongarch/hypercalls.rst 
b/Documentation/virt/kvm/loongarch/hypercalls.rst

new file mode 100644
index ..1679e48d67d2
--- /dev/null
+++ b/Documentation/virt/kvm/loongarch/hypercalls.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+The LoongArch paravirtual interface
+===
+
+KVM hypercalls use the HVCL instruction with code 0x100, and the 
hypercall

+number is put in a0 and up to five arguments may be placed in a1-a5, the
+return value is placed in v0 (alias with a0).


Just say a0: the name v0 is long deprecated (has been the case ever 
since LoongArch got mainlined).



Sure, will modify since you are compiler export :)


+
+The code for that interface can be found in arch/loongarch/kvm/*
+
+Querying for existence
+==
+
+To find out if we're running on KVM or not, cpucfg can be used with 
index
+CPUCFG_KVM_BASE (0x4000), cpucfg range between 0x4000 - 
0x40FF
+is marked as a specially reserved range. All existing and future 
processors

+will not implement any features in this range.
+
+When Linux is running on KVM, cpucfg with index CPUCFG_KVM_BASE 
(0x4000)

+returns magic string "KVM\0"
+
+Once you determined you're running under a PV capable KVM, you can 
now use

+hypercalls as described below.


So this is still the approach similar to the x86 CPUID-based 
implementation. But here the non-privileged behavior isn't specified -- 
I see there is PLV checking in Patch 3 but it's safer to have the 
requirement spelled out here too.


But I still think this approach touches more places than strictly 
needed. As it is currently the case in 
arch/loongarch/kernel/cpu-probe.c, the FEATURES IOCSR is checked for a 
bit IOCSRF_VM that already signifies presence of a hypervisor; if this 
information can be interpreted as availability of the HVCL instruction 
(which I suppose is the case -- a hypervisor can always trap-and-emulate 
in case HVCL isn't provided by hardware), here we can already start 
making calls with HVCL.


We can and should define a uniform interface for probing the hypervisor 
kind, similar to the centrally-managed RISC-V SBI implementation ID 
registry [1]: otherwise future non-KVM hypervisors would have to


1. somehow pretend they are KVM and eventually fail to do so, leading to 
subtle incompatibilities,

2. invent another way of probing for their existence,
3. piggy-back on the current KVM definition, which is inelegant (reading 
the LoongArch-KVM-defined CPUCFG leaf only to find it's not KVM) and 
utterly makes the definition here *not* KVM-specific.


[1]: 
https://github.com/riscv-non-isa/riscv-sbi-doc/blob/v2.0/src/ext-base.adoc


Sorry, I know nothing about riscv. Can you describe how sbi_get_mimpid() 
is implemented in detailed? Is it a simple library or need trap into 
secure mode or need trap into hypervisor mode?



My take on this:

To check if we are running on Linux KVM or not, first check IOCSR 0x8 
(``LOONGARCH_IOCSR_FEATURES``) for bit 11 (``IOCSRF_VM``); we are 
running under a hypervisor if the bit is set. Then invoke ``HVCL 0`` to 
find out the hypervisor implementation ID; a return value in ``$a0`` of 
0x004d564b (``KVM\0``) means Linux KVM, in which case the rest of the 
convention applies.


I do not think so. `HVCL 0` requires that hypercall ABIs need be unified 
for all hypervisors. Instead it is not necessary, each hypervisor can 
has its own hypercall ABI.



+
+KVM hypercall ABI
+=
+
+Hypercall ABI on KVM is simple, only one scratch register a0 (v0) and 
at most
+five generic registers used as input parameter. FP register and 
vector register
+is not used for input register and should not be modified during 
hypercall.
+Hypercall function can be inlined since there is only one scratch 
register.


It should be pointed out explicitly that on hypercall return all 
Well, return value description will added. What do think about the 
meaning of return value for KVM_HCALL_FUNC_PV_IPI hypercall?  The number 
of CPUs with IPI delivered 

Re: [PATCH net-next 1/2] tcp: add tracing of skb/skaddr in tcp_event_sk_skb class

2024-03-04 Thread Jason Xing
On Mon, Mar 4, 2024 at 4:24 PM Eric Dumazet  wrote:
>
> On Thu, Feb 29, 2024 at 6:10 PM Jason Xing  wrote:
> >
> > From: Jason Xing 
> >
> > Prio to this patch, the trace function doesn't print addresses
> > which might be forgotten. As we can see, it already fetches
> > those, use it directly and it will print like below:
> >
> > ...tcp_retransmit_skb: skbaddr=XXX skaddr=XXX family=AF_INET...
>
> It was not forgotten, but a recommendation from Alexei
>
> https://yhbt.net/lore/all/20171010173821.6djxyvrggvaiv...@ast-mbp.dhcp.thefacebook.com/

Thanks, Eric.

Oh, good to see the old threads. I have to change the description. The
other reason why I chose to add address messages is we do have some
useful information when using trace ss another thread[1] does: it also
ships the trace with address printing.

[1]: 
https://lore.kernel.org/netdev/20240304034632.GA21357@didi-ThinkCentre-M920t-N000/

Thanks,
Jason



Re: [PATCH net-next v2] tcp: Add skb addr and sock addr to arguments of tracepoint tcp_probe.

2024-03-04 Thread yuanli fu
Eric Dumazet  于2024年3月4日周一 16:13写道:
>
> On Mon, Mar 4, 2024 at 4:46 AM fuyuanli  wrote:
> >
> > It is useful to expose skb addr and sock addr to user in tracepoint
> > tcp_probe, so that we can get more information while monitoring
> > receiving of tcp data, by ebpf or other ways.
> >
> > For example, we need to identify a packet by seq and end_seq when
> > calculate transmit latency between lay 2 and lay 4 by ebpf, but which is
>
> Please use "layer 2 and layer 4".
>
> > not available in tcp_probe, so we can only use kprobe hooking
> > tcp_rcv_esatblised to get them. But we can use tcp_probe directly if skb
> > addr and sock addr are available, which is more efficient.
>
> Okay, but please fix the typo. Correct function name is tcp_rcv_established
>
> >
> > Signed-off-by: fuyuanli 
> > Link: 
> > https://lore.kernel.org/netdev/20240229052813.GA23899@didi-ThinkCentre-M920t-N000/
> >
>
OK, I will submit a v3 patch, thanks for your review :)
fuyuanli



Re: [PATCH net-next 1/2] tcp: add tracing of skb/skaddr in tcp_event_sk_skb class

2024-03-04 Thread Eric Dumazet
On Thu, Feb 29, 2024 at 6:10 PM Jason Xing  wrote:
>
> From: Jason Xing 
>
> Prio to this patch, the trace function doesn't print addresses
> which might be forgotten. As we can see, it already fetches
> those, use it directly and it will print like below:
>
> ...tcp_retransmit_skb: skbaddr=XXX skaddr=XXX family=AF_INET...

It was not forgotten, but a recommendation from Alexei

https://yhbt.net/lore/all/20171010173821.6djxyvrggvaiv...@ast-mbp.dhcp.thefacebook.com/



Re: [PATCH net-next 0/2] add two missing addresses when using trace

2024-03-04 Thread Eric Dumazet
On Mon, Mar 4, 2024 at 8:01 AM Jason Xing  wrote:
>
> On Fri, Mar 1, 2024 at 1:10 AM Jason Xing  wrote:
> >
> > From: Jason Xing 
> >
> > When I reviewed other people's patch [1], I noticed that similar thing
> > also happens in tcp_event_skb class and tcp_event_sk_skb class. They
> > don't print those two addrs of skb/sk which already exist.
> >
> > They are probably forgotten by the original authors, so this time I
> > finish the work. Also, adding more trace about the socket/skb addr can
> > help us sometime even though the chance is minor.
> >
> > I don't consider it is a bug, thus I chose to target net-next tree.
>
> Gentle ping...No rush. Just in case this simple patchset was lost for
> some reason.

This was a conscious choice I think.

https://yhbt.net/lore/all/20171010173821.6djxyvrggvaiv...@ast-mbp.dhcp.thefacebook.com/



Re: [PATCH net-next v2] tcp: Add skb addr and sock addr to arguments of tracepoint tcp_probe.

2024-03-04 Thread Eric Dumazet
On Mon, Mar 4, 2024 at 4:46 AM fuyuanli  wrote:
>
> It is useful to expose skb addr and sock addr to user in tracepoint
> tcp_probe, so that we can get more information while monitoring
> receiving of tcp data, by ebpf or other ways.
>
> For example, we need to identify a packet by seq and end_seq when
> calculate transmit latency between lay 2 and lay 4 by ebpf, but which is

Please use "layer 2 and layer 4".

> not available in tcp_probe, so we can only use kprobe hooking
> tcp_rcv_esatblised to get them. But we can use tcp_probe directly if skb
> addr and sock addr are available, which is more efficient.

Okay, but please fix the typo. Correct function name is tcp_rcv_established

>
> Signed-off-by: fuyuanli 
> Link: 
> https://lore.kernel.org/netdev/20240229052813.GA23899@didi-ThinkCentre-M920t-N000/
>



Re: [RFC PATCH v3 1/5] dt-bindings: mfd: add entry for Marvell 88PM886 PMIC

2024-03-04 Thread Krzysztof Kozlowski
On 03/03/2024 11:04, Karel Balej wrote:
> From: Karel Balej 
> 


> +examples:
> +  - |
> +#include 
> +i2c {
> +  #address-cells = <1>;
> +  #size-cells = <0>;
> +  pmic@30 {
> +compatible = "marvell,88pm886-a1";
> +reg = <0x30>;
> +interrupts = <0 4 IRQ_TYPE_LEVEL_HIGH>;
> +interrupt-parent = <>;
> +wakeup-source;
> +
> +regulators {
> +  ldo2: ldo2 {
> +regulator-min-microvolt = <310>;
> +regulator-max-microvolt = <330>;
> +};

Messed indentation here and in following lines..

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof