subject:"Re\:"

Re: [PATCH] kallsyms, livepatch: Fix livepatch with CONFIG_LTO_CLANG

2024-06-07 Thread Miroslav Benes

Hi,

On Tue, 4 Jun 2024, Song Liu wrote:

> With CONFIG_LTO_CLANG, the compiler may postfix symbols with .llvm.
> to avoid symbol duplication. scripts/kallsyms.c sorted the symbols
> without these postfixes. The default symbol lookup also removes these
> postfixes before comparing symbols.
> 
> On the other hand, livepatch need to look up symbols with the full names.
> However, calling kallsyms_on_each_match_symbol with full name (with the
> postfix) cannot find the symbol(s). As a result, we cannot livepatch
> kernel functions with .llvm. postfix or kernel functions that use
> relocation information to symbols with .llvm. postfixes.
> 
> Fix this by calling kallsyms_on_each_match_symbol without the postfix;
> and then match the full name (with postfix) in klp_match_callback.
> 
> Signed-off-by: Song Liu 
> ---
>  include/linux/kallsyms.h | 13 +
>  kernel/kallsyms.c| 21 -
>  kernel/livepatch/core.c  | 32 +++-
>  3 files changed, 60 insertions(+), 6 deletions(-)

I do not like much that something which seems to be kallsyms-internal is 
leaked out. You need to export cleanup_symbol_name() and there is now a 
lot of code outside. I would feel much more comfortable if it is all 
hidden from kallsyms users and kept there. Would it be possible?

Moreover, isn't there a similar problem for ftrace, kprobes, ebpf,...?

Thank you,
Miroslav

Re: [PATCH v2] rpmsg: qcom_smd: Improve error handling for qcom_smd_parse_edge

2024-06-07 Thread Dmitry Baryshkov

On Thu, Jun 06, 2024 at 09:01:36PM +0200, Luca Weiss wrote:
> When the mailbox driver has not probed yet, the error message "failed to
> parse smd edge" is just going to confuse users, so improve the error
> prints a bit.
> 
> Cover the last remaining exits from qcom_smd_parse_edge with proper
> error prints, especially the one for the mbox_chan deserved
> dev_err_probe to handle EPROBE_DEFER nicely. And add one for ipc_regmap
> also to be complete.
> 
> With this done, we can remove the outer print completely.
> 
> Signed-off-by: Luca Weiss 
> ---
> Changes in v2:
> - Rebase on qcom for-next, drop dts patches which have been applied
> - Improve error printing situation (Bjorn)
> - Link to v1: 
> https://lore.kernel.org/r/20240424-apcs-mboxes-v1-0-6556c47cb...@z3ntu.xyz
> ---
>  drivers/rpmsg/qcom_smd.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry

Re: [PATCH 1/3] rust: add static_call support

2024-06-07 Thread Miguel Ojeda

On Fri, Jun 7, 2024 at 12:52 PM Peter Zijlstra  wrote:
>
> I'm sorry, but 30+ years of reading ! as NOT (or factorial) isn't going
> to go away. So I'm reading your macros do NOT rule.

It makes it clear what is macro call or not. They could have gone for
UPPERCASE names (for instance), yes. On the other hand, they do not
work like C macros and are ~hygienic, so it also makes sense to avoid
confusion here.

I mean, I am not suggesting to do a CPP-pass to Rust files, but if
someone really, really wanted to mix them in a single file, it would
be nice to not confuse the two kinds. :)

Generally they feel "closer" to the language (given what they
do/support) compared to the CPP ones, so it also makes sense they
don't "shout" so much compared to UPPERCASE, if that makes sense.

> The above exaple fails, because the next token is :ident, whatever the
> heck that might be. Also, extra points for line-noise due to lack of
> whitespace.

$name:ident means "match what Rust would consider an identifier here
and call it $name for the purposes of this macro".

So, for instance, $x:ident matches:

a
a2
a_b

But it would not match:

2a
a-b
a _b

For the usual reasons why those are not identifiers.

https://godbolt.org/z/G7v4j67dc

fn f(x: i32) -> i32 {
x * 2
}

macro_rules! f {
($x:ident) => { $x * 2 }
}

fn main() {
let a = 42;

let b = f(a);   // Function.
let c = f!(a);  // Macro.

//let c = f!(a2);   // Works, but the variable does not exist.
//let c = f!(2a);   // Error: no rules expected the token `2a`.

//let c = f!(a_b);  // Works, but the variable does not exist.
//let c = f!(a-b);  // Error: no rules expected the token `-`.
//let c = f!(a_ b); // Error: no rules expected the token `b`.

println!("{a} {b} {c}");
}

I hope this makes it clearer.

> You just need to extend the rust thing to be able to consume C header
> files.

I agree, because in practice it is quite useful for a language like
Rust that consuming C header files is "natively" supported.

Though it also has downsides and is a big decision, which is why, like
Alice mentioned, some people agree, and some people don't.
Nevertheless, we have been doing our best for a long time to get the
best we can for the kernel -- just 2 days ago we told the Rust project
in one of our meetings that it would be nice to see that particular
"project goal" from that document realized (among others).

Cheers,
Miguel

Re: [PATCH 1/3] rust: add static_call support

2024-06-07 Thread Peter Zijlstra

On Fri, Jun 07, 2024 at 09:43:29AM +, Alice Ryhl wrote:
> Peter Zijlstra  wrote:
> > On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote:
> > > On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra  
> > > wrote:
> > > >
> > > > This is absolutely unreadable gibberish -- how am I supposed to keep
> > > > this in sync with the rest of the static_call infrastructure?
> > > 
> > > Yeah, they are macros, which look different from "normal" Rust code.
> > 
> > Macros like CPP ?
> 
> Yes, this patch series uses declarative macros, which are the closest
> that Rust has to the C preprocessor. They are powerful, but just like
> CPP, they can become pretty complicated and hard to read if you are
> doing something non-trivial.
> 
> The macro_rules! block is how you define a new declarative macro.

I'm sorry, but 30+ years of reading ! as NOT (or factorial) isn't going
to go away. So I'm reading your macros do NOT rule.

> The ($name:ident($($args:expr),* $(,)?)) part defines the arguments to
> the declarative macro. This syntax means:
> 
> 1. The input starts with any identifier, which we call $name.
> 2. Then there must be a ( token.

The above exaple fails, because the next token is :ident, whatever the
heck that might be. Also, extra points for line-noise due to lack of
whitespace.

> So for example, you might invoke the macro like this:
> 
> static_call!(tp_func_my_tracepoint(__data,  my_task_struct));

static_call NOT (blah dog blah);

> Inside the macro, you will see things such as:
> $crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; }
> 
> The Rust syntax for invoking a macro has an exclamation mark after the

Like I said before, the creator of Rust must've been an esoteric
language freak and must've wanted to make this unreadable on purpose :/

Also, why the white space beteen the :: scope operator and the [< thing?
that's just weird. I would then expect the output to be:

  ...::bindings:: __SCK__my_static_key

> name, so you know that $crate::macros::paste is a macro. The `paste`
> macro just emits its input unchanged, except that any identifiers
> between [< and >] are concatenated into a single identifier. So if $name
> is my_static_key, then the above invocation of paste! emits:
> 
>   $crate::bindings::__SCK__my_static_key;

But it doesn't, so it isn't unmodified, it seems to strip whitespace.

> The $crate::bindings module is where the output of bindgen goes, so this
> should correspond to the C symbol called __SCK__my_static_key.
> 
> > > Is there something we could do to help here? I think Alice and others
> > > would be happy to explain how it works and/or help maintain it in the
> > > future if you prefer.
> > 
> > Write a new language that looks more like C -- pretty please ? :-)
> > 
> > Mostly I would just really like you to just use arm/jump_label.h,
> > they're inline functions and should be doable with IR, no weirdo CPP
> > involved in this case.
> 
> I assume that you're referring to static_key_false here? I don't think
> that function can be exposed using IR because it passes the function
> argument called key as an "i" argument to an inline assembly block. Any
> attempt to compile static_key_false without knowing the value of key at
> compile time will surely fail to compile with the
> 
>   invalid operand for inline asm constraint 'i'
> 
> error.

You can have clang read the header files and compile them into
Intermediate-Representation, and have it splice the lot into the Rust
crap's IR and voila, compile time.

You just need to extend the rust thing to be able to consume C header
files.

Re: [PATCH 1/3] rust: add static_call support

2024-06-07 Thread Alice Ryhl

Peter Zijlstra  wrote:
> On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote:
> > On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra  wrote:
> > >
> > > This is absolutely unreadable gibberish -- how am I supposed to keep
> > > this in sync with the rest of the static_call infrastructure?
> > 
> > Yeah, they are macros, which look different from "normal" Rust code.
> 
> Macros like CPP ?

Yes, this patch series uses declarative macros, which are the closest
that Rust has to the C preprocessor. They are powerful, but just like
CPP, they can become pretty complicated and hard to read if you are
doing something non-trivial.

The macro_rules! block is how you define a new declarative macro.

The ($name:ident($($args:expr),* $(,)?)) part defines the arguments to
the declarative macro. This syntax means:

1. The input starts with any identifier, which we call $name.
2. Then there must be a ( token.
3. Then the $(),* part defines a repetition group. The contents inside
   the parenthesises are what is being repeated. The comma means that
   repetitions are separated by commas. The asterisk means that the
   contents may be repeated any number of times, including zero times.
   (Alternatives are + which means repeated at least once, and ? which
   means that the contents may be repeated zero or one time.)
4. The contents of the repetition group will be an expression, which we
   call $args.
5. After the last expression, we have $(,)? which means that you can
   have zero or one commas after the last expression. Rust usually
   allows trailing commas in lists, which is why I included it.
6. And finally, you must close the input with a ) token.

So for example, you might invoke the macro like this:

static_call!(tp_func_my_tracepoint(__data,  my_task_struct));

Here, $name will be tp_func_my_tracepoint, and the repetition group is
repeated twice, with $args first corresponding to `__data` and then to
` my_task_struct` when the macro is expanded. The $(,)? group is
repeated zero times.

Inside the macro, you will see things such as:
$crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; }

The Rust syntax for invoking a macro has an exclamation mark after the
name, so you know that $crate::macros::paste is a macro. The `paste`
macro just emits its input unchanged, except that any identifiers
between [< and >] are concatenated into a single identifier. So if $name
is my_static_key, then the above invocation of paste! emits:

$crate::bindings::__SCK__my_static_key;

The $crate::bindings module is where the output of bindgen goes, so this
should correspond to the C symbol called __SCK__my_static_key.

> > Is there something we could do to help here? I think Alice and others
> > would be happy to explain how it works and/or help maintain it in the
> > future if you prefer.
> 
> Write a new language that looks more like C -- pretty please ? :-)
> 
> Mostly I would just really like you to just use arm/jump_label.h,
> they're inline functions and should be doable with IR, no weirdo CPP
> involved in this case.

I assume that you're referring to static_key_false here? I don't think
that function can be exposed using IR because it passes the function
argument called key as an "i" argument to an inline assembly block. Any
attempt to compile static_key_false without knowing the value of key at
compile time will surely fail to compile with the

invalid operand for inline asm constraint 'i'

error.

Alice

Re: [PATCH] livepatch: introduce klp_func called interface

2024-06-07 Thread Miroslav Benes

Hi,

On Tue, 4 Jun 2024, Song Liu wrote:

> On Tue, May 21, 2024 at 1:04 AM Petr Mladek  wrote:
> [...]
> > >
> > > Yes, but the information you get is limited compared to what is available
> > > now. You would obtain the information that a patched function was called
> > > but ftrace could also give you the context and more.
> >
> > Another motivation to use ftrace for testing is that it does not
> > affect the performance in production.
> >
> > We should keep klp_ftrace_handler() as fast as possible so that we
> > could livepatch also performance sensitive functions.
> 
> At LPC last year, we discussed about adding a counter to each
> klp_func, like:
> 
> struct klp_func {
> ...
> u64 __percpu counter;
> ...
> };
> 
> With some static_key (+ sysctl), this should give us a way to estimate
> the overhead of livepatch. If we have the counter, this patch is not
> needed any more. Does this (adding the counter) sound like
> something we still want to pursue?

It would be better than this patch but given what was mentioned in the 
thread I wonder if it is possible to use ftrace even for this. See 
/sys/kernel/tracing/trace_stat/function*. It already gathers the number of 
hits.

Would it be sufficient for you? I guess it depends on what the intention 
is. If there is no time limit, klp_func.counter might be better to provide 
some kind of overall statistics (but I am not sure if it has any value) 
and to avoid having ftrace registered on a live patched function for 
infinite period of time. If the intention is to gather data for some 
limited period, trace_stat sounds like much better approach to me.

Regards
Miroslav

Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks

2024-06-07 Thread Petr Pavlu

On 5/28/24 01:43, Steven Rostedt wrote:
> On Mon, 27 May 2024 11:36:55 +0200
> Petr Pavlu  wrote:
> 
>>>>  static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
>>>>  {
>>>> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, 
>>>> unsigned long size,
>>>> */
>>>>synchronize_rcu();
>>>>for_each_buffer_cpu(buffer, cpu) {
>>>> +  unsigned long flags;
>>>> +
>>>>cpu_buffer = buffer->buffers[cpu];
>>>> +  raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>>>>rb_check_pages(cpu_buffer);
>>>> +  raw_spin_unlock_irqrestore(_buffer->reader_lock,
>>>> + flags);  
>>>
>>> Putting my RT hat on, I really don't like the above fix. The
>>> rb_check_pages() iterates all subbuffers which makes the time interrupts
>>> are disabled non-deterministic.  
>>
>> I see, this applies also to the same rb_check_pages() validation invoked
>> from ring_buffer_read_finish().
>>
>>>
>>> Instead, I would rather have something where we disable readers while we do
>>> the check, and re-enable them.
>>>
>>> raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>>> cpu_buffer->read_disabled++;
>>> raw_spin_unlock_irqrestore(_buffer->reader_lock, 
>>> flags);
>>>
>>> // Also, don't put flags on a new line. We are allow to go 100 characters 
>>> now.  
>>
>> Noted.
>>
>>>
>>>
>>> rb_check_pages(cpu_buffer);
>>> raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>>> cpu_buffer->read_disabled--;
>>> raw_spin_unlock_irqrestore(_buffer->reader_lock, 
>>> flags);
>>>
>>> Or something like that. Yes, that also requires creating a new
>>> "read_disabled" field in the ring_buffer_per_cpu code.  
>>
>> I think this would work but I'm personally not immediately sold on this
>> approach. If I understand the idea correctly, readers should then check
>> whether cpu_buffer->read_disabled is set and bail out with some error if
>> that is the case. The rb_check_pages() function is only a self-check
>> code and as such, I feel it doesn't justify disrupting readers with
>> a new error condition and adding more complex locking.
> 
> Honestly, this code was never made for more than one reader per
> cpu_buffer. I'm perfectly fine if all check_pages() causes other
> readers to the same per_cpu buffer to get -EBUSY.
> 
> Do you really see this being a problem? What use case is there for
> hitting the check_pages() and reading the same cpu buffer at the same
> time?

My main issue is with added complexity to check the new read_disabled
flag. The rb_check_pages() part is simple and you showed what to do. The
readers part is what I struggle with.

I think the read_disabled flag needs to be either checked once in
rb_get_reader_page() where the actual problem with making the list
temporarily inconsistent exists. Or alternatively, it can be checked by
direct or indirect users of rb_get_reader_page() just after they take
the reader_lock.

Looking at the final rb_get_reader_page() function, it currently always
returns a valid reader page unless the buffer doesn't contain any
additional entry or a serious problem is detected by RB_WARN_ON()
checks. This is simple to handle for callers, either they get a reader
page and can continue, or they stop.

Returning -EBUSY means that callers have a new case that they need to
decide what to do about. They need to propagate the error up the chain
or attempt to handle it. This involves ring-buffer functions
rb_advance_reader(), rb_buffer_peek(), ring_buffer_peek(),
ring_buffer_consume(), ring_buffer_read_page()
ring_buffer_map_get_reader() and their callers from other source files.

It is possible to handle this new case in these functions but I'm not
sure if adding this logic is justified. I feel I must have misunderstood
something how it should work?

I was further thinking about alternatives that would possibly provide
a less thorough check but have their complexity limited only to
rb_check_pages(). The already mentioned idea is to have the function to
look only at surrounding nodes where some change in the list occurred.

Another option could be to try traversing the whole list in smaller
parts and give up the reader_lock in between them. This would need some
care to make sure that the operation completes, e.g. the code would need
to bail out if it detects a change on cpu_buffer->pages_read.

Thanks,
Petr

Re: [PATCH 0/2] livepatch: Add compiler optimization disclaimer/docs

2024-06-07 Thread Miroslav Benes

Hi,

On Fri, 31 May 2024, Joe Lawrence wrote:

> On 5/31/24 07:23, Miroslav Benes wrote:
> > Hi,
> > 
> > On Tue, 21 Jul 2020, Joe Lawrence wrote:
> > 
> >> In light of [PATCH] Revert "kbuild: use -flive-patching when
> >> CONFIG_LIVEPATCH is enabled" [1], we should add some loud disclaimers
> >> and explanation of the impact compiler optimizations have on
> >> livepatching.
> >>
> >> The first commit provides detailed explanations and examples.  The list
> >> was taken mostly from Miroslav's LPC talk a few years back.  This is a
> >> bit rough, so corrections and additional suggestions welcome.  Expanding
> >> upon the source-based patching approach would be helpful, too.
> >>
> >> The second commit adds a small README.rst file in the livepatch samples
> >> directory pointing the reader to the doc introduced in the first commit.
> >>
> >> I didn't touch the livepatch kselftests yet as I'm still unsure about
> >> how to best account for IPA here.  We could add the same README.rst
> >> disclaimer here, too, but perhaps we have a chance to do something more.
> >> Possibilities range from checking for renamed functions as part of their
> >> build, or the selftest scripts, or even adding something to the kernel
> >> API.  I think we'll have a better idea after reviewing the compiler
> >> considerations doc.
> > 
> > thanks to Marcos for resurrecting this.
> > 
> > Joe, do you have an updated version by any chance? Some things have 
> > changed since July 2020 so it calls for a new review. If there was an 
> > improved version, it would be easier. If not, no problem at all.
> > 
> 
> Yea, it's been a little while :) I don't have any newer version than
> this one.  I can rebase,  apply all of the v1 suggestions, and see where
> it stands.  LMK if you can think of any specifics that could be added.

I will walk through the patches first to see if there is something which 
can/should be changed given the development since then.

> For example, CONFIG_KERNEL_IBT will be driving some changes soon,
> whether it be klp-convert for source-based patches or vmlinux.o binary
> comparison for kpatch-build.

True.

> I can push a v2 with a few changes, but IIRC, last time we reviewed
> this, it kinda begged the question of how someone is creating the
> livepatch in the first place.  As long as we're fine holding that
> thought for a while longer, this doc may still be useful by itself.

If I remember correctly, the conclusion was that this doc was beneficial 
on its own.

Miroslav

Re: [PATCH v2 1/1] dt-bindings: remoteproc: imx_rproc: add minItems for power-domain

2024-06-07 Thread Krzysztof Kozlowski

On 06/06/2024 17:00, Frank Li wrote:
> "fsl,imx8qxp-cm4" and "fsl,imx8qm-cm4" need minimum 2 power domains. Keep
> the same restriction for other compatible string.
> 
> Signed-off-by: Frank Li 
> ---
> 
> Notes:
> Change from v1 to v2
> - set minitem to 2 at top
> - Add imx8qm compatible string also
> - use not logic to handle difference compatible string restriction
> - update commit message.
> 
> pass dt_binding_check.
> 
> make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8  dt_binding_check 
> DT_SCHEMA_FILES=fsl,imx-rproc.yaml
>   SCHEMA  Documentation/devicetree/bindings/processed-schema.json
>   CHKDT   Documentation/devicetree/bindings
>   LINTDocumentation/devicetree/bindings
>   DTEX
> Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dts
>   DTC_CHK 
> Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dtb
> 
>  .../bindings/remoteproc/fsl,imx-rproc.yaml | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> index df36e29d974ca..da108a39df435 100644
> --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> @@ -59,6 +59,7 @@ properties:
>  maxItems: 32
>  
>power-domains:
> +minItems: 2
>  maxItems: 8
>  
>fsl,auto-boot:
> @@ -99,6 +100,19 @@ allOf:
>properties:
>  fsl,iomuxc-gpr: false
>  
> +  - if:
> +  properties:
> +compatible:
> +  not:
> +contains:
> +  enum:
> +- fsl,imx8qxp-cm4
> +- fsl,imx8qm-cm4
> +then:
> +  properties:
> +power-domains:
> +  minItems: 8

What happened with the "else:"? How many power domains is needed for
other devices?

Best regards,
Krzysztof

Re: [PATCH v2 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-06-07 Thread Krzysztof Kozlowski

On 06/06/2024 21:18, Luca Weiss wrote:
> The qcom,ipc-N properties are essentially providing a reference to a
> mailbox, so allow using the mboxes property to do the same in a more
> structured way.
> 
> Since multiple SMSM hosts are supported, we need to be able to provide
> the correct mailbox for each host. The old qcom,ipc-N properties map to
> the mboxes property by index, starting at 0 since that's a valid SMSM
> host also.
> 
> Mark the older qcom,ipc-N as deprecated and update the example with
> mboxes.
> 
> Signed-off-by: Luca Weiss 
> ---

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

Re: [PATCH 1/5] dt-bindings: remoteproc: qcom,sm8550-pas: document the SDX75 PAS

2024-06-07 Thread Naina Mehta


On 6/6/2024 8:19 PM, Krzysztof Kozlowski wrote:

On 06/06/2024 16:38, Naina Mehta wrote:

Document the MPSS Peripheral Authentication Service on SDX75 platform.

Signed-off-by: Naina Mehta 
---
  .../devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml  | 1 +
  1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml 
b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
index 73fda7565cd1..02e15b1f78ab 100644
--- a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
@@ -16,6 +16,7 @@ description:
  properties:
compatible:
  enum:
+  - qcom,sdx75-mpss-pas
- qcom,sm8550-adsp-pas
- qcom,sm8550-cdsp-pas
- qcom,sm8550-mpss-pas


Missing updates to allOf constraints. Are you sure this is the binding
for SDX75?
SDX75 diverged from SDX55 due to introduction of separate DTB firmware 
binary and addition of DSM and Qlink logging memory regions. Considering 
these additions, SM8550 PAS bindings seem closest.
Thanks for catching updates required for allOf constraints. I will add 
sdx75 compatible in-line with qcom,sm8650-mpss-pas.

Please let me know if you recommend creating new bindings for SDX75.

Regards,
Naina



Best regards,
Krzysztof

Re: [PATCH] net: missing check

2024-06-07 Thread kernel test robot

Hi Denis,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.10-rc2 next-20240607]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Denis-Arefev/net-missing-check/20240606-230540
base:   linus/master
patch link:
https://lore.kernel.org/r/20240606141450.44709-1-arefev%40swemel.ru
patch subject: [PATCH] net: missing check
config: x86_64-randconfig-121-20240607 
(https://download.01.org/0day-ci/archive/20240607/202406071404.oilhfohm-...@intel.com/config)
compiler: gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240607/202406071404.oilhfohm-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202406071404.oilhfohm-...@intel.com/

sparse warnings: (new ones prefixed by >>)
   drivers/net/virtio_net.c: note: in included file:
>> include/linux/virtio_net.h:103:58: sparse: sparse: incorrect type in 
>> argument 2 (different base types) @@ expected unsigned long long 
>> [usertype] divisor @@ got restricted __virtio16 const [usertype] 
>> gso_size @@
   include/linux/virtio_net.h:103:58: sparse: expected unsigned long long 
[usertype] divisor
   include/linux/virtio_net.h:103:58: sparse: got restricted __virtio16 
const [usertype] gso_size
>> include/linux/virtio_net.h:104:42: sparse: sparse: restricted __virtio16 
>> degrades to integer

vim +103 include/linux/virtio_net.h

49  
50  static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
51  const struct virtio_net_hdr 
*hdr,
52  bool little_endian)
53  {
54  unsigned int nh_min_len = sizeof(struct iphdr);
55  unsigned int gso_type = 0;
56  unsigned int thlen = 0;
57  unsigned int p_off = 0;
58  unsigned int ip_proto;
59  u64 ret, remainder;
60  
61  if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
62  switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
63  case VIRTIO_NET_HDR_GSO_TCPV4:
64  gso_type = SKB_GSO_TCPV4;
65  ip_proto = IPPROTO_TCP;
66  thlen = sizeof(struct tcphdr);
67  break;
68  case VIRTIO_NET_HDR_GSO_TCPV6:
69  gso_type = SKB_GSO_TCPV6;
70  ip_proto = IPPROTO_TCP;
71  thlen = sizeof(struct tcphdr);
72  nh_min_len = sizeof(struct ipv6hdr);
73  break;
74  case VIRTIO_NET_HDR_GSO_UDP:
75  gso_type = SKB_GSO_UDP;
76  ip_proto = IPPROTO_UDP;
77  thlen = sizeof(struct udphdr);
78  break;
79  case VIRTIO_NET_HDR_GSO_UDP_L4:
80  gso_type = SKB_GSO_UDP_L4;
81  ip_proto = IPPROTO_UDP;
82  thlen = sizeof(struct udphdr);
83  break;
84  default:
85  return -EINVAL;
86  }
87  
88  if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
89  gso_type |= SKB_GSO_TCP_ECN;
90  
91  if (hdr->gso_size == 0)
92  return -EINVAL;
93  }
94  
95  skb_reset_mac_header(skb);
96  
97  if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
98  u32 start = __virtio16_to_cpu(little_endian, 
hdr->csum_start);
99  u32 off = __virtio16_to_cpu(little_endian, 
hdr->csum_offset);
   100  u32 needed = start + max_t(u32, thlen, off + 
sizeof(__sum16));
   101  
   102  if (hdr->gso_size) {
 > 103  ret = div64_u64_rem(skb->len, hdr->gso_size, 
 > );
 > 104  if (!(ret && (hdr->gso_size > needed) &&
   105  ((remainder > needed) 
|| (remainder == 0 {
   106  return -EINVAL;
   107  }
   108  skb_shinfo(skb)->tx_flags |= SKBFL_SHARED_FRAG;
   109  }
   110  
   111  if (!pskb_may_pull(skb, needed))
   112

Re: [PATCH 3/5] arm64: dts: qcom: sdx75: add missing qlink_logging reserved memory for mpss

2024-06-07 Thread Naina Mehta


On 6/6/2024 8:20 PM, Krzysztof Kozlowski wrote:

On 06/06/2024 16:38, Naina Mehta wrote:

The qlink_logging memory region is also used by the modem firmware,
add it to reserved memory regions.
Also split MPSS DSM region into 2 separate regions.

Signed-off-by: Naina Mehta 
---
  arch/arm64/boot/dts/qcom/sdx75.dtsi | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdx75.dtsi 
b/arch/arm64/boot/dts/qcom/sdx75.dtsi
index 9b93f6501d55..9349b1c4e196 100644
--- a/arch/arm64/boot/dts/qcom/sdx75.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdx75.dtsi
@@ -366,7 +366,12 @@
no-map;
};
  
-		qdss_mem: qdss@8880 {

+   qdss_mem: qdss@8850 {
+   reg = <0x0 0x8850 0x0 0x30>;
+   no-map;
+   };
+
+   qlink_logging_mem: qlink_logging@8880 {


Sorry, no downstream code.

Please follow DTS coding style - no underscores in node names. This
applies to all work sent upstream.



Thanks for pointing this out. I will update in next revision.

Regards,
Naina




Best regards,
Krzysztof

Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()

2024-06-06 Thread Zhijian Li (Fujitsu)



On 07/06/2024 00:49, Ira Weiny wrote:
> Li Zhijian wrote:
>> Don't allocate devs again when it's valid pointer which has pionted to
>> the memory allocated above with size (count + 2 * sizeof(dev)).
>>
>> A kmemleak reports:
>> unreferenced object 0x88800dda1980 (size 16):
>>comm "kworker/u10:5", pid 69, jiffies 4294671781
>>hex dump (first 16 bytes):
>>  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>>backtrace (crc 0):
>>  [] __kmalloc+0x32c/0x470
>>  [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 
>> [libnvdimm]
>>  [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm]
>>  [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
>>  [] really_probe+0xc6/0x390
>>  [<129e2a69>] __driver_probe_device+0x78/0x150
>>  [<2dfed28b>] driver_probe_device+0x1e/0x90
>>  [] __device_attach_driver+0x85/0x110
>>  [<32dca295>] bus_for_each_drv+0x85/0xe0
>>  [<391c5a7d>] __device_attach+0xbe/0x1e0
>>  [<26dabec0>] bus_probe_device+0x94/0xb0
>>  [] device_add+0x656/0x870
>>  [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm]
>>  [<3f4c52a4>] async_run_entry_fn+0x2e/0x110
>>  [] process_one_work+0x1ee/0x600
>>  [<6d90d5a9>] worker_thread+0x183/0x350
>>
>> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation")
>> Signed-off-by: Li Zhijian 
>> ---
>>   drivers/nvdimm/namespace_devs.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvdimm/namespace_devs.c 
>> b/drivers/nvdimm/namespace_devs.c
>> index d6d558f94d6b..56b016dbe307 100644
>> --- a/drivers/nvdimm/namespace_devs.c
>> +++ b/drivers/nvdimm/namespace_devs.c
>> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region 
>> *nd_region)
>>  /* Publish a zero-sized namespace for userspace to configure. */
>>  nd_mapping_free_labels(nd_mapping);
>>   
>> -devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
>> +/* devs probably has been allocated */
> 
> I don't think this is where the bug is.  The loop above is processing the
> known labels and should exit with a count > 0 if devs is not NULL.
> 
>  From what I can tell create_namespace_pmem() must be returning EAGAIN
> which leaves devs allocated but fails to increment count.  Thus there are
> no valid labels but devs was not free'ed.

Per the piece of the code, return EAGAIN and ENODEV could cause this issue in 
theory.

1980 dev = create_namespace_pmem(nd_region, nd_mapping, 
nd_label);
1981 if (IS_ERR(dev)) {
1982 switch (PTR_ERR(dev)) {
1983 case -EAGAIN:
1984 /* skip invalid labels */
1985 continue;
1986 case -ENODEV:
1987 /* fallthrough to seed creation */
1988 break;
1989 default:
1990 goto err;
1991 }
1992 } else
1993 devs[count++] = dev;


> 
> Can you trace the error you are seeing a bit more to see if this is the
> case?


I just tried, but I cannot reproduce this leaking again.
When it happened(100% reproduce at that time), I probably had a corrupted LSA(I 
see empty
output with command 'ndctl list'). It seemed the QEMU emulated Nvdimm device 
was broken
for some reasons.


Thanks
Zhijian


> 
> Thanks,
> Ira
> 
>> +if (!devs)
>> +devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
>>  if (!devs)
>>  goto err;
>>   
>> -- 
>> 2.29.2
>>
> 
>

Re: [PATCH] livepatch: introduce klp_func called interface

2024-06-06 Thread zhang warden




> On Jun 6, 2024, at 23:01, Joe Lawrence  wrote:
> 
> Hi Wardenjohn,
> 
> To follow up, Red Hat kpatch QE pointed me toward this test:
> 
> https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/tree/main/general/kpatch/kpatch-trace?ref_type=heads
> 
> which reports a few interesting things via systemd service and ftrace:
> 
> - Patched functions
> - Traced functions
> - Code that was modified, but didn't run
> - Other code that ftrace saw, but is not listed in the sysfs directory
> 
> The code looks to be built on top of the same basic ftrace commands that
> I sent the other day.  You may be able to reuse/copy/etc bits from the
> scripts if they are handy.
> 
> --
> Joe
> 

Thank you so much, you really helped me, Joe!

I will try to learn the script and make it suitable for our system.

Again, thank you, Joe!

Regards,
Wardenjohn

Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()

2024-06-06 Thread Zhijian Li (Fujitsu)



On 07/06/2024 00:27, Dave Jiang wrote:
> 
> 
> On 6/3/24 8:16 PM, Li Zhijian wrote:
>> Don't allocate devs again when it's valid pointer which has pionted to
>> the memory allocated above with size (count + 2 * sizeof(dev)).
>>
>> A kmemleak reports:
>> unreferenced object 0x88800dda1980 (size 16):
>>comm "kworker/u10:5", pid 69, jiffies 4294671781
>>hex dump (first 16 bytes):
>>  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>>backtrace (crc 0):
>>  [] __kmalloc+0x32c/0x470
>>  [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 
>> [libnvdimm]
>>  [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm]
>>  [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
>>  [] really_probe+0xc6/0x390
>>  [<129e2a69>] __driver_probe_device+0x78/0x150
>>  [<2dfed28b>] driver_probe_device+0x1e/0x90
>>  [] __device_attach_driver+0x85/0x110
>>  [<32dca295>] bus_for_each_drv+0x85/0xe0
>>  [<391c5a7d>] __device_attach+0xbe/0x1e0
>>  [<26dabec0>] bus_probe_device+0x94/0xb0
>>  [] device_add+0x656/0x870
>>  [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm]
>>  [<3f4c52a4>] async_run_entry_fn+0x2e/0x110
>>  [] process_one_work+0x1ee/0x600
>>  [<6d90d5a9>] worker_thread+0x183/0x350
>>
>> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation")
>> Signed-off-by: Li Zhijian 
>> ---
>>   drivers/nvdimm/namespace_devs.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvdimm/namespace_devs.c 
>> b/drivers/nvdimm/namespace_devs.c
>> index d6d558f94d6b..56b016dbe307 100644
>> --- a/drivers/nvdimm/namespace_devs.c
>> +++ b/drivers/nvdimm/namespace_devs.c
>> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region 
>> *nd_region)
>>  /* Publish a zero-sized namespace for userspace to configure. */
>>  nd_mapping_free_labels(nd_mapping);
>>   
>> -devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
>> +/* devs probably has been allocated */
>> +if (!devs)
>> +devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
> 
> This changes the behavior of this code and possibly corrupting the previously 
> allocated memory at times when 'devs' is valid.

If we reach here, that means count == 0 is true. so we can deduce that
the previously allocated memory was not touched at all in previous loop.
and its size was (2 * sizeof(dev)) too.

Was the 'devs' leaked out from the previous loop and should be freed instead?

this option is also fine to me. we can also fix this by free(devs) before 
allocate it again.:

+ kfree(devs); // kfree(NULL) is safe.
   devs = kcalloc(2, sizeof(dev), GFP_KERNEL);


Thanks
Zhijian

> 
>>  if (!devs)
>>  goto err;
>>   
>

Re: [PATCH v14 02/14] cgroup/misc: Add per resource callbacks for CSS events

2024-06-06 Thread chenridong

I think it is better when _misc_cg_res_alloc fails, it just calls 
_misc_cg_res_free(cg, index)(add index parameter, it means ending of 
iterator), so it can avoid calling ->free() that do not call ->alloc().


And in misc_cg_free, just call _misc_cg_res_free(cg, MISC_CG_RES_TYPES)  
to free all.



Thanks

Ridong


On 2024/6/6 22:51, Haitao Huang wrote:
On Thu, 06 Jun 2024 08:37:31 -0500, chenridong  
wrote:




  If _misc_cg_res_alloc fails, maybe some types do not call 
->alloc(), but all types ->free() callback >will be called, is that ok?


Not sure I understand. Are you suggesting we ignore failures from 
->alloc() callback in _misc_cg_res_alloc() as it is per-resource, and 
have ->free() callback and resource provider of the failing type to 
handle the failure internally?


IIUC, this failure only happens when a specific subcgroup is created 
(memory running out for allocation) so failing that subcgroup as a 
whole seems fine to me. Note the root node is static and no 
pre-resource callbacks invoked by misc. And resource provider handles 
its own allocations for root. In SGX case we too declare a static 
object for corresponding root sgx_cgroup struct.


Note also misc cgroup (except for setting capacity[res] = 0 at root) 
is all or nothing so no mechanism to tell user "this resource does not 
work but others are fine in this particular cgroup."


Thanks
Haitao

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Boqun Feng

On Thu, Jun 06, 2024 at 09:29:51PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 06, 2024 at 12:00:36PM -0700, Boqun Feng wrote:
> > On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote:
> > > On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote:
> > > 
> > > > Long-term plan is to 1) compile the C helpers in some IR and 2) inline
> > > > the helpers with Rust in IR-level, as what Gary has:
> > > > 
> > > > 
> > > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/
> > > 
> > > Urgh, that still needs us to maintain that silly list of helpers :-/
> > > 
> > 
> > But it's an improvement from the current stage, right? ;-)
> 
> Somewhat, but only marginal.
> 
> > > Can't we pretty please have clang parse the actual header files into IR
> > > and munge that into rust? So that we don't get to manually duplicate
> > > everything+dog.
> > 
> > That won't always work, because some of our kernel APIs are defined as
> > macros, and I don't think it's a trivial job to generate a macro
> > definition to a function definition so that it can be translated to
> > something in IR. We will have to do the macro -> function mapping
> > ourselves somewhere, if we want to inline the API across languages.
> 
> We can try and see how far we can get with moving a bunch of stuff into
> inlines. There's quite a bit of simple CPP that could be inlines or
> const objects I suppose.
> 

We can, but I'd first stick with what we have, improve it and make it
stable until we go to the next stage. Plus, there's benefit of keeping
an explicit helper list: it's clear what APIs are called by Rust, and
moreover, it's easier to modify the helpers if you were to change an
API, other than chasing where Rust code calls it. (Don't make me wrong,
I'm happy if you want to do that ;-))

Regards,
Boqun

> Things like the tracepoints are of course glorious CPP abuse and are
> never going to work.
> 
> But perhaps you can have an explicit 'eval-CPP on this here' construct
> or whatnot. If I squit I see this paste! thingy (WTF's up with that !
> operator?) to munge function names in the static_call thing. So
> something like apply CPP from over there on this here can also be done
> :-)

Re: [PATCH v3 08/13] tracing: Add option to use memmapped memory for trace boot instance

2024-06-06 Thread Steven Rostedt



Memory management folks. Please review this patch.

Specifically the "map_pages()" function below.

On Thu, 06 Jun 2024 17:17:43 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Add an option to the trace_instance kernel command line parameter that
> allows it to use the reserved memory from memmap boot parameter.
> 
>   memmap=12M$0x28450 trace_instance=boot_mapped@0x28450:12M
> 
> The above will reserves 12 megs at the physical address 0x28450.
> The second parameter will create a "boot_mapped" instance and use the
> memory reserved as the memory for the ring buffer.
> 
> That will create an instance called "boot_mapped":
> 
>   /sys/kernel/tracing/instances/boot_mapped
> 
> Note, because the ring buffer is using a defined memory ranged, it will
> act just like a memory mapped ring buffer. It will not have a snapshot
> buffer, as it can't swap out the buffer. The snapshot files as well as any
> tracers that uses a snapshot will not be present in the boot_mapped
> instance.
> 
> Cc: linux...@kvack.org
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  .../admin-guide/kernel-parameters.txt |  9 +++
>  kernel/trace/trace.c  | 75 +--
>  2 files changed, 78 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index b600df82669d..ff26b6094e79 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6754,6 +6754,15 @@
>   the same thing would happen if it was left off). The 
> irq_handler_entry
>   event, and all events under the "initcall" system.
>  
> + If memory has been reserved (see memmap for x86), the 
> instance
> + can use that memory:
> +
> + memmap=12M$0x28450 
> trace_instance=boot_map@0x28450:12M
> +
> + The above will create a "boot_map" instance that uses 
> the physical
> + memory at 0x28450 that is 12Megs. The per CPU 
> buffers of that
> + instance will be split up accordingly.
> +
>   trace_options=[option-list]
>   [FTRACE] Enable or disable tracer options at boot.
>   The option-list is a comma delimited list of options
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 622fe670949d..13e89023f33b 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -9504,6 +9504,31 @@ static int instance_mkdir(const char *name)
>   return ret;
>  }
>  
> +static u64 map_pages(u64 start, u64 size)
> +{
> + struct page **pages;
> + phys_addr_t page_start;
> + unsigned int page_count;
> + unsigned int i;
> + void *vaddr;
> +
> + page_count = DIV_ROUND_UP(size, PAGE_SIZE);
> +
> + page_start = start;
> + pages = kmalloc_array(page_count, sizeof(struct page *), GFP_KERNEL);
> + if (!pages)
> + return 0;
> +
> + for (i = 0; i < page_count; i++) {
> + phys_addr_t addr = page_start + i * PAGE_SIZE;
> + pages[i] = pfn_to_page(addr >> PAGE_SHIFT);
> + }
> + vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL);
> + kfree(pages);
> +
> + return (u64)(unsigned long)vaddr;
> +}

If for some reason the memmap=nn$ss fails, but this still gets called,
will the above just map over any memory. That is, is it possible that
the kernel could have used this memory?

Is there a way to detect this? That is, I don't want this to succeed if
the memory location it's about to map to is used by the kernel, or will
be used by user space.

-- Steve


> +
>  /**
>   * trace_array_get_by_name - Create/Lookup a trace array, given its name.
>   * @name: The name of the trace array to be looked up/created.
> @@ -10350,6 +10375,7 @@ __init static void enable_instances(void)
>  {
>   struct trace_array *tr;
>   char *curr_str;
> + char *name;
>   char *str;
>   char *tok;
>  
> @@ -10358,19 +10384,56 @@ __init static void enable_instances(void)
>   str = boot_instance_info;
>  
>   while ((curr_str = strsep(, "\t"))) {
> + unsigned long start = 0;
> + unsigned long size = 0;
> + unsigned long addr = 0;
>  
>   tok = strsep(_str, ",");
> + name = strsep(, "@");
> + if (tok) {
> + start = memparse(tok, );
> + if (!start) {
> + pr_warn("Tracing: Invalid boot instance address 
> for %s\n",
> + name);
> + continue;
> + }
> + }
>  
> - if (IS_ENABLED(CONFIG_TRACER_MAX_TRACE))
> - do_allocate_snapshot(tok);
> + if (start) {
> + if (*tok

Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support

2024-06-06 Thread Mathieu Poirier

On Wed, Jun 05, 2024 at 12:45:17PM -0500, Tanmay Shah wrote:
> 
> 
> On 6/4/24 10:34 AM, Mathieu Poirier wrote:
> 
> Hi Mathieu,
> 
> Thanks for reviews.
> Please find my comments below.
> 
> > Hi Tanmay,
> > 
> > On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote:
> >> It is possible that remote processor is already running before
> >> linux boot or remoteproc platform driver probe. Implement required
> >> remoteproc framework ops to provide resource table address and
> >> connect or disconnect with remote processor in such case.
> >> 
> >> Signed-off-by: Tanmay Shah 
> >> 
> >> ---
> >> Changes in v4:
> >>   - Move change log out of commit text
> >> 
> >> Changes in v3:
> >> 
> >>   - Drop SRAM patch from the series
> >>   - Change type from "struct resource_table *" to void __iomem *
> >>   - Change comment format from /** to /*
> >>   - Remove unmap of resource table va address during detach, allowing
> >> attach-detach-reattach use case.
> >>   - Unmap rsc_data_va after retrieving resource table data structure.
> >>   - Unmap resource table va during driver remove op
> >> 
> >> Changes in v2:
> >> 
> >>   - Fix typecast warnings reported using sparse tool.
> >>   - Fix following sparse warnings:
> >> 
> >> drivers/remoteproc/xlnx_r5_remoteproc.c:827:21: sparse: warning: incorrect 
> >> type in assignment (different address spaces)
> >> drivers/remoteproc/xlnx_r5_remoteproc.c:844:18: sparse: warning: incorrect 
> >> type in assignment (different address spaces)
> >> drivers/remoteproc/xlnx_r5_remoteproc.c:898:24: sparse: warning: incorrect 
> >> type in argument 1 (different address spaces)
> >>  drivers/remoteproc/xlnx_r5_remoteproc.c | 173 +++-
> >>  1 file changed, 169 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c 
> >> b/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> index 84243d1dff9f..6898d4761566 100644
> >> --- a/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> @@ -25,6 +25,10 @@
> >>  /* RX mailbox client buffer max length */
> >>  #define MBOX_CLIENT_BUF_MAX   (IPI_BUF_LEN_MAX + \
> >> sizeof(struct zynqmp_ipi_message))
> >> +
> >> +#define RSC_TBL_XLNX_MAGIC((uint32_t)'x' << 24 | (uint32_t)'a' << 
> >> 16 | \
> >> +   (uint32_t)'m' << 8 | (uint32_t)'p')
> >> +
> >>  /*
> >>   * settings for RPU cluster mode which
> >>   * reflects possible values of xlnx,cluster-mode dt-property
> >> @@ -73,6 +77,26 @@ struct mbox_info {
> >>struct mbox_chan *rx_chan;
> >>  };
> >>  
> >> +/**
> >> + * struct rsc_tbl_data
> >> + *
> >> + * Platform specific data structure used to sync resource table address.
> >> + * It's important to maintain order and size of each field on remote side.
> >> + *
> >> + * @version: version of data structure
> >> + * @magic_num: 32-bit magic number.
> >> + * @comp_magic_num: complement of above magic number
> >> + * @rsc_tbl_size: resource table size
> >> + * @rsc_tbl: resource table address
> >> + */
> >> +struct rsc_tbl_data {
> >> +  const int version;
> >> +  const u32 magic_num;
> >> +  const u32 comp_magic_num;
> > 
> > I thought we agreed on making the magic number a u64 - did I get this wrong?
> > 
> 
> Looks like I missed this comment from previous reviews, so didn't address it.
> Thanks for pointing this.
> 
> So I think having two 32-bit numbers with proper name, implies what is 
> expected and less chance of errors.
> With 64-bit number, it's easy to create errors when assigning magic number.
> 
> However, if 64-bit number is preferred, I will change it and test it.
> Please let me know.

I was under the impression we had agreed on a u64 but that may just be my own
interpretation.  Things can stay as they are now.

> 
> 
> >> +  const u32 rsc_tbl_size;
> >> +  const uintptr_t rsc_tbl;
> >> +} __packed;
> >> +
> >>  /*
> >>   * Hardcoded TCM bank values. This will stay in driver to maintain 
> >> backward
> >>   * compatibility with device-tree that does not have TCM information.
> >> @@ -95,20 +119,24 @@ static const struct mem_bank_data 
> >> zynqmp_tcm_banks_lockstep[] = {
> >>  /**
> >>   * struct zynqmp_r5_core
> >>   *
> >> + * @rsc_tbl_va: resource table virtual address
> >>   * @dev: device of RPU instance
> >>   * @np: device node of RPU instance
> >>   * @tcm_bank_count: number TCM banks accessible to this RPU
> >>   * @tcm_banks: array of each TCM bank data
> >>   * @rproc: rproc handle
> >> + * @rsc_tbl_size: resource table size retrieved from remote
> >>   * @pm_domain_id: RPU CPU power domain id
> >>   * @ipi: pointer to mailbox information
> >>   */
> >>  struct zynqmp_r5_core {
> >> +  void __iomem *rsc_tbl_va;
> >>struct device *dev;
> >>struct device_node *np;
> >>int tcm_bank_count;
> >>struct mem_bank_data **tcm_banks;
> >>struct rproc *rproc;
> >> +  u32 rsc_tbl_size;
> >>u32 pm_domain_id;
> >>struct mbox_info *ipi;
> >>  };

Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support

2024-06-06 Thread Mathieu Poirier

On Wed, 5 Jun 2024 at 11:47, Tanmay Shah  wrote:
>
>
>
> On 6/4/24 3:23 PM, Bjorn Andersson wrote:
> > On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote:
> >> It is possible that remote processor is already running before
> >> linux boot or remoteproc platform driver probe. Implement required
> >> remoteproc framework ops to provide resource table address and
> >> connect or disconnect with remote processor in such case.
> >>
> >
> > I know if changes the current style for this driver, but could we drop
> > "drivers: " from the subject prefix, to align changes to this driver
> > with others?
> >
>
> Ack. Will be fixed. Is it convention not to have "drivers" ?

"drivers" is not needed.

> If so, from now on, I will format subject prefix: /:
>

That will be just fine.

> > Regards,
> > Bjorn
>

Re: [PATCH 1/3] rust: add static_call support

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote:
> On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra  wrote:
> >
> > This is absolutely unreadable gibberish -- how am I supposed to keep
> > this in sync with the rest of the static_call infrastructure?
> 
> Yeah, they are macros, which look different from "normal" Rust code.

Macros like CPP ?

> Is there something we could do to help here? I think Alice and others
> would be happy to explain how it works and/or help maintain it in the
> future if you prefer.

Write a new language that looks more like C -- pretty please ? :-)

Mostly I would just really like you to just use arm/jump_label.h,
they're inline functions and should be doable with IR, no weirdo CPP
involved in this case.

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 12:00:36PM -0700, Boqun Feng wrote:
> On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote:
> > On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote:
> > 
> > > Long-term plan is to 1) compile the C helpers in some IR and 2) inline
> > > the helpers with Rust in IR-level, as what Gary has:
> > > 
> > >   
> > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/
> > 
> > Urgh, that still needs us to maintain that silly list of helpers :-/
> > 
> 
> But it's an improvement from the current stage, right? ;-)

Somewhat, but only marginal.

> > Can't we pretty please have clang parse the actual header files into IR
> > and munge that into rust? So that we don't get to manually duplicate
> > everything+dog.
> 
> That won't always work, because some of our kernel APIs are defined as
> macros, and I don't think it's a trivial job to generate a macro
> definition to a function definition so that it can be translated to
> something in IR. We will have to do the macro -> function mapping
> ourselves somewhere, if we want to inline the API across languages.

We can try and see how far we can get with moving a bunch of stuff into
inlines. There's quite a bit of simple CPP that could be inlines or
const objects I suppose.

Things like the tracepoints are of course glorious CPP abuse and are
never going to work.

But perhaps you can have an explicit 'eval-CPP on this here' construct
or whatnot. If I squit I see this paste! thingy (WTF's up with that !
operator?) to munge function names in the static_call thing. So
something like apply CPP from over there on this here can also be done
:-)

Re: [PATCH] remoteproc: mediatek: Don't print error when optional scp handle is missing

2024-06-06 Thread Nícolas F . R . A . Prado

On Thu, Jun 06, 2024 at 12:50:56PM +0200, AngeloGioacchino Del Regno wrote:
> Il 05/06/24 21:35, Nícolas F. R. A. Prado ha scritto:
> > The scp_get() helper has two users: the mtk-vcodec and the mtk-mdp3
> > drivers. mdp3 considers the mediatek,scp phandle optional, and when it's
> > missing mdp3 will directly look for the scp node based on compatible.
> > 
> > For that reason printing an error in the helper when the handle is
> > missing is wrong, as it's not always an actual error. Besides, the other
> > user, vcodec, already prints an error message on its own when the helper
> > fails so this message doesn't add that much information.
> > 
> > Remove the error message from the helper. This gets rid of the deceptive
> > error on MT8195-Tomato:
> > 
> >mtk-mdp3 14001000.dma-controller: can't get SCP node
> 
> I'm way more for giving it a SCP handle instead of silencing this print... the
> SCP handle way *is* the correct one and I prefer it, as that'd also be the 
> only
> way to support Dual-Core SCP in MDP3.
> 
> I really want to see hardcoded stuff disappear and I really really *really* 
> want
> to see more consistency across DT nodes in MediaTek platforms.
> 
> So, still a one-line change, but do it on the devicetree instead of here 
> please.

Sure. So the missing scp phandle case is maintained exclusively for
backwards-compatibility. And we're ok that the old DTs will keep getting the
error.

I'll send a v2 adding the phandle in the DT instead.

Thanks,
Nícolas

Re: [PATCH 1/3] rust: add static_call support

2024-06-06 Thread Miguel Ojeda

On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra  wrote:
>
> This is absolutely unreadable gibberish -- how am I supposed to keep
> this in sync with the rest of the static_call infrastructure?

Yeah, they are macros, which look different from "normal" Rust code.

Is there something we could do to help here? I think Alice and others
would be happy to explain how it works and/or help maintain it in the
future if you prefer.

Cheers,
Miguel

Re: [PATCH v3 4/6] module: script to generate offset ranges for builtin modules

2024-06-06 Thread Kris Van Hees

On Tue, May 21, 2024 at 01:53:27AM +0900, Masahiro Yamada wrote:
> On Fri, May 17, 2024 at 1:31???PM Kris Van Hees  
> wrote:
> >
> > The offset range data for builtin modules is generated using:
> >  - modules.builtin.modinfo: associates object files with module names
> >  - vmlinux.map: provides load order of sections and offset of first member
> > per section
> >  - vmlinux.o.map: provides offset of object file content per section
> >  - .*.cmd: build cmd file with KBUILD_MODFILE and KBUILD_MODNAME
> >
> > The generated data will look like:
> >
> > .text - = _text
> > .text baf0-cb10 amd_uncore
> > .text 0009bd10-0009c8e0 iosf_mbi
> > ...
> > .text 008e6660-008e9630 snd_soc_wcd_mbhc
> > .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
> > .text 008ea610-008ea780 snd_soc_wcd9335
> > ...
> > .data - = _sdata
> > .data f020-f680 amd_uncore
> >
> > For each ELF section, it lists the offset of the first symbol.  This can
> > be used to determine the base address of the section at runtime.
> >
> > Next, it lists (in strict ascending order) offset ranges in that section
> > that cover the symbols of one or more builtin modules.  Multiple ranges
> > can apply to a single module, and ranges can be shared between modules.
> >
> > Signed-off-by: Kris Van Hees 
> > Reviewed-by: Nick Alcock 
> > ---
> > Changes since v2:
> >  - 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
> >  - Switched from using modules.builtin.objs to parsing .*.cmd files
> >  - Parse data from .*.cmd in generate_builtin_ranges.awk
> > ---
> >  scripts/generate_builtin_ranges.awk | 232 
> >  1 file changed, 232 insertions(+)
> >  create mode 100755 scripts/generate_builtin_ranges.awk
> >
> > diff --git a/scripts/generate_builtin_ranges.awk 
> > b/scripts/generate_builtin_ranges.awk
> > new file mode 100755
> > index 0..6975a9c7266d9
> > --- /dev/null
> > +++ b/scripts/generate_builtin_ranges.awk
> > @@ -0,0 +1,232 @@
> > +#!/usr/bin/gawk -f
> > +# SPDX-License-Identifier: GPL-2.0
> > +# generate_builtin_ranges.awk: Generate address range data for builtin 
> > modules
> > +# Written by Kris Van Hees 
> > +#
> > +# Usage: generate_builtin_ranges.awk modules.builtin.modinfo vmlinux.map \
> > +#  vmlinux.o.map > modules.builtin.ranges
> > +#
> > +
> > +BEGIN {
> > +   # modules.builtin.modinfo uses \0 as record separator
> > +   # All other files use \n.
> > +   RS = "[\n\0]";
> > +}
> 
> 
> Why do you want to parse modules.builtin.modinfo
> instead of modules.builtin?
> 
> modules.builtin uses \n separator.

Oh my, I completely overlooked modules.builtin.  Thank you!  That is indeed
much easier.

> > +
> > +# Return the module name(s) (if any) associated with the given object.
> > +#
> > +# If we have seen this object before, return information from the cache.
> > +# Otherwise, retrieve it from the corresponding .cmd file.
> > +#
> > +function get_module_info(fn, mod, obj, mfn, s) {
> 
> 
> There are 5 arguments, while the caller passes only 1 argument
> ( get_module_info($4) )

That is the way to be able to have local variables in an AWK function - every
variable mentioned in the function declaration is local to the function.  It
is an oddity in AWK.

> > +   if (fn in omod)
> > +   return omod[fn];
> > +
> > +   if (match(fn, /\/[^/]+$/) == 0)
> > +   return "";
> > +
> > +   obj = fn;
> > +   mod = "";
> > +   mfn = "";
> > +   fn = substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
> > +   if (getline s  > +   if (match(s, /DKBUILD_MODNAME=[^ ]+/) > 0) {
> > +   mod = substr(s, RSTART + 17, RLENGTH - 17);
> > +   gsub(/['"]/, "", mod);
> > +   gsub(/:/, " ", mod);
> > +   }
> > +
> > +   if (match(s, /DKBUILD_MODFILE=[^ ]+/) > 0) {
> > +   mfn = substr(s, RSTART + 17, RLENGTH - 17);
> > +   gsub(/['"]/, "", mfn);
> > +   gsub(/:/, " ", mfn);
> > +   }
> > +   }
> > +   close(fn);
> > +
> > +# tmp = $0;
> > +# $0 = s;
> > +# print mod " " mfn " " obj " " $NF;
> > +# $0 = tmp;
> > +
> > +   # A single module (common case) also reflects objects that are not 
> > part
> > +   # of a module.  Some of those objects have names that are also a 
> > module
> > +   # name (e.g. core).  We check the associated module file name, and 
> > if
> > +   # they do not match, the object is not part of a module.
> 
> 
> You do not need to use KBUILD_MODNAME.
> 
> Just use KBUILD_MODFILE only.
> If the same path is found in modules.builtin,
> it is a built-in module.
> 
> Its basename is modname.

Yes, that is true.  I can do all this based on KBUILD_MODFILE.  Thank you.
Adjusting the patch that way.

> One more question in a corner case.
> 
> How does your

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Boqun Feng

On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote:
> 
> > Long-term plan is to 1) compile the C helpers in some IR and 2) inline
> > the helpers with Rust in IR-level, as what Gary has:
> > 
> > 
> > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/
> 
> Urgh, that still needs us to maintain that silly list of helpers :-/
> 

But it's an improvement from the current stage, right? ;-)

> Can't we pretty please have clang parse the actual header files into IR
> and munge that into rust? So that we don't get to manually duplicate
> everything+dog.

That won't always work, because some of our kernel APIs are defined as
macros, and I don't think it's a trivial job to generate a macro
definition to a function definition so that it can be translated to
something in IR. We will have to do the macro -> function mapping
ourselves somewhere, if we want to inline the API across languages.

Regards,
Boqun

Re: [PATCH v2 2/5] ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update()

2024-06-06 Thread Steven Rostedt

On Thu, 6 Jun 2024 18:53:12 +0100
Mark Rutland  wrote:

> On Wed, Jun 05, 2024 at 02:03:36PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" 
> > 
> > While adding comments to the function __ftrace_hash_rec_update() and
> > trying to describe in detail what the parameter for "ftrace_hash" does, I
> > realized that it basically does exactly the same thing (but differently)
> > if it is set or not!  
> 
> Typo: "ftrace_hash" should be "filter_hash", and likewise in the commit
> title.

Let me go fix up linux-next :-p

> 
> > If it is set, the idea was the ops->filter_hash was being updated, and the
> > code should focus on the functions that are in the ops->filter_hash and
> > add them. But it still had to pay attention to the functions in the
> > ops->notrace_hash, to ignore them.
> > 
> > If it was cleared, it focused on the ops->notrace_hash, and would add
> > functions that were not in the ops->notrace_hash but would still keep
> > functions in the "ops->filter_hash". Basically doing the same thing.
> > 
> > In reality, the __ftrace_hash_rec_update() only needs to either remove the
> > functions associated to the give ops (if "inc" is set) or remove them (if
> > "inc" is cleared). It has to pay attention to both the filter_hash and
> > notrace_hash regardless.  
> 
> AFAICT, we actually remove a latent bug here, but that bug wasn't
> reachable because we never call __ftrace_hash_rec_update() with
> "filter_hash" clear.
> 
> Before this patch, if we did call __ftrace_hash_rec_update() with
> "filter_hash" clear, we'd setup:
> 
>   all = false;
>   ...
>   if (filter_hash) {
>   ...
>   } else {
>   hash = ops->func_hash->notrace_hash;
>   other_hash = ops->func_hash->filter_hash;
>   }
> 
> ... and then at the tail of the loop where we do:
> 
>   count++;
> 
>   [...] 
> 
>   /* Shortcut, if we handled all records, we are done. */
>   if (!all && count == hash->count) {
>   pr_info("HARK: stopping after %d recs\n", count);
>   return update;
>   }
> 
> ... we'd be checking whether we've updated notrace_hash->count entries,
> when that could be smaller than the number of entries we actually need
> to update (e.g. if the notrace hash contains a single entry, but the
> filter_hash contained more).

I noticed this too as well as:

if (filter_hash) {
hash = ops->func_hash->filter_hash;
other_hash = ops->func_hash->notrace_hash;
if (ftrace_hash_empty(hash))
all = true;
} else {
inc = !inc;
hash = ops->func_hash->notrace_hash;
other_hash = ops->func_hash->filter_hash;
/*
 * If the notrace hash has no items,
 * then there's nothing to do.
 */
if (ftrace_hash_empty(hash))
return false;
}

That "return false" is actually a mistake as well. But I tried to hit
it, but found that I could not. I think I updated the code due to bugs
where I prevent that from happening, but the real fix would have been
this patch. :-p

> 
> As above, we never actually hit that because we never call with
> "filter_hash" clear, so it might be good to explicitly say that before
> this patch we never actually call __ftrace_hash_rec_update() with
> "filter_hash" clear, and this is removing dead (and potentially broken)
> code.

Right.

> 
> > Remove the "filter_hash" parameter from __filter_hash_rec_update() and
> > comment the function for what it really is doing.
> > 
> > Signed-off-by: Steven Rostedt (Google)   
> 
> FWIW, this looks good to me, and works in testing, so:
> 
> Reviewed-by: Mark Rutland 
> Tested-by: Mark Rutland 
> 
> I have one comment below, but the above stands regardless.
> 
> [...]
> 
> > +/*
> > + * This is the main engine to the ftrace updates to the dyn_ftrace records.
> > + *
> > + * It will iterate through all the available ftrace functions
> > + * (the ones that ftrace can have callbacks to) and set the flags
> > + * in the associated dyn_ftrace records.
> > + *
> > + * @inc: If true, the functions associated to @ops are added to
> > + *   the dyn_ftrace records, otherwise they are removed.
> > + */
> >  static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
> > -int filter_hash,
> >  bool inc)
> >  {
> > struct ftrace_hash *hash;
> > -   struct ftrace_hash *other_hash;
> > +   struct ftrace_hash *notrace_hash;
> > struct ftrace_page *pg;
> > struct dyn_ftrace *rec;
> > bool update = false;
> > @@ -1718,35 +1725,16 @@ static bool __ftrace_hash_rec_update(struct 
> > ftrace_ops *ops,
> > return false;
> >  
> > /*
> > -* In the filter_hash case:
> >  *   If the count is zero, we update all records.
> >  *   Otherwise we just update the

Re: [PATCH v2 5/5] ftrace: Add comments to ftrace_hash_move() and friends

2024-06-06 Thread Mark Rutland

On Wed, Jun 05, 2024 at 02:03:39PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> Describe what ftrace_hash_move() does and add some more comments to some
> other functions to make it easier to understand.
> 
> Signed-off-by: Steven Rostedt (Google) 

Acked-by: Mark Rutland 

Mark.

> ---
>  kernel/trace/ftrace.c | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 83f23f8fc26d..ae1603e771c5 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -169,6 +169,7 @@ static inline void ftrace_ops_init(struct ftrace_ops *ops)
>  #endif
>  }
>  
> +/* Call this function for when a callback filters on set_ftrace_pid */
>  static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip,
>   struct ftrace_ops *op, struct ftrace_regs *fregs)
>  {
> @@ -1317,7 +1318,7 @@ static struct ftrace_hash *alloc_ftrace_hash(int 
> size_bits)
>   return hash;
>  }
>  
> -
> +/* Used to save filters on functions for modules not loaded yet */
>  static int ftrace_add_mod(struct trace_array *tr,
> const char *func, const char *module,
> int enable)
> @@ -1429,6 +1430,7 @@ static struct ftrace_hash *__move_hash(struct 
> ftrace_hash *src, int size)
>   return new_hash;
>  }
>  
> +/* Move the @src entries to a newly allocated hash */
>  static struct ftrace_hash *
>  __ftrace_hash_move(struct ftrace_hash *src)
>  {
> @@ -1443,6 +1445,26 @@ __ftrace_hash_move(struct ftrace_hash *src)
>   return __move_hash(src, size);
>  }
>  
> +/**
> + * ftrace_hash_move - move a new hash to a filter and do updates
> + * @ops: The ops with the hash that @dst points to
> + * @enable: True if for the filter hash, false for the notrace hash
> + * @dst: Points to the @ops hash that should be updated
> + * @src: The hash to update @dst with
> + *
> + * This is called when an ftrace_ops hash is being updated and the
> + * the kernel needs to reflect this. Note, this only updates the kernel
> + * function callbacks if the @ops is enabled (not to be confused with
> + * @enable above). If the @ops is enabled, its hash determines what
> + * callbacks get called. This function gets called when the @ops hash
> + * is updated and it requires new callbacks.
> + *
> + * On success the elements of @src is moved to @dst, and @dst is updated
> + * properly, as well as the functions determined by the @ops hashes
> + * are now calling the @ops callback function.
> + *
> + * Regardless of return type, @src should be freed with free_ftrace_hash().
> + */
>  static int
>  ftrace_hash_move(struct ftrace_ops *ops, int enable,
>struct ftrace_hash **dst, struct ftrace_hash *src)
> -- 
> 2.43.0
> 
>

Re: [PATCH v2 4/5] ftrace: Convert "inc" parameter to bool in ftrace_hash_rec_update_modify()

2024-06-06 Thread Mark Rutland

On Wed, Jun 05, 2024 at 02:03:38PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> The parameter "inc" in the function ftrace_hash_rec_update_modify() is
> boolean. Change it to be such.
> 
> Also add documentation to what the function does.
> 
> Signed-off-by: Steven Rostedt (Google) 

Acked-by: Mark Rutland 

Mark.

> ---
>  kernel/trace/ftrace.c | 23 ---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 1a2444199740..83f23f8fc26d 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1882,7 +1882,24 @@ static bool ftrace_hash_rec_enable(struct ftrace_ops 
> *ops)
>   return __ftrace_hash_rec_update(ops, true);
>  }
>  
> -static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops, int inc)
> +/*
> + * This function will update what functions @ops traces when its filter
> + * changes.
> + *
> + * The @inc states if the @ops callbacks are going to be added or removed.
> + * When one of the @ops hashes are updated to a "new_hash" the dyn_ftrace
> + * records are update via:
> + *
> + * ftrace_hash_rec_disable_modify(ops);
> + * ops->hash = new_hash
> + * ftrace_hash_rec_enable_modify(ops);
> + *
> + * Where the @ops is removed from all the records it is tracing using
> + * its old hash. The @ops hash is updated to the new hash, and then
> + * the @ops is added back to the records so that it is tracing all
> + * the new functions.
> + */
> +static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops, bool inc)
>  {
>   struct ftrace_ops *op;
>  
> @@ -1906,12 +1923,12 @@ static void ftrace_hash_rec_update_modify(struct 
> ftrace_ops *ops, int inc)
>  
>  static void ftrace_hash_rec_disable_modify(struct ftrace_ops *ops)
>  {
> - ftrace_hash_rec_update_modify(ops, 0);
> + ftrace_hash_rec_update_modify(ops, false);
>  }
>  
>  static void ftrace_hash_rec_enable_modify(struct ftrace_ops *ops)
>  {
> - ftrace_hash_rec_update_modify(ops, 1);
> + ftrace_hash_rec_update_modify(ops, true);
>  }
>  
>  /*
> -- 
> 2.43.0
> 
>

Re: [PATCH v2 2/5] ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update()

2024-06-06 Thread Mark Rutland

On Wed, Jun 05, 2024 at 02:03:36PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> While adding comments to the function __ftrace_hash_rec_update() and
> trying to describe in detail what the parameter for "ftrace_hash" does, I
> realized that it basically does exactly the same thing (but differently)
> if it is set or not!

Typo: "ftrace_hash" should be "filter_hash", and likewise in the commit
title.

> If it is set, the idea was the ops->filter_hash was being updated, and the
> code should focus on the functions that are in the ops->filter_hash and
> add them. But it still had to pay attention to the functions in the
> ops->notrace_hash, to ignore them.
> 
> If it was cleared, it focused on the ops->notrace_hash, and would add
> functions that were not in the ops->notrace_hash but would still keep
> functions in the "ops->filter_hash". Basically doing the same thing.
> 
> In reality, the __ftrace_hash_rec_update() only needs to either remove the
> functions associated to the give ops (if "inc" is set) or remove them (if
> "inc" is cleared). It has to pay attention to both the filter_hash and
> notrace_hash regardless.

AFAICT, we actually remove a latent bug here, but that bug wasn't
reachable because we never call __ftrace_hash_rec_update() with
"filter_hash" clear.

Before this patch, if we did call __ftrace_hash_rec_update() with
"filter_hash" clear, we'd setup:

all = false;
...
if (filter_hash) {
...
} else {
hash = ops->func_hash->notrace_hash;
other_hash = ops->func_hash->filter_hash;
}

... and then at the tail of the loop where we do:

count++;

[...] 

/* Shortcut, if we handled all records, we are done. */
if (!all && count == hash->count) {
pr_info("HARK: stopping after %d recs\n", count);
return update;
}

... we'd be checking whether we've updated notrace_hash->count entries,
when that could be smaller than the number of entries we actually need
to update (e.g. if the notrace hash contains a single entry, but the
filter_hash contained more).

As above, we never actually hit that because we never call with
"filter_hash" clear, so it might be good to explicitly say that before
this patch we never actually call __ftrace_hash_rec_update() with
"filter_hash" clear, and this is removing dead (and potentially broken)
code.

> Remove the "filter_hash" parameter from __filter_hash_rec_update() and
> comment the function for what it really is doing.
> 
> Signed-off-by: Steven Rostedt (Google) 

FWIW, this looks good to me, and works in testing, so:

Reviewed-by: Mark Rutland 
Tested-by: Mark Rutland 

I have one comment below, but the above stands regardless.

[...]

> +/*
> + * This is the main engine to the ftrace updates to the dyn_ftrace records.
> + *
> + * It will iterate through all the available ftrace functions
> + * (the ones that ftrace can have callbacks to) and set the flags
> + * in the associated dyn_ftrace records.
> + *
> + * @inc: If true, the functions associated to @ops are added to
> + *   the dyn_ftrace records, otherwise they are removed.
> + */
>  static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
> -  int filter_hash,
>bool inc)
>  {
>   struct ftrace_hash *hash;
> - struct ftrace_hash *other_hash;
> + struct ftrace_hash *notrace_hash;
>   struct ftrace_page *pg;
>   struct dyn_ftrace *rec;
>   bool update = false;
> @@ -1718,35 +1725,16 @@ static bool __ftrace_hash_rec_update(struct 
> ftrace_ops *ops,
>   return false;
>  
>   /*
> -  * In the filter_hash case:
>*   If the count is zero, we update all records.
>*   Otherwise we just update the items in the hash.
> -  *
> -  * In the notrace_hash case:
> -  *   We enable the update in the hash.
> -  *   As disabling notrace means enabling the tracing,
> -  *   and enabling notrace means disabling, the inc variable
> -  *   gets inversed.
>*/
> - if (filter_hash) {
> - hash = ops->func_hash->filter_hash;
> - other_hash = ops->func_hash->notrace_hash;
> - if (ftrace_hash_empty(hash))
> - all = true;
> - } else {
> - inc = !inc;
> - hash = ops->func_hash->notrace_hash;
> - other_hash = ops->func_hash->filter_hash;
> - /*
> -  * If the notrace hash has no items,
> -  * then there's nothing to do.
> -  */
> - if (ftrace_hash_empty(hash))
> - return false;
> - }
> + hash = ops->func_hash->filter_hash;
> + notrace_hash = ops->func_hash->notrace_hash;
> + if (ftrace_hash_empty(hash))
> + all = true;
>  
>   do_for_each_ftrace_rec(pg, rec) {
> - int in_other_hash

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote:

> Long-term plan is to 1) compile the C helpers in some IR and 2) inline
> the helpers with Rust in IR-level, as what Gary has:
> 
>   
> https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/

Urgh, that still needs us to maintain that silly list of helpers :-/

Can't we pretty please have clang parse the actual header files into IR
and munge that into rust? So that we don't get to manually duplicate
everything+dog.

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 03:05:26PM +, Alice Ryhl wrote:
> Make it possible to have Rust code call into tracepoints defined by C
> code. It is still required that the tracepoint is declared in a C
> header, and that this header is included in the input to bindgen.
> 
> Signed-off-by: Alice Ryhl 
> ---
>  rust/bindings/bindings_helper.h |  1 +
>  rust/bindings/lib.rs| 15 +++
>  rust/helpers.c  | 24 +++
>  rust/kernel/lib.rs  |  1 +
>  rust/kernel/tracepoint.rs   | 92 
> +
>  5 files changed, 133 insertions(+)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index ddb5644d4fd9..d442f9ccfc2c 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> diff --git a/rust/bindings/lib.rs b/rust/bindings/lib.rs
> index 40ddaee50d8b..48856761d682 100644
> --- a/rust/bindings/lib.rs
> +++ b/rust/bindings/lib.rs
> @@ -48,3 +48,18 @@ mod bindings_helper {
>  }
>  
>  pub use bindings_raw::*;
> +
> +/// Rust version of the C macro `rcu_dereference_raw`.
> +///
> +/// The rust helper only works with void pointers, but this wrapper method 
> makes it work with any
> +/// pointer type using pointer casts.
> +///
> +/// # Safety
> +///
> +/// This method has the same safety requirements as the C macro of the same 
> name.
> +#[inline(always)]
> +pub unsafe fn rcu_dereference_raw(p: *const *mut T) -> *mut T {
> +// SAFETY: This helper calls into the C macro, so the caller promises to 
> uphold the safety
> +// requirements.
> +unsafe { __rcu_dereference_raw(p as *mut *mut _) as *mut T }
> +}
> diff --git a/rust/helpers.c b/rust/helpers.c
> index 2c37a0f5d7a8..0560cc2a512a 100644
> --- a/rust/helpers.c
> +++ b/rust/helpers.c
> @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, 
> gfp_t flags)
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_krealloc);
>  
> +void rust_helper_preempt_enable_notrace(void)
> +{
> + preempt_enable_notrace();
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
> +
> +void rust_helper_preempt_disable_notrace(void)
> +{
> + preempt_disable_notrace();
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);

A notrace wrapper that is tracable, lol.

> +bool rust_helper_current_cpu_online(void)
> +{
> + return cpu_online(raw_smp_processor_id());
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
> +
> +void *rust_helper___rcu_dereference_raw(void **p)
> +{
> + return rcu_dereference_raw(p);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);

I'm going to keep yelling and objecting to these wrappers.

Fix bindgen already. Or whatever is needed to get it to interoperate
with C inline / asm.

NAK NAK NAK

Re: [PATCH 2/3] rust: add static_key_false

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 03:05:25PM +, Alice Ryhl wrote:
> Add just enough support for static key so that we can use it from
> tracepoints. Tracepoints rely on `static_key_false` even though it is
> deprecated, so we add the same functionality to Rust.

Urgh, more unreadable gibberish :-(

Can we please get bindgen to translate asm/jump_table.h instead of
randomly duplicating random bits.

I really think that whoever created rust was an esoteric language freak.
Hideous crap :-(.

Re: [PATCH 1/3] rust: add static_call support

2024-06-06 Thread Peter Zijlstra

On Thu, Jun 06, 2024 at 03:05:24PM +, Alice Ryhl wrote:
> Add static_call support by mirroring how C does. When the platform does
> not support static calls (right now, that means that it is not x86),
> then the function pointer is loaded from a global and called. Otherwise,
> we generate a call to a trampoline function, and objtool is used to make
> these calls patchable at runtime.

This is absolutely unreadable gibberish -- how am I supposed to keep
this in sync with the rest of the static_call infrastructure?

> Signed-off-by: Alice Ryhl 
> ---
>  rust/kernel/lib.rs |  1 +
>  rust/kernel/static_call.rs | 92 
> ++
>  2 files changed, 93 insertions(+)
> 
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index fbd91a48ff8b..d534b1178955 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -38,6 +38,7 @@
>  pub mod prelude;
>  pub mod print;
>  mod static_assert;
> +pub mod static_call;
>  #[doc(hidden)]
>  pub mod std_vendor;
>  pub mod str;
> diff --git a/rust/kernel/static_call.rs b/rust/kernel/static_call.rs
> new file mode 100644
> index ..f7b8ba7bf1fb
> --- /dev/null
> +++ b/rust/kernel/static_call.rs
> @@ -0,0 +1,92 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Logic for static calls.
> +
> +#[macro_export]
> +#[doc(hidden)]
> +macro_rules! ty_underscore_for {
> +($arg:expr) => {
> +_
> +};
> +}
> +
> +#[doc(hidden)]
> +#[repr(transparent)]
> +pub struct AddressableStaticCallKey {
> +_ptr: *const bindings::static_call_key,
> +}
> +unsafe impl Sync for AddressableStaticCallKey {}
> +impl AddressableStaticCallKey {
> +pub const fn new(ptr: *const bindings::static_call_key) -> Self {
> +Self { _ptr: ptr }
> +}
> +}
> +
> +#[cfg(CONFIG_HAVE_STATIC_CALL)]
> +#[doc(hidden)]
> +#[macro_export]
> +macro_rules! _static_call {
> +($name:ident($($args:expr),* $(,)?)) => {{
> +// Symbol mangling will give this symbol a unique name.
> +#[cfg(CONFIG_HAVE_STATIC_CALL_INLINE)]
> +#[link_section = ".discard.addressable"]
> +#[used]
> +static __ADDRESSABLE: $crate::static_call::AddressableStaticCallKey 
> = unsafe {
> +
> $crate::static_call::AddressableStaticCallKey::new(::core::ptr::addr_of!(
> +$crate::macros::paste! { $crate::bindings:: [<__SCK__ $name 
> >]; }
> +))
> +};
> +
> +let fn_ptr: unsafe extern "C" 
> fn($($crate::static_call::ty_underscore_for!($args)),*) -> _ =
> +$crate::macros::paste! { $crate::bindings:: [<__SCT__ $name >]; 
> };
> +(fn_ptr)($($args),*)
> +}};
> +}
> +
> +#[cfg(not(CONFIG_HAVE_STATIC_CALL))]
> +#[doc(hidden)]
> +#[macro_export]
> +macro_rules! _static_call {
> +($name:ident($($args:expr),* $(,)?)) => {{
> +let void_ptr_fn: *mut ::core::ffi::c_void =
> +$crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; 
> }.func;
> +
> +let fn_ptr: unsafe extern "C" 
> fn($($crate::static_call::ty_underscore_for!($args)),*) -> _ =
> +if true {
> +::core::mem::transmute(void_ptr_fn)
> +} else {
> +// This is dead code, but it influences type inference on 
> `fn_ptr` so that we
> +// transmute the function pointer to the right type.
> +$crate::macros::paste! { $crate::bindings:: [<__SCT__ $name 
> >]; }
> +};
> +
> +(fn_ptr)($($args),*)
> +}};
> +}
> +
> +/// Statically call a global function.
> +///
> +/// # Safety
> +///
> +/// This macro will call the provided function. It is up to the caller to 
> uphold the safety
> +/// guarantees of the function.
> +///
> +/// # Examples
> +///
> +/// ```ignore
> +/// fn call_static() {
> +/// unsafe {
> +/// static_call! { your_static_call() };
> +/// }
> +/// }
> +/// ```
> +#[macro_export]
> +macro_rules! static_call {
> +// Forward to the real implementation. Separated like this so that we 
> don't have to duplicate
> +// the documentation.
> +($($args:tt)*) => { $crate::static_call::_static_call! { $($args)* } };
> +}
> +
> +pub use {_static_call, static_call, ty_underscore_for};
> 
> -- 
> 2.45.2.505.gda0bf45e8d-goog
>

Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()

2024-06-06 Thread Ira Weiny

Li Zhijian wrote:
> Don't allocate devs again when it's valid pointer which has pionted to
> the memory allocated above with size (count + 2 * sizeof(dev)).
> 
> A kmemleak reports:
> unreferenced object 0x88800dda1980 (size 16):
>   comm "kworker/u10:5", pid 69, jiffies 4294671781
>   hex dump (first 16 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace (crc 0):
> [] __kmalloc+0x32c/0x470
> [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 
> [libnvdimm]
> [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm]
> [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
> [] really_probe+0xc6/0x390
> [<129e2a69>] __driver_probe_device+0x78/0x150
> [<2dfed28b>] driver_probe_device+0x1e/0x90
> [] __device_attach_driver+0x85/0x110
> [<32dca295>] bus_for_each_drv+0x85/0xe0
> [<391c5a7d>] __device_attach+0xbe/0x1e0
> [<26dabec0>] bus_probe_device+0x94/0xb0
> [] device_add+0x656/0x870
> [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm]
> [<3f4c52a4>] async_run_entry_fn+0x2e/0x110
> [] process_one_work+0x1ee/0x600
> [<6d90d5a9>] worker_thread+0x183/0x350
> 
> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation")
> Signed-off-by: Li Zhijian 
> ---
>  drivers/nvdimm/namespace_devs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
> index d6d558f94d6b..56b016dbe307 100644
> --- a/drivers/nvdimm/namespace_devs.c
> +++ b/drivers/nvdimm/namespace_devs.c
> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region 
> *nd_region)
>   /* Publish a zero-sized namespace for userspace to configure. */
>   nd_mapping_free_labels(nd_mapping);
>  
> - devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
> + /* devs probably has been allocated */

I don't think this is where the bug is.  The loop above is processing the
known labels and should exit with a count > 0 if devs is not NULL.

>From what I can tell create_namespace_pmem() must be returning EAGAIN
which leaves devs allocated but fails to increment count.  Thus there are
no valid labels but devs was not free'ed.

Can you trace the error you are seeing a bit more to see if this is the
case?  

Thanks,
Ira

> + if (!devs)
> + devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
>   if (!devs)
>   goto err;
>  
> -- 
> 2.29.2
>

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-06 Thread Andrii Nakryiko

On Thu, Jun 6, 2024 at 9:46 AM Jiri Olsa  wrote:
>
> On Wed, Jun 05, 2024 at 10:50:11PM +0200, Jiri Olsa wrote:
> > On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote:
> > > On 06/05, Andrii Nakryiko wrote:
> > > >
> > > > so any such
> > > > limitations will cause problems, issue reports, investigation, etc.
> > >
> > > Agreed...
> > >
> > > > As one possible solution, what if we do
> > > >
> > > > struct return_instance {
> > > > ...
> > > > u64 session_cookies[];
> > > > };
> > > >
> > > > and allocate sizeof(struct return_instance) + 8 *
> > > >  and then at runtime pass
> > > > _cookies[i] as data pointer to session-aware callbacks?
> > >
> > > I too thought about this, but I guess it is not that simple.
> > >
> > > Just for example. Suppose we have 2 session-consumers C1 and C2.
> > > What if uprobe_unregister(C1) comes before the probed function
> > > returns?
> > >
> > > We need something like map_cookie_to_consumer().
> >
> > I guess we could have hash table in return_instance that gets 'consumer -> 
> > cookie' ?
>
> ok, hash table is probably too big for this.. I guess some solution that
> would iterate consumers and cookies made sure it matches would be fine
>

Yes, I was hoping to avoid hash tables for this, and in the common
case have no added overhead.

> jirka
>
> >
> > return instance is freed after the consumers' return handlers are executed,
> > so there's no leak if some consumer gets unregistered before that
> >
> > >
> > > > > +   /* The handler_session callback return value controls 
> > > > > execution of
> > > > > +* the return uprobe and ret_handler_session callback.
> > > > > +*  0 on success
> > > > > +*  1 on failure, DO NOT install/execute the return uprobe
> > > > > +*console warning for anything else
> > > > > +*/
> > > > > +   int (*handler_session)(struct uprobe_consumer *self, struct 
> > > > > pt_regs *regs,
> > > > > +  unsigned long *data);
> > > > > +   int (*ret_handler_session)(struct uprobe_consumer *self, 
> > > > > unsigned long func,
> > > > > +  struct pt_regs *regs, unsigned 
> > > > > long *data);
> > > > > +
> > > >
> > > > We should try to avoid an alternative set of callbacks, IMO. Let's
> > > > extend existing ones with `unsigned long *data`,
> > >
> > > Oh yes, agreed.
> > >
> > > And the comment about the return value looks confusing too. I mean, the
> > > logic doesn't differ from the ret-code from ->handler().
> > >
> > > "DO NOT install/execute the return uprobe" is not true if another
> > > non-session-consumer returns 0.
> >
> > well they are meant to be exclusive, so there'd be no other 
> > non-session-consumer
> >
> > jirka

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-06 Thread Jiri Olsa

On Wed, Jun 05, 2024 at 10:50:11PM +0200, Jiri Olsa wrote:
> On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote:
> > On 06/05, Andrii Nakryiko wrote:
> > >
> > > so any such
> > > limitations will cause problems, issue reports, investigation, etc.
> > 
> > Agreed...
> > 
> > > As one possible solution, what if we do
> > >
> > > struct return_instance {
> > > ...
> > > u64 session_cookies[];
> > > };
> > >
> > > and allocate sizeof(struct return_instance) + 8 *
> > >  and then at runtime pass
> > > _cookies[i] as data pointer to session-aware callbacks?
> > 
> > I too thought about this, but I guess it is not that simple.
> > 
> > Just for example. Suppose we have 2 session-consumers C1 and C2.
> > What if uprobe_unregister(C1) comes before the probed function
> > returns?
> > 
> > We need something like map_cookie_to_consumer().
> 
> I guess we could have hash table in return_instance that gets 'consumer -> 
> cookie' ?

ok, hash table is probably too big for this.. I guess some solution that
would iterate consumers and cookies made sure it matches would be fine

jirka

> 
> return instance is freed after the consumers' return handlers are executed,
> so there's no leak if some consumer gets unregistered before that
> 
> > 
> > > > +   /* The handler_session callback return value controls execution 
> > > > of
> > > > +* the return uprobe and ret_handler_session callback.
> > > > +*  0 on success
> > > > +*  1 on failure, DO NOT install/execute the return uprobe
> > > > +*console warning for anything else
> > > > +*/
> > > > +   int (*handler_session)(struct uprobe_consumer *self, struct 
> > > > pt_regs *regs,
> > > > +  unsigned long *data);
> > > > +   int (*ret_handler_session)(struct uprobe_consumer *self, 
> > > > unsigned long func,
> > > > +  struct pt_regs *regs, unsigned long 
> > > > *data);
> > > > +
> > >
> > > We should try to avoid an alternative set of callbacks, IMO. Let's
> > > extend existing ones with `unsigned long *data`,
> > 
> > Oh yes, agreed.
> > 
> > And the comment about the return value looks confusing too. I mean, the
> > logic doesn't differ from the ret-code from ->handler().
> > 
> > "DO NOT install/execute the return uprobe" is not true if another
> > non-session-consumer returns 0.
> 
> well they are meant to be exclusive, so there'd be no other 
> non-session-consumer
> 
> jirka

Re: [PATCH 0/3] Tracepoints and static branch/call in Rust

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 11:46, Alice Ryhl wrote:

On Thu, Jun 6, 2024 at 5:25 PM Mathieu Desnoyers
 wrote:


On 2024-06-06 11:05, Alice Ryhl wrote:

This implementation implements support for static keys in Rust so that
the actual static branch will end up in the Rust object file. However,
it would also be possible to just wrap the trace_##name generated by
__DECLARE_TRACE in an extern C function and then call that from Rust.
This will simplify the Rust code by removing the need for static
branches and calls, but it places the static branch behind an external
call, which has performance implications.


The tracepoints try very hard to minimize overhead of dormant tracepoints
so it is not frowned-upon to have them built into production binaries.
This is needed to make sure distribution vendors keep those tracepoints
in the kernel binaries that reach end-users.

Adding a function call before evaluation of the static branch goes against
this major goal.



A possible middle ground would be to place just the __DO_TRACE body in
an extern C function and to implement the Rust wrapper by doing the
static branch in Rust, and then calling into C the code that contains
__DO_TRACE when the tracepoint is active. However, this would need some
changes to include/linux/tracepoint.h to generate and export a function
containing the body of __DO_TRACE when the tracepoint should be callable
from Rust.


This tradeoff is more acceptable than having a function call before
evaluation of the static branch, but I wonder what is the upside of
this tradeoff compared to inlining the whole __DO_TRACE in Rust ?


So in general, there is a tradeoff between placing parts of the
tracepoint (which is perf sensitive) behind an external call, and having
code duplicated in both C and Rust (which must be kept in sync when
changes are made). This is an important point that I would like feedback
on from the C maintainers.


I don't see how the duplication happens there: __DO_TRACE is meant to be
inlined into each C tracepoint caller site, so the code is already meant
to be duplicated. Having an explicit function wrapping the tracepoint
for Rust would just create an extra instance of __DO_TRACE if it happens
to be also inlined into C code.

Or do you meant you would like to prevent having to duplicate the
implementation of __DO_TRACE in both C and Rust ?

I'm not sure if you mean to prevent source code duplication between
C and Rust or duplication of binary code (instructions).


It's a question of maintenance burden. If you change how __DO_TRACE is
implemented, then those changes must also be reflected in the Rust
version. There's no issue in the binary code.


As long as it is only __DO_TRACE that is duplicated between C and Rust,
I don't see it as a large burden.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()

2024-06-06 Thread Dave Jiang




On 6/3/24 8:16 PM, Li Zhijian wrote:
> Don't allocate devs again when it's valid pointer which has pionted to
> the memory allocated above with size (count + 2 * sizeof(dev)).
> 
> A kmemleak reports:
> unreferenced object 0x88800dda1980 (size 16):
>   comm "kworker/u10:5", pid 69, jiffies 4294671781
>   hex dump (first 16 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace (crc 0):
> [] __kmalloc+0x32c/0x470
> [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 
> [libnvdimm]
> [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm]
> [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
> [] really_probe+0xc6/0x390
> [<129e2a69>] __driver_probe_device+0x78/0x150
> [<2dfed28b>] driver_probe_device+0x1e/0x90
> [] __device_attach_driver+0x85/0x110
> [<32dca295>] bus_for_each_drv+0x85/0xe0
> [<391c5a7d>] __device_attach+0xbe/0x1e0
> [<26dabec0>] bus_probe_device+0x94/0xb0
> [] device_add+0x656/0x870
> [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm]
> [<3f4c52a4>] async_run_entry_fn+0x2e/0x110
> [] process_one_work+0x1ee/0x600
> [<6d90d5a9>] worker_thread+0x183/0x350
> 
> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation")
> Signed-off-by: Li Zhijian 
> ---
>  drivers/nvdimm/namespace_devs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
> index d6d558f94d6b..56b016dbe307 100644
> --- a/drivers/nvdimm/namespace_devs.c
> +++ b/drivers/nvdimm/namespace_devs.c
> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region 
> *nd_region)
>   /* Publish a zero-sized namespace for userspace to configure. */
>   nd_mapping_free_labels(nd_mapping);
>  
> - devs = kcalloc(2, sizeof(dev), GFP_KERNEL);
> + /* devs probably has been allocated */
> + if (!devs)
> + devs = kcalloc(2, sizeof(dev), GFP_KERNEL);

This changes the behavior of this code and possibly corrupting the previously 
allocated memory at times when 'devs' is valid. Was the 'devs' leaked out from 
the previous loop and should be freed instead?

>   if (!devs)
>   goto err;
>

Re: [PATCH 2/3] rust: add static_key_false

2024-06-06 Thread Alice Ryhl

On Thu, Jun 6, 2024 at 5:37 PM Mathieu Desnoyers
 wrote:
>
> On 2024-06-06 11:05, Alice Ryhl wrote:
> > Add just enough support for static key so that we can use it from
> > tracepoints. Tracepoints rely on `static_key_false` even though it is
> > deprecated, so we add the same functionality to Rust.
> >
> > Signed-off-by: Alice Ryhl 
> > ---
> >   rust/kernel/lib.rs|  1 +
> >   rust/kernel/static_key.rs | 87 
> > +++
> >   scripts/Makefile.build|  2 +-
> >   3 files changed, 89 insertions(+), 1 deletion(-)
> >
> > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > index d534b1178955..22e1fedd0774 100644
> > --- a/rust/kernel/lib.rs
> > +++ b/rust/kernel/lib.rs
> > @@ -39,6 +39,7 @@
> >   pub mod print;
> >   mod static_assert;
> >   pub mod static_call;
> > +pub mod static_key;
> >   #[doc(hidden)]
> >   pub mod std_vendor;
> >   pub mod str;
> > diff --git a/rust/kernel/static_key.rs b/rust/kernel/static_key.rs
> > new file mode 100644
> > index ..6c3dbe14c98a
> > --- /dev/null
> > +++ b/rust/kernel/static_key.rs
> > @@ -0,0 +1,87 @@
> > +// SPDX-License-Identifier: GPL-2.0
>
> This static key code is something that can be generally useful
> both in kernel and user-space. Is there anything that would prevent
> licensing this under MIT right away instead so it could eventually be
> re-used more broadly in userspace as well ?

I would not mind.

Alice

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 12:16, Alice Ryhl wrote:

On Thu, Jun 6, 2024 at 5:29 PM Mathieu Desnoyers
 wrote:


On 2024-06-06 11:05, Alice Ryhl wrote:

Make it possible to have Rust code call into tracepoints defined by C
code. It is still required that the tracepoint is declared in a C
header, and that this header is included in the input to bindgen.

Signed-off-by: Alice Ryhl 


[...]


diff --git a/rust/helpers.c b/rust/helpers.c
index 2c37a0f5d7a8..0560cc2a512a 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, 
gfp_t flags)
   }
   EXPORT_SYMBOL_GPL(rust_helper_krealloc);

+void rust_helper_preempt_enable_notrace(void)
+{
+ preempt_enable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
+
+void rust_helper_preempt_disable_notrace(void)
+{
+ preempt_disable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);
+
+bool rust_helper_current_cpu_online(void)
+{
+ return cpu_online(raw_smp_processor_id());
+}
+EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
+
+void *rust_helper___rcu_dereference_raw(void **p)
+{
+ return rcu_dereference_raw(p);
+}
+EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);


Ouch. Doing a function call for each of those small operations will
have a rather large performance impact when tracing is active. If it is
not possible to inline those in Rust, then implementing __DO_TRACE in
a C function would at least allow Rust to only do a single call to C
rather than go back and forth between Rust and C.

What prevents inlining those helpers in Rust ?


There's nothing fundamental that prevents it. But they contain a large
amount of architecture specific code, which makes them a significant
amount of work to reimplement in Rust.

For example, rcu_dereference_raw calls into READ_ONCE. READ_ONCE is
usually a volatile load, but under arm64+LTO, you get a bunch of
inline assembly that relies on ALTERNATIVE for feature detection:
https://elixir.bootlin.com/linux/v6.9/source/arch/arm64/include/asm/rwonce.h#L36

And preempt_enable_notrace has a similar story.

The solution that Boqun mentions is nice, but it relies on rustc and
clang using the same version of LLVM. You are unlikely to have
compilers with matching LLVMs unless you intentionally take steps to
make that happen.

But yes, perhaps these helpers are an argument to have a single call
for __DO_TRACE instead.


If those helpers end up being inlined into Rust with the solution
pointed to by Boqun, then it makes sense to implement __DO_TRACE
in Rust. Otherwise doing a single call to C would be more efficient
than calling each of the helpers individually.

Thanks,

Mathieu



Alice


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 11:49, Boqun Feng wrote:

On Thu, Jun 06, 2024 at 11:30:03AM -0400, Mathieu Desnoyers wrote:

On 2024-06-06 11:05, Alice Ryhl wrote:

Make it possible to have Rust code call into tracepoints defined by C
code. It is still required that the tracepoint is declared in a C
header, and that this header is included in the input to bindgen.

Signed-off-by: Alice Ryhl 


[...]


diff --git a/rust/helpers.c b/rust/helpers.c
index 2c37a0f5d7a8..0560cc2a512a 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, 
gfp_t flags)
   }
   EXPORT_SYMBOL_GPL(rust_helper_krealloc);
+void rust_helper_preempt_enable_notrace(void)
+{
+   preempt_enable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
+
+void rust_helper_preempt_disable_notrace(void)
+{
+   preempt_disable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);
+
+bool rust_helper_current_cpu_online(void)
+{
+   return cpu_online(raw_smp_processor_id());
+}
+EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
+
+void *rust_helper___rcu_dereference_raw(void **p)
+{
+   return rcu_dereference_raw(p);
+}
+EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);


Ouch. Doing a function call for each of those small operations will
have a rather large performance impact when tracing is active. If it is


Long-term plan is to 1) compile the C helpers in some IR and 2) inline
the helpers with Rust in IR-level, as what Gary has:


https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/

and I use it for the upcoming atomic API support:

https://github.com/fbq/linux/tree/dev/rust/atomic-rfc

and it works very well.


Thanks for the pointers, it makes sense.

Thanks,

Mathieu



Regards,
Boqun


not possible to inline those in Rust, then implementing __DO_TRACE in
a C function would at least allow Rust to only do a single call to C
rather than go back and forth between Rust and C.

What prevents inlining those helpers in Rust ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Alice Ryhl

On Thu, Jun 6, 2024 at 5:29 PM Mathieu Desnoyers
 wrote:
>
> On 2024-06-06 11:05, Alice Ryhl wrote:
> > Make it possible to have Rust code call into tracepoints defined by C
> > code. It is still required that the tracepoint is declared in a C
> > header, and that this header is included in the input to bindgen.
> >
> > Signed-off-by: Alice Ryhl 
>
> [...]
>
> > diff --git a/rust/helpers.c b/rust/helpers.c
> > index 2c37a0f5d7a8..0560cc2a512a 100644
> > --- a/rust/helpers.c
> > +++ b/rust/helpers.c
> > @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t 
> > new_size, gfp_t flags)
> >   }
> >   EXPORT_SYMBOL_GPL(rust_helper_krealloc);
> >
> > +void rust_helper_preempt_enable_notrace(void)
> > +{
> > + preempt_enable_notrace();
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
> > +
> > +void rust_helper_preempt_disable_notrace(void)
> > +{
> > + preempt_disable_notrace();
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);
> > +
> > +bool rust_helper_current_cpu_online(void)
> > +{
> > + return cpu_online(raw_smp_processor_id());
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
> > +
> > +void *rust_helper___rcu_dereference_raw(void **p)
> > +{
> > + return rcu_dereference_raw(p);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);
>
> Ouch. Doing a function call for each of those small operations will
> have a rather large performance impact when tracing is active. If it is
> not possible to inline those in Rust, then implementing __DO_TRACE in
> a C function would at least allow Rust to only do a single call to C
> rather than go back and forth between Rust and C.
>
> What prevents inlining those helpers in Rust ?

There's nothing fundamental that prevents it. But they contain a large
amount of architecture specific code, which makes them a significant
amount of work to reimplement in Rust.

For example, rcu_dereference_raw calls into READ_ONCE. READ_ONCE is
usually a volatile load, but under arm64+LTO, you get a bunch of
inline assembly that relies on ALTERNATIVE for feature detection:
https://elixir.bootlin.com/linux/v6.9/source/arch/arm64/include/asm/rwonce.h#L36

And preempt_enable_notrace has a similar story.

The solution that Boqun mentions is nice, but it relies on rustc and
clang using the same version of LLVM. You are unlikely to have
compilers with matching LLVMs unless you intentionally take steps to
make that happen.

But yes, perhaps these helpers are an argument to have a single call
for __DO_TRACE instead.

Alice

Re: [PATCH RESEND] nvdimm: add missing MODULE_DESCRIPTION() macros

2024-06-06 Thread Dave Jiang




On 5/26/24 10:07 AM, Jeff Johnson wrote:
> Fix the 'make W=1' warnings:
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/libnvdimm.o
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_pmem.o
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_btt.o
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_e820.o
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/of_pmem.o
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_virtio.o
> 
> Signed-off-by: Jeff Johnson 
Reviewed-by: Dave Jiang 
> ---
>  drivers/nvdimm/btt.c   | 1 +
>  drivers/nvdimm/core.c  | 1 +
>  drivers/nvdimm/e820.c  | 1 +
>  drivers/nvdimm/nd_virtio.c | 1 +
>  drivers/nvdimm/of_pmem.c   | 1 +
>  drivers/nvdimm/pmem.c  | 1 +
>  6 files changed, 6 insertions(+)
> 
> diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
> index 1e5aedaf8c7b..a47acc5d05df 100644
> --- a/drivers/nvdimm/btt.c
> +++ b/drivers/nvdimm/btt.c
> @@ -1721,6 +1721,7 @@ static void __exit nd_btt_exit(void)
>  
>  MODULE_ALIAS_ND_DEVICE(ND_DEVICE_BTT);
>  MODULE_AUTHOR("Vishal Verma ");
> +MODULE_DESCRIPTION("NVDIMM Block Translation Table");
>  MODULE_LICENSE("GPL v2");
>  module_init(nd_btt_init);
>  module_exit(nd_btt_exit);
> diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
> index 2023a661bbb0..f4b6fb4b9828 100644
> --- a/drivers/nvdimm/core.c
> +++ b/drivers/nvdimm/core.c
> @@ -540,6 +540,7 @@ static __exit void libnvdimm_exit(void)
>   nvdimm_devs_exit();
>  }
>  
> +MODULE_DESCRIPTION("NVDIMM (Non-Volatile Memory Device) core module");
>  MODULE_LICENSE("GPL v2");
>  MODULE_AUTHOR("Intel Corporation");
>  subsys_initcall(libnvdimm_init);
> diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c
> index 4cd18be9d0e9..008b9aae74ff 100644
> --- a/drivers/nvdimm/e820.c
> +++ b/drivers/nvdimm/e820.c
> @@ -69,5 +69,6 @@ static struct platform_driver e820_pmem_driver = {
>  module_platform_driver(e820_pmem_driver);
>  
>  MODULE_ALIAS("platform:e820_pmem*");
> +MODULE_DESCRIPTION("NVDIMM support for e820 type-12 memory");
>  MODULE_LICENSE("GPL v2");
>  MODULE_AUTHOR("Intel Corporation");
> diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c
> index 1f8c667c6f1e..35c8fbbba10e 100644
> --- a/drivers/nvdimm/nd_virtio.c
> +++ b/drivers/nvdimm/nd_virtio.c
> @@ -123,4 +123,5 @@ int async_pmem_flush(struct nd_region *nd_region, struct 
> bio *bio)
>   return 0;
>  };
>  EXPORT_SYMBOL_GPL(async_pmem_flush);
> +MODULE_DESCRIPTION("Virtio Persistent Memory Driver");
>  MODULE_LICENSE("GPL");
> diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c
> index d3fca0ab6290..5134a8d08bf9 100644
> --- a/drivers/nvdimm/of_pmem.c
> +++ b/drivers/nvdimm/of_pmem.c
> @@ -111,5 +111,6 @@ static struct platform_driver of_pmem_region_driver = {
>  
>  module_platform_driver(of_pmem_region_driver);
>  MODULE_DEVICE_TABLE(of, of_pmem_region_match);
> +MODULE_DESCRIPTION("NVDIMM Device Tree support");
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("IBM Corporation");
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index 598fe2e89bda..57cb30f8a3b8 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -768,4 +768,5 @@ static struct nd_device_driver nd_pmem_driver = {
>  module_nd_driver(nd_pmem_driver);
>  
>  MODULE_AUTHOR("Ross Zwisler ");
> +MODULE_DESCRIPTION("NVDIMM Persistent Memory Driver");
>  MODULE_LICENSE("GPL v2");
> 
> ---
> base-commit: 416ff45264d50a983c3c0b99f0da6ee59f9acd68
> change-id: 20240526-md-drivers-nvdimm-121215a4b93f

Re: [PATCH] ACPI: NFIT: add missing MODULE_DESCRIPTION() macro

2024-06-06 Thread Dave Jiang




On 6/3/24 6:30 AM, Jeff Johnson wrote:
> make allmodconfig && make W=1 C=1 reports:
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/acpi/nfit/nfit.o
> 
> Add the missing invocation of the MODULE_DESCRIPTION() macro.
> 
> Signed-off-by: Jeff Johnson 

Reviewed-by: Dave Jiang 
> ---
>  drivers/acpi/nfit/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index d4595d1985b1..e8520fb8af4f 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -3531,5 +3531,6 @@ static __exit void nfit_exit(void)
>  
>  module_init(nfit_init);
>  module_exit(nfit_exit);
> +MODULE_DESCRIPTION("ACPI NVDIMM Firmware Interface Table (NFIT) module");
>  MODULE_LICENSE("GPL v2");
>  MODULE_AUTHOR("Intel Corporation");
> 
> ---
> base-commit: a693b9c95abd4947c2d06e05733de5d470ab6586
> change-id: 20240603-md-drivers-acpi-nfit-e032bad0b189
>

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Boqun Feng

On Thu, Jun 06, 2024 at 11:30:03AM -0400, Mathieu Desnoyers wrote:
> On 2024-06-06 11:05, Alice Ryhl wrote:
> > Make it possible to have Rust code call into tracepoints defined by C
> > code. It is still required that the tracepoint is declared in a C
> > header, and that this header is included in the input to bindgen.
> > 
> > Signed-off-by: Alice Ryhl 
> 
> [...]
> 
> > diff --git a/rust/helpers.c b/rust/helpers.c
> > index 2c37a0f5d7a8..0560cc2a512a 100644
> > --- a/rust/helpers.c
> > +++ b/rust/helpers.c
> > @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t 
> > new_size, gfp_t flags)
> >   }
> >   EXPORT_SYMBOL_GPL(rust_helper_krealloc);
> > +void rust_helper_preempt_enable_notrace(void)
> > +{
> > +   preempt_enable_notrace();
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
> > +
> > +void rust_helper_preempt_disable_notrace(void)
> > +{
> > +   preempt_disable_notrace();
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);
> > +
> > +bool rust_helper_current_cpu_online(void)
> > +{
> > +   return cpu_online(raw_smp_processor_id());
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
> > +
> > +void *rust_helper___rcu_dereference_raw(void **p)
> > +{
> > +   return rcu_dereference_raw(p);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);
> 
> Ouch. Doing a function call for each of those small operations will
> have a rather large performance impact when tracing is active. If it is

Long-term plan is to 1) compile the C helpers in some IR and 2) inline
the helpers with Rust in IR-level, as what Gary has:


https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/

and I use it for the upcoming atomic API support:

https://github.com/fbq/linux/tree/dev/rust/atomic-rfc

and it works very well.

Regards,
Boqun

> not possible to inline those in Rust, then implementing __DO_TRACE in
> a C function would at least allow Rust to only do a single call to C
> rather than go back and forth between Rust and C.
> 
> What prevents inlining those helpers in Rust ?
> 
> Thanks,
> 
> Mathieu
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>

Re: [PATCH 0/3] Tracepoints and static branch/call in Rust

2024-06-06 Thread Alice Ryhl

On Thu, Jun 6, 2024 at 5:25 PM Mathieu Desnoyers
 wrote:
>
> On 2024-06-06 11:05, Alice Ryhl wrote:
> > This implementation implements support for static keys in Rust so that
> > the actual static branch will end up in the Rust object file. However,
> > it would also be possible to just wrap the trace_##name generated by
> > __DECLARE_TRACE in an extern C function and then call that from Rust.
> > This will simplify the Rust code by removing the need for static
> > branches and calls, but it places the static branch behind an external
> > call, which has performance implications.
>
> The tracepoints try very hard to minimize overhead of dormant tracepoints
> so it is not frowned-upon to have them built into production binaries.
> This is needed to make sure distribution vendors keep those tracepoints
> in the kernel binaries that reach end-users.
>
> Adding a function call before evaluation of the static branch goes against
> this major goal.
>
> >
> > A possible middle ground would be to place just the __DO_TRACE body in
> > an extern C function and to implement the Rust wrapper by doing the
> > static branch in Rust, and then calling into C the code that contains
> > __DO_TRACE when the tracepoint is active. However, this would need some
> > changes to include/linux/tracepoint.h to generate and export a function
> > containing the body of __DO_TRACE when the tracepoint should be callable
> > from Rust.
>
> This tradeoff is more acceptable than having a function call before
> evaluation of the static branch, but I wonder what is the upside of
> this tradeoff compared to inlining the whole __DO_TRACE in Rust ?
>
> > So in general, there is a tradeoff between placing parts of the
> > tracepoint (which is perf sensitive) behind an external call, and having
> > code duplicated in both C and Rust (which must be kept in sync when
> > changes are made). This is an important point that I would like feedback
> > on from the C maintainers.
>
> I don't see how the duplication happens there: __DO_TRACE is meant to be
> inlined into each C tracepoint caller site, so the code is already meant
> to be duplicated. Having an explicit function wrapping the tracepoint
> for Rust would just create an extra instance of __DO_TRACE if it happens
> to be also inlined into C code.
>
> Or do you meant you would like to prevent having to duplicate the
> implementation of __DO_TRACE in both C and Rust ?
>
> I'm not sure if you mean to prevent source code duplication between
> C and Rust or duplication of binary code (instructions).

It's a question of maintenance burden. If you change how __DO_TRACE is
implemented, then those changes must also be reflected in the Rust
version. There's no issue in the binary code.

Alice

Re: [PATCH 0/3] Tracepoints and static branch/call in Rust

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 11:05, Alice Ryhl wrote:

An important part of a production ready Linux kernel driver is
tracepoints. So to write production ready Linux kernel drivers in Rust,
we must be able to call tracepoints from Rust code. This patch series
adds support for calling tracepoints declared in C from Rust.


I'm glad to see progress on this front ! Please see feedback below.



To use the tracepoint support, you must:

1. Declare the tracepoint in a C header file as usual.
2. Make sure that the header file is visible to bindgen so that Rust
bindings are generated for the symbols that the tracepoint macro
emits.
3. Use the declare_trace! macro in your Rust code to generate Rust
functions that call into the tracepoint.

For example, the kernel has a tracepoint called `sched_kthread_stop`. It
is declared like this:

TRACE_EVENT(sched_kthread_stop,
TP_PROTO(struct task_struct *t),
TP_ARGS(t),
TP_STRUCT__entry(
__array(char,   comm,   TASK_COMM_LEN   )
__field(pid_t,  pid )
),
TP_fast_assign(
memcpy(__entry->comm, t->comm, TASK_COMM_LEN);
__entry->pid = t->pid;
),
TP_printk("comm=%s pid=%d", __entry->comm, __entry->pid)
);

To call the above tracepoint from Rust code, you would add the relevant
header file to rust/bindings/bindings_helper.h and add the following
invocation somewhere in your Rust code:

declare_trace! {
fn sched_kthread_stop(task: *mut task_struct);
}

This will define a Rust function of the given name that you can call
like any other Rust function. Since these tracepoints often take raw
pointers as arguments, it may be convenient to wrap it in a safe
wrapper:

mod raw {
declare_trace! {
fn sched_kthread_stop(task: *mut task_struct);
}
}

#[inline]
pub fn trace_sched_kthread_stop(task: ) {
// SAFETY: The pointer to `task` is valid.
unsafe { raw::sched_kthread_stop(task.as_raw()) }
}

A future expansion of the tracepoint support could generate these safe
versions automatically, but that is left as future work for now.

This is intended for use in the Rust Binder driver, which was originally
sent as an RFC [1]. The RFC did not include tracepoint support, but you
can see how it will be used in Rust Binder at [2]. The author has
verified that the tracepoint support works on Android devices.

This implementation implements support for static keys in Rust so that
the actual static branch will end up in the Rust object file. However,
it would also be possible to just wrap the trace_##name generated by
__DECLARE_TRACE in an extern C function and then call that from Rust.
This will simplify the Rust code by removing the need for static
branches and calls, but it places the static branch behind an external
call, which has performance implications.


The tracepoints try very hard to minimize overhead of dormant tracepoints
so it is not frowned-upon to have them built into production binaries.
This is needed to make sure distribution vendors keep those tracepoints
in the kernel binaries that reach end-users.

Adding a function call before evaluation of the static branch goes against
this major goal.



A possible middle ground would be to place just the __DO_TRACE body in
an extern C function and to implement the Rust wrapper by doing the
static branch in Rust, and then calling into C the code that contains
__DO_TRACE when the tracepoint is active. However, this would need some
changes to include/linux/tracepoint.h to generate and export a function
containing the body of __DO_TRACE when the tracepoint should be callable
from Rust.


This tradeoff is more acceptable than having a function call before
evaluation of the static branch, but I wonder what is the upside of
this tradeoff compared to inlining the whole __DO_TRACE in Rust ?


So in general, there is a tradeoff between placing parts of the
tracepoint (which is perf sensitive) behind an external call, and having
code duplicated in both C and Rust (which must be kept in sync when
changes are made). This is an important point that I would like feedback
on from the C maintainers.


I don't see how the duplication happens there: __DO_TRACE is meant to be
inlined into each C tracepoint caller site, so the code is already meant
to be duplicated. Having an explicit function wrapping the tracepoint
for Rust would just create an extra instance of __DO_TRACE if it happens
to be also inlined into C code.

Or do you meant you would like to prevent having to duplicate the
implementation of __DO_TRACE in both C and Rust ?

I'm not sure if you mean to prevent source code duplication between
C and Rust or duplication of binary code (instructions).

Thanks,

Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb

2024-06-06 Thread Yan Zhai

On Wed, Jun 5, 2024 at 6:57 PM Steven Rostedt  wrote:
>
> On Tue, 4 Jun 2024 14:47:38 -0700
> Yan Zhai  wrote:
>
> > skb does not include enough information to find out receiving
> > sockets/services and netns/containers on packet drops. In theory
> > skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
> > stack for OOO packet lookup. Similarly, skb->sk often identifies a local
> > sender, and tells nothing about a receiver.
> >
> > Allow passing an extra receiving socket to the tracepoint to improve
> > the visibility on receiving drops.
> >
> > Signed-off-by: Yan Zhai 
> > ---
> > v2->v3: fixed drop_monitor function prototype
> > ---
> >  include/trace/events/skb.h | 11 +++
> >  net/core/dev.c |  2 +-
> >  net/core/drop_monitor.c|  9 ++---
> >  net/core/skbuff.c  |  2 +-
> >  4 files changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> > index 07e0715628ec..aa6b46b6172c 100644
> > --- a/include/trace/events/skb.h
> > +++ b/include/trace/events/skb.h
> > @@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN)
> >  TRACE_EVENT(kfree_skb,
> >
> >   TP_PROTO(struct sk_buff *skb, void *location,
> > -  enum skb_drop_reason reason),
> > +  enum skb_drop_reason reason, struct sock *rx_sk),
> >
> > - TP_ARGS(skb, location, reason),
> > + TP_ARGS(skb, location, reason, rx_sk),
> >
> >   TP_STRUCT__entry(
> >   __field(void *, skbaddr)
> >   __field(void *, location)
> >   __field(unsigned short, protocol)
> >   __field(enum skb_drop_reason,   reason)
> > + __field(void *, rx_skaddr)
>
> Please add the pointer after the other pointers:
>
> __field(void *, skbaddr)
> __field(void *, location)
> +   __field(void *, rx_skaddr)
> __field(unsigned short, protocol)
> __field(enum skb_drop_reason,   reason)
>
> otherwise you are adding holes in the ring buffer event.
>
> The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We
> want to avoid alignment holes. I also question having a short before the
> enum, if the emum is 4 bytes. The short should be at the end.
>
> In fact, looking at the format file, there is a 2 byte hole:
>
>  # cat /sys/kernel/tracing/events/skb/kfree_skb/format
>
> name: kfree_skb
> ID: 1799
> format:
> field:unsigned short common_type;   offset:0;   size:2; 
> signed:0;
> field:unsigned char common_flags;   offset:2;   size:1; 
> signed:0;
> field:unsigned char common_preempt_count;   offset:3;   
> size:1; signed:0;
> field:int common_pid;   offset:4;   size:4; signed:1;
>
> field:void * skbaddr;   offset:8;   size:8; signed:0;
> field:void * location;  offset:16;  size:8; signed:0;
> field:unsigned short protocol;  offset:24;  size:2; signed:0;
> field:enum skb_drop_reason reason;  offset:28;  size:4; 
> signed:0;
>
> Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts
> at offset 28. This means at offset 26, there's a 2 byte hole.
>
The reason I added the pointer as the last argument is trying to
minimize the surprise to existing TP users, because for common ABIs
it's fine to omit later arguments when defining a function, but it
needs change and recompilation if the order of arguments changed.

Looking at the actual format after the change, it does not add a new
hole since protocol and reason are already packed into the same 8-byte
block, so rx_skaddr starts at 8-byte aligned offset:

# cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format
name: kfree_skb
ID: 2260
format:
field:unsigned short common_type;   offset:0;
size:2; signed:0;
field:unsigned char common_flags;   offset:2;
size:1; signed:0;
field:unsigned char common_preempt_count;   offset:3;
 size:1; signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;

field:void * skbaddr;   offset:8;   size:8; signed:0;
field:void * location;  offset:16;  size:8; signed:0;
field:unsigned short protocol;  offset:24;  size:2; signed:0;
field:enum skb_drop_reason reason;  offset:28;
size:4; signed:0;
field:void * rx_skaddr; offset:32;  size:8; signed:0;

Do you think we still need to change the order?

Yan


> -- Steve
>
>
>
> >   ),
> >
> >   TP_fast_assign(
> > @@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb,
> >   __entry->location = location;
> >   __entry->protocol = ntohs(skb->protocol);
> >   __entry->reason = reason;
> > + __entry->rx_skaddr = rx_sk;
> >   ),
> >
>

Re: [PATCH 2/3] rust: add static_key_false

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 11:05, Alice Ryhl wrote:

Add just enough support for static key so that we can use it from
tracepoints. Tracepoints rely on `static_key_false` even though it is
deprecated, so we add the same functionality to Rust.

Signed-off-by: Alice Ryhl 
---
  rust/kernel/lib.rs|  1 +
  rust/kernel/static_key.rs | 87 +++
  scripts/Makefile.build|  2 +-
  3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index d534b1178955..22e1fedd0774 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -39,6 +39,7 @@
  pub mod print;
  mod static_assert;
  pub mod static_call;
+pub mod static_key;
  #[doc(hidden)]
  pub mod std_vendor;
  pub mod str;
diff --git a/rust/kernel/static_key.rs b/rust/kernel/static_key.rs
new file mode 100644
index ..6c3dbe14c98a
--- /dev/null
+++ b/rust/kernel/static_key.rs
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0


This static key code is something that can be generally useful
both in kernel and user-space. Is there anything that would prevent
licensing this under MIT right away instead so it could eventually be
re-used more broadly in userspace as well ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH 3/3] rust: add tracepoint support

2024-06-06 Thread Mathieu Desnoyers


On 2024-06-06 11:05, Alice Ryhl wrote:

Make it possible to have Rust code call into tracepoints defined by C
code. It is still required that the tracepoint is declared in a C
header, and that this header is included in the input to bindgen.

Signed-off-by: Alice Ryhl 


[...]


diff --git a/rust/helpers.c b/rust/helpers.c
index 2c37a0f5d7a8..0560cc2a512a 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, 
gfp_t flags)
  }
  EXPORT_SYMBOL_GPL(rust_helper_krealloc);
  
+void rust_helper_preempt_enable_notrace(void)

+{
+   preempt_enable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace);
+
+void rust_helper_preempt_disable_notrace(void)
+{
+   preempt_disable_notrace();
+}
+EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace);
+
+bool rust_helper_current_cpu_online(void)
+{
+   return cpu_online(raw_smp_processor_id());
+}
+EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online);
+
+void *rust_helper___rcu_dereference_raw(void **p)
+{
+   return rcu_dereference_raw(p);
+}
+EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw);


Ouch. Doing a function call for each of those small operations will
have a rather large performance impact when tracing is active. If it is
not possible to inline those in Rust, then implementing __DO_TRACE in
a C function would at least allow Rust to only do a single call to C
rather than go back and forth between Rust and C.

What prevents inlining those helpers in Rust ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH] livepatch: introduce klp_func called interface

2024-06-06 Thread Joe Lawrence

Hi Wardenjohn,

To follow up, Red Hat kpatch QE pointed me toward this test:

https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/tree/main/general/kpatch/kpatch-trace?ref_type=heads

which reports a few interesting things via systemd service and ftrace:

- Patched functions
- Traced functions
- Code that was modified, but didn't run
- Other code that ftrace saw, but is not listed in the sysfs directory

The code looks to be built on top of the same basic ftrace commands that
I sent the other day.  You may be able to reuse/copy/etc bits from the
scripts if they are handy.

--
Joe

Re: [PATCH 3/5] arm64: dts: qcom: sdx75: add missing qlink_logging reserved memory for mpss

2024-06-06 Thread Krzysztof Kozlowski

On 06/06/2024 16:38, Naina Mehta wrote:
> The qlink_logging memory region is also used by the modem firmware,
> add it to reserved memory regions.
> Also split MPSS DSM region into 2 separate regions.
> 
> Signed-off-by: Naina Mehta 
> ---
>  arch/arm64/boot/dts/qcom/sdx75.dtsi | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sdx75.dtsi 
> b/arch/arm64/boot/dts/qcom/sdx75.dtsi
> index 9b93f6501d55..9349b1c4e196 100644
> --- a/arch/arm64/boot/dts/qcom/sdx75.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdx75.dtsi
> @@ -366,7 +366,12 @@
>   no-map;
>   };
>  
> - qdss_mem: qdss@8880 {
> + qdss_mem: qdss@8850 {
> + reg = <0x0 0x8850 0x0 0x30>;
> + no-map;
> + };
> +
> + qlink_logging_mem: qlink_logging@8880 {

Sorry, no downstream code.

Please follow DTS coding style - no underscores in node names. This
applies to all work sent upstream.



Best regards,
Krzysztof

Re: [PATCH v14 02/14] cgroup/misc: Add per resource callbacks for CSS events

2024-06-06 Thread Haitao Huang

On Thu, 06 Jun 2024 08:37:31 -0500, chenridong   
wrote:




   
If _misc_cg_res_alloc fails, maybe some types do not call ->alloc(), but  
all types ->free() callback >will be called, is that ok?


Not sure I understand. Are you suggesting we ignore failures from  
->alloc() callback in _misc_cg_res_alloc() as it is per-resource, and have  
->free() callback and resource provider of the failing type to handle the  
failure internally?


IIUC, this failure only happens when a specific subcgroup is created  
(memory running out for allocation) so failing that subcgroup as a whole  
seems fine to me. Note the root node is static and no pre-resource  
callbacks invoked by misc. And resource provider handles its own  
allocations for root. In SGX case we too declare a static object for  
corresponding root sgx_cgroup struct.


Note also misc cgroup (except for setting capacity[res] = 0 at root) is  
all or nothing so no mechanism to tell user "this resource does not work  
but others are fine in this particular cgroup."


Thanks
Haitao

Re: [PATCH 1/5] dt-bindings: remoteproc: qcom,sm8550-pas: document the SDX75 PAS

2024-06-06 Thread Krzysztof Kozlowski

On 06/06/2024 16:38, Naina Mehta wrote:
> Document the MPSS Peripheral Authentication Service on SDX75 platform.
> 
> Signed-off-by: Naina Mehta 
> ---
>  .../devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml  | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git 
> a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml 
> b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
> index 73fda7565cd1..02e15b1f78ab 100644
> --- a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml
> @@ -16,6 +16,7 @@ description:
>  properties:
>compatible:
>  enum:
> +  - qcom,sdx75-mpss-pas
>- qcom,sm8550-adsp-pas
>- qcom,sm8550-cdsp-pas
>- qcom,sm8550-mpss-pas

Missing updates to allOf constraints. Are you sure this is the binding
for SDX75?

Best regards,
Krzysztof

Re: [PATCH] net: missing check

2024-06-06 Thread Eric Dumazet

On Thu, Jun 6, 2024 at 4:14 PM Denis Arefev  wrote:
>
> Two missing check in virtio_net_hdr_to_skb() allowed syzbot
> to crash kernels again
>
> 1. After the skb_segment function the buffer may become non-linear
> (nr_frags != 0), but since the SKBTX_SHARED_FRAG flag is not set anywhere
> the __skb_linearize function will not be executed, then the buffer will
> remain non-linear. Then the condition (offset >= skb_headlen(skb))
> becomes true, which causes WARN_ON_ONCE in skb_checksum_help.
>
> 2. The struct sk_buff and struct virtio_net_hdr members must be
> mathematically related.
> (gso_size) must be greater than (needed) otherwise WARN_ON_ONCE.
> (remainder) must be greater than (needed) otherwise WARN_ON_ONCE.
> (remainder) may be 0 if division is without remainder.
>
> offset+2 (4191) > skb_headlen() (1116)
> WARNING: CPU: 1 PID: 5084 at net/core/dev.c:3303 
> skb_checksum_help+0x5e2/0x740 net/core/dev.c:3303
> Modules linked in:
> CPU: 1 PID: 5084 Comm: syz-executor336 Not tainted 
> 6.7.0-rc3-syzkaller-00014-gdf60cee26a2e #0
> Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 
> 11/10/2023
> RIP: 0010:skb_checksum_help+0x5e2/0x740 net/core/dev.c:3303
> Code: 89 e8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 52 01 00 00 44 89 e2 2b 
> 53 74 4c 89 ee 48 c7 c7 40 57 e9 8b e8 af 8f dd f8 90 <0f> 0b 90 90 e9 87 fe 
> ff ff e8 40 0f 6e f9 e9 4b fa ff ff 48 89 ef
> RSP: 0018:c90003a9f338 EFLAGS: 00010286
> RAX:  RBX: 888025125780 RCX: 814db209
> RDX: 888015393b80 RSI: 814db216 RDI: 0001
> RBP: 8880251257f4 R08: 0001 R09: 
> R10:  R11: 0001 R12: 045c
> R13: 105f R14: 8880251257f0 R15: 105d
> FS:  55c24380() GS:8880b990() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 2000f000 CR3: 23151000 CR4: 003506f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  
>  ip_do_fragment+0xa1b/0x18b0 net/ipv4/ip_output.c:777
>  ip_fragment.constprop.0+0x161/0x230 net/ipv4/ip_output.c:584
>  ip_finish_output_gso net/ipv4/ip_output.c:286 [inline]
>  __ip_finish_output net/ipv4/ip_output.c:308 [inline]
>  __ip_finish_output+0x49c/0x650 net/ipv4/ip_output.c:295
>  ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
>  NF_HOOK_COND include/linux/netfilter.h:303 [inline]
>  ip_output+0x13b/0x2a0 net/ipv4/ip_output.c:433
>  dst_output include/net/dst.h:451 [inline]
>  ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:129
>  iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
>  ipip6_tunnel_xmit net/ipv6/sit.c:1034 [inline]
>  sit_tunnel_xmit+0xed2/0x28f0 net/ipv6/sit.c:1076
>  __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
>  netdev_start_xmit include/linux/netdevice.h:4954 [inline]
>  xmit_one net/core/dev.c:3545 [inline]
>  dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3561
>  __dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4346
>  dev_queue_xmit include/linux/netdevice.h:3134 [inline]
>  packet_xmit+0x257/0x380 net/packet/af_packet.c:276
>  packet_snd net/packet/af_packet.c:3087 [inline]
>  packet_sendmsg+0x24ca/0x5240 net/packet/af_packet.c:3119
>  sock_sendmsg_nosec net/socket.c:730 [inline]
>  __sock_sendmsg+0xd5/0x180 net/socket.c:745
>  __sys_sendto+0x255/0x340 net/socket.c:2190
>  __do_sys_sendto net/socket.c:2202 [inline]
>  __se_sys_sendto net/socket.c:2198 [inline]
>  __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>  do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
>  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>
> Signed-off-by: Denis Arefev 
> ---
>  include/linux/virtio_net.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> index 4dfa9b69ca8d..77ebe908d746 100644
> --- a/include/linux/virtio_net.h
> +++ b/include/linux/virtio_net.h
> @@ -56,6 +56,7 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
> unsigned int thlen = 0;
> unsigned int p_off = 0;
> unsigned int ip_proto;
> +   u64 ret, remainder;
>
> if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
> switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
> @@ -98,6 +99,15 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
> *skb,
> u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
> u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
>
> +   if (hdr->gso_size) {
> +   ret = div64_u64_rem(skb->len, hdr->gso_size, 
> );
> +   if (!(ret && (hdr->gso_size > needed) &&
> +   ((remainder > needed) || 
> (remainder == 0 {
> +

Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing

2024-06-06 Thread Haitao Huang

On Thu, 06 Jun 2024 00:32:55 -0500, Jarkko Sakkinen   
wrote:



On Wed Jun 5, 2024 at 6:33 PM EEST, Haitao Huang wrote:


sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we
check and return 0 which was the initially implemented in v12. But then
Kai had some concern on that we expose all the interface files to allow
user to set limits but we don't enforce. To keep it simple we settled  
down

  ~~

Sure:

"Keep it simple and corpse"


back to BUG_ON(). This would only happen rarely and user can add
command-line to disable SGX if s/he really wants to start kernel in this
case, just can't do SGX.


Even disabling all of SGX would be a less catastrophical measure.


Yes I had a comment but Kai thought it was too obvious and I can't think
of a better one that's not obvious so I removed:


Not great advice given. Please just document it. In patch, which
BUG_ON() I don't want to see my R-by in it, until I've reviewed an
updated version.

BR, Jarkko



Sure. that makes sens.  So far I have following changes. I did not do the  
renaming for the functions as I am not sure it is still needed. Let me  
know otherwise and if I missed any more.


--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
@@ -329,7 +329,7 @@ const struct misc_res_ops sgx_cgroup_ops = {
 .free = sgx_cgroup_free,
 };

-void sgx_cgroup_init(void)
+int __init sgx_cgroup_init(void)
 {
 /*
  * misc root always exists even if misc is disabled from command line.
@@ -345,7 +345,8 @@ void sgx_cgroup_init(void)
 if (cgroup_subsys_enabled(misc_cgrp_subsys)) {
 sgx_cg_wq = alloc_workqueue("sgx_cg_wq", WQ_UNBOUND |  
WQ_FREEZABLE,

 WQ_UNBOUND_MAX_ACTIVE);
-BUG_ON(!sgx_cg_wq);
+return -ENOMEM;
 }

+return 0;
 }

--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
@@ -99,7 +99,7 @@ bool sgx_cgroup_should_reclaim(struct sgx_cgroup  
*sgx_cg);

 struct misc_cg *sgx_cgroup_reclaim_pages(struct misc_cg *root,
  struct misc_cg *next_cg,
  struct mm_struct *charge_mm);
-void sgx_cgroup_init(void);
+int __init sgx_cgroup_init(void);

 #endif /* CONFIG_CGROUP_MISC */


--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -1045,7 +1045,7 @@ static int __init sgx_init(void)
 if (!sgx_page_cache_init())
 return -ENOMEM;

-if (!sgx_page_reclaimer_init()) {
+if (!sgx_page_reclaimer_init() || !sgx_cgroup_init()) {
 ret = -ENOMEM;
 goto err_page_cache;
 }
@@ -1067,9 +1067,6 @@ static int __init sgx_init(void)
 if (sgx_vepc_init() && ret)
 goto err_provision;

-/* Setup cgroup if either the native or vepc driver is active */
-sgx_cgroup_init();
-
 return 0;

 err_provision:

--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -119,6 +119,7 @@ static inline void *sgx_get_epc_virt_addr(struct  
sgx_epc_page *page)

  * cgroup.
  */
 struct sgx_epc_lru_list {
+/* Use this lock to protect access from multiple reclamation worker  
threads */

 spinlock_t lock;
 struct list_head reclaimable;
 };

--- a/kernel/cgroup/misc.c
+++ b/kernel/cgroup/misc.c
@@ -446,7 +446,8 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css)
 int ret;

 if (!parent_css) {
-parent_cg = cg = _cg;
+parent_cg = _cg;
+cg = _cg;
 } else {
 cg = kzalloc(sizeof(*cg), GFP_KERNEL);
 if (!cg)

Thanks
Haitao

Re: [PATCH v2 1/5] ftrace: Rename dup_hash() and comment it

2024-06-06 Thread Mark Rutland

On Wed, Jun 05, 2024 at 02:03:35PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> The name "dup_hash()" is a misnomer as it does not duplicate the hash that
> is passed in, but instead moves its entities from that hash to a newly
> allocated one. Rename it to "__move_hash()" (using starting underscores as
> it is an internal function), and add some comments about what it does.
> 
> Signed-off-by: Steven Rostedt (Google) 

Acked-by: Mark Rutland 

Mark.

> ---
>  kernel/trace/ftrace.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index da7e6abf48b4..9dcdefe9d1aa 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, 
> int filter_hash);
>  static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops,
>  struct ftrace_hash *new_hash);
>  
> -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size)
> +/*
> + * Allocate a new hash and remove entries from @src and move them to the new 
> hash.
> + * On success, the @src hash will be empty and should be freed.
> + */
> +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size)
>  {
>   struct ftrace_func_entry *entry;
>   struct ftrace_hash *new_hash;
> @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src)
>   if (ftrace_hash_empty(src))
>   return EMPTY_HASH;
>  
> - return dup_hash(src, size);
> + return __move_hash(src, size);
>  }
>  
>  static int
> -- 
> 2.43.0
> 
>

Re: [PATCH] remoteproc: mediatek: Increase MT8188 SCP core0 DRAM size

2024-06-06 Thread AngeloGioacchino Del Regno


Il 06/06/24 11:06, jason-ch chen ha scritto:

From: Jason Chen 

Increase MT8188 SCP core0 DRAM size for HEVC driver.



so the second core only gets a maximum of 0x20 of SRAM?
I'm not sure how useful is the secondary SCP core, at this point, with so little
available SRAM but... okay, as you wish.

Reviewed-by: AngeloGioacchino Del Regno 




Signed-off-by: Jason Chen 
---
  drivers/remoteproc/mtk_scp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index b885a9a041e4..2119fc62c3f2 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -1390,7 +1390,7 @@ static const struct mtk_scp_sizes_data default_scp_sizes 
= {
  };
  
  static const struct mtk_scp_sizes_data mt8188_scp_sizes = {

-   .max_dram_size = 0x50,
+   .max_dram_size = 0x80,
.ipi_share_buffer_size = 600,
  };

Re: [PATCH] remoteproc: mediatek: Don't print error when optional scp handle is missing

2024-06-06 Thread AngeloGioacchino Del Regno


Il 05/06/24 21:35, Nícolas F. R. A. Prado ha scritto:

The scp_get() helper has two users: the mtk-vcodec and the mtk-mdp3
drivers. mdp3 considers the mediatek,scp phandle optional, and when it's
missing mdp3 will directly look for the scp node based on compatible.

For that reason printing an error in the helper when the handle is
missing is wrong, as it's not always an actual error. Besides, the other
user, vcodec, already prints an error message on its own when the helper
fails so this message doesn't add that much information.

Remove the error message from the helper. This gets rid of the deceptive
error on MT8195-Tomato:

   mtk-mdp3 14001000.dma-controller: can't get SCP node


I'm way more for giving it a SCP handle instead of silencing this print... the
SCP handle way *is* the correct one and I prefer it, as that'd also be the only
way to support Dual-Core SCP in MDP3.

I really want to see hardcoded stuff disappear and I really really *really* want
to see more consistency across DT nodes in MediaTek platforms.

So, still a one-line change, but do it on the devicetree instead of here please.

Cheers,
Angelo



Signed-off-by: Nícolas F. R. A. Prado 
---
  drivers/remoteproc/mtk_scp.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index b885a9a041e4..f813117b6312 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -37,10 +37,8 @@ struct mtk_scp *scp_get(struct platform_device *pdev)
struct platform_device *scp_pdev;
  
  	scp_node = of_parse_phandle(dev->of_node, "mediatek,scp", 0);

-   if (!scp_node) {
-   dev_err(dev, "can't get SCP node\n");
+   if (!scp_node)
return NULL;
-   }
  
  	scp_pdev = of_find_device_by_node(scp_node);

of_node_put(scp_node);

---
base-commit: d97496ca23a2d4ee80b7302849404859d9058bcd
change-id: 20240605-mt8195-dma-scp-node-err-6a8dea26d4f5

Best regards,

Re: [PATCH v2 2/6] livepatch: Add klp-convert tool

2024-06-06 Thread Lukas Hruska

Hello Joe,
> > +#define KLP_RELOC_SYMBOL_POS(LP_OBJ_NAME, SYM_OBJ_NAME, SYM_NAME, SYM_POS) 
> > \
> > +   asm("\".klp.sym.rela." #LP_OBJ_NAME "." #SYM_OBJ_NAME "." #SYM_NAME "," 
> > #SYM_POS "\"")
> 
> ^^^
> I think I found a potential bug, or at least compatiblity problem with
> including a comma "," character in this symbol format and older versions
> of the GNU assembler.  The good news is that other delimiter characters
> like "." or "#" seem to work out fine.

Thank you for spotting this. I was using binutils 2.38, so I did not
even notice this problem. Unfortunately, I was not able to make it work
with "#" as a delimiter; only "." worked. Additionally, any type of
parenthesis apparently has some special purpose even in labels, so they
are also not an option.

> If you want to reproduce, you'll need a version of `as` like binutils
> 2.36.1 and try building the samples/livepatch/livepatch-extern-symbol.ko
> and you should get an error like:
> 
>   Assembler messages:
>   Warning: missing closing '"'
>   Warning: missing closing '"'
>   Error: too many memory references for `movq'
> 
> 
> If you want to retrace my adventure, here are my steps:
> 
>   1) Clone klp-convert-tree repo branch containing this patchset +
>   Petr's review comments + a few helpful things for klp-convert
>   development:
>   
> $ git clone \
> --single-branch --branch=klp-convert-minimal-v1-review --depth=9 \
> https://github.com/joe-lawrence/klp-convert-tree.git
> [ ... snip ... ]
> $ cd klp-convert-tree
>   
>   2) Override .cross-dev defaults:
>   
> $ export BUILD_ARCHES=x86_64
> $ export COMPILER=gcc-11.1.0
> $ export URL=https://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/
> $ export OUTDIR_PREFIX=$(pwd)/build
> $ export COMPILER_INSTALL_PATH=$(pwd)/cross-compiler
>   
>   3) Setup x86_64 default .config (this will download and install the
>   gcc-11.1.0 compiler from cdn.kernel.org):
>   
> $ ./cross-dev make defconfig
> 
> x86_64 : make defconfig ...
> Compiler will be installed in /root/klp-convert-tree/cross-compiler
> [ ... snip ... ]
>   
>   4) Add kernel livepatching configuration options:
>   
> $ ./cross-dev klp-config
> 
> Configuring x86_64 ...
> [ ... snip ... ]
> 
> $ grep LIVEPATCH "$OUTDIR_PREFIX"-x86_64/.config
> CONFIG_HAVE_LIVEPATCH=y
> CONFIG_LIVEPATCH=y
> CONFIG_SAMPLE_LIVEPATCH=m
>   
>   5) Run the cross-compiler build until it hits a build error on
>   livepatch-extern-symbol.ko:
>   
> $ ./cross-dev make -j$(nproc)
> [ ... snip ... ]
> make: Target '__all' not remade because of errors.
> [ x86_64 : make -j48 = FAIL ]
>   
>   6) With pre-requisites already built, retry the external symbol sample
>   and add -save-temps to the KCFLAGS to keep the generated assembly file:
>   
> $ KCFLAGS="-save-temps=obj" ./cross-dev make 
> samples/livepatch/livepatch-extern-symbol.ko
> [ ... snip ... ]
> samples/livepatch/livepatch-extern-symbol.s: Assembler messages:
> samples/livepatch/livepatch-extern-symbol.s:103: Warning: missing closing 
> '"'
> samples/livepatch/livepatch-extern-symbol.s:103: Warning: missing closing 
> '"'
> samples/livepatch/livepatch-extern-symbol.s:103: Error: too many memory 
> references for `movq'
> [ ... snip ... ]
>   
>   7) Which line is that?
>   
> $ awk 'NR==103' 
> "$OUTDIR_PREFIX"-x86_64/samples/livepatch/livepatch-extern-symbol.s
> movq
> ".klp.sym.rela.vmlinux.vmlinux.saved_command_line,0"(%rip), %rdx
> 
> 
> You could alternatively poke at it through the compiler explorer service
> and toggle the source and binutils versions:
> 
>   (error)   binutils 2.36.1 : https://godbolt.org/z/cGGs6rfWe
>   (success) binutils 2.38   : https://godbolt.org/z/ffzza3vYd

Thank you for those detailed step-by-step instruction to reproduce it!
It helped me a lot to understand the problem.

Lukas

Re: [PATCH] arm64: dts: qcom: qcm6490-fairphone-fp5: Use .mbn firmware for IPA

2024-06-06 Thread Konrad Dybcio

On 6.06.2024 11:09 AM, Luca Weiss wrote:
> Specify the file name for the squashed/non-split firmware with the .mbn
> extension instead of the split .mdt. The kernel can load both but the
> squashed version is preferred in dts nowadays.
> 
> Signed-off-by: Luca Weiss 
> ---

Reviewed-by: Konrad Dybcio 

Konrad

Re: [PATCH v8 0/5] soc: qcom: add in-kernel pd-mapper implementation

2024-06-06 Thread Neil Armstrong


On 11/05/2024 23:56, Dmitry Baryshkov wrote:

Protection domain mapper is a QMI service providing mapping between
'protection domains' and services supported / allowed in these domains.
For example such mapping is required for loading of the WiFi firmware or
for properly starting up the UCSI / altmode / battery manager support.

The existing userspace implementation has several issue. It doesn't play
well with CONFIG_EXTRA_FIRMWARE, it doesn't reread the JSON files if the
firmware location is changed (or if the firmware was not available at
the time pd-mapper was started but the corresponding directory is
mounted later), etc.

However this configuration is largely static and common between
different platforms. Provide in-kernel service implementing static
per-platform data.

To: Bjorn Andersson 
To: Konrad Dybcio 
To: Sibi Sankar 
To: Mathieu Poirier 
Cc: linux-arm-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-remotep...@vger.kernel.org
Cc: Johan Hovold 
Cc: Xilin Wu 
Cc: "Bryan O'Donoghue" 
Cc: Steev Klimaszewski 
Cc: Alexey Minnekhanov 

--

Changes in v8:
- Reworked pd-mapper to register as an rproc_subdev / auxdev
- Dropped Tested-by from Steev and Alexey from the last patch since the
   implementation was changed significantly.
- Add sensors, cdsp and mpss_root domains to 660 config (Alexey
   Minnekhanov)
- Added platform entry for sm4250 (used for qrb4210 / RB2)
- Added locking to the pdr_get_domain_list() (Chris Lew)
- Remove the call to qmi_del_server() and corresponding API (Chris Lew)
- In qmi_handle_init() changed 1024 to a defined constant (Chris Lew)
- Link to v7: 
https://lore.kernel.org/r/20240424-qcom-pd-mapper-v7-0-05f7fc646...@linaro.org

Changes in v7:
- Fixed modular build (Steev)
- Link to v6: 
https://lore.kernel.org/r/20240422-qcom-pd-mapper-v6-0-f96957d01...@linaro.org

Changes in v6:
- Reworked mutex to fix lockdep issue on deregistration
- Fixed dependencies between PD-mapper and remoteproc to fix modular
   builds (Krzysztof)
- Added EXPORT_SYMBOL_GPL to fix modular builds (Krzysztof)
- Fixed kerneldocs (Krzysztof)
- Removed extra pr_debug messages (Krzysztof)
- Fixed wcss build (Krzysztof)
- Added platforms which do not require protection domain mapping to
   silence the notice on those platforms
- Link to v5: 
https://lore.kernel.org/r/20240419-qcom-pd-mapper-v5-0-e35b6f847...@linaro.org

Changes in v5:
- pdr: drop lock in pdr_register_listener, list_lock is already held (Chris Lew)
- pd_mapper: reworked to provide static configuration per platform
   (Bjorn)
- Link to v4: 
https://lore.kernel.org/r/20240311-qcom-pd-mapper-v4-0-24679cca5...@linaro.org

Changes in v4:
- Fixed missing chunk, reenabled kfree in qmi_del_server (Konrad)
- Added configuration for sm6350 (Thanks to Luca)
- Removed RFC tag (Konrad)
- Link to v3: 
https://lore.kernel.org/r/20240304-qcom-pd-mapper-v3-0-6858fa1ac...@linaro.org

Changes in RFC v3:
- Send start / stop notifications when PD-mapper domain list is changed
- Reworked the way PD-mapper treats protection domains, register all of
   them in a single batch
- Added SC7180 domains configuration based on TCL Book 14 GO
- Link to v2: 
https://lore.kernel.org/r/20240301-qcom-pd-mapper-v2-0-5d12a081d...@linaro.org

Changes in RFC v2:
- Swapped num_domains / domains (Konrad)
- Fixed an issue with battery not working on sc8280xp
- Added missing configuration for QCS404

---
Dmitry Baryshkov (5):
   soc: qcom: pdr: protect locator_addr with the main mutex
   soc: qcom: pdr: fix parsing of domains lists
   soc: qcom: pdr: extract PDR message marshalling data
   soc: qcom: add pd-mapper implementation
   remoteproc: qcom: enable in-kernel PD mapper

  drivers/remoteproc/qcom_common.c|  87 +
  drivers/remoteproc/qcom_common.h|  10 +
  drivers/remoteproc/qcom_q6v5_adsp.c |   3 +
  drivers/remoteproc/qcom_q6v5_mss.c  |   3 +
  drivers/remoteproc/qcom_q6v5_pas.c  |   3 +
  drivers/remoteproc/qcom_q6v5_wcss.c |   3 +
  drivers/soc/qcom/Kconfig|  15 +
  drivers/soc/qcom/Makefile   |   2 +
  drivers/soc/qcom/pdr_interface.c|  17 +-
  drivers/soc/qcom/pdr_internal.h | 318 ++---
  drivers/soc/qcom/qcom_pd_mapper.c   | 676 
  drivers/soc/qcom/qcom_pdr_msg.c | 353 +++
  12 files changed, 1190 insertions(+), 300 deletions(-)
---
base-commit: e5119bbdaca76cd3c15c3c975d51d840bbfb2488
change-id: 20240301-qcom-pd-mapper-e12d622d4ad0

Best regards,


Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Neil Armstrong  # on SM8550-HDK
Tested-by: Neil Armstrong  # on SM8650-QRD

Thanks,
Neil

Re: [PATCH v8 1/5] soc: qcom: pdr: protect locator_addr with the main mutex

2024-06-06 Thread Dmitry Baryshkov

On Thu, 6 Jun 2024 at 01:48, Chris Lew  wrote:
>
> Hi Dmitry,
>
> On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote:
> ...
> > @@ -76,12 +76,12 @@ static int pdr_locator_new_server(struct qmi_handle 
> > *qmi,
> > locator_hdl);
> >   struct pdr_service *pds;
> >
> > + mutex_lock(>lock);
> >   /* Create a local client port for QMI communication */
> >   pdr->locator_addr.sq_family = AF_QIPCRTR;
> >   pdr->locator_addr.sq_node = svc->node;
> >   pdr->locator_addr.sq_port = svc->port;
> >
> > - mutex_lock(>lock);
> >   pdr->locator_init_complete = true;
> >   mutex_unlock(>lock);
> >
> > @@ -104,10 +104,10 @@ static void pdr_locator_del_server(struct qmi_handle 
> > *qmi,
> >
> >   mutex_lock(>lock);
> >   pdr->locator_init_complete = false;
> > - mutex_unlock(>lock);
> >
> >   pdr->locator_addr.sq_node = 0;
> >   pdr->locator_addr.sq_port = 0;
> > + mutex_unlock(>lock);
> >   }
> >
> >   static const struct qmi_ops pdr_locator_ops = {
> > @@ -365,6 +365,7 @@ static int pdr_get_domain_list(struct 
> > servreg_get_domain_list_req *req,
> >   if (ret < 0)
> >   return ret;
> >
> > + mutex_lock(>lock);
> >   ret = qmi_send_request(>locator_hdl,
> >  >locator_addr,
> >  , SERVREG_GET_DOMAIN_LIST_REQ,
> > @@ -373,15 +374,16 @@ static int pdr_get_domain_list(struct 
> > servreg_get_domain_list_req *req,
> >  req);
> >   if (ret < 0) {
> >   qmi_txn_cancel();
> > - return ret;
> > + goto err_unlock;
> >   }
> >
> >   ret = qmi_txn_wait(, 5 * HZ);
> >   if (ret < 0) {
> >   pr_err("PDR: %s get domain list txn wait failed: %d\n",
> >  req->service_name, ret);
> > - return ret;
> > + goto err_unlock;
> >   }
> > + mutex_unlock(>lock);
>
> I'm not sure it is necessary to hold the the mutex during the
> qmi_txn_wait() since the only variable we are trying to protect is
> locator_addr.
>
> Wouldn't this delay other work like new/del server notifications if this
> qmi service is delayed or non-responsive?
>

I've verified, the addr is stored inside the message data by the
enqueueing functions, so the locator_addr isn't referenced after the
function returns. I'll reduce the locking scope.


-- 
With best wishes
Dmitry

RE: [PATCH] remoteproc: imx_rproc: Adjust phandle parsing issue while remapping optional addresses in imx_rproc_addr_init()

2024-06-06 Thread Peng Fan

Hi Aleksandr,

> Subject: [PATCH] remoteproc: imx_rproc: Adjust phandle parsing issue while
> remapping optional addresses in imx_rproc_addr_init()
> 
> In imx_rproc_addr_init() "nph = of_count_phandle_with_args()" just counts
> number of phandles. But phandles may be empty. So of_parse_phandle() in
> the parsing loop (0 < a < nph) may return NULL which is later dereferenced.
> Adjust this issue by adding NULL-return check.

It is good to add a check here. But I am not sure whether this will really
happen.

> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: a0ff4aa6f010 ("remoteproc: imx_rproc: add a NXP/Freescale imx_rproc
> driver")
> Signed-off-by: Aleksandr Mishin 

Anyway LGTM:
Reviewed-by: Peng Fan 

Thanks,
Peng.

Re: [PATCH v3 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-06-06 Thread Krzysztof Kozlowski

On 05/06/2024 18:56, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski 
> 
> Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
> GPDSP1 on the SA8775p SoC.
> 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  .../bindings/remoteproc/qcom,sa8775p-pas.yaml  | 160 
> +
>  1 file changed, 160 insertions(+)

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.

2024-06-06 Thread cuitao


Sorry, maybe I'm wrong.

I wonder if the gfp parameter in static inline void *kmalloc(size_t s, 
gfp_t gfp) can be deleted if it is not used.


Or would be better to move memset to kmalloc.

Like this:
#define __GFP_ZERO 0x1
static inline void *kmalloc(size_t s, gfp_t gfp)
{
    void *p;
    if (__kmalloc_fake)
    return __kmalloc_fake;

    p = malloc(s);
    if (!p)
    return p;

    if (gfp & __GFP_ZERO)
    memset(p, 0, s);
    return p;
}

在 2024/6/6 15:18, Michael S. Tsirkin 写道:

On Wed, Jun 05, 2024 at 09:52:45PM +0800, cuitao wrote:

Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it,
without the need for an additional memset call.

Signed-off-by: cuitao 
---
  tools/virtio/linux/kernel.h | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
index 6702008f7f5c..9e401fb7c215 100644
--- a/tools/virtio/linux/kernel.h
+++ b/tools/virtio/linux/kernel.h
@@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, 
gfp_t gfp)
  
  static inline void *kzalloc(size_t s, gfp_t gfp)

  {
-   void *p = kmalloc(s, gfp);
-
-   memset(p, 0, s);
-   return p;
+   return kmalloc(s, gfp | __GFP_ZERO);
  }


Why do we care? It's just here to make things compile. The simpler the
better.


  static inline void *alloc_pages_exact(size_t s, gfp_t gfp)
--
2.25.1

Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.

2024-06-06 Thread Michael S. Tsirkin

On Wed, Jun 05, 2024 at 09:52:45PM +0800, cuitao wrote:
> Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it,
> without the need for an additional memset call.
> 
> Signed-off-by: cuitao 
> ---
>  tools/virtio/linux/kernel.h | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
> index 6702008f7f5c..9e401fb7c215 100644
> --- a/tools/virtio/linux/kernel.h
> +++ b/tools/virtio/linux/kernel.h
> @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, 
> gfp_t gfp)
>  
>  static inline void *kzalloc(size_t s, gfp_t gfp)
>  {
> - void *p = kmalloc(s, gfp);
> -
> - memset(p, 0, s);
> - return p;
> + return kmalloc(s, gfp | __GFP_ZERO);
>  }


Why do we care? It's just here to make things compile. The simpler the
better.

>  static inline void *alloc_pages_exact(size_t s, gfp_t gfp)
> -- 
> 2.25.1

Re: [PATCH 1/1] dt-bindings: remoteproc: imx_rproc: add minItems for power-domain

2024-06-06 Thread Krzysztof Kozlowski

On 05/06/2024 21:34, Frank Li wrote:
> "fsl,imx8qxp-cm4" just need 2 power domains. Keep the same restriction for
> other compatible string.
> 
> Signed-off-by: Frank Li 
> ---
> 
> Notes:
> pass dt_binding_check.
> 
> make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8  dt_binding_check 
> DT_SCHEMA_FILES=fsl,imx-rproc.yaml
>   SCHEMA  Documentation/devicetree/bindings/processed-schema.json
>   CHKDT   Documentation/devicetree/bindings
>   LINTDocumentation/devicetree/bindings
>   DTEX
> Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dts
>   DTC_CHK 
> Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dtb
> 
>  .../bindings/remoteproc/fsl,imx-rproc.yaml   | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> index df36e29d974ca..60267c1ba94e0 100644
> --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> @@ -59,6 +59,7 @@ properties:
>  maxItems: 32
>  
>power-domains:
> +minItems: 1

Then here should be 2.

>  maxItems: 8
>  
>fsl,auto-boot:
> @@ -99,6 +100,21 @@ allOf:
>properties:
>  fsl,iomuxc-gpr: false
>  
> +  - if:
> +  properties:
> +compatible:
> +  contains:
> +enum:
> +  - fsl,imx8qxp-cm4
> +then:
> +  properties:
> +power-domains:
> +  minItems: 2



Best regards,
Krzysztof

Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing

2024-06-06 Thread Jarkko Sakkinen

On Thu Jun 6, 2024 at 9:20 AM EEST, Jarkko Sakkinen wrote:
> > There are existing code where BUG_ON() is used during the kernel early 
> > boot code when memory allocation fails (e.g., see cgroup_init_subsys()), 
> > so it might be acceptable to use BUG_ON() here, but it's up to 
> > maintainers to decide whether it is OK.
>
> When it is not possible continue to run the system at all, and only
> then.
>
> Here it is possible. Without SGX.

With this logic sgx_init() should call BUG() in the error paths.

BR, Jarkko

Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing

2024-06-06 Thread Jarkko Sakkinen

On Thu Jun 6, 2024 at 1:30 AM EEST, Huang, Kai wrote:
>
> >> Reorg:
> >>
> >> void sgx_cgroup_init(void)
> >> {
> >> struct workqueue_struct *wq;
> >>
> >> /* eagerly allocate the workqueue: */
> >> wq = alloc_workqueue("sgx_cg_wq", wq_unbound | wq_freezable, 
> >> wq_unbound_max_active);
> >> if (!wq) {
> >>     pr_warn("sgx_cg_wq creation failed\n");
> >>     return;
> > 
> > sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we 
> > check and return 0 which was the initially implemented in v12. But then 
> > Kai had some concern on that we expose all the interface files to allow 
> > user to set limits but we don't enforce. To keep it simple we settled 
> > down back to BUG_ON(). 
>
> [...]
>
> > This would only happen rarely and user can add 
> > command-line to disable SGX if s/he really wants to start kernel in this 
> > case, just can't do SGX.
>
> Just to be clear that I don't like BUG_ON() either.  It's inevitable you 
> will get attention because of using it.

Just then plain disable SGX if it fails to start.

Do not take down the whole system.

> This is a compromise that you don't want to reset "capacity" to 0 when 
> alloc_workqueue() fails.

BUG_ON() is not a "compromise".

> There are existing code where BUG_ON() is used during the kernel early 
> boot code when memory allocation fails (e.g., see cgroup_init_subsys()), 
> so it might be acceptable to use BUG_ON() here, but it's up to 
> maintainers to decide whether it is OK.

When it is not possible continue to run the system at all, and only
then.

Here it is possible. Without SGX.

BR, Jarkko

Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing

2024-06-05 Thread Jarkko Sakkinen

On Wed Jun 5, 2024 at 6:33 PM EEST, Haitao Huang wrote:

> sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we  
> check and return 0 which was the initially implemented in v12. But then  
> Kai had some concern on that we expose all the interface files to allow  
> user to set limits but we don't enforce. To keep it simple we settled down  
  ~~

Sure:

"Keep it simple and corpse"

> back to BUG_ON(). This would only happen rarely and user can add  
> command-line to disable SGX if s/he really wants to start kernel in this  
> case, just can't do SGX.

Even disabling all of SGX would be a less catastrophical measure.

> Yes I had a comment but Kai thought it was too obvious and I can't think  
> of a better one that's not obvious so I removed:

Not great advice given. Please just document it. In patch, which
BUG_ON() I don't want to see my R-by in it, until I've reviewed an
updated version.

BR, Jarkko

Re: (subset) [PATCH 0/2] Add HTC One (M8) support

2024-06-05 Thread Bjorn Andersson



On Mon, 03 Jun 2024 02:28:55 -0400, Alexandre Messier wrote:
> Add an initial device tree to support the HTC One (M8) smartphone,
> aka "htc,m8".
> 
> 

Applied, thanks!

[2/2] ARM: dts: qcom: Add initial support for HTC One (M8)
  commit: 0e8a41e511c98f5f5796c0dca8ff983d1c967b93

Best regards,
-- 
Bjorn Andersson

回复: Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.

2024-06-05 Thread 崔涛

Sorry, this is a stupid mistake.
 
I wonder if the gfp parameter in static inline void *kmalloc(size_t s, gfp_t gfp) can be deleted if it is not used.
 
Or would be better to move memset to kmalloc.
Like this:
#define __GFP_ZERO 0x1
static inline void *kmalloc(size_t s, gfp_t gfp)
{
 void *p;
 if (__kmalloc_fake)
 return __kmalloc_fake;
 
 p = malloc(s);
 if (!p)
         return p;
 
 if (gfp & __GFP_ZERO)
 memset(p, 0, s);
 return p;
}


 

主　题：Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization. 日　期：2024-06-06 08:29 发件人：jasowang 收件人：崔涛;



On Wed, Jun 5, 2024 at 9:56 PM cuitao wrote:>> Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it,> without the need for an additional memset call.>> Signed-off-by: cuitao > ---> tools/virtio/linux/kernel.h | 5 +> 1 file changed, 1 insertion(+), 4 deletions(-)>> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h> index 6702008f7f5c..9e401fb7c215 100644> --- a/tools/virtio/linux/kernel.h> +++ b/tools/virtio/linux/kernel.h> @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp)>> static inline void *kzalloc(size_t s, gfp_t gfp)> {> - void *p = kmalloc(s, gfp);> -> - memset(p, 0, s);> - return p;> + return kmalloc(s, gfp | __GFP_ZERO);> }>> static inline void *alloc_pages_exact(size_t s, gfp_t gfp)> --> 2.25.1>Does this really work?extern void *__kmalloc_fake, *__kfree_ignore_start, *__kfree_ignore_end;static inline void *kmalloc(size_t s, gfp_t gfp){if (__kmalloc_fake)return __kmalloc_fake;return malloc(s);}Thanks

Re: [PATCH] fgraph: Remove some unused functions

2024-06-05 Thread Steven Rostedt

On Thu,  6 Jun 2024 10:10:53 +0800
Jiapeng Chong  wrote:

> These functions are defined in the fgraph.c file, but not
> called elsewhere, so delete these unused functions.
> 
> kernel/trace/fgraph.c:273:1: warning: unused function 'set_bitmap_bits'.
> kernel/trace/fgraph.c:259:19: warning: unused function 'get_fgraph_type'.
> 

Thanks, these are leftovers from the rewrite.

-- Steve

Re: [PATCH] ftrace: adding the missing parameter descriptions of unregister_ftrace_direct

2024-06-05 Thread Steven Rostedt

On Mon, 27 May 2024 21:50:46 -0300
MarileneGarcia  wrote:

The subject for the tracing subsystem should start with a capital letter,
but it is a bit confusing anyway. Should be:

   ftrace: Add missing kerneldoc parameters to unregister_ftrace_direct()


> Adding the description of the parameters addr and free_filters
> of the function unregister_ftrace_direct.

s/Adding/Add/  s/of the/to the/

> 
> Signed-off-by: MarileneGarcia 
> ---
> Hello, 
> These changes fix the following compiler warnings of the function
> unregister_ftrace_direct.
> 
> The warnings happen using GCC compiler, enabling the ftrace related 
> configs and using the command 'make W=1'.
> 
> kernel/trace/ftrace.c:5489: warning: Function parameter or struct member
> 'addr' not described in 'unregister_ftrace_direct'
> 
> kernel/trace/ftrace.c:5489: warning: Function parameter or struct member 
> 'free_filters' not described in 'unregister_ftrace_direct'
> 
>  kernel/trace/ftrace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 65208d3b5ed9..6062e4ce1957 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5475,6 +5475,8 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct);
>   * unregister_ftrace_direct - Remove calls to custom trampoline
>   * previously registered by register_ftrace_direct for @ops object.
>   * @ops: The address of the struct ftrace_ops object
> + * @addr: The address of the trampoline to call at @ops functions

This is the unregister function. The above sounds like it will be called
instead of no longer being called.

 @addr: The address of the direct function that are called by the @ops functions

> + * @free_filters: non zero to remove all filters for the ftrace_ops

It's a boolean value, there is no zero.

>   *
>   * This is used to remove a direct calls to @addr from the nop locations
>   * of the functions registered in @ops (with by ftrace_set_filter_ip


-- Steve

Re: [PATCH v8 2/5] soc: qcom: pdr: fix parsing of domains lists

2024-06-05 Thread Chris Lew





On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote:

While parsing the domains list, start offsets from 0 rather than from
domains_read. The domains_read is equal to the total count of the
domains we have seen, while the domains list in the message starts from
offset 0.

Fixes: fbe639b44a82 ("soc: qcom: Introduce Protection Domain Restart helpers")
Tested-by: Steev Klimaszewski 
Tested-by: Alexey Minnekhanov 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/soc/qcom/pdr_interface.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c
index e014dd2d8ab3..d495ee736519 100644
--- a/drivers/soc/qcom/pdr_interface.c
+++ b/drivers/soc/qcom/pdr_interface.c
@@ -422,7 +422,7 @@ static int pdr_locate_service(struct pdr_handle *pdr, 
struct pdr_service *pds)
if (ret < 0)
goto out;
  
-		for (i = domains_read; i < resp->domain_list_len; i++) {

+   for (i = 0; i < resp->domain_list_len; i++) {
entry = >domain_list[i];
  
  			if (strnlen(entry->name, sizeof(entry->name)) == sizeof(entry->name))




Reviewed-by: Chris Lew

Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.

2024-06-05 Thread Jason Wang

On Wed, Jun 5, 2024 at 9:56 PM cuitao  wrote:
>
> Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it,
> without the need for an additional memset call.
>
> Signed-off-by: cuitao 
> ---
>  tools/virtio/linux/kernel.h | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
> index 6702008f7f5c..9e401fb7c215 100644
> --- a/tools/virtio/linux/kernel.h
> +++ b/tools/virtio/linux/kernel.h
> @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, 
> gfp_t gfp)
>
>  static inline void *kzalloc(size_t s, gfp_t gfp)
>  {
> -   void *p = kmalloc(s, gfp);
> -
> -   memset(p, 0, s);
> -   return p;
> +   return kmalloc(s, gfp | __GFP_ZERO);
>  }
>
>  static inline void *alloc_pages_exact(size_t s, gfp_t gfp)
> --
> 2.25.1
>

Does this really work?

extern void *__kmalloc_fake, *__kfree_ignore_start, *__kfree_ignore_end;
static inline void *kmalloc(size_t s, gfp_t gfp)
{
  if (__kmalloc_fake)
return __kmalloc_fake;
return malloc(s);
}

Thanks

Re: [PATCH net-next V2] virtio-net: synchronize operstate with admin state on up/down

2024-06-05 Thread Jason Wang

On Fri, May 31, 2024 at 8:18 AM Jason Wang  wrote:
>
> On Thu, May 30, 2024 at 9:09 PM Michael S. Tsirkin  wrote:
> >
> > On Thu, May 30, 2024 at 06:29:51PM +0800, Jason Wang wrote:
> > > On Thu, May 30, 2024 at 2:10 PM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Thu, May 30, 2024 at 11:20:55AM +0800, Jason Wang wrote:
> > > > > This patch synchronize operstate with admin state per RFC2863.
> > > > >
> > > > > This is done by trying to toggle the carrier upon open/close and
> > > > > synchronize with the config change work. This allows propagate status
> > > > > correctly to stacked devices like:
> > > > >
> > > > > ip link add link enp0s3 macvlan0 type macvlan
> > > > > ip link set link enp0s3 down
> > > > > ip link show
> > > > >
> > > > > Before this patch:
> > > > >
> > > > > 3: enp0s3:  mtu 1500 qdisc pfifo_fast state DOWN 
> > > > > mode DEFAULT group default qlen 1000
> > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
> > > > > ..
> > > > > 5: macvlan0@enp0s3:  mtu 1500 
> > > > > qdisc noqueue state UP mode DEFAULT group default qlen 1000
> > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
> > > > >
> > > > > After this patch:
> > > > >
> > > > > 3: enp0s3:  mtu 1500 qdisc pfifo_fast state DOWN 
> > > > > mode DEFAULT group default qlen 1000
> > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
> > > > > ...
> > > > > 5: macvlan0@enp0s3:  mtu 
> > > > > 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default 
> > > > > qlen 1000
> > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
> > > > >
> > > > > Cc: Venkat Venkatsubra 
> > > > > Cc: Gia-Khanh Nguyen 
> > > > > Reviewed-by: Xuan Zhuo 
> > > > > Acked-by: Michael S. Tsirkin 
> > > > > Signed-off-by: Jason Wang 
> > > > > ---
> > > > > Changes since V1:
> > > > > - rebase
> > > > > - add ack/review tags
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > ---
> > > > >  drivers/net/virtio_net.c | 94 
> > > > > +++-
> > > > >  1 file changed, 63 insertions(+), 31 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 4a802c0ea2cb..69e4ae353c51 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -433,6 +433,12 @@ struct virtnet_info {
> > > > >   /* The lock to synchronize the access to refill_enabled */
> > > > >   spinlock_t refill_lock;
> > > > >
> > > > > + /* Is config change enabled? */
> > > > > + bool config_change_enabled;
> > > > > +
> > > > > + /* The lock to synchronize the access to config_change_enabled 
> > > > > */
> > > > > + spinlock_t config_change_lock;
> > > > > +
> > > > >   /* Work struct for config space updates */
> > > > >   struct work_struct config_work;
> > > > >
> > > >
> > > >
> > > > But we already have dev->config_lock and dev->config_enabled.
> > > >
> > > > And it actually works better - instead of discarding config
> > > > change events it defers them until enabled.
> > > >
> > >
> > > Yes but then both virtio-net driver and virtio core can ask to enable
> > > and disable and then we need some kind of synchronization which is
> > > non-trivial.
> >
> > Well for core it happens on bring up path before driver works
> > and later on tear down after it is gone.
> > So I do not think they ever do it at the same time.
>
> For example, there could be a suspend/resume when the admin state is down.
>
> >
> >
> > > And device enabling on the core is different from bringing the device
> > > up in the networking subsystem. Here we just delay to deal with the
> > > config change interrupt on ndo_open(). (E.g try to ack announce is
> > > meaningless when the device is down).
> > >
> > > Thanks
> >
> > another thing is that it is better not to re-read all config
> > on link up if there was no config interrupt - less vm exits.
>
> Yes, but it should not matter much as it's done in the ndo_open().

Michael, any more comments on this?

Please confirm if this patch is ok or not. If you prefer to reuse the
config_disable() I can change it from a boolean to a counter that
allows to be nested.

Thanks

>
> Thanks
>
> >
> > --
> > MST
> >

Re: [PATCH v3 13/27] function_graph: Add pid tracing back to function graph tracer

2024-06-05 Thread Steven Rostedt

On Mon, 03 Jun 2024 15:07:17 -0400
Steven Rostedt  wrote:

> +++ b/kernel/trace/ftrace.c
> @@ -100,7 +100,7 @@ struct ftrace_ops *function_trace_op __read_mostly = 
> _list_end;
>  /* What to set function_trace_op to */
>  static struct ftrace_ops *set_function_trace_op;
>  
> -static bool ftrace_pids_enabled(struct ftrace_ops *ops)
> +bool ftrace_pids_enabled(struct ftrace_ops *ops)
>  {
>   struct trace_array *tr;
>  
> @@ -402,10 +402,11 @@ static void ftrace_update_pid_func(void)
>   if (op->flags & FTRACE_OPS_FL_PID) {
>   op->func = ftrace_pids_enabled(op) ?
>   ftrace_pid_func : op->saved_func;
> - ftrace_update_trampoline(op);

Bah, this patch accidentally removed the above function and broke pid
tracing. Hmm, not sure why this still passed the tests. Will investigate.

-- Steve

>   }
>   } while_for_each_ftrace_op(op);
>  
> + fgraph_update_pid_func();
> +
>   update_ftrace_function();
>  }
>

Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb

2024-06-05 Thread Steven Rostedt

On Tue, 4 Jun 2024 14:47:38 -0700
Yan Zhai  wrote:

> skb does not include enough information to find out receiving
> sockets/services and netns/containers on packet drops. In theory
> skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
> stack for OOO packet lookup. Similarly, skb->sk often identifies a local
> sender, and tells nothing about a receiver.
> 
> Allow passing an extra receiving socket to the tracepoint to improve
> the visibility on receiving drops.
> 
> Signed-off-by: Yan Zhai 
> ---
> v2->v3: fixed drop_monitor function prototype
> ---
>  include/trace/events/skb.h | 11 +++
>  net/core/dev.c |  2 +-
>  net/core/drop_monitor.c|  9 ++---
>  net/core/skbuff.c  |  2 +-
>  4 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index 07e0715628ec..aa6b46b6172c 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN)
>  TRACE_EVENT(kfree_skb,
>  
>   TP_PROTO(struct sk_buff *skb, void *location,
> -  enum skb_drop_reason reason),
> +  enum skb_drop_reason reason, struct sock *rx_sk),
>  
> - TP_ARGS(skb, location, reason),
> + TP_ARGS(skb, location, reason, rx_sk),
>  
>   TP_STRUCT__entry(
>   __field(void *, skbaddr)
>   __field(void *, location)
>   __field(unsigned short, protocol)
>   __field(enum skb_drop_reason,   reason)
> + __field(void *, rx_skaddr)

Please add the pointer after the other pointers:

__field(void *, skbaddr)
__field(void *, location)
+   __field(void *, rx_skaddr)
__field(unsigned short, protocol)
__field(enum skb_drop_reason,   reason)

otherwise you are adding holes in the ring buffer event.

The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We
want to avoid alignment holes. I also question having a short before the
enum, if the emum is 4 bytes. The short should be at the end.

In fact, looking at the format file, there is a 2 byte hole:

 # cat /sys/kernel/tracing/events/skb/kfree_skb/format

name: kfree_skb
ID: 1799
format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;

field:void * skbaddr;   offset:8;   size:8; signed:0;
field:void * location;  offset:16;  size:8; signed:0;
field:unsigned short protocol;  offset:24;  size:2; signed:0;
field:enum skb_drop_reason reason;  offset:28;  size:4; 
signed:0;

Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts
at offset 28. This means at offset 26, there's a 2 byte hole.

-- Steve



>   ),
>  
>   TP_fast_assign(
> @@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb,
>   __entry->location = location;
>   __entry->protocol = ntohs(skb->protocol);
>   __entry->reason = reason;
> + __entry->rx_skaddr = rx_sk;
>   ),
>

Re: [PATCH v2 0/5] ftrace: Clean up and comment code

2024-06-05 Thread Google

On Wed, 05 Jun 2024 14:03:34 -0400
Steven Rostedt  wrote:

> While working on the function_graph multiple users code, I realized
> that I was struggling with how the ftrace code worked. Being the
> author of such code meant that it wasn't very intuitive. Namely, the
> function names were not descriptive enough, or at least, they needed
> comments.
> 
> This series moves to solve some of that via changing a couple function
> names and parameters and adding comments to many of them.
> 

This series looks good to me.

Acked-by: Masami Hiramatsu (Google) 

for this series.

Thanks!

> There's more to do, but this at least moves it in the right direction.
> 
> Changes since v1: 
> https://lore.kernel.org/all/20240604212817.384103...@goodmis.org/
> 
> - While working on v1 and responding to a comment from Mark Rutland about
>   the usage of "ftrace_hash" in the __ftrace_hash_rec_update() code,
>   I realized that the function does pretty much the same thing if
>   it is set or not set (but slightly differently). It turns out that
>   it isn't needed and that parameter can be removed, making the code
>   simpler.
> 
> - Fixed some wording and typos suggested by Mark Rutland.
> 
> Steven Rostedt (Google) (5):
>   ftrace: Rename dup_hash() and comment it
>   ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update()
>   ftrace: Add comments to ftrace_hash_rec_disable/enable()
>   ftrace: Convert "inc" parameter to bool in 
> ftrace_hash_rec_update_modify()
>   ftrace: Add comments to ftrace_hash_move() and friends
> 
> 
>  kernel/trace/ftrace.c | 161 
> +-
>  1 file changed, 94 insertions(+), 67 deletions(-)


-- 
Masami Hiramatsu (Google)

Re: [PATCH v2 1/5] ftrace: Rename dup_hash() and comment it

2024-06-05 Thread Google

On Wed, 05 Jun 2024 14:03:35 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> The name "dup_hash()" is a misnomer as it does not duplicate the hash that
> is passed in, but instead moves its entities from that hash to a newly
> allocated one. Rename it to "__move_hash()" (using starting underscores as
> it is an internal function), and add some comments about what it does.
> 

Looks good to me.

Acked-by: Masami Hiramatsu (Google) 

> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/ftrace.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index da7e6abf48b4..9dcdefe9d1aa 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, 
> int filter_hash);
>  static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops,
>  struct ftrace_hash *new_hash);
>  
> -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size)
> +/*
> + * Allocate a new hash and remove entries from @src and move them to the new 
> hash.
> + * On success, the @src hash will be empty and should be freed.
> + */
> +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size)
>  {
>   struct ftrace_func_entry *entry;
>   struct ftrace_hash *new_hash;
> @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src)
>   if (ftrace_hash_empty(src))
>   return EMPTY_HASH;
>  
> - return dup_hash(src, size);
> + return __move_hash(src, size);
>  }
>  
>  static int
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH v8 1/5] soc: qcom: pdr: protect locator_addr with the main mutex

2024-06-05 Thread Chris Lew


Hi Dmitry,

On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote:
...

@@ -76,12 +76,12 @@ static int pdr_locator_new_server(struct qmi_handle *qmi,
  locator_hdl);
struct pdr_service *pds;
  
+	mutex_lock(>lock);

/* Create a local client port for QMI communication */
pdr->locator_addr.sq_family = AF_QIPCRTR;
pdr->locator_addr.sq_node = svc->node;
pdr->locator_addr.sq_port = svc->port;
  
-	mutex_lock(>lock);

pdr->locator_init_complete = true;
mutex_unlock(>lock);
  
@@ -104,10 +104,10 @@ static void pdr_locator_del_server(struct qmi_handle *qmi,
  
  	mutex_lock(>lock);

pdr->locator_init_complete = false;
-   mutex_unlock(>lock);
  
  	pdr->locator_addr.sq_node = 0;

pdr->locator_addr.sq_port = 0;
+   mutex_unlock(>lock);
  }
  
  static const struct qmi_ops pdr_locator_ops = {

@@ -365,6 +365,7 @@ static int pdr_get_domain_list(struct 
servreg_get_domain_list_req *req,
if (ret < 0)
return ret;
  
+	mutex_lock(>lock);

ret = qmi_send_request(>locator_hdl,
   >locator_addr,
   , SERVREG_GET_DOMAIN_LIST_REQ,
@@ -373,15 +374,16 @@ static int pdr_get_domain_list(struct 
servreg_get_domain_list_req *req,
   req);
if (ret < 0) {
qmi_txn_cancel();
-   return ret;
+   goto err_unlock;
}
  
  	ret = qmi_txn_wait(, 5 * HZ);

if (ret < 0) {
pr_err("PDR: %s get domain list txn wait failed: %d\n",
   req->service_name, ret);
-   return ret;
+   goto err_unlock;
}
+   mutex_unlock(>lock);


I'm not sure it is necessary to hold the the mutex during the 
qmi_txn_wait() since the only variable we are trying to protect is 
locator_addr.


Wouldn't this delay other work like new/del server notifications if this 
qmi service is delayed or non-responsive?


Thanks,
Chris

  
  	if (resp->resp.result != QMI_RESULT_SUCCESS_V01) {

pr_err("PDR: %s get domain list failed: 0x%x\n",
@@ -390,6 +392,11 @@ static int pdr_get_domain_list(struct 
servreg_get_domain_list_req *req,
}
  
  	return 0;

+
+err_unlock:
+   mutex_unlock(>lock);
+
+   return ret;
  }

Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing

2024-06-05 Thread Huang, Kai





Reorg:

void sgx_cgroup_init(void)
{
struct workqueue_struct *wq;

/* eagerly allocate the workqueue: */
wq = alloc_workqueue("sgx_cg_wq", wq_unbound | wq_freezable, 
wq_unbound_max_active);

if (!wq) {
    pr_warn("sgx_cg_wq creation failed\n");
    return;


sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we 
check and return 0 which was the initially implemented in v12. But then 
Kai had some concern on that we expose all the interface files to allow 
user to set limits but we don't enforce. To keep it simple we settled 
down back to BUG_ON(). 


[...]

This would only happen rarely and user can add 
command-line to disable SGX if s/he really wants to start kernel in this 
case, just can't do SGX.


Just to be clear that I don't like BUG_ON() either.  It's inevitable you 
will get attention because of using it.


This is a compromise that you don't want to reset "capacity" to 0 when 
alloc_workqueue() fails.


There are existing code where BUG_ON() is used during the kernel early 
boot code when memory allocation fails (e.g., see cgroup_init_subsys()), 
so it might be acceptable to use BUG_ON() here, but it's up to 
maintainers to decide whether it is OK.


[...]


With "--strict" flag I also catched these:

CHECK: spinlock_t definition without comment
#1308: FILE: arch/x86/kernel/cpu/sgx/sgx.h:122:
+    spinlock_t lock;

Yes I had a comment but Kai thought it was too obvious and I can't think 
of a better one that's not obvious so I removed:


https://lore.kernel.org/linux-sgx/9ffb02a3344807f2c173fe8c7cb000cd6c7843b6.ca...@intel.com/

To be clear, my reply was really about the comment itself isn't useful, 
but didn't say we shouldn't use a comment here.





CHECK: multiple assignments should be avoided
#444: FILE: kernel/cgroup/misc.c:450:
+    parent_cg = cg = _cg;



This was also suggested by Kai a few versions back:
https://lore.kernel.org/linux-sgx/8f08f0b0f2b04b90d7cdb7b628f16f9080687c43.ca...@intel.com/



I didn't know checkpatch complains this.  Feel free to revert back as it 
is trivial to me.

Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()

2024-06-05 Thread Steven Rostedt

On Thu, 6 Jun 2024 06:50:18 +0900
Masami Hiramatsu (Google)  wrote:

> On Tue, 04 Jun 2024 17:28:20 -0400
> Steven Rostedt  wrote:
> 
> > From: "Steven Rostedt (Google)" 
> > 
> > The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable()
> > always has 1 passed to its "ftrace_hash" parameter. Remove the parameter
> > and pass in true to __ftrace_hash_rec_update().
> > 
> > Also add some comments to both those functions explaining what they do.
> >   
> 
> Looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) 

Thanks Masami, but I sent out a v2. Could you review those patches
instead?

https://lore.kernel.org/all/20240605180334.090848...@goodmis.org/

-- Steve

Re: [PATCH 0/6] ftrace: Minor fixes for sparse and kernel test robot

2024-06-05 Thread Google

On Wed, 05 Jun 2024 16:26:44 -0400
Steven Rostedt  wrote:

> 
> Recieved some minor bug reports from the kernel test robot. First I started
> cleaning up some of the sparse warnings. There's many more, but most changes
> are not really helping anything, but just quieting the warnings.
> 
> But the reports from kernel test robot need to be fixed.

All looks good to me.

Acked-by: Masami Hiramatsu (Google) 

Thank you!

> 
> Steven Rostedt (Google) (6):
>   ftrace: Declare function_trace_op in header to quiet sparse warning
>   ftrace: Assign ftrace_list_end to ftrace_ops_list type cast to RCU
>   ftrace: Assign RCU list variable with rcu_assign_ptr()
>   ftrace: Fix prototypes for ftrace_startup/shutdown_subops()
>   function_graph: Make fgraph_do_direct static key static
>   function_graph: Do not update pid func if CONFIG_DYNAMIC_FTRACE not 
> enabled
> 
> 
>  include/linux/ftrace.h | 3 +++
>  kernel/trace/fgraph.c  | 4 +++-
>  kernel/trace/ftrace.c  | 4 ++--
>  kernel/trace/ftrace_internal.h | 9 +
>  kernel/trace/trace.h   | 1 -
>  5 files changed, 17 insertions(+), 4 deletions(-)


-- 
Masami Hiramatsu (Google)

Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()

2024-06-05 Thread Google

On Tue, 04 Jun 2024 17:28:20 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable()
> always has 1 passed to its "ftrace_hash" parameter. Remove the parameter
> and pass in true to __ftrace_hash_rec_update().
> 
> Also add some comments to both those functions explaining what they do.
> 

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) 

Thank you,

> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/ftrace.c | 24 
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 93c7c5fd4249..de652201c86c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1895,16 +1895,24 @@ static bool __ftrace_hash_rec_update(struct 
> ftrace_ops *ops,
>   return update;
>  }
>  
> -static bool ftrace_hash_rec_disable(struct ftrace_ops *ops,
> - int filter_hash)
> +/*
> + * This is called when an ops is removed from tracing. It will decrement
> + * the counters of the dyn_ftrace records for all the functions that
> + * the @ops attached to.
> + */
> +static bool ftrace_hash_rec_disable(struct ftrace_ops *ops)
>  {
> - return __ftrace_hash_rec_update(ops, filter_hash, 0);
> + return __ftrace_hash_rec_update(ops, true, 0);
>  }
>  
> -static bool ftrace_hash_rec_enable(struct ftrace_ops *ops,
> -int filter_hash)
> +/*
> + * This is called when an ops is added to tracing. It will increment
> + * the counters of the dyn_ftrace records for all the functions that
> + * the @ops attached to.
> + */
> +static bool ftrace_hash_rec_enable(struct ftrace_ops *ops)
>  {
> - return __ftrace_hash_rec_update(ops, filter_hash, 1);
> + return __ftrace_hash_rec_update(ops, true, 1);
>  }
>  
>  static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops,
> @@ -3062,7 +3070,7 @@ int ftrace_startup(struct ftrace_ops *ops, int command)
>   return ret;
>   }
>  
> - if (ftrace_hash_rec_enable(ops, 1))
> + if (ftrace_hash_rec_enable(ops))
>   command |= FTRACE_UPDATE_CALLS;
>  
>   ftrace_startup_enable(command);
> @@ -3104,7 +3112,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
>   /* Disabling ipmodify never fails */
>   ftrace_hash_ipmodify_disable(ops);
>  
> - if (ftrace_hash_rec_disable(ops, 1))
> + if (ftrace_hash_rec_disable(ops))
>   command |= FTRACE_UPDATE_CALLS;
>  
>   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH 1/5] ftrace: Rename dup_hash() and comment it

2024-06-05 Thread Google

On Tue, 04 Jun 2024 17:28:18 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> The name "dup_hash()" is a misnomer as it does not duplicate the hash that
> is passed in, but instead moves its entities from that hash to a newly
> allocated one. Rename it to "__move_hash()" (using starting underscores as
> it is an internal function), and add some comments about what it does.

Good change.

Reviewed-by: Masami Hiramatsu (Google) 

Thank you,

> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/ftrace.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index da7e6abf48b4..9dcdefe9d1aa 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, 
> int filter_hash);
>  static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops,
>  struct ftrace_hash *new_hash);
>  
> -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size)
> +/*
> + * Allocate a new hash and remove entries from @src and move them to the new 
> hash.
> + * On success, the @src hash will be empty and should be freed.
> + */
> +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size)
>  {
>   struct ftrace_func_entry *entry;
>   struct ftrace_hash *new_hash;
> @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src)
>   if (ftrace_hash_empty(src))
>   return EMPTY_HASH;
>  
> - return dup_hash(src, size);
> + return __move_hash(src, size);
>  }
>  
>  static int
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Oleg Nesterov

On 06/05, Andrii Nakryiko wrote:
>
> WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO.

Andrii. I am alredy sleeping, I'll try to read your email tomorrow.
Right now I can only say that everything is simpler than the shadow stack ;)

> P.S. Regardless, maybe we should change the order in which we insert
> consumers to uprobe? Right now uprobe consumer added later will be
> executed first, which, while not wrong, is counter-intuitive.

Agreed...

Even if currently this doesn't really matter, I guess it is supposed
that uc->handler() is "non-intrusive".

Oleg.

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Jiri Olsa

On Wed, Jun 05, 2024 at 01:47:00PM -0700, Andrii Nakryiko wrote:
> On Wed, Jun 5, 2024 at 10:57 AM Oleg Nesterov  wrote:
> >
> > On 06/05, Andrii Nakryiko wrote:
> > >
> > > so any such
> > > limitations will cause problems, issue reports, investigation, etc.
> >
> > Agreed...
> >
> > > As one possible solution, what if we do
> > >
> > > struct return_instance {
> > > ...
> > > u64 session_cookies[];
> > > };
> > >
> > > and allocate sizeof(struct return_instance) + 8 *
> > >  and then at runtime pass
> > > _cookies[i] as data pointer to session-aware callbacks?
> >
> > I too thought about this, but I guess it is not that simple.
> >
> > Just for example. Suppose we have 2 session-consumers C1 and C2.
> > What if uprobe_unregister(C1) comes before the probed function
> > returns?
> >
> > We need something like map_cookie_to_consumer().
> 
> Fair enough. The easy way to solve this is to have
> 
> 
> struct uprobe_session_cookie {
> int consumer_id;
> u64 cookie;
> };
> 
> And add id to each new consumer when it is added to struct uprobe.
> Unfortunately, it's impossible to tell when a new consumer was added
> to the list (as a front item, but maybe we just change it to be
> appended instead of prepending) vs when the old consumer was removed,
> so in some cases we'd need to do a linear search.

also we probably need to add the flag if we want to execute the return
handler..  we can have multiple session handlers and if just one of them
returns 0 we need to install the return probe

and then when return probe hits, we need to execute only that consumer's
return handler

jirka

> 
> But the good news is that in the common case we wouldn't need to
> search and the next item in session_cookies[] array would be the one
> we need.
> 
> WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO.
> 
> P.S. Regardless, maybe we should change the order in which we insert
> consumers to uprobe? Right now uprobe consumer added later will be
> executed first, which, while not wrong, is counter-intuitive. And also
> it breaks a nice natural order when we need to match it up with stuff
> like session_cookies[] as described above.
> 
> >
> > > > +   /* The handler_session callback return value controls execution 
> > > > of
> > > > +* the return uprobe and ret_handler_session callback.
> > > > +*  0 on success
> > > > +*  1 on failure, DO NOT install/execute the return uprobe
> > > > +*console warning for anything else
> > > > +*/
> > > > +   int (*handler_session)(struct uprobe_consumer *self, struct 
> > > > pt_regs *regs,
> > > > +  unsigned long *data);
> > > > +   int (*ret_handler_session)(struct uprobe_consumer *self, 
> > > > unsigned long func,
> > > > +  struct pt_regs *regs, unsigned long 
> > > > *data);
> > > > +
> > >
> > > We should try to avoid an alternative set of callbacks, IMO. Let's
> > > extend existing ones with `unsigned long *data`,
> >
> > Oh yes, agreed.
> >
> > And the comment about the return value looks confusing too. I mean, the
> > logic doesn't differ from the ret-code from ->handler().
> >
> > "DO NOT install/execute the return uprobe" is not true if another
> > non-session-consumer returns 0.
> >
> > Oleg.
> >

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Oleg Nesterov

On 06/05, Jiri Olsa wrote:
>
> > And the comment about the return value looks confusing too. I mean, the
> > logic doesn't differ from the ret-code from ->handler().
> >
> > "DO NOT install/execute the return uprobe" is not true if another
> > non-session-consumer returns 0.
>
> well they are meant to be exclusive, so there'd be no other 
> non-session-consumer

OK. (but may be the changelog can explain more clearly why they can't
co-exist with the non-session-consumers).

But again, this doesn't differ from the the ret-code from the
non-session-consumer->handler().

If it returns 1 == UPROBE_HANDLER_REMOVE, then without other consumers
prepare_uretprobe() won't be called.

Oleg.

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Jiri Olsa

On Wed, Jun 05, 2024 at 10:25:56AM -0700, Andrii Nakryiko wrote:

SNIP

> > ---
> >  include/linux/uprobes.h | 18 +++
> >  kernel/events/uprobes.c | 69 +++--
> >  2 files changed, 78 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> > index f46e0ca0169c..a2f2d5ac3cee 100644
> > --- a/include/linux/uprobes.h
> > +++ b/include/linux/uprobes.h
> > @@ -34,6 +34,12 @@ enum uprobe_filter_ctx {
> >  };
> >
> >  struct uprobe_consumer {
> > +   /*
> > +* The handler callback return value controls removal of the uprobe.
> > +*  0 on success, uprobe stays
> > +*  1 on failure, remove the uprobe
> > +*console warning for anything else
> > +*/
> > int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
> > int (*ret_handler)(struct uprobe_consumer *self,
> > unsigned long func,
> > @@ -42,6 +48,17 @@ struct uprobe_consumer {
> > enum uprobe_filter_ctx ctx,
> > struct mm_struct *mm);
> >
> > +   /* The handler_session callback return value controls execution of
> > +* the return uprobe and ret_handler_session callback.
> > +*  0 on success
> > +*  1 on failure, DO NOT install/execute the return uprobe
> > +*console warning for anything else
> > +*/
> > +   int (*handler_session)(struct uprobe_consumer *self, struct pt_regs 
> > *regs,
> > +  unsigned long *data);
> > +   int (*ret_handler_session)(struct uprobe_consumer *self, unsigned 
> > long func,
> > +  struct pt_regs *regs, unsigned long 
> > *data);
> > +
> 
> We should try to avoid an alternative set of callbacks, IMO. Let's
> extend existing ones with `unsigned long *data`, but specify that
> unless consumer sets some flag on registration that it needs a session
> cookie, we'll pass NULL here? Or just allocate cookie data for each
> registered consumer for simplicity, don't know; given we don't expect
> many consumers on exactly the same uprobe, it might be ok to keep it
> simple.
>

ah, I did not want to break existing users.. but it's not uapi,
so we're good, ok makes sense

jirka
 
> 
> > struct uprobe_consumer *next;
> >  };
> >
> > @@ -85,6 +102,7 @@ struct return_instance {
> > unsigned long   func;
> > unsigned long   stack;  /* stack pointer */
> > unsigned long   orig_ret_vaddr; /* original return address 
> > */
> > +   unsigned long   data;
> > boolchained;/* true, if instance is 
> > nested */
> >
> > struct return_instance  *next;  /* keep as stack */

SNIP

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Jiri Olsa

On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote:
> On 06/05, Andrii Nakryiko wrote:
> >
> > so any such
> > limitations will cause problems, issue reports, investigation, etc.
> 
> Agreed...
> 
> > As one possible solution, what if we do
> >
> > struct return_instance {
> > ...
> > u64 session_cookies[];
> > };
> >
> > and allocate sizeof(struct return_instance) + 8 *
> >  and then at runtime pass
> > _cookies[i] as data pointer to session-aware callbacks?
> 
> I too thought about this, but I guess it is not that simple.
> 
> Just for example. Suppose we have 2 session-consumers C1 and C2.
> What if uprobe_unregister(C1) comes before the probed function
> returns?
> 
> We need something like map_cookie_to_consumer().

I guess we could have hash table in return_instance that gets 'consumer -> 
cookie' ?

return instance is freed after the consumers' return handlers are executed,
so there's no leak if some consumer gets unregistered before that

> 
> > > +   /* The handler_session callback return value controls execution of
> > > +* the return uprobe and ret_handler_session callback.
> > > +*  0 on success
> > > +*  1 on failure, DO NOT install/execute the return uprobe
> > > +*console warning for anything else
> > > +*/
> > > +   int (*handler_session)(struct uprobe_consumer *self, struct 
> > > pt_regs *regs,
> > > +  unsigned long *data);
> > > +   int (*ret_handler_session)(struct uprobe_consumer *self, unsigned 
> > > long func,
> > > +  struct pt_regs *regs, unsigned long 
> > > *data);
> > > +
> >
> > We should try to avoid an alternative set of callbacks, IMO. Let's
> > extend existing ones with `unsigned long *data`,
> 
> Oh yes, agreed.
> 
> And the comment about the return value looks confusing too. I mean, the
> logic doesn't differ from the ret-code from ->handler().
> 
> "DO NOT install/execute the return uprobe" is not true if another
> non-session-consumer returns 0.

well they are meant to be exclusive, so there'd be no other non-session-consumer

jirka

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Andrii Nakryiko

On Wed, Jun 5, 2024 at 10:57 AM Oleg Nesterov  wrote:
>
> On 06/05, Andrii Nakryiko wrote:
> >
> > so any such
> > limitations will cause problems, issue reports, investigation, etc.
>
> Agreed...
>
> > As one possible solution, what if we do
> >
> > struct return_instance {
> > ...
> > u64 session_cookies[];
> > };
> >
> > and allocate sizeof(struct return_instance) + 8 *
> >  and then at runtime pass
> > _cookies[i] as data pointer to session-aware callbacks?
>
> I too thought about this, but I guess it is not that simple.
>
> Just for example. Suppose we have 2 session-consumers C1 and C2.
> What if uprobe_unregister(C1) comes before the probed function
> returns?
>
> We need something like map_cookie_to_consumer().

Fair enough. The easy way to solve this is to have

struct uprobe_session_cookie {
int consumer_id;
u64 cookie;
};

And add id to each new consumer when it is added to struct uprobe.
Unfortunately, it's impossible to tell when a new consumer was added
to the list (as a front item, but maybe we just change it to be
appended instead of prepending) vs when the old consumer was removed,
so in some cases we'd need to do a linear search.

But the good news is that in the common case we wouldn't need to
search and the next item in session_cookies[] array would be the one
we need.

WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO.

P.S. Regardless, maybe we should change the order in which we insert
consumers to uprobe? Right now uprobe consumer added later will be
executed first, which, while not wrong, is counter-intuitive. And also
it breaks a nice natural order when we need to match it up with stuff
like session_cookies[] as described above.

>
> > > +   /* The handler_session callback return value controls execution of
> > > +* the return uprobe and ret_handler_session callback.
> > > +*  0 on success
> > > +*  1 on failure, DO NOT install/execute the return uprobe
> > > +*console warning for anything else
> > > +*/
> > > +   int (*handler_session)(struct uprobe_consumer *self, struct 
> > > pt_regs *regs,
> > > +  unsigned long *data);
> > > +   int (*ret_handler_session)(struct uprobe_consumer *self, unsigned 
> > > long func,
> > > +  struct pt_regs *regs, unsigned long 
> > > *data);
> > > +
> >
> > We should try to avoid an alternative set of callbacks, IMO. Let's
> > extend existing ones with `unsigned long *data`,
>
> Oh yes, agreed.
>
> And the comment about the return value looks confusing too. I mean, the
> logic doesn't differ from the ret-code from ->handler().
>
> "DO NOT install/execute the return uprobe" is not true if another
> non-session-consumer returns 0.
>
> Oleg.
>

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Jiri Olsa

On Wed, Jun 05, 2024 at 06:36:25PM +0200, Oleg Nesterov wrote:
> On 06/05, Oleg Nesterov wrote:
> >
> > On 06/05, Oleg Nesterov wrote:
> > >
> > > > +/*
> > > > + * Make sure all the uprobe consumers have only one type of entry
> > > > + * callback registered (either handler or handler_session) due to
> > > > + * different return value actions.
> > > > + */
> > > > +static int consumer_check(struct uprobe_consumer *curr, struct 
> > > > uprobe_consumer *uc)
> > > > +{
> > > > +   if (!curr)
> > > > +   return 0;
> > > > +   if (curr->handler_session || uc->handler_session)
> > > > +   return -EBUSY;
> > > > +   return 0;
> > > > +}
> > >
> > > Hmm, I don't understand this code, it doesn't match the comment...
> > >
> > > The comment says "all the uprobe consumers have only one type" but
> > > consumer_check() will always fail if the the 1st or 2nd consumer has
> > > ->handler_session != NULL ?
> > >
> > > Perhaps you meant
> > >
> > >   if (!!curr->handler != !!uc->handler)
> > >   return -EBUSY;
> > >
> > > ?
> >
> > OK, the changelog says
> >
> > Which means that there can be only single user of a uprobe (inode +
> > offset) when session consumer is registered to it.
> >
> > so the code is correct. But I still think the comment is misleading.
> 
> Cough... perhaps it is correct but I am still confused even we forget about
> the comment ;)
> 
> OK, uprobe can have a single consumer with ->handler_session != NULL. I guess
> this is because return_instance->data is "global".
> 
> So uprobe can have multiple handler_session == NULL consumers before
> handler_session != NULL, but not after ?

ah yea it should have done what's in the comment, so it's missing
the check for handler.. session handlers are meant to be exclusive

thanks,
jirka

Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()

2024-06-05 Thread Steven Rostedt

On Wed, 5 Jun 2024 11:17:31 +0100
Mark Rutland  wrote:

> On Tue, Jun 04, 2024 at 05:28:20PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" 
> > 
> > The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable()
> > always has 1 passed to its "ftrace_hash" parameter. Remove the parameter
> > and pass in true to __ftrace_hash_rec_update().
> > 
> > Also add some comments to both those functions explaining what they do.
> > 
> > Signed-off-by: Steven Rostedt (Google)   
> 
> Looks good to me.
> 
> Acked-by: Mark Rutland 

I removed your Ack from v2 as it changed enough that I believe it
requires a new Ack.

-- Steve

Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer

2024-06-05 Thread Oleg Nesterov

On 06/05, Andrii Nakryiko wrote:
>
> so any such
> limitations will cause problems, issue reports, investigation, etc.

Agreed...

> As one possible solution, what if we do
>
> struct return_instance {
> ...
> u64 session_cookies[];
> };
>
> and allocate sizeof(struct return_instance) + 8 *
>  and then at runtime pass
> _cookies[i] as data pointer to session-aware callbacks?

I too thought about this, but I guess it is not that simple.

Just for example. Suppose we have 2 session-consumers C1 and C2.
What if uprobe_unregister(C1) comes before the probed function
returns?

We need something like map_cookie_to_consumer().

> > +   /* The handler_session callback return value controls execution of
> > +* the return uprobe and ret_handler_session callback.
> > +*  0 on success
> > +*  1 on failure, DO NOT install/execute the return uprobe
> > +*console warning for anything else
> > +*/
> > +   int (*handler_session)(struct uprobe_consumer *self, struct pt_regs 
> > *regs,
> > +  unsigned long *data);
> > +   int (*ret_handler_session)(struct uprobe_consumer *self, unsigned 
> > long func,
> > +  struct pt_regs *regs, unsigned long 
> > *data);
> > +
>
> We should try to avoid an alternative set of callbacks, IMO. Let's
> extend existing ones with `unsigned long *data`,

Oh yes, agreed.

And the comment about the return value looks confusing too. I mean, the
logic doesn't differ from the ret-code from ->handler().

"DO NOT install/execute the return uprobe" is not true if another
non-session-consumer returns 0.

Oleg.

Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support

2024-06-05 Thread Tanmay Shah




On 6/4/24 3:23 PM, Bjorn Andersson wrote:
> On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote:
>> It is possible that remote processor is already running before
>> linux boot or remoteproc platform driver probe. Implement required
>> remoteproc framework ops to provide resource table address and
>> connect or disconnect with remote processor in such case.
>> 
> 
> I know if changes the current style for this driver, but could we drop
> "drivers: " from the subject prefix, to align changes to this driver
> with others?
> 

Ack. Will be fixed. Is it convention not to have "drivers" ?
If so, from now on, I will format subject prefix: /:

> Regards,
> Bjorn

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2410614 matches

Mail list logo