Re: [PATCH] kallsyms, livepatch: Fix livepatch with CONFIG_LTO_CLANG
Hi, On Tue, 4 Jun 2024, Song Liu wrote: > With CONFIG_LTO_CLANG, the compiler may postfix symbols with .llvm. > to avoid symbol duplication. scripts/kallsyms.c sorted the symbols > without these postfixes. The default symbol lookup also removes these > postfixes before comparing symbols. > > On the other hand, livepatch need to look up symbols with the full names. > However, calling kallsyms_on_each_match_symbol with full name (with the > postfix) cannot find the symbol(s). As a result, we cannot livepatch > kernel functions with .llvm. postfix or kernel functions that use > relocation information to symbols with .llvm. postfixes. > > Fix this by calling kallsyms_on_each_match_symbol without the postfix; > and then match the full name (with postfix) in klp_match_callback. > > Signed-off-by: Song Liu > --- > include/linux/kallsyms.h | 13 + > kernel/kallsyms.c| 21 - > kernel/livepatch/core.c | 32 +++- > 3 files changed, 60 insertions(+), 6 deletions(-) I do not like much that something which seems to be kallsyms-internal is leaked out. You need to export cleanup_symbol_name() and there is now a lot of code outside. I would feel much more comfortable if it is all hidden from kallsyms users and kept there. Would it be possible? Moreover, isn't there a similar problem for ftrace, kprobes, ebpf,...? Thank you, Miroslav
Re: [PATCH v2] rpmsg: qcom_smd: Improve error handling for qcom_smd_parse_edge
On Thu, Jun 06, 2024 at 09:01:36PM +0200, Luca Weiss wrote: > When the mailbox driver has not probed yet, the error message "failed to > parse smd edge" is just going to confuse users, so improve the error > prints a bit. > > Cover the last remaining exits from qcom_smd_parse_edge with proper > error prints, especially the one for the mbox_chan deserved > dev_err_probe to handle EPROBE_DEFER nicely. And add one for ipc_regmap > also to be complete. > > With this done, we can remove the outer print completely. > > Signed-off-by: Luca Weiss > --- > Changes in v2: > - Rebase on qcom for-next, drop dts patches which have been applied > - Improve error printing situation (Bjorn) > - Link to v1: > https://lore.kernel.org/r/20240424-apcs-mboxes-v1-0-6556c47cb...@z3ntu.xyz > --- > drivers/rpmsg/qcom_smd.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > Reviewed-by: Dmitry Baryshkov -- With best wishes Dmitry
Re: [PATCH 1/3] rust: add static_call support
On Fri, Jun 7, 2024 at 12:52 PM Peter Zijlstra wrote: > > I'm sorry, but 30+ years of reading ! as NOT (or factorial) isn't going > to go away. So I'm reading your macros do NOT rule. It makes it clear what is macro call or not. They could have gone for UPPERCASE names (for instance), yes. On the other hand, they do not work like C macros and are ~hygienic, so it also makes sense to avoid confusion here. I mean, I am not suggesting to do a CPP-pass to Rust files, but if someone really, really wanted to mix them in a single file, it would be nice to not confuse the two kinds. :) Generally they feel "closer" to the language (given what they do/support) compared to the CPP ones, so it also makes sense they don't "shout" so much compared to UPPERCASE, if that makes sense. > The above exaple fails, because the next token is :ident, whatever the > heck that might be. Also, extra points for line-noise due to lack of > whitespace. $name:ident means "match what Rust would consider an identifier here and call it $name for the purposes of this macro". So, for instance, $x:ident matches: a a2 a_b But it would not match: 2a a-b a _b For the usual reasons why those are not identifiers. https://godbolt.org/z/G7v4j67dc fn f(x: i32) -> i32 { x * 2 } macro_rules! f { ($x:ident) => { $x * 2 } } fn main() { let a = 42; let b = f(a); // Function. let c = f!(a); // Macro. //let c = f!(a2); // Works, but the variable does not exist. //let c = f!(2a); // Error: no rules expected the token `2a`. //let c = f!(a_b); // Works, but the variable does not exist. //let c = f!(a-b); // Error: no rules expected the token `-`. //let c = f!(a_ b); // Error: no rules expected the token `b`. println!("{a} {b} {c}"); } I hope this makes it clearer. > You just need to extend the rust thing to be able to consume C header > files. I agree, because in practice it is quite useful for a language like Rust that consuming C header files is "natively" supported. Though it also has downsides and is a big decision, which is why, like Alice mentioned, some people agree, and some people don't. Nevertheless, we have been doing our best for a long time to get the best we can for the kernel -- just 2 days ago we told the Rust project in one of our meetings that it would be nice to see that particular "project goal" from that document realized (among others). Cheers, Miguel
Re: [PATCH 1/3] rust: add static_call support
On Fri, Jun 07, 2024 at 09:43:29AM +, Alice Ryhl wrote: > Peter Zijlstra wrote: > > On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote: > > > On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra > > > wrote: > > > > > > > > This is absolutely unreadable gibberish -- how am I supposed to keep > > > > this in sync with the rest of the static_call infrastructure? > > > > > > Yeah, they are macros, which look different from "normal" Rust code. > > > > Macros like CPP ? > > Yes, this patch series uses declarative macros, which are the closest > that Rust has to the C preprocessor. They are powerful, but just like > CPP, they can become pretty complicated and hard to read if you are > doing something non-trivial. > > The macro_rules! block is how you define a new declarative macro. I'm sorry, but 30+ years of reading ! as NOT (or factorial) isn't going to go away. So I'm reading your macros do NOT rule. > The ($name:ident($($args:expr),* $(,)?)) part defines the arguments to > the declarative macro. This syntax means: > > 1. The input starts with any identifier, which we call $name. > 2. Then there must be a ( token. The above exaple fails, because the next token is :ident, whatever the heck that might be. Also, extra points for line-noise due to lack of whitespace. > So for example, you might invoke the macro like this: > > static_call!(tp_func_my_tracepoint(__data, my_task_struct)); static_call NOT (blah dog blah); > Inside the macro, you will see things such as: > $crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; } > > The Rust syntax for invoking a macro has an exclamation mark after the Like I said before, the creator of Rust must've been an esoteric language freak and must've wanted to make this unreadable on purpose :/ Also, why the white space beteen the :: scope operator and the [< thing? that's just weird. I would then expect the output to be: ...::bindings:: __SCK__my_static_key > name, so you know that $crate::macros::paste is a macro. The `paste` > macro just emits its input unchanged, except that any identifiers > between [< and >] are concatenated into a single identifier. So if $name > is my_static_key, then the above invocation of paste! emits: > > $crate::bindings::__SCK__my_static_key; But it doesn't, so it isn't unmodified, it seems to strip whitespace. > The $crate::bindings module is where the output of bindgen goes, so this > should correspond to the C symbol called __SCK__my_static_key. > > > > Is there something we could do to help here? I think Alice and others > > > would be happy to explain how it works and/or help maintain it in the > > > future if you prefer. > > > > Write a new language that looks more like C -- pretty please ? :-) > > > > Mostly I would just really like you to just use arm/jump_label.h, > > they're inline functions and should be doable with IR, no weirdo CPP > > involved in this case. > > I assume that you're referring to static_key_false here? I don't think > that function can be exposed using IR because it passes the function > argument called key as an "i" argument to an inline assembly block. Any > attempt to compile static_key_false without knowing the value of key at > compile time will surely fail to compile with the > > invalid operand for inline asm constraint 'i' > > error. You can have clang read the header files and compile them into Intermediate-Representation, and have it splice the lot into the Rust crap's IR and voila, compile time. You just need to extend the rust thing to be able to consume C header files.
Re: [PATCH 1/3] rust: add static_call support
Peter Zijlstra wrote: > On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote: > > On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra wrote: > > > > > > This is absolutely unreadable gibberish -- how am I supposed to keep > > > this in sync with the rest of the static_call infrastructure? > > > > Yeah, they are macros, which look different from "normal" Rust code. > > Macros like CPP ? Yes, this patch series uses declarative macros, which are the closest that Rust has to the C preprocessor. They are powerful, but just like CPP, they can become pretty complicated and hard to read if you are doing something non-trivial. The macro_rules! block is how you define a new declarative macro. The ($name:ident($($args:expr),* $(,)?)) part defines the arguments to the declarative macro. This syntax means: 1. The input starts with any identifier, which we call $name. 2. Then there must be a ( token. 3. Then the $(),* part defines a repetition group. The contents inside the parenthesises are what is being repeated. The comma means that repetitions are separated by commas. The asterisk means that the contents may be repeated any number of times, including zero times. (Alternatives are + which means repeated at least once, and ? which means that the contents may be repeated zero or one time.) 4. The contents of the repetition group will be an expression, which we call $args. 5. After the last expression, we have $(,)? which means that you can have zero or one commas after the last expression. Rust usually allows trailing commas in lists, which is why I included it. 6. And finally, you must close the input with a ) token. So for example, you might invoke the macro like this: static_call!(tp_func_my_tracepoint(__data, my_task_struct)); Here, $name will be tp_func_my_tracepoint, and the repetition group is repeated twice, with $args first corresponding to `__data` and then to ` my_task_struct` when the macro is expanded. The $(,)? group is repeated zero times. Inside the macro, you will see things such as: $crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; } The Rust syntax for invoking a macro has an exclamation mark after the name, so you know that $crate::macros::paste is a macro. The `paste` macro just emits its input unchanged, except that any identifiers between [< and >] are concatenated into a single identifier. So if $name is my_static_key, then the above invocation of paste! emits: $crate::bindings::__SCK__my_static_key; The $crate::bindings module is where the output of bindgen goes, so this should correspond to the C symbol called __SCK__my_static_key. > > Is there something we could do to help here? I think Alice and others > > would be happy to explain how it works and/or help maintain it in the > > future if you prefer. > > Write a new language that looks more like C -- pretty please ? :-) > > Mostly I would just really like you to just use arm/jump_label.h, > they're inline functions and should be doable with IR, no weirdo CPP > involved in this case. I assume that you're referring to static_key_false here? I don't think that function can be exposed using IR because it passes the function argument called key as an "i" argument to an inline assembly block. Any attempt to compile static_key_false without knowing the value of key at compile time will surely fail to compile with the invalid operand for inline asm constraint 'i' error. Alice
Re: [PATCH] livepatch: introduce klp_func called interface
Hi, On Tue, 4 Jun 2024, Song Liu wrote: > On Tue, May 21, 2024 at 1:04 AM Petr Mladek wrote: > [...] > > > > > > Yes, but the information you get is limited compared to what is available > > > now. You would obtain the information that a patched function was called > > > but ftrace could also give you the context and more. > > > > Another motivation to use ftrace for testing is that it does not > > affect the performance in production. > > > > We should keep klp_ftrace_handler() as fast as possible so that we > > could livepatch also performance sensitive functions. > > At LPC last year, we discussed about adding a counter to each > klp_func, like: > > struct klp_func { > ... > u64 __percpu counter; > ... > }; > > With some static_key (+ sysctl), this should give us a way to estimate > the overhead of livepatch. If we have the counter, this patch is not > needed any more. Does this (adding the counter) sound like > something we still want to pursue? It would be better than this patch but given what was mentioned in the thread I wonder if it is possible to use ftrace even for this. See /sys/kernel/tracing/trace_stat/function*. It already gathers the number of hits. Would it be sufficient for you? I guess it depends on what the intention is. If there is no time limit, klp_func.counter might be better to provide some kind of overall statistics (but I am not sure if it has any value) and to avoid having ftrace registered on a live patched function for infinite period of time. If the intention is to gather data for some limited period, trace_stat sounds like much better approach to me. Regards Miroslav
Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks
On 5/28/24 01:43, Steven Rostedt wrote: > On Mon, 27 May 2024 11:36:55 +0200 > Petr Pavlu wrote: > >>>> static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer) >>>> { >>>> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, >>>> unsigned long size, >>>> */ >>>>synchronize_rcu(); >>>>for_each_buffer_cpu(buffer, cpu) { >>>> + unsigned long flags; >>>> + >>>>cpu_buffer = buffer->buffers[cpu]; >>>> + raw_spin_lock_irqsave(_buffer->reader_lock, flags); >>>>rb_check_pages(cpu_buffer); >>>> + raw_spin_unlock_irqrestore(_buffer->reader_lock, >>>> + flags); >>> >>> Putting my RT hat on, I really don't like the above fix. The >>> rb_check_pages() iterates all subbuffers which makes the time interrupts >>> are disabled non-deterministic. >> >> I see, this applies also to the same rb_check_pages() validation invoked >> from ring_buffer_read_finish(). >> >>> >>> Instead, I would rather have something where we disable readers while we do >>> the check, and re-enable them. >>> >>> raw_spin_lock_irqsave(_buffer->reader_lock, flags); >>> cpu_buffer->read_disabled++; >>> raw_spin_unlock_irqrestore(_buffer->reader_lock, >>> flags); >>> >>> // Also, don't put flags on a new line. We are allow to go 100 characters >>> now. >> >> Noted. >> >>> >>> >>> rb_check_pages(cpu_buffer); >>> raw_spin_lock_irqsave(_buffer->reader_lock, flags); >>> cpu_buffer->read_disabled--; >>> raw_spin_unlock_irqrestore(_buffer->reader_lock, >>> flags); >>> >>> Or something like that. Yes, that also requires creating a new >>> "read_disabled" field in the ring_buffer_per_cpu code. >> >> I think this would work but I'm personally not immediately sold on this >> approach. If I understand the idea correctly, readers should then check >> whether cpu_buffer->read_disabled is set and bail out with some error if >> that is the case. The rb_check_pages() function is only a self-check >> code and as such, I feel it doesn't justify disrupting readers with >> a new error condition and adding more complex locking. > > Honestly, this code was never made for more than one reader per > cpu_buffer. I'm perfectly fine if all check_pages() causes other > readers to the same per_cpu buffer to get -EBUSY. > > Do you really see this being a problem? What use case is there for > hitting the check_pages() and reading the same cpu buffer at the same > time? My main issue is with added complexity to check the new read_disabled flag. The rb_check_pages() part is simple and you showed what to do. The readers part is what I struggle with. I think the read_disabled flag needs to be either checked once in rb_get_reader_page() where the actual problem with making the list temporarily inconsistent exists. Or alternatively, it can be checked by direct or indirect users of rb_get_reader_page() just after they take the reader_lock. Looking at the final rb_get_reader_page() function, it currently always returns a valid reader page unless the buffer doesn't contain any additional entry or a serious problem is detected by RB_WARN_ON() checks. This is simple to handle for callers, either they get a reader page and can continue, or they stop. Returning -EBUSY means that callers have a new case that they need to decide what to do about. They need to propagate the error up the chain or attempt to handle it. This involves ring-buffer functions rb_advance_reader(), rb_buffer_peek(), ring_buffer_peek(), ring_buffer_consume(), ring_buffer_read_page() ring_buffer_map_get_reader() and their callers from other source files. It is possible to handle this new case in these functions but I'm not sure if adding this logic is justified. I feel I must have misunderstood something how it should work? I was further thinking about alternatives that would possibly provide a less thorough check but have their complexity limited only to rb_check_pages(). The already mentioned idea is to have the function to look only at surrounding nodes where some change in the list occurred. Another option could be to try traversing the whole list in smaller parts and give up the reader_lock in between them. This would need some care to make sure that the operation completes, e.g. the code would need to bail out if it detects a change on cpu_buffer->pages_read. Thanks, Petr
Re: [PATCH 0/2] livepatch: Add compiler optimization disclaimer/docs
Hi, On Fri, 31 May 2024, Joe Lawrence wrote: > On 5/31/24 07:23, Miroslav Benes wrote: > > Hi, > > > > On Tue, 21 Jul 2020, Joe Lawrence wrote: > > > >> In light of [PATCH] Revert "kbuild: use -flive-patching when > >> CONFIG_LIVEPATCH is enabled" [1], we should add some loud disclaimers > >> and explanation of the impact compiler optimizations have on > >> livepatching. > >> > >> The first commit provides detailed explanations and examples. The list > >> was taken mostly from Miroslav's LPC talk a few years back. This is a > >> bit rough, so corrections and additional suggestions welcome. Expanding > >> upon the source-based patching approach would be helpful, too. > >> > >> The second commit adds a small README.rst file in the livepatch samples > >> directory pointing the reader to the doc introduced in the first commit. > >> > >> I didn't touch the livepatch kselftests yet as I'm still unsure about > >> how to best account for IPA here. We could add the same README.rst > >> disclaimer here, too, but perhaps we have a chance to do something more. > >> Possibilities range from checking for renamed functions as part of their > >> build, or the selftest scripts, or even adding something to the kernel > >> API. I think we'll have a better idea after reviewing the compiler > >> considerations doc. > > > > thanks to Marcos for resurrecting this. > > > > Joe, do you have an updated version by any chance? Some things have > > changed since July 2020 so it calls for a new review. If there was an > > improved version, it would be easier. If not, no problem at all. > > > > Yea, it's been a little while :) I don't have any newer version than > this one. I can rebase, apply all of the v1 suggestions, and see where > it stands. LMK if you can think of any specifics that could be added. I will walk through the patches first to see if there is something which can/should be changed given the development since then. > For example, CONFIG_KERNEL_IBT will be driving some changes soon, > whether it be klp-convert for source-based patches or vmlinux.o binary > comparison for kpatch-build. True. > I can push a v2 with a few changes, but IIRC, last time we reviewed > this, it kinda begged the question of how someone is creating the > livepatch in the first place. As long as we're fine holding that > thought for a while longer, this doc may still be useful by itself. If I remember correctly, the conclusion was that this doc was beneficial on its own. Miroslav
Re: [PATCH v2 1/1] dt-bindings: remoteproc: imx_rproc: add minItems for power-domain
On 06/06/2024 17:00, Frank Li wrote: > "fsl,imx8qxp-cm4" and "fsl,imx8qm-cm4" need minimum 2 power domains. Keep > the same restriction for other compatible string. > > Signed-off-by: Frank Li > --- > > Notes: > Change from v1 to v2 > - set minitem to 2 at top > - Add imx8qm compatible string also > - use not logic to handle difference compatible string restriction > - update commit message. > > pass dt_binding_check. > > make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 dt_binding_check > DT_SCHEMA_FILES=fsl,imx-rproc.yaml > SCHEMA Documentation/devicetree/bindings/processed-schema.json > CHKDT Documentation/devicetree/bindings > LINTDocumentation/devicetree/bindings > DTEX > Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dts > DTC_CHK > Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dtb > > .../bindings/remoteproc/fsl,imx-rproc.yaml | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > index df36e29d974ca..da108a39df435 100644 > --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > @@ -59,6 +59,7 @@ properties: > maxItems: 32 > >power-domains: > +minItems: 2 > maxItems: 8 > >fsl,auto-boot: > @@ -99,6 +100,19 @@ allOf: >properties: > fsl,iomuxc-gpr: false > > + - if: > + properties: > +compatible: > + not: > +contains: > + enum: > +- fsl,imx8qxp-cm4 > +- fsl,imx8qm-cm4 > +then: > + properties: > +power-domains: > + minItems: 8 What happened with the "else:"? How many power domains is needed for other devices? Best regards, Krzysztof
Re: [PATCH v2 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc
On 06/06/2024 21:18, Luca Weiss wrote: > The qcom,ipc-N properties are essentially providing a reference to a > mailbox, so allow using the mboxes property to do the same in a more > structured way. > > Since multiple SMSM hosts are supported, we need to be able to provide > the correct mailbox for each host. The old qcom,ipc-N properties map to > the mboxes property by index, starting at 0 since that's a valid SMSM > host also. > > Mark the older qcom,ipc-N as deprecated and update the example with > mboxes. > > Signed-off-by: Luca Weiss > --- Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof
Re: [PATCH 1/5] dt-bindings: remoteproc: qcom,sm8550-pas: document the SDX75 PAS
On 6/6/2024 8:19 PM, Krzysztof Kozlowski wrote: On 06/06/2024 16:38, Naina Mehta wrote: Document the MPSS Peripheral Authentication Service on SDX75 platform. Signed-off-by: Naina Mehta --- .../devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml index 73fda7565cd1..02e15b1f78ab 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml +++ b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml @@ -16,6 +16,7 @@ description: properties: compatible: enum: + - qcom,sdx75-mpss-pas - qcom,sm8550-adsp-pas - qcom,sm8550-cdsp-pas - qcom,sm8550-mpss-pas Missing updates to allOf constraints. Are you sure this is the binding for SDX75? SDX75 diverged from SDX55 due to introduction of separate DTB firmware binary and addition of DSM and Qlink logging memory regions. Considering these additions, SM8550 PAS bindings seem closest. Thanks for catching updates required for allOf constraints. I will add sdx75 compatible in-line with qcom,sm8650-mpss-pas. Please let me know if you recommend creating new bindings for SDX75. Regards, Naina Best regards, Krzysztof
Re: [PATCH] net: missing check
Hi Denis, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.10-rc2 next-20240607] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Denis-Arefev/net-missing-check/20240606-230540 base: linus/master patch link: https://lore.kernel.org/r/20240606141450.44709-1-arefev%40swemel.ru patch subject: [PATCH] net: missing check config: x86_64-randconfig-121-20240607 (https://download.01.org/0day-ci/archive/20240607/202406071404.oilhfohm-...@intel.com/config) compiler: gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240607/202406071404.oilhfohm-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406071404.oilhfohm-...@intel.com/ sparse warnings: (new ones prefixed by >>) drivers/net/virtio_net.c: note: in included file: >> include/linux/virtio_net.h:103:58: sparse: sparse: incorrect type in >> argument 2 (different base types) @@ expected unsigned long long >> [usertype] divisor @@ got restricted __virtio16 const [usertype] >> gso_size @@ include/linux/virtio_net.h:103:58: sparse: expected unsigned long long [usertype] divisor include/linux/virtio_net.h:103:58: sparse: got restricted __virtio16 const [usertype] gso_size >> include/linux/virtio_net.h:104:42: sparse: sparse: restricted __virtio16 >> degrades to integer vim +103 include/linux/virtio_net.h 49 50 static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, 51 const struct virtio_net_hdr *hdr, 52 bool little_endian) 53 { 54 unsigned int nh_min_len = sizeof(struct iphdr); 55 unsigned int gso_type = 0; 56 unsigned int thlen = 0; 57 unsigned int p_off = 0; 58 unsigned int ip_proto; 59 u64 ret, remainder; 60 61 if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { 62 switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { 63 case VIRTIO_NET_HDR_GSO_TCPV4: 64 gso_type = SKB_GSO_TCPV4; 65 ip_proto = IPPROTO_TCP; 66 thlen = sizeof(struct tcphdr); 67 break; 68 case VIRTIO_NET_HDR_GSO_TCPV6: 69 gso_type = SKB_GSO_TCPV6; 70 ip_proto = IPPROTO_TCP; 71 thlen = sizeof(struct tcphdr); 72 nh_min_len = sizeof(struct ipv6hdr); 73 break; 74 case VIRTIO_NET_HDR_GSO_UDP: 75 gso_type = SKB_GSO_UDP; 76 ip_proto = IPPROTO_UDP; 77 thlen = sizeof(struct udphdr); 78 break; 79 case VIRTIO_NET_HDR_GSO_UDP_L4: 80 gso_type = SKB_GSO_UDP_L4; 81 ip_proto = IPPROTO_UDP; 82 thlen = sizeof(struct udphdr); 83 break; 84 default: 85 return -EINVAL; 86 } 87 88 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN) 89 gso_type |= SKB_GSO_TCP_ECN; 90 91 if (hdr->gso_size == 0) 92 return -EINVAL; 93 } 94 95 skb_reset_mac_header(skb); 96 97 if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { 98 u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start); 99 u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); 100 u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16)); 101 102 if (hdr->gso_size) { > 103 ret = div64_u64_rem(skb->len, hdr->gso_size, > ); > 104 if (!(ret && (hdr->gso_size > needed) && 105 ((remainder > needed) || (remainder == 0 { 106 return -EINVAL; 107 } 108 skb_shinfo(skb)->tx_flags |= SKBFL_SHARED_FRAG; 109 } 110 111 if (!pskb_may_pull(skb, needed)) 112
Re: [PATCH 3/5] arm64: dts: qcom: sdx75: add missing qlink_logging reserved memory for mpss
On 6/6/2024 8:20 PM, Krzysztof Kozlowski wrote: On 06/06/2024 16:38, Naina Mehta wrote: The qlink_logging memory region is also used by the modem firmware, add it to reserved memory regions. Also split MPSS DSM region into 2 separate regions. Signed-off-by: Naina Mehta --- arch/arm64/boot/dts/qcom/sdx75.dtsi | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sdx75.dtsi b/arch/arm64/boot/dts/qcom/sdx75.dtsi index 9b93f6501d55..9349b1c4e196 100644 --- a/arch/arm64/boot/dts/qcom/sdx75.dtsi +++ b/arch/arm64/boot/dts/qcom/sdx75.dtsi @@ -366,7 +366,12 @@ no-map; }; - qdss_mem: qdss@8880 { + qdss_mem: qdss@8850 { + reg = <0x0 0x8850 0x0 0x30>; + no-map; + }; + + qlink_logging_mem: qlink_logging@8880 { Sorry, no downstream code. Please follow DTS coding style - no underscores in node names. This applies to all work sent upstream. Thanks for pointing this out. I will update in next revision. Regards, Naina Best regards, Krzysztof
Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()
On 07/06/2024 00:49, Ira Weiny wrote: > Li Zhijian wrote: >> Don't allocate devs again when it's valid pointer which has pionted to >> the memory allocated above with size (count + 2 * sizeof(dev)). >> >> A kmemleak reports: >> unreferenced object 0x88800dda1980 (size 16): >>comm "kworker/u10:5", pid 69, jiffies 4294671781 >>hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>backtrace (crc 0): >> [] __kmalloc+0x32c/0x470 >> [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 >> [libnvdimm] >> [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [ ] really_probe+0xc6/0x390 >> [<129e2a69>] __driver_probe_device+0x78/0x150 >> [<2dfed28b>] driver_probe_device+0x1e/0x90 >> [ ] __device_attach_driver+0x85/0x110 >> [<32dca295>] bus_for_each_drv+0x85/0xe0 >> [<391c5a7d>] __device_attach+0xbe/0x1e0 >> [<26dabec0>] bus_probe_device+0x94/0xb0 >> [ ] device_add+0x656/0x870 >> [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<3f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [ ] process_one_work+0x1ee/0x600 >> [<6d90d5a9>] worker_thread+0x183/0x350 >> >> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") >> Signed-off-by: Li Zhijian >> --- >> drivers/nvdimm/namespace_devs.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/nvdimm/namespace_devs.c >> b/drivers/nvdimm/namespace_devs.c >> index d6d558f94d6b..56b016dbe307 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region >> *nd_region) >> /* Publish a zero-sized namespace for userspace to configure. */ >> nd_mapping_free_labels(nd_mapping); >> >> -devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> +/* devs probably has been allocated */ > > I don't think this is where the bug is. The loop above is processing the > known labels and should exit with a count > 0 if devs is not NULL. > > From what I can tell create_namespace_pmem() must be returning EAGAIN > which leaves devs allocated but fails to increment count. Thus there are > no valid labels but devs was not free'ed. Per the piece of the code, return EAGAIN and ENODEV could cause this issue in theory. 1980 dev = create_namespace_pmem(nd_region, nd_mapping, nd_label); 1981 if (IS_ERR(dev)) { 1982 switch (PTR_ERR(dev)) { 1983 case -EAGAIN: 1984 /* skip invalid labels */ 1985 continue; 1986 case -ENODEV: 1987 /* fallthrough to seed creation */ 1988 break; 1989 default: 1990 goto err; 1991 } 1992 } else 1993 devs[count++] = dev; > > Can you trace the error you are seeing a bit more to see if this is the > case? I just tried, but I cannot reproduce this leaking again. When it happened(100% reproduce at that time), I probably had a corrupted LSA(I see empty output with command 'ndctl list'). It seemed the QEMU emulated Nvdimm device was broken for some reasons. Thanks Zhijian > > Thanks, > Ira > >> +if (!devs) >> +devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> if (!devs) >> goto err; >> >> -- >> 2.29.2 >> > >
Re: [PATCH] livepatch: introduce klp_func called interface
> On Jun 6, 2024, at 23:01, Joe Lawrence wrote: > > Hi Wardenjohn, > > To follow up, Red Hat kpatch QE pointed me toward this test: > > https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/tree/main/general/kpatch/kpatch-trace?ref_type=heads > > which reports a few interesting things via systemd service and ftrace: > > - Patched functions > - Traced functions > - Code that was modified, but didn't run > - Other code that ftrace saw, but is not listed in the sysfs directory > > The code looks to be built on top of the same basic ftrace commands that > I sent the other day. You may be able to reuse/copy/etc bits from the > scripts if they are handy. > > -- > Joe > Thank you so much, you really helped me, Joe! I will try to learn the script and make it suitable for our system. Again, thank you, Joe! Regards, Wardenjohn
Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()
On 07/06/2024 00:27, Dave Jiang wrote: > > > On 6/3/24 8:16 PM, Li Zhijian wrote: >> Don't allocate devs again when it's valid pointer which has pionted to >> the memory allocated above with size (count + 2 * sizeof(dev)). >> >> A kmemleak reports: >> unreferenced object 0x88800dda1980 (size 16): >>comm "kworker/u10:5", pid 69, jiffies 4294671781 >>hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>backtrace (crc 0): >> [] __kmalloc+0x32c/0x470 >> [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 >> [libnvdimm] >> [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [ ] really_probe+0xc6/0x390 >> [<129e2a69>] __driver_probe_device+0x78/0x150 >> [<2dfed28b>] driver_probe_device+0x1e/0x90 >> [ ] __device_attach_driver+0x85/0x110 >> [<32dca295>] bus_for_each_drv+0x85/0xe0 >> [<391c5a7d>] __device_attach+0xbe/0x1e0 >> [<26dabec0>] bus_probe_device+0x94/0xb0 >> [ ] device_add+0x656/0x870 >> [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<3f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [ ] process_one_work+0x1ee/0x600 >> [<6d90d5a9>] worker_thread+0x183/0x350 >> >> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") >> Signed-off-by: Li Zhijian >> --- >> drivers/nvdimm/namespace_devs.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/nvdimm/namespace_devs.c >> b/drivers/nvdimm/namespace_devs.c >> index d6d558f94d6b..56b016dbe307 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region >> *nd_region) >> /* Publish a zero-sized namespace for userspace to configure. */ >> nd_mapping_free_labels(nd_mapping); >> >> -devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> +/* devs probably has been allocated */ >> +if (!devs) >> +devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > > This changes the behavior of this code and possibly corrupting the previously > allocated memory at times when 'devs' is valid. If we reach here, that means count == 0 is true. so we can deduce that the previously allocated memory was not touched at all in previous loop. and its size was (2 * sizeof(dev)) too. Was the 'devs' leaked out from the previous loop and should be freed instead? this option is also fine to me. we can also fix this by free(devs) before allocate it again.: + kfree(devs); // kfree(NULL) is safe. devs = kcalloc(2, sizeof(dev), GFP_KERNEL); Thanks Zhijian > >> if (!devs) >> goto err; >> >
Re: [PATCH v14 02/14] cgroup/misc: Add per resource callbacks for CSS events
I think it is better when _misc_cg_res_alloc fails, it just calls _misc_cg_res_free(cg, index)(add index parameter, it means ending of iterator), so it can avoid calling ->free() that do not call ->alloc(). And in misc_cg_free, just call _misc_cg_res_free(cg, MISC_CG_RES_TYPES) to free all. Thanks Ridong On 2024/6/6 22:51, Haitao Huang wrote: On Thu, 06 Jun 2024 08:37:31 -0500, chenridong wrote: If _misc_cg_res_alloc fails, maybe some types do not call ->alloc(), but all types ->free() callback >will be called, is that ok? Not sure I understand. Are you suggesting we ignore failures from ->alloc() callback in _misc_cg_res_alloc() as it is per-resource, and have ->free() callback and resource provider of the failing type to handle the failure internally? IIUC, this failure only happens when a specific subcgroup is created (memory running out for allocation) so failing that subcgroup as a whole seems fine to me. Note the root node is static and no pre-resource callbacks invoked by misc. And resource provider handles its own allocations for root. In SGX case we too declare a static object for corresponding root sgx_cgroup struct. Note also misc cgroup (except for setting capacity[res] = 0 at root) is all or nothing so no mechanism to tell user "this resource does not work but others are fine in this particular cgroup." Thanks Haitao
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 09:29:51PM +0200, Peter Zijlstra wrote: > On Thu, Jun 06, 2024 at 12:00:36PM -0700, Boqun Feng wrote: > > On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote: > > > On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote: > > > > > > > Long-term plan is to 1) compile the C helpers in some IR and 2) inline > > > > the helpers with Rust in IR-level, as what Gary has: > > > > > > > > > > > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ > > > > > > Urgh, that still needs us to maintain that silly list of helpers :-/ > > > > > > > But it's an improvement from the current stage, right? ;-) > > Somewhat, but only marginal. > > > > Can't we pretty please have clang parse the actual header files into IR > > > and munge that into rust? So that we don't get to manually duplicate > > > everything+dog. > > > > That won't always work, because some of our kernel APIs are defined as > > macros, and I don't think it's a trivial job to generate a macro > > definition to a function definition so that it can be translated to > > something in IR. We will have to do the macro -> function mapping > > ourselves somewhere, if we want to inline the API across languages. > > We can try and see how far we can get with moving a bunch of stuff into > inlines. There's quite a bit of simple CPP that could be inlines or > const objects I suppose. > We can, but I'd first stick with what we have, improve it and make it stable until we go to the next stage. Plus, there's benefit of keeping an explicit helper list: it's clear what APIs are called by Rust, and moreover, it's easier to modify the helpers if you were to change an API, other than chasing where Rust code calls it. (Don't make me wrong, I'm happy if you want to do that ;-)) Regards, Boqun > Things like the tracepoints are of course glorious CPP abuse and are > never going to work. > > But perhaps you can have an explicit 'eval-CPP on this here' construct > or whatnot. If I squit I see this paste! thingy (WTF's up with that ! > operator?) to munge function names in the static_call thing. So > something like apply CPP from over there on this here can also be done > :-)
Re: [PATCH v3 08/13] tracing: Add option to use memmapped memory for trace boot instance
Memory management folks. Please review this patch. Specifically the "map_pages()" function below. On Thu, 06 Jun 2024 17:17:43 -0400 Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > Add an option to the trace_instance kernel command line parameter that > allows it to use the reserved memory from memmap boot parameter. > > memmap=12M$0x28450 trace_instance=boot_mapped@0x28450:12M > > The above will reserves 12 megs at the physical address 0x28450. > The second parameter will create a "boot_mapped" instance and use the > memory reserved as the memory for the ring buffer. > > That will create an instance called "boot_mapped": > > /sys/kernel/tracing/instances/boot_mapped > > Note, because the ring buffer is using a defined memory ranged, it will > act just like a memory mapped ring buffer. It will not have a snapshot > buffer, as it can't swap out the buffer. The snapshot files as well as any > tracers that uses a snapshot will not be present in the boot_mapped > instance. > > Cc: linux...@kvack.org > Signed-off-by: Steven Rostedt (Google) > --- > .../admin-guide/kernel-parameters.txt | 9 +++ > kernel/trace/trace.c | 75 +-- > 2 files changed, 78 insertions(+), 6 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > index b600df82669d..ff26b6094e79 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6754,6 +6754,15 @@ > the same thing would happen if it was left off). The > irq_handler_entry > event, and all events under the "initcall" system. > > + If memory has been reserved (see memmap for x86), the > instance > + can use that memory: > + > + memmap=12M$0x28450 > trace_instance=boot_map@0x28450:12M > + > + The above will create a "boot_map" instance that uses > the physical > + memory at 0x28450 that is 12Megs. The per CPU > buffers of that > + instance will be split up accordingly. > + > trace_options=[option-list] > [FTRACE] Enable or disable tracer options at boot. > The option-list is a comma delimited list of options > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 622fe670949d..13e89023f33b 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -9504,6 +9504,31 @@ static int instance_mkdir(const char *name) > return ret; > } > > +static u64 map_pages(u64 start, u64 size) > +{ > + struct page **pages; > + phys_addr_t page_start; > + unsigned int page_count; > + unsigned int i; > + void *vaddr; > + > + page_count = DIV_ROUND_UP(size, PAGE_SIZE); > + > + page_start = start; > + pages = kmalloc_array(page_count, sizeof(struct page *), GFP_KERNEL); > + if (!pages) > + return 0; > + > + for (i = 0; i < page_count; i++) { > + phys_addr_t addr = page_start + i * PAGE_SIZE; > + pages[i] = pfn_to_page(addr >> PAGE_SHIFT); > + } > + vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL); > + kfree(pages); > + > + return (u64)(unsigned long)vaddr; > +} If for some reason the memmap=nn$ss fails, but this still gets called, will the above just map over any memory. That is, is it possible that the kernel could have used this memory? Is there a way to detect this? That is, I don't want this to succeed if the memory location it's about to map to is used by the kernel, or will be used by user space. -- Steve > + > /** > * trace_array_get_by_name - Create/Lookup a trace array, given its name. > * @name: The name of the trace array to be looked up/created. > @@ -10350,6 +10375,7 @@ __init static void enable_instances(void) > { > struct trace_array *tr; > char *curr_str; > + char *name; > char *str; > char *tok; > > @@ -10358,19 +10384,56 @@ __init static void enable_instances(void) > str = boot_instance_info; > > while ((curr_str = strsep(, "\t"))) { > + unsigned long start = 0; > + unsigned long size = 0; > + unsigned long addr = 0; > > tok = strsep(_str, ","); > + name = strsep(, "@"); > + if (tok) { > + start = memparse(tok, ); > + if (!start) { > + pr_warn("Tracing: Invalid boot instance address > for %s\n", > + name); > + continue; > + } > + } > > - if (IS_ENABLED(CONFIG_TRACER_MAX_TRACE)) > - do_allocate_snapshot(tok); > + if (start) { > + if (*tok
Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support
On Wed, Jun 05, 2024 at 12:45:17PM -0500, Tanmay Shah wrote: > > > On 6/4/24 10:34 AM, Mathieu Poirier wrote: > > Hi Mathieu, > > Thanks for reviews. > Please find my comments below. > > > Hi Tanmay, > > > > On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote: > >> It is possible that remote processor is already running before > >> linux boot or remoteproc platform driver probe. Implement required > >> remoteproc framework ops to provide resource table address and > >> connect or disconnect with remote processor in such case. > >> > >> Signed-off-by: Tanmay Shah > >> > >> --- > >> Changes in v4: > >> - Move change log out of commit text > >> > >> Changes in v3: > >> > >> - Drop SRAM patch from the series > >> - Change type from "struct resource_table *" to void __iomem * > >> - Change comment format from /** to /* > >> - Remove unmap of resource table va address during detach, allowing > >> attach-detach-reattach use case. > >> - Unmap rsc_data_va after retrieving resource table data structure. > >> - Unmap resource table va during driver remove op > >> > >> Changes in v2: > >> > >> - Fix typecast warnings reported using sparse tool. > >> - Fix following sparse warnings: > >> > >> drivers/remoteproc/xlnx_r5_remoteproc.c:827:21: sparse: warning: incorrect > >> type in assignment (different address spaces) > >> drivers/remoteproc/xlnx_r5_remoteproc.c:844:18: sparse: warning: incorrect > >> type in assignment (different address spaces) > >> drivers/remoteproc/xlnx_r5_remoteproc.c:898:24: sparse: warning: incorrect > >> type in argument 1 (different address spaces) > >> drivers/remoteproc/xlnx_r5_remoteproc.c | 173 +++- > >> 1 file changed, 169 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c > >> b/drivers/remoteproc/xlnx_r5_remoteproc.c > >> index 84243d1dff9f..6898d4761566 100644 > >> --- a/drivers/remoteproc/xlnx_r5_remoteproc.c > >> +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c > >> @@ -25,6 +25,10 @@ > >> /* RX mailbox client buffer max length */ > >> #define MBOX_CLIENT_BUF_MAX (IPI_BUF_LEN_MAX + \ > >> sizeof(struct zynqmp_ipi_message)) > >> + > >> +#define RSC_TBL_XLNX_MAGIC((uint32_t)'x' << 24 | (uint32_t)'a' << > >> 16 | \ > >> + (uint32_t)'m' << 8 | (uint32_t)'p') > >> + > >> /* > >> * settings for RPU cluster mode which > >> * reflects possible values of xlnx,cluster-mode dt-property > >> @@ -73,6 +77,26 @@ struct mbox_info { > >>struct mbox_chan *rx_chan; > >> }; > >> > >> +/** > >> + * struct rsc_tbl_data > >> + * > >> + * Platform specific data structure used to sync resource table address. > >> + * It's important to maintain order and size of each field on remote side. > >> + * > >> + * @version: version of data structure > >> + * @magic_num: 32-bit magic number. > >> + * @comp_magic_num: complement of above magic number > >> + * @rsc_tbl_size: resource table size > >> + * @rsc_tbl: resource table address > >> + */ > >> +struct rsc_tbl_data { > >> + const int version; > >> + const u32 magic_num; > >> + const u32 comp_magic_num; > > > > I thought we agreed on making the magic number a u64 - did I get this wrong? > > > > Looks like I missed this comment from previous reviews, so didn't address it. > Thanks for pointing this. > > So I think having two 32-bit numbers with proper name, implies what is > expected and less chance of errors. > With 64-bit number, it's easy to create errors when assigning magic number. > > However, if 64-bit number is preferred, I will change it and test it. > Please let me know. I was under the impression we had agreed on a u64 but that may just be my own interpretation. Things can stay as they are now. > > > >> + const u32 rsc_tbl_size; > >> + const uintptr_t rsc_tbl; > >> +} __packed; > >> + > >> /* > >> * Hardcoded TCM bank values. This will stay in driver to maintain > >> backward > >> * compatibility with device-tree that does not have TCM information. > >> @@ -95,20 +119,24 @@ static const struct mem_bank_data > >> zynqmp_tcm_banks_lockstep[] = { > >> /** > >> * struct zynqmp_r5_core > >> * > >> + * @rsc_tbl_va: resource table virtual address > >> * @dev: device of RPU instance > >> * @np: device node of RPU instance > >> * @tcm_bank_count: number TCM banks accessible to this RPU > >> * @tcm_banks: array of each TCM bank data > >> * @rproc: rproc handle > >> + * @rsc_tbl_size: resource table size retrieved from remote > >> * @pm_domain_id: RPU CPU power domain id > >> * @ipi: pointer to mailbox information > >> */ > >> struct zynqmp_r5_core { > >> + void __iomem *rsc_tbl_va; > >>struct device *dev; > >>struct device_node *np; > >>int tcm_bank_count; > >>struct mem_bank_data **tcm_banks; > >>struct rproc *rproc; > >> + u32 rsc_tbl_size; > >>u32 pm_domain_id; > >>struct mbox_info *ipi; > >> };
Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support
On Wed, 5 Jun 2024 at 11:47, Tanmay Shah wrote: > > > > On 6/4/24 3:23 PM, Bjorn Andersson wrote: > > On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote: > >> It is possible that remote processor is already running before > >> linux boot or remoteproc platform driver probe. Implement required > >> remoteproc framework ops to provide resource table address and > >> connect or disconnect with remote processor in such case. > >> > > > > I know if changes the current style for this driver, but could we drop > > "drivers: " from the subject prefix, to align changes to this driver > > with others? > > > > Ack. Will be fixed. Is it convention not to have "drivers" ? "drivers" is not needed. > If so, from now on, I will format subject prefix: /: > That will be just fine. > > Regards, > > Bjorn >
Re: [PATCH 1/3] rust: add static_call support
On Thu, Jun 06, 2024 at 09:09:00PM +0200, Miguel Ojeda wrote: > On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra wrote: > > > > This is absolutely unreadable gibberish -- how am I supposed to keep > > this in sync with the rest of the static_call infrastructure? > > Yeah, they are macros, which look different from "normal" Rust code. Macros like CPP ? > Is there something we could do to help here? I think Alice and others > would be happy to explain how it works and/or help maintain it in the > future if you prefer. Write a new language that looks more like C -- pretty please ? :-) Mostly I would just really like you to just use arm/jump_label.h, they're inline functions and should be doable with IR, no weirdo CPP involved in this case.
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 12:00:36PM -0700, Boqun Feng wrote: > On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote: > > On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote: > > > > > Long-term plan is to 1) compile the C helpers in some IR and 2) inline > > > the helpers with Rust in IR-level, as what Gary has: > > > > > > > > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ > > > > Urgh, that still needs us to maintain that silly list of helpers :-/ > > > > But it's an improvement from the current stage, right? ;-) Somewhat, but only marginal. > > Can't we pretty please have clang parse the actual header files into IR > > and munge that into rust? So that we don't get to manually duplicate > > everything+dog. > > That won't always work, because some of our kernel APIs are defined as > macros, and I don't think it's a trivial job to generate a macro > definition to a function definition so that it can be translated to > something in IR. We will have to do the macro -> function mapping > ourselves somewhere, if we want to inline the API across languages. We can try and see how far we can get with moving a bunch of stuff into inlines. There's quite a bit of simple CPP that could be inlines or const objects I suppose. Things like the tracepoints are of course glorious CPP abuse and are never going to work. But perhaps you can have an explicit 'eval-CPP on this here' construct or whatnot. If I squit I see this paste! thingy (WTF's up with that ! operator?) to munge function names in the static_call thing. So something like apply CPP from over there on this here can also be done :-)
Re: [PATCH] remoteproc: mediatek: Don't print error when optional scp handle is missing
On Thu, Jun 06, 2024 at 12:50:56PM +0200, AngeloGioacchino Del Regno wrote: > Il 05/06/24 21:35, Nícolas F. R. A. Prado ha scritto: > > The scp_get() helper has two users: the mtk-vcodec and the mtk-mdp3 > > drivers. mdp3 considers the mediatek,scp phandle optional, and when it's > > missing mdp3 will directly look for the scp node based on compatible. > > > > For that reason printing an error in the helper when the handle is > > missing is wrong, as it's not always an actual error. Besides, the other > > user, vcodec, already prints an error message on its own when the helper > > fails so this message doesn't add that much information. > > > > Remove the error message from the helper. This gets rid of the deceptive > > error on MT8195-Tomato: > > > >mtk-mdp3 14001000.dma-controller: can't get SCP node > > I'm way more for giving it a SCP handle instead of silencing this print... the > SCP handle way *is* the correct one and I prefer it, as that'd also be the > only > way to support Dual-Core SCP in MDP3. > > I really want to see hardcoded stuff disappear and I really really *really* > want > to see more consistency across DT nodes in MediaTek platforms. > > So, still a one-line change, but do it on the devicetree instead of here > please. Sure. So the missing scp phandle case is maintained exclusively for backwards-compatibility. And we're ok that the old DTs will keep getting the error. I'll send a v2 adding the phandle in the DT instead. Thanks, Nícolas
Re: [PATCH 1/3] rust: add static_call support
On Thu, Jun 6, 2024 at 7:19 PM Peter Zijlstra wrote: > > This is absolutely unreadable gibberish -- how am I supposed to keep > this in sync with the rest of the static_call infrastructure? Yeah, they are macros, which look different from "normal" Rust code. Is there something we could do to help here? I think Alice and others would be happy to explain how it works and/or help maintain it in the future if you prefer. Cheers, Miguel
Re: [PATCH v3 4/6] module: script to generate offset ranges for builtin modules
On Tue, May 21, 2024 at 01:53:27AM +0900, Masahiro Yamada wrote: > On Fri, May 17, 2024 at 1:31???PM Kris Van Hees > wrote: > > > > The offset range data for builtin modules is generated using: > > - modules.builtin.modinfo: associates object files with module names > > - vmlinux.map: provides load order of sections and offset of first member > > per section > > - vmlinux.o.map: provides offset of object file content per section > > - .*.cmd: build cmd file with KBUILD_MODFILE and KBUILD_MODNAME > > > > The generated data will look like: > > > > .text - = _text > > .text baf0-cb10 amd_uncore > > .text 0009bd10-0009c8e0 iosf_mbi > > ... > > .text 008e6660-008e9630 snd_soc_wcd_mbhc > > .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x > > .text 008ea610-008ea780 snd_soc_wcd9335 > > ... > > .data - = _sdata > > .data f020-f680 amd_uncore > > > > For each ELF section, it lists the offset of the first symbol. This can > > be used to determine the base address of the section at runtime. > > > > Next, it lists (in strict ascending order) offset ranges in that section > > that cover the symbols of one or more builtin modules. Multiple ranges > > can apply to a single module, and ranges can be shared between modules. > > > > Signed-off-by: Kris Van Hees > > Reviewed-by: Nick Alcock > > --- > > Changes since v2: > > - 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo > > - Switched from using modules.builtin.objs to parsing .*.cmd files > > - Parse data from .*.cmd in generate_builtin_ranges.awk > > --- > > scripts/generate_builtin_ranges.awk | 232 > > 1 file changed, 232 insertions(+) > > create mode 100755 scripts/generate_builtin_ranges.awk > > > > diff --git a/scripts/generate_builtin_ranges.awk > > b/scripts/generate_builtin_ranges.awk > > new file mode 100755 > > index 0..6975a9c7266d9 > > --- /dev/null > > +++ b/scripts/generate_builtin_ranges.awk > > @@ -0,0 +1,232 @@ > > +#!/usr/bin/gawk -f > > +# SPDX-License-Identifier: GPL-2.0 > > +# generate_builtin_ranges.awk: Generate address range data for builtin > > modules > > +# Written by Kris Van Hees > > +# > > +# Usage: generate_builtin_ranges.awk modules.builtin.modinfo vmlinux.map \ > > +# vmlinux.o.map > modules.builtin.ranges > > +# > > + > > +BEGIN { > > + # modules.builtin.modinfo uses \0 as record separator > > + # All other files use \n. > > + RS = "[\n\0]"; > > +} > > > Why do you want to parse modules.builtin.modinfo > instead of modules.builtin? > > modules.builtin uses \n separator. Oh my, I completely overlooked modules.builtin. Thank you! That is indeed much easier. > > + > > +# Return the module name(s) (if any) associated with the given object. > > +# > > +# If we have seen this object before, return information from the cache. > > +# Otherwise, retrieve it from the corresponding .cmd file. > > +# > > +function get_module_info(fn, mod, obj, mfn, s) { > > > There are 5 arguments, while the caller passes only 1 argument > ( get_module_info($4) ) That is the way to be able to have local variables in an AWK function - every variable mentioned in the function declaration is local to the function. It is an oddity in AWK. > > + if (fn in omod) > > + return omod[fn]; > > + > > + if (match(fn, /\/[^/]+$/) == 0) > > + return ""; > > + > > + obj = fn; > > + mod = ""; > > + mfn = ""; > > + fn = substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd"; > > + if (getline s > + if (match(s, /DKBUILD_MODNAME=[^ ]+/) > 0) { > > + mod = substr(s, RSTART + 17, RLENGTH - 17); > > + gsub(/['"]/, "", mod); > > + gsub(/:/, " ", mod); > > + } > > + > > + if (match(s, /DKBUILD_MODFILE=[^ ]+/) > 0) { > > + mfn = substr(s, RSTART + 17, RLENGTH - 17); > > + gsub(/['"]/, "", mfn); > > + gsub(/:/, " ", mfn); > > + } > > + } > > + close(fn); > > + > > +# tmp = $0; > > +# $0 = s; > > +# print mod " " mfn " " obj " " $NF; > > +# $0 = tmp; > > + > > + # A single module (common case) also reflects objects that are not > > part > > + # of a module. Some of those objects have names that are also a > > module > > + # name (e.g. core). We check the associated module file name, and > > if > > + # they do not match, the object is not part of a module. > > > You do not need to use KBUILD_MODNAME. > > Just use KBUILD_MODFILE only. > If the same path is found in modules.builtin, > it is a built-in module. > > Its basename is modname. Yes, that is true. I can do all this based on KBUILD_MODFILE. Thank you. Adjusting the patch that way. > One more question in a corner case. > > How does your
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 07:35:44PM +0200, Peter Zijlstra wrote: > On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote: > > > Long-term plan is to 1) compile the C helpers in some IR and 2) inline > > the helpers with Rust in IR-level, as what Gary has: > > > > > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ > > Urgh, that still needs us to maintain that silly list of helpers :-/ > But it's an improvement from the current stage, right? ;-) > Can't we pretty please have clang parse the actual header files into IR > and munge that into rust? So that we don't get to manually duplicate > everything+dog. That won't always work, because some of our kernel APIs are defined as macros, and I don't think it's a trivial job to generate a macro definition to a function definition so that it can be translated to something in IR. We will have to do the macro -> function mapping ourselves somewhere, if we want to inline the API across languages. Regards, Boqun
Re: [PATCH v2 2/5] ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update()
On Thu, 6 Jun 2024 18:53:12 +0100 Mark Rutland wrote: > On Wed, Jun 05, 2024 at 02:03:36PM -0400, Steven Rostedt wrote: > > From: "Steven Rostedt (Google)" > > > > While adding comments to the function __ftrace_hash_rec_update() and > > trying to describe in detail what the parameter for "ftrace_hash" does, I > > realized that it basically does exactly the same thing (but differently) > > if it is set or not! > > Typo: "ftrace_hash" should be "filter_hash", and likewise in the commit > title. Let me go fix up linux-next :-p > > > If it is set, the idea was the ops->filter_hash was being updated, and the > > code should focus on the functions that are in the ops->filter_hash and > > add them. But it still had to pay attention to the functions in the > > ops->notrace_hash, to ignore them. > > > > If it was cleared, it focused on the ops->notrace_hash, and would add > > functions that were not in the ops->notrace_hash but would still keep > > functions in the "ops->filter_hash". Basically doing the same thing. > > > > In reality, the __ftrace_hash_rec_update() only needs to either remove the > > functions associated to the give ops (if "inc" is set) or remove them (if > > "inc" is cleared). It has to pay attention to both the filter_hash and > > notrace_hash regardless. > > AFAICT, we actually remove a latent bug here, but that bug wasn't > reachable because we never call __ftrace_hash_rec_update() with > "filter_hash" clear. > > Before this patch, if we did call __ftrace_hash_rec_update() with > "filter_hash" clear, we'd setup: > > all = false; > ... > if (filter_hash) { > ... > } else { > hash = ops->func_hash->notrace_hash; > other_hash = ops->func_hash->filter_hash; > } > > ... and then at the tail of the loop where we do: > > count++; > > [...] > > /* Shortcut, if we handled all records, we are done. */ > if (!all && count == hash->count) { > pr_info("HARK: stopping after %d recs\n", count); > return update; > } > > ... we'd be checking whether we've updated notrace_hash->count entries, > when that could be smaller than the number of entries we actually need > to update (e.g. if the notrace hash contains a single entry, but the > filter_hash contained more). I noticed this too as well as: if (filter_hash) { hash = ops->func_hash->filter_hash; other_hash = ops->func_hash->notrace_hash; if (ftrace_hash_empty(hash)) all = true; } else { inc = !inc; hash = ops->func_hash->notrace_hash; other_hash = ops->func_hash->filter_hash; /* * If the notrace hash has no items, * then there's nothing to do. */ if (ftrace_hash_empty(hash)) return false; } That "return false" is actually a mistake as well. But I tried to hit it, but found that I could not. I think I updated the code due to bugs where I prevent that from happening, but the real fix would have been this patch. :-p > > As above, we never actually hit that because we never call with > "filter_hash" clear, so it might be good to explicitly say that before > this patch we never actually call __ftrace_hash_rec_update() with > "filter_hash" clear, and this is removing dead (and potentially broken) > code. Right. > > > Remove the "filter_hash" parameter from __filter_hash_rec_update() and > > comment the function for what it really is doing. > > > > Signed-off-by: Steven Rostedt (Google) > > FWIW, this looks good to me, and works in testing, so: > > Reviewed-by: Mark Rutland > Tested-by: Mark Rutland > > I have one comment below, but the above stands regardless. > > [...] > > > +/* > > + * This is the main engine to the ftrace updates to the dyn_ftrace records. > > + * > > + * It will iterate through all the available ftrace functions > > + * (the ones that ftrace can have callbacks to) and set the flags > > + * in the associated dyn_ftrace records. > > + * > > + * @inc: If true, the functions associated to @ops are added to > > + * the dyn_ftrace records, otherwise they are removed. > > + */ > > static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, > > -int filter_hash, > > bool inc) > > { > > struct ftrace_hash *hash; > > - struct ftrace_hash *other_hash; > > + struct ftrace_hash *notrace_hash; > > struct ftrace_page *pg; > > struct dyn_ftrace *rec; > > bool update = false; > > @@ -1718,35 +1725,16 @@ static bool __ftrace_hash_rec_update(struct > > ftrace_ops *ops, > > return false; > > > > /* > > -* In the filter_hash case: > > * If the count is zero, we update all records. > > * Otherwise we just update the
Re: [PATCH v2 5/5] ftrace: Add comments to ftrace_hash_move() and friends
On Wed, Jun 05, 2024 at 02:03:39PM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > Describe what ftrace_hash_move() does and add some more comments to some > other functions to make it easier to understand. > > Signed-off-by: Steven Rostedt (Google) Acked-by: Mark Rutland Mark. > --- > kernel/trace/ftrace.c | 24 +++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 83f23f8fc26d..ae1603e771c5 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -169,6 +169,7 @@ static inline void ftrace_ops_init(struct ftrace_ops *ops) > #endif > } > > +/* Call this function for when a callback filters on set_ftrace_pid */ > static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip, > struct ftrace_ops *op, struct ftrace_regs *fregs) > { > @@ -1317,7 +1318,7 @@ static struct ftrace_hash *alloc_ftrace_hash(int > size_bits) > return hash; > } > > - > +/* Used to save filters on functions for modules not loaded yet */ > static int ftrace_add_mod(struct trace_array *tr, > const char *func, const char *module, > int enable) > @@ -1429,6 +1430,7 @@ static struct ftrace_hash *__move_hash(struct > ftrace_hash *src, int size) > return new_hash; > } > > +/* Move the @src entries to a newly allocated hash */ > static struct ftrace_hash * > __ftrace_hash_move(struct ftrace_hash *src) > { > @@ -1443,6 +1445,26 @@ __ftrace_hash_move(struct ftrace_hash *src) > return __move_hash(src, size); > } > > +/** > + * ftrace_hash_move - move a new hash to a filter and do updates > + * @ops: The ops with the hash that @dst points to > + * @enable: True if for the filter hash, false for the notrace hash > + * @dst: Points to the @ops hash that should be updated > + * @src: The hash to update @dst with > + * > + * This is called when an ftrace_ops hash is being updated and the > + * the kernel needs to reflect this. Note, this only updates the kernel > + * function callbacks if the @ops is enabled (not to be confused with > + * @enable above). If the @ops is enabled, its hash determines what > + * callbacks get called. This function gets called when the @ops hash > + * is updated and it requires new callbacks. > + * > + * On success the elements of @src is moved to @dst, and @dst is updated > + * properly, as well as the functions determined by the @ops hashes > + * are now calling the @ops callback function. > + * > + * Regardless of return type, @src should be freed with free_ftrace_hash(). > + */ > static int > ftrace_hash_move(struct ftrace_ops *ops, int enable, >struct ftrace_hash **dst, struct ftrace_hash *src) > -- > 2.43.0 > >
Re: [PATCH v2 4/5] ftrace: Convert "inc" parameter to bool in ftrace_hash_rec_update_modify()
On Wed, Jun 05, 2024 at 02:03:38PM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > The parameter "inc" in the function ftrace_hash_rec_update_modify() is > boolean. Change it to be such. > > Also add documentation to what the function does. > > Signed-off-by: Steven Rostedt (Google) Acked-by: Mark Rutland Mark. > --- > kernel/trace/ftrace.c | 23 --- > 1 file changed, 20 insertions(+), 3 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 1a2444199740..83f23f8fc26d 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -1882,7 +1882,24 @@ static bool ftrace_hash_rec_enable(struct ftrace_ops > *ops) > return __ftrace_hash_rec_update(ops, true); > } > > -static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops, int inc) > +/* > + * This function will update what functions @ops traces when its filter > + * changes. > + * > + * The @inc states if the @ops callbacks are going to be added or removed. > + * When one of the @ops hashes are updated to a "new_hash" the dyn_ftrace > + * records are update via: > + * > + * ftrace_hash_rec_disable_modify(ops); > + * ops->hash = new_hash > + * ftrace_hash_rec_enable_modify(ops); > + * > + * Where the @ops is removed from all the records it is tracing using > + * its old hash. The @ops hash is updated to the new hash, and then > + * the @ops is added back to the records so that it is tracing all > + * the new functions. > + */ > +static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops, bool inc) > { > struct ftrace_ops *op; > > @@ -1906,12 +1923,12 @@ static void ftrace_hash_rec_update_modify(struct > ftrace_ops *ops, int inc) > > static void ftrace_hash_rec_disable_modify(struct ftrace_ops *ops) > { > - ftrace_hash_rec_update_modify(ops, 0); > + ftrace_hash_rec_update_modify(ops, false); > } > > static void ftrace_hash_rec_enable_modify(struct ftrace_ops *ops) > { > - ftrace_hash_rec_update_modify(ops, 1); > + ftrace_hash_rec_update_modify(ops, true); > } > > /* > -- > 2.43.0 > >
Re: [PATCH v2 2/5] ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update()
On Wed, Jun 05, 2024 at 02:03:36PM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > While adding comments to the function __ftrace_hash_rec_update() and > trying to describe in detail what the parameter for "ftrace_hash" does, I > realized that it basically does exactly the same thing (but differently) > if it is set or not! Typo: "ftrace_hash" should be "filter_hash", and likewise in the commit title. > If it is set, the idea was the ops->filter_hash was being updated, and the > code should focus on the functions that are in the ops->filter_hash and > add them. But it still had to pay attention to the functions in the > ops->notrace_hash, to ignore them. > > If it was cleared, it focused on the ops->notrace_hash, and would add > functions that were not in the ops->notrace_hash but would still keep > functions in the "ops->filter_hash". Basically doing the same thing. > > In reality, the __ftrace_hash_rec_update() only needs to either remove the > functions associated to the give ops (if "inc" is set) or remove them (if > "inc" is cleared). It has to pay attention to both the filter_hash and > notrace_hash regardless. AFAICT, we actually remove a latent bug here, but that bug wasn't reachable because we never call __ftrace_hash_rec_update() with "filter_hash" clear. Before this patch, if we did call __ftrace_hash_rec_update() with "filter_hash" clear, we'd setup: all = false; ... if (filter_hash) { ... } else { hash = ops->func_hash->notrace_hash; other_hash = ops->func_hash->filter_hash; } ... and then at the tail of the loop where we do: count++; [...] /* Shortcut, if we handled all records, we are done. */ if (!all && count == hash->count) { pr_info("HARK: stopping after %d recs\n", count); return update; } ... we'd be checking whether we've updated notrace_hash->count entries, when that could be smaller than the number of entries we actually need to update (e.g. if the notrace hash contains a single entry, but the filter_hash contained more). As above, we never actually hit that because we never call with "filter_hash" clear, so it might be good to explicitly say that before this patch we never actually call __ftrace_hash_rec_update() with "filter_hash" clear, and this is removing dead (and potentially broken) code. > Remove the "filter_hash" parameter from __filter_hash_rec_update() and > comment the function for what it really is doing. > > Signed-off-by: Steven Rostedt (Google) FWIW, this looks good to me, and works in testing, so: Reviewed-by: Mark Rutland Tested-by: Mark Rutland I have one comment below, but the above stands regardless. [...] > +/* > + * This is the main engine to the ftrace updates to the dyn_ftrace records. > + * > + * It will iterate through all the available ftrace functions > + * (the ones that ftrace can have callbacks to) and set the flags > + * in the associated dyn_ftrace records. > + * > + * @inc: If true, the functions associated to @ops are added to > + * the dyn_ftrace records, otherwise they are removed. > + */ > static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, > - int filter_hash, >bool inc) > { > struct ftrace_hash *hash; > - struct ftrace_hash *other_hash; > + struct ftrace_hash *notrace_hash; > struct ftrace_page *pg; > struct dyn_ftrace *rec; > bool update = false; > @@ -1718,35 +1725,16 @@ static bool __ftrace_hash_rec_update(struct > ftrace_ops *ops, > return false; > > /* > - * In the filter_hash case: >* If the count is zero, we update all records. >* Otherwise we just update the items in the hash. > - * > - * In the notrace_hash case: > - * We enable the update in the hash. > - * As disabling notrace means enabling the tracing, > - * and enabling notrace means disabling, the inc variable > - * gets inversed. >*/ > - if (filter_hash) { > - hash = ops->func_hash->filter_hash; > - other_hash = ops->func_hash->notrace_hash; > - if (ftrace_hash_empty(hash)) > - all = true; > - } else { > - inc = !inc; > - hash = ops->func_hash->notrace_hash; > - other_hash = ops->func_hash->filter_hash; > - /* > - * If the notrace hash has no items, > - * then there's nothing to do. > - */ > - if (ftrace_hash_empty(hash)) > - return false; > - } > + hash = ops->func_hash->filter_hash; > + notrace_hash = ops->func_hash->notrace_hash; > + if (ftrace_hash_empty(hash)) > + all = true; > > do_for_each_ftrace_rec(pg, rec) { > - int in_other_hash
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 08:49:06AM -0700, Boqun Feng wrote: > Long-term plan is to 1) compile the C helpers in some IR and 2) inline > the helpers with Rust in IR-level, as what Gary has: > > > https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ Urgh, that still needs us to maintain that silly list of helpers :-/ Can't we pretty please have clang parse the actual header files into IR and munge that into rust? So that we don't get to manually duplicate everything+dog.
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 03:05:26PM +, Alice Ryhl wrote: > Make it possible to have Rust code call into tracepoints defined by C > code. It is still required that the tracepoint is declared in a C > header, and that this header is included in the input to bindgen. > > Signed-off-by: Alice Ryhl > --- > rust/bindings/bindings_helper.h | 1 + > rust/bindings/lib.rs| 15 +++ > rust/helpers.c | 24 +++ > rust/kernel/lib.rs | 1 + > rust/kernel/tracepoint.rs | 92 > + > 5 files changed, 133 insertions(+) > > diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h > index ddb5644d4fd9..d442f9ccfc2c 100644 > --- a/rust/bindings/bindings_helper.h > +++ b/rust/bindings/bindings_helper.h > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > #include > #include > > diff --git a/rust/bindings/lib.rs b/rust/bindings/lib.rs > index 40ddaee50d8b..48856761d682 100644 > --- a/rust/bindings/lib.rs > +++ b/rust/bindings/lib.rs > @@ -48,3 +48,18 @@ mod bindings_helper { > } > > pub use bindings_raw::*; > + > +/// Rust version of the C macro `rcu_dereference_raw`. > +/// > +/// The rust helper only works with void pointers, but this wrapper method > makes it work with any > +/// pointer type using pointer casts. > +/// > +/// # Safety > +/// > +/// This method has the same safety requirements as the C macro of the same > name. > +#[inline(always)] > +pub unsafe fn rcu_dereference_raw(p: *const *mut T) -> *mut T { > +// SAFETY: This helper calls into the C macro, so the caller promises to > uphold the safety > +// requirements. > +unsafe { __rcu_dereference_raw(p as *mut *mut _) as *mut T } > +} > diff --git a/rust/helpers.c b/rust/helpers.c > index 2c37a0f5d7a8..0560cc2a512a 100644 > --- a/rust/helpers.c > +++ b/rust/helpers.c > @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, > gfp_t flags) > } > EXPORT_SYMBOL_GPL(rust_helper_krealloc); > > +void rust_helper_preempt_enable_notrace(void) > +{ > + preempt_enable_notrace(); > +} > +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); > + > +void rust_helper_preempt_disable_notrace(void) > +{ > + preempt_disable_notrace(); > +} > +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); A notrace wrapper that is tracable, lol. > +bool rust_helper_current_cpu_online(void) > +{ > + return cpu_online(raw_smp_processor_id()); > +} > +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); > + > +void *rust_helper___rcu_dereference_raw(void **p) > +{ > + return rcu_dereference_raw(p); > +} > +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); I'm going to keep yelling and objecting to these wrappers. Fix bindgen already. Or whatever is needed to get it to interoperate with C inline / asm. NAK NAK NAK
Re: [PATCH 2/3] rust: add static_key_false
On Thu, Jun 06, 2024 at 03:05:25PM +, Alice Ryhl wrote: > Add just enough support for static key so that we can use it from > tracepoints. Tracepoints rely on `static_key_false` even though it is > deprecated, so we add the same functionality to Rust. Urgh, more unreadable gibberish :-( Can we please get bindgen to translate asm/jump_table.h instead of randomly duplicating random bits. I really think that whoever created rust was an esoteric language freak. Hideous crap :-(.
Re: [PATCH 1/3] rust: add static_call support
On Thu, Jun 06, 2024 at 03:05:24PM +, Alice Ryhl wrote: > Add static_call support by mirroring how C does. When the platform does > not support static calls (right now, that means that it is not x86), > then the function pointer is loaded from a global and called. Otherwise, > we generate a call to a trampoline function, and objtool is used to make > these calls patchable at runtime. This is absolutely unreadable gibberish -- how am I supposed to keep this in sync with the rest of the static_call infrastructure? > Signed-off-by: Alice Ryhl > --- > rust/kernel/lib.rs | 1 + > rust/kernel/static_call.rs | 92 > ++ > 2 files changed, 93 insertions(+) > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs > index fbd91a48ff8b..d534b1178955 100644 > --- a/rust/kernel/lib.rs > +++ b/rust/kernel/lib.rs > @@ -38,6 +38,7 @@ > pub mod prelude; > pub mod print; > mod static_assert; > +pub mod static_call; > #[doc(hidden)] > pub mod std_vendor; > pub mod str; > diff --git a/rust/kernel/static_call.rs b/rust/kernel/static_call.rs > new file mode 100644 > index ..f7b8ba7bf1fb > --- /dev/null > +++ b/rust/kernel/static_call.rs > @@ -0,0 +1,92 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +// Copyright (C) 2024 Google LLC. > + > +//! Logic for static calls. > + > +#[macro_export] > +#[doc(hidden)] > +macro_rules! ty_underscore_for { > +($arg:expr) => { > +_ > +}; > +} > + > +#[doc(hidden)] > +#[repr(transparent)] > +pub struct AddressableStaticCallKey { > +_ptr: *const bindings::static_call_key, > +} > +unsafe impl Sync for AddressableStaticCallKey {} > +impl AddressableStaticCallKey { > +pub const fn new(ptr: *const bindings::static_call_key) -> Self { > +Self { _ptr: ptr } > +} > +} > + > +#[cfg(CONFIG_HAVE_STATIC_CALL)] > +#[doc(hidden)] > +#[macro_export] > +macro_rules! _static_call { > +($name:ident($($args:expr),* $(,)?)) => {{ > +// Symbol mangling will give this symbol a unique name. > +#[cfg(CONFIG_HAVE_STATIC_CALL_INLINE)] > +#[link_section = ".discard.addressable"] > +#[used] > +static __ADDRESSABLE: $crate::static_call::AddressableStaticCallKey > = unsafe { > + > $crate::static_call::AddressableStaticCallKey::new(::core::ptr::addr_of!( > +$crate::macros::paste! { $crate::bindings:: [<__SCK__ $name > >]; } > +)) > +}; > + > +let fn_ptr: unsafe extern "C" > fn($($crate::static_call::ty_underscore_for!($args)),*) -> _ = > +$crate::macros::paste! { $crate::bindings:: [<__SCT__ $name >]; > }; > +(fn_ptr)($($args),*) > +}}; > +} > + > +#[cfg(not(CONFIG_HAVE_STATIC_CALL))] > +#[doc(hidden)] > +#[macro_export] > +macro_rules! _static_call { > +($name:ident($($args:expr),* $(,)?)) => {{ > +let void_ptr_fn: *mut ::core::ffi::c_void = > +$crate::macros::paste! { $crate::bindings:: [<__SCK__ $name >]; > }.func; > + > +let fn_ptr: unsafe extern "C" > fn($($crate::static_call::ty_underscore_for!($args)),*) -> _ = > +if true { > +::core::mem::transmute(void_ptr_fn) > +} else { > +// This is dead code, but it influences type inference on > `fn_ptr` so that we > +// transmute the function pointer to the right type. > +$crate::macros::paste! { $crate::bindings:: [<__SCT__ $name > >]; } > +}; > + > +(fn_ptr)($($args),*) > +}}; > +} > + > +/// Statically call a global function. > +/// > +/// # Safety > +/// > +/// This macro will call the provided function. It is up to the caller to > uphold the safety > +/// guarantees of the function. > +/// > +/// # Examples > +/// > +/// ```ignore > +/// fn call_static() { > +/// unsafe { > +/// static_call! { your_static_call() }; > +/// } > +/// } > +/// ``` > +#[macro_export] > +macro_rules! static_call { > +// Forward to the real implementation. Separated like this so that we > don't have to duplicate > +// the documentation. > +($($args:tt)*) => { $crate::static_call::_static_call! { $($args)* } }; > +} > + > +pub use {_static_call, static_call, ty_underscore_for}; > > -- > 2.45.2.505.gda0bf45e8d-goog >
Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()
Li Zhijian wrote: > Don't allocate devs again when it's valid pointer which has pionted to > the memory allocated above with size (count + 2 * sizeof(dev)). > > A kmemleak reports: > unreferenced object 0x88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > backtrace (crc 0): > [] __kmalloc+0x32c/0x470 > [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 > [libnvdimm] > [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [ ] really_probe+0xc6/0x390 > [<129e2a69>] __driver_probe_device+0x78/0x150 > [<2dfed28b>] driver_probe_device+0x1e/0x90 > [ ] __device_attach_driver+0x85/0x110 > [<32dca295>] bus_for_each_drv+0x85/0xe0 > [<391c5a7d>] __device_attach+0xbe/0x1e0 > [<26dabec0>] bus_probe_device+0x94/0xb0 > [ ] device_add+0x656/0x870 > [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<3f4c52a4>] async_run_entry_fn+0x2e/0x110 > [ ] process_one_work+0x1ee/0x600 > [<6d90d5a9>] worker_thread+0x183/0x350 > > Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") > Signed-off-by: Li Zhijian > --- > drivers/nvdimm/namespace_devs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..56b016dbe307 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region > *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + /* devs probably has been allocated */ I don't think this is where the bug is. The loop above is processing the known labels and should exit with a count > 0 if devs is not NULL. >From what I can tell create_namespace_pmem() must be returning EAGAIN which leaves devs allocated but fails to increment count. Thus there are no valid labels but devs was not free'ed. Can you trace the error you are seeing a bit more to see if this is the case? Thanks, Ira > + if (!devs) > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > if (!devs) > goto err; > > -- > 2.29.2 >
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Thu, Jun 6, 2024 at 9:46 AM Jiri Olsa wrote: > > On Wed, Jun 05, 2024 at 10:50:11PM +0200, Jiri Olsa wrote: > > On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote: > > > On 06/05, Andrii Nakryiko wrote: > > > > > > > > so any such > > > > limitations will cause problems, issue reports, investigation, etc. > > > > > > Agreed... > > > > > > > As one possible solution, what if we do > > > > > > > > struct return_instance { > > > > ... > > > > u64 session_cookies[]; > > > > }; > > > > > > > > and allocate sizeof(struct return_instance) + 8 * > > > > and then at runtime pass > > > > _cookies[i] as data pointer to session-aware callbacks? > > > > > > I too thought about this, but I guess it is not that simple. > > > > > > Just for example. Suppose we have 2 session-consumers C1 and C2. > > > What if uprobe_unregister(C1) comes before the probed function > > > returns? > > > > > > We need something like map_cookie_to_consumer(). > > > > I guess we could have hash table in return_instance that gets 'consumer -> > > cookie' ? > > ok, hash table is probably too big for this.. I guess some solution that > would iterate consumers and cookies made sure it matches would be fine > Yes, I was hoping to avoid hash tables for this, and in the common case have no added overhead. > jirka > > > > > return instance is freed after the consumers' return handlers are executed, > > so there's no leak if some consumer gets unregistered before that > > > > > > > > > > + /* The handler_session callback return value controls > > > > > execution of > > > > > +* the return uprobe and ret_handler_session callback. > > > > > +* 0 on success > > > > > +* 1 on failure, DO NOT install/execute the return uprobe > > > > > +*console warning for anything else > > > > > +*/ > > > > > + int (*handler_session)(struct uprobe_consumer *self, struct > > > > > pt_regs *regs, > > > > > + unsigned long *data); > > > > > + int (*ret_handler_session)(struct uprobe_consumer *self, > > > > > unsigned long func, > > > > > + struct pt_regs *regs, unsigned > > > > > long *data); > > > > > + > > > > > > > > We should try to avoid an alternative set of callbacks, IMO. Let's > > > > extend existing ones with `unsigned long *data`, > > > > > > Oh yes, agreed. > > > > > > And the comment about the return value looks confusing too. I mean, the > > > logic doesn't differ from the ret-code from ->handler(). > > > > > > "DO NOT install/execute the return uprobe" is not true if another > > > non-session-consumer returns 0. > > > > well they are meant to be exclusive, so there'd be no other > > non-session-consumer > > > > jirka
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 05, 2024 at 10:50:11PM +0200, Jiri Olsa wrote: > On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote: > > On 06/05, Andrii Nakryiko wrote: > > > > > > so any such > > > limitations will cause problems, issue reports, investigation, etc. > > > > Agreed... > > > > > As one possible solution, what if we do > > > > > > struct return_instance { > > > ... > > > u64 session_cookies[]; > > > }; > > > > > > and allocate sizeof(struct return_instance) + 8 * > > > and then at runtime pass > > > _cookies[i] as data pointer to session-aware callbacks? > > > > I too thought about this, but I guess it is not that simple. > > > > Just for example. Suppose we have 2 session-consumers C1 and C2. > > What if uprobe_unregister(C1) comes before the probed function > > returns? > > > > We need something like map_cookie_to_consumer(). > > I guess we could have hash table in return_instance that gets 'consumer -> > cookie' ? ok, hash table is probably too big for this.. I guess some solution that would iterate consumers and cookies made sure it matches would be fine jirka > > return instance is freed after the consumers' return handlers are executed, > so there's no leak if some consumer gets unregistered before that > > > > > > > + /* The handler_session callback return value controls execution > > > > of > > > > +* the return uprobe and ret_handler_session callback. > > > > +* 0 on success > > > > +* 1 on failure, DO NOT install/execute the return uprobe > > > > +*console warning for anything else > > > > +*/ > > > > + int (*handler_session)(struct uprobe_consumer *self, struct > > > > pt_regs *regs, > > > > + unsigned long *data); > > > > + int (*ret_handler_session)(struct uprobe_consumer *self, > > > > unsigned long func, > > > > + struct pt_regs *regs, unsigned long > > > > *data); > > > > + > > > > > > We should try to avoid an alternative set of callbacks, IMO. Let's > > > extend existing ones with `unsigned long *data`, > > > > Oh yes, agreed. > > > > And the comment about the return value looks confusing too. I mean, the > > logic doesn't differ from the ret-code from ->handler(). > > > > "DO NOT install/execute the return uprobe" is not true if another > > non-session-consumer returns 0. > > well they are meant to be exclusive, so there'd be no other > non-session-consumer > > jirka
Re: [PATCH 0/3] Tracepoints and static branch/call in Rust
On 2024-06-06 11:46, Alice Ryhl wrote: On Thu, Jun 6, 2024 at 5:25 PM Mathieu Desnoyers wrote: On 2024-06-06 11:05, Alice Ryhl wrote: This implementation implements support for static keys in Rust so that the actual static branch will end up in the Rust object file. However, it would also be possible to just wrap the trace_##name generated by __DECLARE_TRACE in an extern C function and then call that from Rust. This will simplify the Rust code by removing the need for static branches and calls, but it places the static branch behind an external call, which has performance implications. The tracepoints try very hard to minimize overhead of dormant tracepoints so it is not frowned-upon to have them built into production binaries. This is needed to make sure distribution vendors keep those tracepoints in the kernel binaries that reach end-users. Adding a function call before evaluation of the static branch goes against this major goal. A possible middle ground would be to place just the __DO_TRACE body in an extern C function and to implement the Rust wrapper by doing the static branch in Rust, and then calling into C the code that contains __DO_TRACE when the tracepoint is active. However, this would need some changes to include/linux/tracepoint.h to generate and export a function containing the body of __DO_TRACE when the tracepoint should be callable from Rust. This tradeoff is more acceptable than having a function call before evaluation of the static branch, but I wonder what is the upside of this tradeoff compared to inlining the whole __DO_TRACE in Rust ? So in general, there is a tradeoff between placing parts of the tracepoint (which is perf sensitive) behind an external call, and having code duplicated in both C and Rust (which must be kept in sync when changes are made). This is an important point that I would like feedback on from the C maintainers. I don't see how the duplication happens there: __DO_TRACE is meant to be inlined into each C tracepoint caller site, so the code is already meant to be duplicated. Having an explicit function wrapping the tracepoint for Rust would just create an extra instance of __DO_TRACE if it happens to be also inlined into C code. Or do you meant you would like to prevent having to duplicate the implementation of __DO_TRACE in both C and Rust ? I'm not sure if you mean to prevent source code duplication between C and Rust or duplication of binary code (instructions). It's a question of maintenance burden. If you change how __DO_TRACE is implemented, then those changes must also be reflected in the Rust version. There's no issue in the binary code. As long as it is only __DO_TRACE that is duplicated between C and Rust, I don't see it as a large burden. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH] nvdimm: Fix devs leaks in scan_labels()
On 6/3/24 8:16 PM, Li Zhijian wrote: > Don't allocate devs again when it's valid pointer which has pionted to > the memory allocated above with size (count + 2 * sizeof(dev)). > > A kmemleak reports: > unreferenced object 0x88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > backtrace (crc 0): > [] __kmalloc+0x32c/0x470 > [<9ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 > [libnvdimm] > [<0e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<7b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [ ] really_probe+0xc6/0x390 > [<129e2a69>] __driver_probe_device+0x78/0x150 > [<2dfed28b>] driver_probe_device+0x1e/0x90 > [ ] __device_attach_driver+0x85/0x110 > [<32dca295>] bus_for_each_drv+0x85/0xe0 > [<391c5a7d>] __device_attach+0xbe/0x1e0 > [<26dabec0>] bus_probe_device+0x94/0xb0 > [ ] device_add+0x656/0x870 > [<3d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<3f4c52a4>] async_run_entry_fn+0x2e/0x110 > [ ] process_one_work+0x1ee/0x600 > [<6d90d5a9>] worker_thread+0x183/0x350 > > Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") > Signed-off-by: Li Zhijian > --- > drivers/nvdimm/namespace_devs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..56b016dbe307 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region > *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + /* devs probably has been allocated */ > + if (!devs) > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); This changes the behavior of this code and possibly corrupting the previously allocated memory at times when 'devs' is valid. Was the 'devs' leaked out from the previous loop and should be freed instead? > if (!devs) > goto err; >
Re: [PATCH 2/3] rust: add static_key_false
On Thu, Jun 6, 2024 at 5:37 PM Mathieu Desnoyers wrote: > > On 2024-06-06 11:05, Alice Ryhl wrote: > > Add just enough support for static key so that we can use it from > > tracepoints. Tracepoints rely on `static_key_false` even though it is > > deprecated, so we add the same functionality to Rust. > > > > Signed-off-by: Alice Ryhl > > --- > > rust/kernel/lib.rs| 1 + > > rust/kernel/static_key.rs | 87 > > +++ > > scripts/Makefile.build| 2 +- > > 3 files changed, 89 insertions(+), 1 deletion(-) > > > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs > > index d534b1178955..22e1fedd0774 100644 > > --- a/rust/kernel/lib.rs > > +++ b/rust/kernel/lib.rs > > @@ -39,6 +39,7 @@ > > pub mod print; > > mod static_assert; > > pub mod static_call; > > +pub mod static_key; > > #[doc(hidden)] > > pub mod std_vendor; > > pub mod str; > > diff --git a/rust/kernel/static_key.rs b/rust/kernel/static_key.rs > > new file mode 100644 > > index ..6c3dbe14c98a > > --- /dev/null > > +++ b/rust/kernel/static_key.rs > > @@ -0,0 +1,87 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > This static key code is something that can be generally useful > both in kernel and user-space. Is there anything that would prevent > licensing this under MIT right away instead so it could eventually be > re-used more broadly in userspace as well ? I would not mind. Alice
Re: [PATCH 3/3] rust: add tracepoint support
On 2024-06-06 12:16, Alice Ryhl wrote: On Thu, Jun 6, 2024 at 5:29 PM Mathieu Desnoyers wrote: On 2024-06-06 11:05, Alice Ryhl wrote: Make it possible to have Rust code call into tracepoints defined by C code. It is still required that the tracepoint is declared in a C header, and that this header is included in the input to bindgen. Signed-off-by: Alice Ryhl [...] diff --git a/rust/helpers.c b/rust/helpers.c index 2c37a0f5d7a8..0560cc2a512a 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, gfp_t flags) } EXPORT_SYMBOL_GPL(rust_helper_krealloc); +void rust_helper_preempt_enable_notrace(void) +{ + preempt_enable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); + +void rust_helper_preempt_disable_notrace(void) +{ + preempt_disable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); + +bool rust_helper_current_cpu_online(void) +{ + return cpu_online(raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); + +void *rust_helper___rcu_dereference_raw(void **p) +{ + return rcu_dereference_raw(p); +} +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); Ouch. Doing a function call for each of those small operations will have a rather large performance impact when tracing is active. If it is not possible to inline those in Rust, then implementing __DO_TRACE in a C function would at least allow Rust to only do a single call to C rather than go back and forth between Rust and C. What prevents inlining those helpers in Rust ? There's nothing fundamental that prevents it. But they contain a large amount of architecture specific code, which makes them a significant amount of work to reimplement in Rust. For example, rcu_dereference_raw calls into READ_ONCE. READ_ONCE is usually a volatile load, but under arm64+LTO, you get a bunch of inline assembly that relies on ALTERNATIVE for feature detection: https://elixir.bootlin.com/linux/v6.9/source/arch/arm64/include/asm/rwonce.h#L36 And preempt_enable_notrace has a similar story. The solution that Boqun mentions is nice, but it relies on rustc and clang using the same version of LLVM. You are unlikely to have compilers with matching LLVMs unless you intentionally take steps to make that happen. But yes, perhaps these helpers are an argument to have a single call for __DO_TRACE instead. If those helpers end up being inlined into Rust with the solution pointed to by Boqun, then it makes sense to implement __DO_TRACE in Rust. Otherwise doing a single call to C would be more efficient than calling each of the helpers individually. Thanks, Mathieu Alice -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH 3/3] rust: add tracepoint support
On 2024-06-06 11:49, Boqun Feng wrote: On Thu, Jun 06, 2024 at 11:30:03AM -0400, Mathieu Desnoyers wrote: On 2024-06-06 11:05, Alice Ryhl wrote: Make it possible to have Rust code call into tracepoints defined by C code. It is still required that the tracepoint is declared in a C header, and that this header is included in the input to bindgen. Signed-off-by: Alice Ryhl [...] diff --git a/rust/helpers.c b/rust/helpers.c index 2c37a0f5d7a8..0560cc2a512a 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, gfp_t flags) } EXPORT_SYMBOL_GPL(rust_helper_krealloc); +void rust_helper_preempt_enable_notrace(void) +{ + preempt_enable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); + +void rust_helper_preempt_disable_notrace(void) +{ + preempt_disable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); + +bool rust_helper_current_cpu_online(void) +{ + return cpu_online(raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); + +void *rust_helper___rcu_dereference_raw(void **p) +{ + return rcu_dereference_raw(p); +} +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); Ouch. Doing a function call for each of those small operations will have a rather large performance impact when tracing is active. If it is Long-term plan is to 1) compile the C helpers in some IR and 2) inline the helpers with Rust in IR-level, as what Gary has: https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ and I use it for the upcoming atomic API support: https://github.com/fbq/linux/tree/dev/rust/atomic-rfc and it works very well. Thanks for the pointers, it makes sense. Thanks, Mathieu Regards, Boqun not possible to inline those in Rust, then implementing __DO_TRACE in a C function would at least allow Rust to only do a single call to C rather than go back and forth between Rust and C. What prevents inlining those helpers in Rust ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 6, 2024 at 5:29 PM Mathieu Desnoyers wrote: > > On 2024-06-06 11:05, Alice Ryhl wrote: > > Make it possible to have Rust code call into tracepoints defined by C > > code. It is still required that the tracepoint is declared in a C > > header, and that this header is included in the input to bindgen. > > > > Signed-off-by: Alice Ryhl > > [...] > > > diff --git a/rust/helpers.c b/rust/helpers.c > > index 2c37a0f5d7a8..0560cc2a512a 100644 > > --- a/rust/helpers.c > > +++ b/rust/helpers.c > > @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t > > new_size, gfp_t flags) > > } > > EXPORT_SYMBOL_GPL(rust_helper_krealloc); > > > > +void rust_helper_preempt_enable_notrace(void) > > +{ > > + preempt_enable_notrace(); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); > > + > > +void rust_helper_preempt_disable_notrace(void) > > +{ > > + preempt_disable_notrace(); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); > > + > > +bool rust_helper_current_cpu_online(void) > > +{ > > + return cpu_online(raw_smp_processor_id()); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); > > + > > +void *rust_helper___rcu_dereference_raw(void **p) > > +{ > > + return rcu_dereference_raw(p); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); > > Ouch. Doing a function call for each of those small operations will > have a rather large performance impact when tracing is active. If it is > not possible to inline those in Rust, then implementing __DO_TRACE in > a C function would at least allow Rust to only do a single call to C > rather than go back and forth between Rust and C. > > What prevents inlining those helpers in Rust ? There's nothing fundamental that prevents it. But they contain a large amount of architecture specific code, which makes them a significant amount of work to reimplement in Rust. For example, rcu_dereference_raw calls into READ_ONCE. READ_ONCE is usually a volatile load, but under arm64+LTO, you get a bunch of inline assembly that relies on ALTERNATIVE for feature detection: https://elixir.bootlin.com/linux/v6.9/source/arch/arm64/include/asm/rwonce.h#L36 And preempt_enable_notrace has a similar story. The solution that Boqun mentions is nice, but it relies on rustc and clang using the same version of LLVM. You are unlikely to have compilers with matching LLVMs unless you intentionally take steps to make that happen. But yes, perhaps these helpers are an argument to have a single call for __DO_TRACE instead. Alice
Re: [PATCH RESEND] nvdimm: add missing MODULE_DESCRIPTION() macros
On 5/26/24 10:07 AM, Jeff Johnson wrote: > Fix the 'make W=1' warnings: > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/libnvdimm.o > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_pmem.o > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_btt.o > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_e820.o > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/of_pmem.o > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_virtio.o > > Signed-off-by: Jeff Johnson Reviewed-by: Dave Jiang > --- > drivers/nvdimm/btt.c | 1 + > drivers/nvdimm/core.c | 1 + > drivers/nvdimm/e820.c | 1 + > drivers/nvdimm/nd_virtio.c | 1 + > drivers/nvdimm/of_pmem.c | 1 + > drivers/nvdimm/pmem.c | 1 + > 6 files changed, 6 insertions(+) > > diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c > index 1e5aedaf8c7b..a47acc5d05df 100644 > --- a/drivers/nvdimm/btt.c > +++ b/drivers/nvdimm/btt.c > @@ -1721,6 +1721,7 @@ static void __exit nd_btt_exit(void) > > MODULE_ALIAS_ND_DEVICE(ND_DEVICE_BTT); > MODULE_AUTHOR("Vishal Verma "); > +MODULE_DESCRIPTION("NVDIMM Block Translation Table"); > MODULE_LICENSE("GPL v2"); > module_init(nd_btt_init); > module_exit(nd_btt_exit); > diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c > index 2023a661bbb0..f4b6fb4b9828 100644 > --- a/drivers/nvdimm/core.c > +++ b/drivers/nvdimm/core.c > @@ -540,6 +540,7 @@ static __exit void libnvdimm_exit(void) > nvdimm_devs_exit(); > } > > +MODULE_DESCRIPTION("NVDIMM (Non-Volatile Memory Device) core module"); > MODULE_LICENSE("GPL v2"); > MODULE_AUTHOR("Intel Corporation"); > subsys_initcall(libnvdimm_init); > diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c > index 4cd18be9d0e9..008b9aae74ff 100644 > --- a/drivers/nvdimm/e820.c > +++ b/drivers/nvdimm/e820.c > @@ -69,5 +69,6 @@ static struct platform_driver e820_pmem_driver = { > module_platform_driver(e820_pmem_driver); > > MODULE_ALIAS("platform:e820_pmem*"); > +MODULE_DESCRIPTION("NVDIMM support for e820 type-12 memory"); > MODULE_LICENSE("GPL v2"); > MODULE_AUTHOR("Intel Corporation"); > diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c > index 1f8c667c6f1e..35c8fbbba10e 100644 > --- a/drivers/nvdimm/nd_virtio.c > +++ b/drivers/nvdimm/nd_virtio.c > @@ -123,4 +123,5 @@ int async_pmem_flush(struct nd_region *nd_region, struct > bio *bio) > return 0; > }; > EXPORT_SYMBOL_GPL(async_pmem_flush); > +MODULE_DESCRIPTION("Virtio Persistent Memory Driver"); > MODULE_LICENSE("GPL"); > diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c > index d3fca0ab6290..5134a8d08bf9 100644 > --- a/drivers/nvdimm/of_pmem.c > +++ b/drivers/nvdimm/of_pmem.c > @@ -111,5 +111,6 @@ static struct platform_driver of_pmem_region_driver = { > > module_platform_driver(of_pmem_region_driver); > MODULE_DEVICE_TABLE(of, of_pmem_region_match); > +MODULE_DESCRIPTION("NVDIMM Device Tree support"); > MODULE_LICENSE("GPL"); > MODULE_AUTHOR("IBM Corporation"); > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > index 598fe2e89bda..57cb30f8a3b8 100644 > --- a/drivers/nvdimm/pmem.c > +++ b/drivers/nvdimm/pmem.c > @@ -768,4 +768,5 @@ static struct nd_device_driver nd_pmem_driver = { > module_nd_driver(nd_pmem_driver); > > MODULE_AUTHOR("Ross Zwisler "); > +MODULE_DESCRIPTION("NVDIMM Persistent Memory Driver"); > MODULE_LICENSE("GPL v2"); > > --- > base-commit: 416ff45264d50a983c3c0b99f0da6ee59f9acd68 > change-id: 20240526-md-drivers-nvdimm-121215a4b93f
Re: [PATCH] ACPI: NFIT: add missing MODULE_DESCRIPTION() macro
On 6/3/24 6:30 AM, Jeff Johnson wrote: > make allmodconfig && make W=1 C=1 reports: > WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/acpi/nfit/nfit.o > > Add the missing invocation of the MODULE_DESCRIPTION() macro. > > Signed-off-by: Jeff Johnson Reviewed-by: Dave Jiang > --- > drivers/acpi/nfit/core.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c > index d4595d1985b1..e8520fb8af4f 100644 > --- a/drivers/acpi/nfit/core.c > +++ b/drivers/acpi/nfit/core.c > @@ -3531,5 +3531,6 @@ static __exit void nfit_exit(void) > > module_init(nfit_init); > module_exit(nfit_exit); > +MODULE_DESCRIPTION("ACPI NVDIMM Firmware Interface Table (NFIT) module"); > MODULE_LICENSE("GPL v2"); > MODULE_AUTHOR("Intel Corporation"); > > --- > base-commit: a693b9c95abd4947c2d06e05733de5d470ab6586 > change-id: 20240603-md-drivers-acpi-nfit-e032bad0b189 >
Re: [PATCH 3/3] rust: add tracepoint support
On Thu, Jun 06, 2024 at 11:30:03AM -0400, Mathieu Desnoyers wrote: > On 2024-06-06 11:05, Alice Ryhl wrote: > > Make it possible to have Rust code call into tracepoints defined by C > > code. It is still required that the tracepoint is declared in a C > > header, and that this header is included in the input to bindgen. > > > > Signed-off-by: Alice Ryhl > > [...] > > > diff --git a/rust/helpers.c b/rust/helpers.c > > index 2c37a0f5d7a8..0560cc2a512a 100644 > > --- a/rust/helpers.c > > +++ b/rust/helpers.c > > @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t > > new_size, gfp_t flags) > > } > > EXPORT_SYMBOL_GPL(rust_helper_krealloc); > > +void rust_helper_preempt_enable_notrace(void) > > +{ > > + preempt_enable_notrace(); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); > > + > > +void rust_helper_preempt_disable_notrace(void) > > +{ > > + preempt_disable_notrace(); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); > > + > > +bool rust_helper_current_cpu_online(void) > > +{ > > + return cpu_online(raw_smp_processor_id()); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); > > + > > +void *rust_helper___rcu_dereference_raw(void **p) > > +{ > > + return rcu_dereference_raw(p); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); > > Ouch. Doing a function call for each of those small operations will > have a rather large performance impact when tracing is active. If it is Long-term plan is to 1) compile the C helpers in some IR and 2) inline the helpers with Rust in IR-level, as what Gary has: https://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-g...@garyguo.net/ and I use it for the upcoming atomic API support: https://github.com/fbq/linux/tree/dev/rust/atomic-rfc and it works very well. Regards, Boqun > not possible to inline those in Rust, then implementing __DO_TRACE in > a C function would at least allow Rust to only do a single call to C > rather than go back and forth between Rust and C. > > What prevents inlining those helpers in Rust ? > > Thanks, > > Mathieu > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com >
Re: [PATCH 0/3] Tracepoints and static branch/call in Rust
On Thu, Jun 6, 2024 at 5:25 PM Mathieu Desnoyers wrote: > > On 2024-06-06 11:05, Alice Ryhl wrote: > > This implementation implements support for static keys in Rust so that > > the actual static branch will end up in the Rust object file. However, > > it would also be possible to just wrap the trace_##name generated by > > __DECLARE_TRACE in an extern C function and then call that from Rust. > > This will simplify the Rust code by removing the need for static > > branches and calls, but it places the static branch behind an external > > call, which has performance implications. > > The tracepoints try very hard to minimize overhead of dormant tracepoints > so it is not frowned-upon to have them built into production binaries. > This is needed to make sure distribution vendors keep those tracepoints > in the kernel binaries that reach end-users. > > Adding a function call before evaluation of the static branch goes against > this major goal. > > > > > A possible middle ground would be to place just the __DO_TRACE body in > > an extern C function and to implement the Rust wrapper by doing the > > static branch in Rust, and then calling into C the code that contains > > __DO_TRACE when the tracepoint is active. However, this would need some > > changes to include/linux/tracepoint.h to generate and export a function > > containing the body of __DO_TRACE when the tracepoint should be callable > > from Rust. > > This tradeoff is more acceptable than having a function call before > evaluation of the static branch, but I wonder what is the upside of > this tradeoff compared to inlining the whole __DO_TRACE in Rust ? > > > So in general, there is a tradeoff between placing parts of the > > tracepoint (which is perf sensitive) behind an external call, and having > > code duplicated in both C and Rust (which must be kept in sync when > > changes are made). This is an important point that I would like feedback > > on from the C maintainers. > > I don't see how the duplication happens there: __DO_TRACE is meant to be > inlined into each C tracepoint caller site, so the code is already meant > to be duplicated. Having an explicit function wrapping the tracepoint > for Rust would just create an extra instance of __DO_TRACE if it happens > to be also inlined into C code. > > Or do you meant you would like to prevent having to duplicate the > implementation of __DO_TRACE in both C and Rust ? > > I'm not sure if you mean to prevent source code duplication between > C and Rust or duplication of binary code (instructions). It's a question of maintenance burden. If you change how __DO_TRACE is implemented, then those changes must also be reflected in the Rust version. There's no issue in the binary code. Alice
Re: [PATCH 0/3] Tracepoints and static branch/call in Rust
On 2024-06-06 11:05, Alice Ryhl wrote: An important part of a production ready Linux kernel driver is tracepoints. So to write production ready Linux kernel drivers in Rust, we must be able to call tracepoints from Rust code. This patch series adds support for calling tracepoints declared in C from Rust. I'm glad to see progress on this front ! Please see feedback below. To use the tracepoint support, you must: 1. Declare the tracepoint in a C header file as usual. 2. Make sure that the header file is visible to bindgen so that Rust bindings are generated for the symbols that the tracepoint macro emits. 3. Use the declare_trace! macro in your Rust code to generate Rust functions that call into the tracepoint. For example, the kernel has a tracepoint called `sched_kthread_stop`. It is declared like this: TRACE_EVENT(sched_kthread_stop, TP_PROTO(struct task_struct *t), TP_ARGS(t), TP_STRUCT__entry( __array(char, comm, TASK_COMM_LEN ) __field(pid_t, pid ) ), TP_fast_assign( memcpy(__entry->comm, t->comm, TASK_COMM_LEN); __entry->pid = t->pid; ), TP_printk("comm=%s pid=%d", __entry->comm, __entry->pid) ); To call the above tracepoint from Rust code, you would add the relevant header file to rust/bindings/bindings_helper.h and add the following invocation somewhere in your Rust code: declare_trace! { fn sched_kthread_stop(task: *mut task_struct); } This will define a Rust function of the given name that you can call like any other Rust function. Since these tracepoints often take raw pointers as arguments, it may be convenient to wrap it in a safe wrapper: mod raw { declare_trace! { fn sched_kthread_stop(task: *mut task_struct); } } #[inline] pub fn trace_sched_kthread_stop(task: ) { // SAFETY: The pointer to `task` is valid. unsafe { raw::sched_kthread_stop(task.as_raw()) } } A future expansion of the tracepoint support could generate these safe versions automatically, but that is left as future work for now. This is intended for use in the Rust Binder driver, which was originally sent as an RFC [1]. The RFC did not include tracepoint support, but you can see how it will be used in Rust Binder at [2]. The author has verified that the tracepoint support works on Android devices. This implementation implements support for static keys in Rust so that the actual static branch will end up in the Rust object file. However, it would also be possible to just wrap the trace_##name generated by __DECLARE_TRACE in an extern C function and then call that from Rust. This will simplify the Rust code by removing the need for static branches and calls, but it places the static branch behind an external call, which has performance implications. The tracepoints try very hard to minimize overhead of dormant tracepoints so it is not frowned-upon to have them built into production binaries. This is needed to make sure distribution vendors keep those tracepoints in the kernel binaries that reach end-users. Adding a function call before evaluation of the static branch goes against this major goal. A possible middle ground would be to place just the __DO_TRACE body in an extern C function and to implement the Rust wrapper by doing the static branch in Rust, and then calling into C the code that contains __DO_TRACE when the tracepoint is active. However, this would need some changes to include/linux/tracepoint.h to generate and export a function containing the body of __DO_TRACE when the tracepoint should be callable from Rust. This tradeoff is more acceptable than having a function call before evaluation of the static branch, but I wonder what is the upside of this tradeoff compared to inlining the whole __DO_TRACE in Rust ? So in general, there is a tradeoff between placing parts of the tracepoint (which is perf sensitive) behind an external call, and having code duplicated in both C and Rust (which must be kept in sync when changes are made). This is an important point that I would like feedback on from the C maintainers. I don't see how the duplication happens there: __DO_TRACE is meant to be inlined into each C tracepoint caller site, so the code is already meant to be duplicated. Having an explicit function wrapping the tracepoint for Rust would just create an extra instance of __DO_TRACE if it happens to be also inlined into C code. Or do you meant you would like to prevent having to duplicate the implementation of __DO_TRACE in both C and Rust ? I'm not sure if you mean to prevent source code duplication between C and Rust or duplication of binary code (instructions). Thanks,
Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb
On Wed, Jun 5, 2024 at 6:57 PM Steven Rostedt wrote: > > On Tue, 4 Jun 2024 14:47:38 -0700 > Yan Zhai wrote: > > > skb does not include enough information to find out receiving > > sockets/services and netns/containers on packet drops. In theory > > skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP > > stack for OOO packet lookup. Similarly, skb->sk often identifies a local > > sender, and tells nothing about a receiver. > > > > Allow passing an extra receiving socket to the tracepoint to improve > > the visibility on receiving drops. > > > > Signed-off-by: Yan Zhai > > --- > > v2->v3: fixed drop_monitor function prototype > > --- > > include/trace/events/skb.h | 11 +++ > > net/core/dev.c | 2 +- > > net/core/drop_monitor.c| 9 ++--- > > net/core/skbuff.c | 2 +- > > 4 files changed, 15 insertions(+), 9 deletions(-) > > > > diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h > > index 07e0715628ec..aa6b46b6172c 100644 > > --- a/include/trace/events/skb.h > > +++ b/include/trace/events/skb.h > > @@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN) > > TRACE_EVENT(kfree_skb, > > > > TP_PROTO(struct sk_buff *skb, void *location, > > - enum skb_drop_reason reason), > > + enum skb_drop_reason reason, struct sock *rx_sk), > > > > - TP_ARGS(skb, location, reason), > > + TP_ARGS(skb, location, reason, rx_sk), > > > > TP_STRUCT__entry( > > __field(void *, skbaddr) > > __field(void *, location) > > __field(unsigned short, protocol) > > __field(enum skb_drop_reason, reason) > > + __field(void *, rx_skaddr) > > Please add the pointer after the other pointers: > > __field(void *, skbaddr) > __field(void *, location) > + __field(void *, rx_skaddr) > __field(unsigned short, protocol) > __field(enum skb_drop_reason, reason) > > otherwise you are adding holes in the ring buffer event. > > The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We > want to avoid alignment holes. I also question having a short before the > enum, if the emum is 4 bytes. The short should be at the end. > > In fact, looking at the format file, there is a 2 byte hole: > > # cat /sys/kernel/tracing/events/skb/kfree_skb/format > > name: kfree_skb > ID: 1799 > format: > field:unsigned short common_type; offset:0; size:2; > signed:0; > field:unsigned char common_flags; offset:2; size:1; > signed:0; > field:unsigned char common_preempt_count; offset:3; > size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:void * skbaddr; offset:8; size:8; signed:0; > field:void * location; offset:16; size:8; signed:0; > field:unsigned short protocol; offset:24; size:2; signed:0; > field:enum skb_drop_reason reason; offset:28; size:4; > signed:0; > > Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts > at offset 28. This means at offset 26, there's a 2 byte hole. > The reason I added the pointer as the last argument is trying to minimize the surprise to existing TP users, because for common ABIs it's fine to omit later arguments when defining a function, but it needs change and recompilation if the order of arguments changed. Looking at the actual format after the change, it does not add a new hole since protocol and reason are already packed into the same 8-byte block, so rx_skaddr starts at 8-byte aligned offset: # cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format name: kfree_skb ID: 2260 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:void * skbaddr; offset:8; size:8; signed:0; field:void * location; offset:16; size:8; signed:0; field:unsigned short protocol; offset:24; size:2; signed:0; field:enum skb_drop_reason reason; offset:28; size:4; signed:0; field:void * rx_skaddr; offset:32; size:8; signed:0; Do you think we still need to change the order? Yan > -- Steve > > > > > ), > > > > TP_fast_assign( > > @@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb, > > __entry->location = location; > > __entry->protocol = ntohs(skb->protocol); > > __entry->reason = reason; > > + __entry->rx_skaddr = rx_sk; > > ), > > >
Re: [PATCH 2/3] rust: add static_key_false
On 2024-06-06 11:05, Alice Ryhl wrote: Add just enough support for static key so that we can use it from tracepoints. Tracepoints rely on `static_key_false` even though it is deprecated, so we add the same functionality to Rust. Signed-off-by: Alice Ryhl --- rust/kernel/lib.rs| 1 + rust/kernel/static_key.rs | 87 +++ scripts/Makefile.build| 2 +- 3 files changed, 89 insertions(+), 1 deletion(-) diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index d534b1178955..22e1fedd0774 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -39,6 +39,7 @@ pub mod print; mod static_assert; pub mod static_call; +pub mod static_key; #[doc(hidden)] pub mod std_vendor; pub mod str; diff --git a/rust/kernel/static_key.rs b/rust/kernel/static_key.rs new file mode 100644 index ..6c3dbe14c98a --- /dev/null +++ b/rust/kernel/static_key.rs @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 This static key code is something that can be generally useful both in kernel and user-space. Is there anything that would prevent licensing this under MIT right away instead so it could eventually be re-used more broadly in userspace as well ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH 3/3] rust: add tracepoint support
On 2024-06-06 11:05, Alice Ryhl wrote: Make it possible to have Rust code call into tracepoints defined by C code. It is still required that the tracepoint is declared in a C header, and that this header is included in the input to bindgen. Signed-off-by: Alice Ryhl [...] diff --git a/rust/helpers.c b/rust/helpers.c index 2c37a0f5d7a8..0560cc2a512a 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -165,6 +165,30 @@ rust_helper_krealloc(const void *objp, size_t new_size, gfp_t flags) } EXPORT_SYMBOL_GPL(rust_helper_krealloc); +void rust_helper_preempt_enable_notrace(void) +{ + preempt_enable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_enable_notrace); + +void rust_helper_preempt_disable_notrace(void) +{ + preempt_disable_notrace(); +} +EXPORT_SYMBOL_GPL(rust_helper_preempt_disable_notrace); + +bool rust_helper_current_cpu_online(void) +{ + return cpu_online(raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(rust_helper_current_cpu_online); + +void *rust_helper___rcu_dereference_raw(void **p) +{ + return rcu_dereference_raw(p); +} +EXPORT_SYMBOL_GPL(rust_helper___rcu_dereference_raw); Ouch. Doing a function call for each of those small operations will have a rather large performance impact when tracing is active. If it is not possible to inline those in Rust, then implementing __DO_TRACE in a C function would at least allow Rust to only do a single call to C rather than go back and forth between Rust and C. What prevents inlining those helpers in Rust ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH] livepatch: introduce klp_func called interface
Hi Wardenjohn, To follow up, Red Hat kpatch QE pointed me toward this test: https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/tree/main/general/kpatch/kpatch-trace?ref_type=heads which reports a few interesting things via systemd service and ftrace: - Patched functions - Traced functions - Code that was modified, but didn't run - Other code that ftrace saw, but is not listed in the sysfs directory The code looks to be built on top of the same basic ftrace commands that I sent the other day. You may be able to reuse/copy/etc bits from the scripts if they are handy. -- Joe
Re: [PATCH 3/5] arm64: dts: qcom: sdx75: add missing qlink_logging reserved memory for mpss
On 06/06/2024 16:38, Naina Mehta wrote: > The qlink_logging memory region is also used by the modem firmware, > add it to reserved memory regions. > Also split MPSS DSM region into 2 separate regions. > > Signed-off-by: Naina Mehta > --- > arch/arm64/boot/dts/qcom/sdx75.dtsi | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sdx75.dtsi > b/arch/arm64/boot/dts/qcom/sdx75.dtsi > index 9b93f6501d55..9349b1c4e196 100644 > --- a/arch/arm64/boot/dts/qcom/sdx75.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdx75.dtsi > @@ -366,7 +366,12 @@ > no-map; > }; > > - qdss_mem: qdss@8880 { > + qdss_mem: qdss@8850 { > + reg = <0x0 0x8850 0x0 0x30>; > + no-map; > + }; > + > + qlink_logging_mem: qlink_logging@8880 { Sorry, no downstream code. Please follow DTS coding style - no underscores in node names. This applies to all work sent upstream. Best regards, Krzysztof
Re: [PATCH v14 02/14] cgroup/misc: Add per resource callbacks for CSS events
On Thu, 06 Jun 2024 08:37:31 -0500, chenridong wrote: If _misc_cg_res_alloc fails, maybe some types do not call ->alloc(), but all types ->free() callback >will be called, is that ok? Not sure I understand. Are you suggesting we ignore failures from ->alloc() callback in _misc_cg_res_alloc() as it is per-resource, and have ->free() callback and resource provider of the failing type to handle the failure internally? IIUC, this failure only happens when a specific subcgroup is created (memory running out for allocation) so failing that subcgroup as a whole seems fine to me. Note the root node is static and no pre-resource callbacks invoked by misc. And resource provider handles its own allocations for root. In SGX case we too declare a static object for corresponding root sgx_cgroup struct. Note also misc cgroup (except for setting capacity[res] = 0 at root) is all or nothing so no mechanism to tell user "this resource does not work but others are fine in this particular cgroup." Thanks Haitao
Re: [PATCH 1/5] dt-bindings: remoteproc: qcom,sm8550-pas: document the SDX75 PAS
On 06/06/2024 16:38, Naina Mehta wrote: > Document the MPSS Peripheral Authentication Service on SDX75 platform. > > Signed-off-by: Naina Mehta > --- > .../devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml | 1 + > 1 file changed, 1 insertion(+) > > diff --git > a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml > b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml > index 73fda7565cd1..02e15b1f78ab 100644 > --- a/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml > +++ b/Documentation/devicetree/bindings/remoteproc/qcom,sm8550-pas.yaml > @@ -16,6 +16,7 @@ description: > properties: >compatible: > enum: > + - qcom,sdx75-mpss-pas >- qcom,sm8550-adsp-pas >- qcom,sm8550-cdsp-pas >- qcom,sm8550-mpss-pas Missing updates to allOf constraints. Are you sure this is the binding for SDX75? Best regards, Krzysztof
Re: [PATCH] net: missing check
On Thu, Jun 6, 2024 at 4:14 PM Denis Arefev wrote: > > Two missing check in virtio_net_hdr_to_skb() allowed syzbot > to crash kernels again > > 1. After the skb_segment function the buffer may become non-linear > (nr_frags != 0), but since the SKBTX_SHARED_FRAG flag is not set anywhere > the __skb_linearize function will not be executed, then the buffer will > remain non-linear. Then the condition (offset >= skb_headlen(skb)) > becomes true, which causes WARN_ON_ONCE in skb_checksum_help. > > 2. The struct sk_buff and struct virtio_net_hdr members must be > mathematically related. > (gso_size) must be greater than (needed) otherwise WARN_ON_ONCE. > (remainder) must be greater than (needed) otherwise WARN_ON_ONCE. > (remainder) may be 0 if division is without remainder. > > offset+2 (4191) > skb_headlen() (1116) > WARNING: CPU: 1 PID: 5084 at net/core/dev.c:3303 > skb_checksum_help+0x5e2/0x740 net/core/dev.c:3303 > Modules linked in: > CPU: 1 PID: 5084 Comm: syz-executor336 Not tainted > 6.7.0-rc3-syzkaller-00014-gdf60cee26a2e #0 > Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google > 11/10/2023 > RIP: 0010:skb_checksum_help+0x5e2/0x740 net/core/dev.c:3303 > Code: 89 e8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 52 01 00 00 44 89 e2 2b > 53 74 4c 89 ee 48 c7 c7 40 57 e9 8b e8 af 8f dd f8 90 <0f> 0b 90 90 e9 87 fe > ff ff e8 40 0f 6e f9 e9 4b fa ff ff 48 89 ef > RSP: 0018:c90003a9f338 EFLAGS: 00010286 > RAX: RBX: 888025125780 RCX: 814db209 > RDX: 888015393b80 RSI: 814db216 RDI: 0001 > RBP: 8880251257f4 R08: 0001 R09: > R10: R11: 0001 R12: 045c > R13: 105f R14: 8880251257f0 R15: 105d > FS: 55c24380() GS:8880b990() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 2000f000 CR3: 23151000 CR4: 003506f0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > > ip_do_fragment+0xa1b/0x18b0 net/ipv4/ip_output.c:777 > ip_fragment.constprop.0+0x161/0x230 net/ipv4/ip_output.c:584 > ip_finish_output_gso net/ipv4/ip_output.c:286 [inline] > __ip_finish_output net/ipv4/ip_output.c:308 [inline] > __ip_finish_output+0x49c/0x650 net/ipv4/ip_output.c:295 > ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323 > NF_HOOK_COND include/linux/netfilter.h:303 [inline] > ip_output+0x13b/0x2a0 net/ipv4/ip_output.c:433 > dst_output include/net/dst.h:451 [inline] > ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:129 > iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82 > ipip6_tunnel_xmit net/ipv6/sit.c:1034 [inline] > sit_tunnel_xmit+0xed2/0x28f0 net/ipv6/sit.c:1076 > __netdev_start_xmit include/linux/netdevice.h:4940 [inline] > netdev_start_xmit include/linux/netdevice.h:4954 [inline] > xmit_one net/core/dev.c:3545 [inline] > dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3561 > __dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4346 > dev_queue_xmit include/linux/netdevice.h:3134 [inline] > packet_xmit+0x257/0x380 net/packet/af_packet.c:276 > packet_snd net/packet/af_packet.c:3087 [inline] > packet_sendmsg+0x24ca/0x5240 net/packet/af_packet.c:3119 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0xd5/0x180 net/socket.c:745 > __sys_sendto+0x255/0x340 net/socket.c:2190 > __do_sys_sendto net/socket.c:2202 [inline] > __se_sys_sendto net/socket.c:2198 [inline] > __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 > do_syscall_x64 arch/x86/entry/common.c:51 [inline] > do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 > entry_SYSCALL_64_after_hwframe+0x63/0x6b > > Signed-off-by: Denis Arefev > --- > include/linux/virtio_net.h | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h > index 4dfa9b69ca8d..77ebe908d746 100644 > --- a/include/linux/virtio_net.h > +++ b/include/linux/virtio_net.h > @@ -56,6 +56,7 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, > unsigned int thlen = 0; > unsigned int p_off = 0; > unsigned int ip_proto; > + u64 ret, remainder; > > if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { > switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { > @@ -98,6 +99,15 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff > *skb, > u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); > u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16)); > > + if (hdr->gso_size) { > + ret = div64_u64_rem(skb->len, hdr->gso_size, > ); > + if (!(ret && (hdr->gso_size > needed) && > + ((remainder > needed) || > (remainder == 0 { > +
Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing
On Thu, 06 Jun 2024 00:32:55 -0500, Jarkko Sakkinen wrote: On Wed Jun 5, 2024 at 6:33 PM EEST, Haitao Huang wrote: sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we check and return 0 which was the initially implemented in v12. But then Kai had some concern on that we expose all the interface files to allow user to set limits but we don't enforce. To keep it simple we settled down ~~ Sure: "Keep it simple and corpse" back to BUG_ON(). This would only happen rarely and user can add command-line to disable SGX if s/he really wants to start kernel in this case, just can't do SGX. Even disabling all of SGX would be a less catastrophical measure. Yes I had a comment but Kai thought it was too obvious and I can't think of a better one that's not obvious so I removed: Not great advice given. Please just document it. In patch, which BUG_ON() I don't want to see my R-by in it, until I've reviewed an updated version. BR, Jarkko Sure. that makes sens. So far I have following changes. I did not do the renaming for the functions as I am not sure it is still needed. Let me know otherwise and if I missed any more. --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c @@ -329,7 +329,7 @@ const struct misc_res_ops sgx_cgroup_ops = { .free = sgx_cgroup_free, }; -void sgx_cgroup_init(void) +int __init sgx_cgroup_init(void) { /* * misc root always exists even if misc is disabled from command line. @@ -345,7 +345,8 @@ void sgx_cgroup_init(void) if (cgroup_subsys_enabled(misc_cgrp_subsys)) { sgx_cg_wq = alloc_workqueue("sgx_cg_wq", WQ_UNBOUND | WQ_FREEZABLE, WQ_UNBOUND_MAX_ACTIVE); -BUG_ON(!sgx_cg_wq); +return -ENOMEM; } +return 0; } --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h @@ -99,7 +99,7 @@ bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg); struct misc_cg *sgx_cgroup_reclaim_pages(struct misc_cg *root, struct misc_cg *next_cg, struct mm_struct *charge_mm); -void sgx_cgroup_init(void); +int __init sgx_cgroup_init(void); #endif /* CONFIG_CGROUP_MISC */ --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -1045,7 +1045,7 @@ static int __init sgx_init(void) if (!sgx_page_cache_init()) return -ENOMEM; -if (!sgx_page_reclaimer_init()) { +if (!sgx_page_reclaimer_init() || !sgx_cgroup_init()) { ret = -ENOMEM; goto err_page_cache; } @@ -1067,9 +1067,6 @@ static int __init sgx_init(void) if (sgx_vepc_init() && ret) goto err_provision; -/* Setup cgroup if either the native or vepc driver is active */ -sgx_cgroup_init(); - return 0; err_provision: --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -119,6 +119,7 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) * cgroup. */ struct sgx_epc_lru_list { +/* Use this lock to protect access from multiple reclamation worker threads */ spinlock_t lock; struct list_head reclaimable; }; --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -446,7 +446,8 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) int ret; if (!parent_css) { -parent_cg = cg = _cg; +parent_cg = _cg; +cg = _cg; } else { cg = kzalloc(sizeof(*cg), GFP_KERNEL); if (!cg) Thanks Haitao
Re: [PATCH v2 1/5] ftrace: Rename dup_hash() and comment it
On Wed, Jun 05, 2024 at 02:03:35PM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > The name "dup_hash()" is a misnomer as it does not duplicate the hash that > is passed in, but instead moves its entities from that hash to a newly > allocated one. Rename it to "__move_hash()" (using starting underscores as > it is an internal function), and add some comments about what it does. > > Signed-off-by: Steven Rostedt (Google) Acked-by: Mark Rutland Mark. > --- > kernel/trace/ftrace.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index da7e6abf48b4..9dcdefe9d1aa 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, > int filter_hash); > static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, > struct ftrace_hash *new_hash); > > -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size) > +/* > + * Allocate a new hash and remove entries from @src and move them to the new > hash. > + * On success, the @src hash will be empty and should be freed. > + */ > +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size) > { > struct ftrace_func_entry *entry; > struct ftrace_hash *new_hash; > @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src) > if (ftrace_hash_empty(src)) > return EMPTY_HASH; > > - return dup_hash(src, size); > + return __move_hash(src, size); > } > > static int > -- > 2.43.0 > >
Re: [PATCH] remoteproc: mediatek: Increase MT8188 SCP core0 DRAM size
Il 06/06/24 11:06, jason-ch chen ha scritto: From: Jason Chen Increase MT8188 SCP core0 DRAM size for HEVC driver. so the second core only gets a maximum of 0x20 of SRAM? I'm not sure how useful is the secondary SCP core, at this point, with so little available SRAM but... okay, as you wish. Reviewed-by: AngeloGioacchino Del Regno Signed-off-by: Jason Chen --- drivers/remoteproc/mtk_scp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c index b885a9a041e4..2119fc62c3f2 100644 --- a/drivers/remoteproc/mtk_scp.c +++ b/drivers/remoteproc/mtk_scp.c @@ -1390,7 +1390,7 @@ static const struct mtk_scp_sizes_data default_scp_sizes = { }; static const struct mtk_scp_sizes_data mt8188_scp_sizes = { - .max_dram_size = 0x50, + .max_dram_size = 0x80, .ipi_share_buffer_size = 600, };
Re: [PATCH] remoteproc: mediatek: Don't print error when optional scp handle is missing
Il 05/06/24 21:35, Nícolas F. R. A. Prado ha scritto: The scp_get() helper has two users: the mtk-vcodec and the mtk-mdp3 drivers. mdp3 considers the mediatek,scp phandle optional, and when it's missing mdp3 will directly look for the scp node based on compatible. For that reason printing an error in the helper when the handle is missing is wrong, as it's not always an actual error. Besides, the other user, vcodec, already prints an error message on its own when the helper fails so this message doesn't add that much information. Remove the error message from the helper. This gets rid of the deceptive error on MT8195-Tomato: mtk-mdp3 14001000.dma-controller: can't get SCP node I'm way more for giving it a SCP handle instead of silencing this print... the SCP handle way *is* the correct one and I prefer it, as that'd also be the only way to support Dual-Core SCP in MDP3. I really want to see hardcoded stuff disappear and I really really *really* want to see more consistency across DT nodes in MediaTek platforms. So, still a one-line change, but do it on the devicetree instead of here please. Cheers, Angelo Signed-off-by: Nícolas F. R. A. Prado --- drivers/remoteproc/mtk_scp.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c index b885a9a041e4..f813117b6312 100644 --- a/drivers/remoteproc/mtk_scp.c +++ b/drivers/remoteproc/mtk_scp.c @@ -37,10 +37,8 @@ struct mtk_scp *scp_get(struct platform_device *pdev) struct platform_device *scp_pdev; scp_node = of_parse_phandle(dev->of_node, "mediatek,scp", 0); - if (!scp_node) { - dev_err(dev, "can't get SCP node\n"); + if (!scp_node) return NULL; - } scp_pdev = of_find_device_by_node(scp_node); of_node_put(scp_node); --- base-commit: d97496ca23a2d4ee80b7302849404859d9058bcd change-id: 20240605-mt8195-dma-scp-node-err-6a8dea26d4f5 Best regards,
Re: [PATCH v2 2/6] livepatch: Add klp-convert tool
Hello Joe, > > +#define KLP_RELOC_SYMBOL_POS(LP_OBJ_NAME, SYM_OBJ_NAME, SYM_NAME, SYM_POS) > > \ > > + asm("\".klp.sym.rela." #LP_OBJ_NAME "." #SYM_OBJ_NAME "." #SYM_NAME "," > > #SYM_POS "\"") > > ^^^ > I think I found a potential bug, or at least compatiblity problem with > including a comma "," character in this symbol format and older versions > of the GNU assembler. The good news is that other delimiter characters > like "." or "#" seem to work out fine. Thank you for spotting this. I was using binutils 2.38, so I did not even notice this problem. Unfortunately, I was not able to make it work with "#" as a delimiter; only "." worked. Additionally, any type of parenthesis apparently has some special purpose even in labels, so they are also not an option. > If you want to reproduce, you'll need a version of `as` like binutils > 2.36.1 and try building the samples/livepatch/livepatch-extern-symbol.ko > and you should get an error like: > > Assembler messages: > Warning: missing closing '"' > Warning: missing closing '"' > Error: too many memory references for `movq' > > > If you want to retrace my adventure, here are my steps: > > 1) Clone klp-convert-tree repo branch containing this patchset + > Petr's review comments + a few helpful things for klp-convert > development: > > $ git clone \ > --single-branch --branch=klp-convert-minimal-v1-review --depth=9 \ > https://github.com/joe-lawrence/klp-convert-tree.git > [ ... snip ... ] > $ cd klp-convert-tree > > 2) Override .cross-dev defaults: > > $ export BUILD_ARCHES=x86_64 > $ export COMPILER=gcc-11.1.0 > $ export URL=https://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/ > $ export OUTDIR_PREFIX=$(pwd)/build > $ export COMPILER_INSTALL_PATH=$(pwd)/cross-compiler > > 3) Setup x86_64 default .config (this will download and install the > gcc-11.1.0 compiler from cdn.kernel.org): > > $ ./cross-dev make defconfig > > x86_64 : make defconfig ... > Compiler will be installed in /root/klp-convert-tree/cross-compiler > [ ... snip ... ] > > 4) Add kernel livepatching configuration options: > > $ ./cross-dev klp-config > > Configuring x86_64 ... > [ ... snip ... ] > > $ grep LIVEPATCH "$OUTDIR_PREFIX"-x86_64/.config > CONFIG_HAVE_LIVEPATCH=y > CONFIG_LIVEPATCH=y > CONFIG_SAMPLE_LIVEPATCH=m > > 5) Run the cross-compiler build until it hits a build error on > livepatch-extern-symbol.ko: > > $ ./cross-dev make -j$(nproc) > [ ... snip ... ] > make: Target '__all' not remade because of errors. > [ x86_64 : make -j48 = FAIL ] > > 6) With pre-requisites already built, retry the external symbol sample > and add -save-temps to the KCFLAGS to keep the generated assembly file: > > $ KCFLAGS="-save-temps=obj" ./cross-dev make > samples/livepatch/livepatch-extern-symbol.ko > [ ... snip ... ] > samples/livepatch/livepatch-extern-symbol.s: Assembler messages: > samples/livepatch/livepatch-extern-symbol.s:103: Warning: missing closing > '"' > samples/livepatch/livepatch-extern-symbol.s:103: Warning: missing closing > '"' > samples/livepatch/livepatch-extern-symbol.s:103: Error: too many memory > references for `movq' > [ ... snip ... ] > > 7) Which line is that? > > $ awk 'NR==103' > "$OUTDIR_PREFIX"-x86_64/samples/livepatch/livepatch-extern-symbol.s > movq > ".klp.sym.rela.vmlinux.vmlinux.saved_command_line,0"(%rip), %rdx > > > You could alternatively poke at it through the compiler explorer service > and toggle the source and binutils versions: > > (error) binutils 2.36.1 : https://godbolt.org/z/cGGs6rfWe > (success) binutils 2.38 : https://godbolt.org/z/ffzza3vYd Thank you for those detailed step-by-step instruction to reproduce it! It helped me a lot to understand the problem. Lukas
Re: [PATCH] arm64: dts: qcom: qcm6490-fairphone-fp5: Use .mbn firmware for IPA
On 6.06.2024 11:09 AM, Luca Weiss wrote: > Specify the file name for the squashed/non-split firmware with the .mbn > extension instead of the split .mdt. The kernel can load both but the > squashed version is preferred in dts nowadays. > > Signed-off-by: Luca Weiss > --- Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v8 0/5] soc: qcom: add in-kernel pd-mapper implementation
On 11/05/2024 23:56, Dmitry Baryshkov wrote: Protection domain mapper is a QMI service providing mapping between 'protection domains' and services supported / allowed in these domains. For example such mapping is required for loading of the WiFi firmware or for properly starting up the UCSI / altmode / battery manager support. The existing userspace implementation has several issue. It doesn't play well with CONFIG_EXTRA_FIRMWARE, it doesn't reread the JSON files if the firmware location is changed (or if the firmware was not available at the time pd-mapper was started but the corresponding directory is mounted later), etc. However this configuration is largely static and common between different platforms. Provide in-kernel service implementing static per-platform data. To: Bjorn Andersson To: Konrad Dybcio To: Sibi Sankar To: Mathieu Poirier Cc: linux-arm-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-remotep...@vger.kernel.org Cc: Johan Hovold Cc: Xilin Wu Cc: "Bryan O'Donoghue" Cc: Steev Klimaszewski Cc: Alexey Minnekhanov -- Changes in v8: - Reworked pd-mapper to register as an rproc_subdev / auxdev - Dropped Tested-by from Steev and Alexey from the last patch since the implementation was changed significantly. - Add sensors, cdsp and mpss_root domains to 660 config (Alexey Minnekhanov) - Added platform entry for sm4250 (used for qrb4210 / RB2) - Added locking to the pdr_get_domain_list() (Chris Lew) - Remove the call to qmi_del_server() and corresponding API (Chris Lew) - In qmi_handle_init() changed 1024 to a defined constant (Chris Lew) - Link to v7: https://lore.kernel.org/r/20240424-qcom-pd-mapper-v7-0-05f7fc646...@linaro.org Changes in v7: - Fixed modular build (Steev) - Link to v6: https://lore.kernel.org/r/20240422-qcom-pd-mapper-v6-0-f96957d01...@linaro.org Changes in v6: - Reworked mutex to fix lockdep issue on deregistration - Fixed dependencies between PD-mapper and remoteproc to fix modular builds (Krzysztof) - Added EXPORT_SYMBOL_GPL to fix modular builds (Krzysztof) - Fixed kerneldocs (Krzysztof) - Removed extra pr_debug messages (Krzysztof) - Fixed wcss build (Krzysztof) - Added platforms which do not require protection domain mapping to silence the notice on those platforms - Link to v5: https://lore.kernel.org/r/20240419-qcom-pd-mapper-v5-0-e35b6f847...@linaro.org Changes in v5: - pdr: drop lock in pdr_register_listener, list_lock is already held (Chris Lew) - pd_mapper: reworked to provide static configuration per platform (Bjorn) - Link to v4: https://lore.kernel.org/r/20240311-qcom-pd-mapper-v4-0-24679cca5...@linaro.org Changes in v4: - Fixed missing chunk, reenabled kfree in qmi_del_server (Konrad) - Added configuration for sm6350 (Thanks to Luca) - Removed RFC tag (Konrad) - Link to v3: https://lore.kernel.org/r/20240304-qcom-pd-mapper-v3-0-6858fa1ac...@linaro.org Changes in RFC v3: - Send start / stop notifications when PD-mapper domain list is changed - Reworked the way PD-mapper treats protection domains, register all of them in a single batch - Added SC7180 domains configuration based on TCL Book 14 GO - Link to v2: https://lore.kernel.org/r/20240301-qcom-pd-mapper-v2-0-5d12a081d...@linaro.org Changes in RFC v2: - Swapped num_domains / domains (Konrad) - Fixed an issue with battery not working on sc8280xp - Added missing configuration for QCS404 --- Dmitry Baryshkov (5): soc: qcom: pdr: protect locator_addr with the main mutex soc: qcom: pdr: fix parsing of domains lists soc: qcom: pdr: extract PDR message marshalling data soc: qcom: add pd-mapper implementation remoteproc: qcom: enable in-kernel PD mapper drivers/remoteproc/qcom_common.c| 87 + drivers/remoteproc/qcom_common.h| 10 + drivers/remoteproc/qcom_q6v5_adsp.c | 3 + drivers/remoteproc/qcom_q6v5_mss.c | 3 + drivers/remoteproc/qcom_q6v5_pas.c | 3 + drivers/remoteproc/qcom_q6v5_wcss.c | 3 + drivers/soc/qcom/Kconfig| 15 + drivers/soc/qcom/Makefile | 2 + drivers/soc/qcom/pdr_interface.c| 17 +- drivers/soc/qcom/pdr_internal.h | 318 ++--- drivers/soc/qcom/qcom_pd_mapper.c | 676 drivers/soc/qcom/qcom_pdr_msg.c | 353 +++ 12 files changed, 1190 insertions(+), 300 deletions(-) --- base-commit: e5119bbdaca76cd3c15c3c975d51d840bbfb2488 change-id: 20240301-qcom-pd-mapper-e12d622d4ad0 Best regards, Tested-by: Neil Armstrong # on SM8550-QRD Tested-by: Neil Armstrong # on SM8550-HDK Tested-by: Neil Armstrong # on SM8650-QRD Thanks, Neil
Re: [PATCH v8 1/5] soc: qcom: pdr: protect locator_addr with the main mutex
On Thu, 6 Jun 2024 at 01:48, Chris Lew wrote: > > Hi Dmitry, > > On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote: > ... > > @@ -76,12 +76,12 @@ static int pdr_locator_new_server(struct qmi_handle > > *qmi, > > locator_hdl); > > struct pdr_service *pds; > > > > + mutex_lock(>lock); > > /* Create a local client port for QMI communication */ > > pdr->locator_addr.sq_family = AF_QIPCRTR; > > pdr->locator_addr.sq_node = svc->node; > > pdr->locator_addr.sq_port = svc->port; > > > > - mutex_lock(>lock); > > pdr->locator_init_complete = true; > > mutex_unlock(>lock); > > > > @@ -104,10 +104,10 @@ static void pdr_locator_del_server(struct qmi_handle > > *qmi, > > > > mutex_lock(>lock); > > pdr->locator_init_complete = false; > > - mutex_unlock(>lock); > > > > pdr->locator_addr.sq_node = 0; > > pdr->locator_addr.sq_port = 0; > > + mutex_unlock(>lock); > > } > > > > static const struct qmi_ops pdr_locator_ops = { > > @@ -365,6 +365,7 @@ static int pdr_get_domain_list(struct > > servreg_get_domain_list_req *req, > > if (ret < 0) > > return ret; > > > > + mutex_lock(>lock); > > ret = qmi_send_request(>locator_hdl, > > >locator_addr, > > , SERVREG_GET_DOMAIN_LIST_REQ, > > @@ -373,15 +374,16 @@ static int pdr_get_domain_list(struct > > servreg_get_domain_list_req *req, > > req); > > if (ret < 0) { > > qmi_txn_cancel(); > > - return ret; > > + goto err_unlock; > > } > > > > ret = qmi_txn_wait(, 5 * HZ); > > if (ret < 0) { > > pr_err("PDR: %s get domain list txn wait failed: %d\n", > > req->service_name, ret); > > - return ret; > > + goto err_unlock; > > } > > + mutex_unlock(>lock); > > I'm not sure it is necessary to hold the the mutex during the > qmi_txn_wait() since the only variable we are trying to protect is > locator_addr. > > Wouldn't this delay other work like new/del server notifications if this > qmi service is delayed or non-responsive? > I've verified, the addr is stored inside the message data by the enqueueing functions, so the locator_addr isn't referenced after the function returns. I'll reduce the locking scope. -- With best wishes Dmitry
RE: [PATCH] remoteproc: imx_rproc: Adjust phandle parsing issue while remapping optional addresses in imx_rproc_addr_init()
Hi Aleksandr, > Subject: [PATCH] remoteproc: imx_rproc: Adjust phandle parsing issue while > remapping optional addresses in imx_rproc_addr_init() > > In imx_rproc_addr_init() "nph = of_count_phandle_with_args()" just counts > number of phandles. But phandles may be empty. So of_parse_phandle() in > the parsing loop (0 < a < nph) may return NULL which is later dereferenced. > Adjust this issue by adding NULL-return check. It is good to add a check here. But I am not sure whether this will really happen. > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > Fixes: a0ff4aa6f010 ("remoteproc: imx_rproc: add a NXP/Freescale imx_rproc > driver") > Signed-off-by: Aleksandr Mishin Anyway LGTM: Reviewed-by: Peng Fan Thanks, Peng.
Re: [PATCH v3 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP
On 05/06/2024 18:56, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and > GPDSP1 on the SA8775p SoC. > > Signed-off-by: Bartosz Golaszewski > --- > .../bindings/remoteproc/qcom,sa8775p-pas.yaml | 160 > + > 1 file changed, 160 insertions(+) Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof
Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.
Sorry, maybe I'm wrong. I wonder if the gfp parameter in static inline void *kmalloc(size_t s, gfp_t gfp) can be deleted if it is not used. Or would be better to move memset to kmalloc. Like this: #define __GFP_ZERO 0x1 static inline void *kmalloc(size_t s, gfp_t gfp) { void *p; if (__kmalloc_fake) return __kmalloc_fake; p = malloc(s); if (!p) return p; if (gfp & __GFP_ZERO) memset(p, 0, s); return p; } 在 2024/6/6 15:18, Michael S. Tsirkin 写道: On Wed, Jun 05, 2024 at 09:52:45PM +0800, cuitao wrote: Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it, without the need for an additional memset call. Signed-off-by: cuitao --- tools/virtio/linux/kernel.h | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h index 6702008f7f5c..9e401fb7c215 100644 --- a/tools/virtio/linux/kernel.h +++ b/tools/virtio/linux/kernel.h @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp) static inline void *kzalloc(size_t s, gfp_t gfp) { - void *p = kmalloc(s, gfp); - - memset(p, 0, s); - return p; + return kmalloc(s, gfp | __GFP_ZERO); } Why do we care? It's just here to make things compile. The simpler the better. static inline void *alloc_pages_exact(size_t s, gfp_t gfp) -- 2.25.1
Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.
On Wed, Jun 05, 2024 at 09:52:45PM +0800, cuitao wrote: > Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it, > without the need for an additional memset call. > > Signed-off-by: cuitao > --- > tools/virtio/linux/kernel.h | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h > index 6702008f7f5c..9e401fb7c215 100644 > --- a/tools/virtio/linux/kernel.h > +++ b/tools/virtio/linux/kernel.h > @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, > gfp_t gfp) > > static inline void *kzalloc(size_t s, gfp_t gfp) > { > - void *p = kmalloc(s, gfp); > - > - memset(p, 0, s); > - return p; > + return kmalloc(s, gfp | __GFP_ZERO); > } Why do we care? It's just here to make things compile. The simpler the better. > static inline void *alloc_pages_exact(size_t s, gfp_t gfp) > -- > 2.25.1
Re: [PATCH 1/1] dt-bindings: remoteproc: imx_rproc: add minItems for power-domain
On 05/06/2024 21:34, Frank Li wrote: > "fsl,imx8qxp-cm4" just need 2 power domains. Keep the same restriction for > other compatible string. > > Signed-off-by: Frank Li > --- > > Notes: > pass dt_binding_check. > > make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 dt_binding_check > DT_SCHEMA_FILES=fsl,imx-rproc.yaml > SCHEMA Documentation/devicetree/bindings/processed-schema.json > CHKDT Documentation/devicetree/bindings > LINTDocumentation/devicetree/bindings > DTEX > Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dts > DTC_CHK > Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.example.dtb > > .../bindings/remoteproc/fsl,imx-rproc.yaml | 16 > 1 file changed, 16 insertions(+) > > diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > index df36e29d974ca..60267c1ba94e0 100644 > --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml > @@ -59,6 +59,7 @@ properties: > maxItems: 32 > >power-domains: > +minItems: 1 Then here should be 2. > maxItems: 8 > >fsl,auto-boot: > @@ -99,6 +100,21 @@ allOf: >properties: > fsl,iomuxc-gpr: false > > + - if: > + properties: > +compatible: > + contains: > +enum: > + - fsl,imx8qxp-cm4 > +then: > + properties: > +power-domains: > + minItems: 2 Best regards, Krzysztof
Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing
On Thu Jun 6, 2024 at 9:20 AM EEST, Jarkko Sakkinen wrote: > > There are existing code where BUG_ON() is used during the kernel early > > boot code when memory allocation fails (e.g., see cgroup_init_subsys()), > > so it might be acceptable to use BUG_ON() here, but it's up to > > maintainers to decide whether it is OK. > > When it is not possible continue to run the system at all, and only > then. > > Here it is possible. Without SGX. With this logic sgx_init() should call BUG() in the error paths. BR, Jarkko
Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing
On Thu Jun 6, 2024 at 1:30 AM EEST, Huang, Kai wrote: > > >> Reorg: > >> > >> void sgx_cgroup_init(void) > >> { > >> struct workqueue_struct *wq; > >> > >> /* eagerly allocate the workqueue: */ > >> wq = alloc_workqueue("sgx_cg_wq", wq_unbound | wq_freezable, > >> wq_unbound_max_active); > >> if (!wq) { > >> pr_warn("sgx_cg_wq creation failed\n"); > >> return; > > > > sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we > > check and return 0 which was the initially implemented in v12. But then > > Kai had some concern on that we expose all the interface files to allow > > user to set limits but we don't enforce. To keep it simple we settled > > down back to BUG_ON(). > > [...] > > > This would only happen rarely and user can add > > command-line to disable SGX if s/he really wants to start kernel in this > > case, just can't do SGX. > > Just to be clear that I don't like BUG_ON() either. It's inevitable you > will get attention because of using it. Just then plain disable SGX if it fails to start. Do not take down the whole system. > This is a compromise that you don't want to reset "capacity" to 0 when > alloc_workqueue() fails. BUG_ON() is not a "compromise". > There are existing code where BUG_ON() is used during the kernel early > boot code when memory allocation fails (e.g., see cgroup_init_subsys()), > so it might be acceptable to use BUG_ON() here, but it's up to > maintainers to decide whether it is OK. When it is not possible continue to run the system at all, and only then. Here it is possible. Without SGX. BR, Jarkko
Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing
On Wed Jun 5, 2024 at 6:33 PM EEST, Haitao Huang wrote: > sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we > check and return 0 which was the initially implemented in v12. But then > Kai had some concern on that we expose all the interface files to allow > user to set limits but we don't enforce. To keep it simple we settled down ~~ Sure: "Keep it simple and corpse" > back to BUG_ON(). This would only happen rarely and user can add > command-line to disable SGX if s/he really wants to start kernel in this > case, just can't do SGX. Even disabling all of SGX would be a less catastrophical measure. > Yes I had a comment but Kai thought it was too obvious and I can't think > of a better one that's not obvious so I removed: Not great advice given. Please just document it. In patch, which BUG_ON() I don't want to see my R-by in it, until I've reviewed an updated version. BR, Jarkko
Re: (subset) [PATCH 0/2] Add HTC One (M8) support
On Mon, 03 Jun 2024 02:28:55 -0400, Alexandre Messier wrote: > Add an initial device tree to support the HTC One (M8) smartphone, > aka "htc,m8". > > Applied, thanks! [2/2] ARM: dts: qcom: Add initial support for HTC One (M8) commit: 0e8a41e511c98f5f5796c0dca8ff983d1c967b93 Best regards, -- Bjorn Andersson
回复: Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.
Sorry, this is a stupid mistake. I wonder if the gfp parameter in static inline void *kmalloc(size_t s, gfp_t gfp) can be deleted if it is not used. Or would be better to move memset to kmalloc. Like this: #define __GFP_ZERO 0x1 static inline void *kmalloc(size_t s, gfp_t gfp) { void *p; if (__kmalloc_fake) return __kmalloc_fake; p = malloc(s); if (!p) return p; if (gfp & __GFP_ZERO) memset(p, 0, s); return p; } 主 题:Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization. 日 期:2024-06-06 08:29 发件人:jasowang 收件人:崔涛; On Wed, Jun 5, 2024 at 9:56 PM cuitao wrote:>> Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it,> without the need for an additional memset call.>> Signed-off-by: cuitao > ---> tools/virtio/linux/kernel.h | 5 +> 1 file changed, 1 insertion(+), 4 deletions(-)>> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h> index 6702008f7f5c..9e401fb7c215 100644> --- a/tools/virtio/linux/kernel.h> +++ b/tools/virtio/linux/kernel.h> @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp)>> static inline void *kzalloc(size_t s, gfp_t gfp)> {> - void *p = kmalloc(s, gfp);> -> - memset(p, 0, s);> - return p;> + return kmalloc(s, gfp | __GFP_ZERO);> }>> static inline void *alloc_pages_exact(size_t s, gfp_t gfp)> --> 2.25.1>Does this really work?extern void *__kmalloc_fake, *__kfree_ignore_start, *__kfree_ignore_end;static inline void *kmalloc(size_t s, gfp_t gfp){if (__kmalloc_fake)return __kmalloc_fake;return malloc(s);}Thanks
Re: [PATCH] fgraph: Remove some unused functions
On Thu, 6 Jun 2024 10:10:53 +0800 Jiapeng Chong wrote: > These functions are defined in the fgraph.c file, but not > called elsewhere, so delete these unused functions. > > kernel/trace/fgraph.c:273:1: warning: unused function 'set_bitmap_bits'. > kernel/trace/fgraph.c:259:19: warning: unused function 'get_fgraph_type'. > Thanks, these are leftovers from the rewrite. -- Steve
Re: [PATCH] ftrace: adding the missing parameter descriptions of unregister_ftrace_direct
On Mon, 27 May 2024 21:50:46 -0300 MarileneGarcia wrote: The subject for the tracing subsystem should start with a capital letter, but it is a bit confusing anyway. Should be: ftrace: Add missing kerneldoc parameters to unregister_ftrace_direct() > Adding the description of the parameters addr and free_filters > of the function unregister_ftrace_direct. s/Adding/Add/ s/of the/to the/ > > Signed-off-by: MarileneGarcia > --- > Hello, > These changes fix the following compiler warnings of the function > unregister_ftrace_direct. > > The warnings happen using GCC compiler, enabling the ftrace related > configs and using the command 'make W=1'. > > kernel/trace/ftrace.c:5489: warning: Function parameter or struct member > 'addr' not described in 'unregister_ftrace_direct' > > kernel/trace/ftrace.c:5489: warning: Function parameter or struct member > 'free_filters' not described in 'unregister_ftrace_direct' > > kernel/trace/ftrace.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 65208d3b5ed9..6062e4ce1957 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -5475,6 +5475,8 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct); > * unregister_ftrace_direct - Remove calls to custom trampoline > * previously registered by register_ftrace_direct for @ops object. > * @ops: The address of the struct ftrace_ops object > + * @addr: The address of the trampoline to call at @ops functions This is the unregister function. The above sounds like it will be called instead of no longer being called. @addr: The address of the direct function that are called by the @ops functions > + * @free_filters: non zero to remove all filters for the ftrace_ops It's a boolean value, there is no zero. > * > * This is used to remove a direct calls to @addr from the nop locations > * of the functions registered in @ops (with by ftrace_set_filter_ip -- Steve
Re: [PATCH v8 2/5] soc: qcom: pdr: fix parsing of domains lists
On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote: While parsing the domains list, start offsets from 0 rather than from domains_read. The domains_read is equal to the total count of the domains we have seen, while the domains list in the message starts from offset 0. Fixes: fbe639b44a82 ("soc: qcom: Introduce Protection Domain Restart helpers") Tested-by: Steev Klimaszewski Tested-by: Alexey Minnekhanov Signed-off-by: Dmitry Baryshkov --- drivers/soc/qcom/pdr_interface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c index e014dd2d8ab3..d495ee736519 100644 --- a/drivers/soc/qcom/pdr_interface.c +++ b/drivers/soc/qcom/pdr_interface.c @@ -422,7 +422,7 @@ static int pdr_locate_service(struct pdr_handle *pdr, struct pdr_service *pds) if (ret < 0) goto out; - for (i = domains_read; i < resp->domain_list_len; i++) { + for (i = 0; i < resp->domain_list_len; i++) { entry = >domain_list[i]; if (strnlen(entry->name, sizeof(entry->name)) == sizeof(entry->name)) Reviewed-by: Chris Lew
Re: [PATCH] tools/virtio: Use the __GFP_ZERO flag of kmalloc to complete the memory initialization.
On Wed, Jun 5, 2024 at 9:56 PM cuitao wrote: > > Use the __GFP_ZERO flag of kmalloc to initialize memory while allocating it, > without the need for an additional memset call. > > Signed-off-by: cuitao > --- > tools/virtio/linux/kernel.h | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h > index 6702008f7f5c..9e401fb7c215 100644 > --- a/tools/virtio/linux/kernel.h > +++ b/tools/virtio/linux/kernel.h > @@ -66,10 +66,7 @@ static inline void *kmalloc_array(unsigned n, size_t s, > gfp_t gfp) > > static inline void *kzalloc(size_t s, gfp_t gfp) > { > - void *p = kmalloc(s, gfp); > - > - memset(p, 0, s); > - return p; > + return kmalloc(s, gfp | __GFP_ZERO); > } > > static inline void *alloc_pages_exact(size_t s, gfp_t gfp) > -- > 2.25.1 > Does this really work? extern void *__kmalloc_fake, *__kfree_ignore_start, *__kfree_ignore_end; static inline void *kmalloc(size_t s, gfp_t gfp) { if (__kmalloc_fake) return __kmalloc_fake; return malloc(s); } Thanks
Re: [PATCH net-next V2] virtio-net: synchronize operstate with admin state on up/down
On Fri, May 31, 2024 at 8:18 AM Jason Wang wrote: > > On Thu, May 30, 2024 at 9:09 PM Michael S. Tsirkin wrote: > > > > On Thu, May 30, 2024 at 06:29:51PM +0800, Jason Wang wrote: > > > On Thu, May 30, 2024 at 2:10 PM Michael S. Tsirkin > > > wrote: > > > > > > > > On Thu, May 30, 2024 at 11:20:55AM +0800, Jason Wang wrote: > > > > > This patch synchronize operstate with admin state per RFC2863. > > > > > > > > > > This is done by trying to toggle the carrier upon open/close and > > > > > synchronize with the config change work. This allows propagate status > > > > > correctly to stacked devices like: > > > > > > > > > > ip link add link enp0s3 macvlan0 type macvlan > > > > > ip link set link enp0s3 down > > > > > ip link show > > > > > > > > > > Before this patch: > > > > > > > > > > 3: enp0s3: mtu 1500 qdisc pfifo_fast state DOWN > > > > > mode DEFAULT group default qlen 1000 > > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff > > > > > .. > > > > > 5: macvlan0@enp0s3: mtu 1500 > > > > > qdisc noqueue state UP mode DEFAULT group default qlen 1000 > > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff > > > > > > > > > > After this patch: > > > > > > > > > > 3: enp0s3: mtu 1500 qdisc pfifo_fast state DOWN > > > > > mode DEFAULT group default qlen 1000 > > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff > > > > > ... > > > > > 5: macvlan0@enp0s3: mtu > > > > > 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default > > > > > qlen 1000 > > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff > > > > > > > > > > Cc: Venkat Venkatsubra > > > > > Cc: Gia-Khanh Nguyen > > > > > Reviewed-by: Xuan Zhuo > > > > > Acked-by: Michael S. Tsirkin > > > > > Signed-off-by: Jason Wang > > > > > --- > > > > > Changes since V1: > > > > > - rebase > > > > > - add ack/review tags > > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > drivers/net/virtio_net.c | 94 > > > > > +++- > > > > > 1 file changed, 63 insertions(+), 31 deletions(-) > > > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > > index 4a802c0ea2cb..69e4ae353c51 100644 > > > > > --- a/drivers/net/virtio_net.c > > > > > +++ b/drivers/net/virtio_net.c > > > > > @@ -433,6 +433,12 @@ struct virtnet_info { > > > > > /* The lock to synchronize the access to refill_enabled */ > > > > > spinlock_t refill_lock; > > > > > > > > > > + /* Is config change enabled? */ > > > > > + bool config_change_enabled; > > > > > + > > > > > + /* The lock to synchronize the access to config_change_enabled > > > > > */ > > > > > + spinlock_t config_change_lock; > > > > > + > > > > > /* Work struct for config space updates */ > > > > > struct work_struct config_work; > > > > > > > > > > > > > > > > > But we already have dev->config_lock and dev->config_enabled. > > > > > > > > And it actually works better - instead of discarding config > > > > change events it defers them until enabled. > > > > > > > > > > Yes but then both virtio-net driver and virtio core can ask to enable > > > and disable and then we need some kind of synchronization which is > > > non-trivial. > > > > Well for core it happens on bring up path before driver works > > and later on tear down after it is gone. > > So I do not think they ever do it at the same time. > > For example, there could be a suspend/resume when the admin state is down. > > > > > > > > And device enabling on the core is different from bringing the device > > > up in the networking subsystem. Here we just delay to deal with the > > > config change interrupt on ndo_open(). (E.g try to ack announce is > > > meaningless when the device is down). > > > > > > Thanks > > > > another thing is that it is better not to re-read all config > > on link up if there was no config interrupt - less vm exits. > > Yes, but it should not matter much as it's done in the ndo_open(). Michael, any more comments on this? Please confirm if this patch is ok or not. If you prefer to reuse the config_disable() I can change it from a boolean to a counter that allows to be nested. Thanks > > Thanks > > > > > -- > > MST > >
Re: [PATCH v3 13/27] function_graph: Add pid tracing back to function graph tracer
On Mon, 03 Jun 2024 15:07:17 -0400 Steven Rostedt wrote: > +++ b/kernel/trace/ftrace.c > @@ -100,7 +100,7 @@ struct ftrace_ops *function_trace_op __read_mostly = > _list_end; > /* What to set function_trace_op to */ > static struct ftrace_ops *set_function_trace_op; > > -static bool ftrace_pids_enabled(struct ftrace_ops *ops) > +bool ftrace_pids_enabled(struct ftrace_ops *ops) > { > struct trace_array *tr; > > @@ -402,10 +402,11 @@ static void ftrace_update_pid_func(void) > if (op->flags & FTRACE_OPS_FL_PID) { > op->func = ftrace_pids_enabled(op) ? > ftrace_pid_func : op->saved_func; > - ftrace_update_trampoline(op); Bah, this patch accidentally removed the above function and broke pid tracing. Hmm, not sure why this still passed the tests. Will investigate. -- Steve > } > } while_for_each_ftrace_op(op); > > + fgraph_update_pid_func(); > + > update_ftrace_function(); > } >
Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb
On Tue, 4 Jun 2024 14:47:38 -0700 Yan Zhai wrote: > skb does not include enough information to find out receiving > sockets/services and netns/containers on packet drops. In theory > skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP > stack for OOO packet lookup. Similarly, skb->sk often identifies a local > sender, and tells nothing about a receiver. > > Allow passing an extra receiving socket to the tracepoint to improve > the visibility on receiving drops. > > Signed-off-by: Yan Zhai > --- > v2->v3: fixed drop_monitor function prototype > --- > include/trace/events/skb.h | 11 +++ > net/core/dev.c | 2 +- > net/core/drop_monitor.c| 9 ++--- > net/core/skbuff.c | 2 +- > 4 files changed, 15 insertions(+), 9 deletions(-) > > diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h > index 07e0715628ec..aa6b46b6172c 100644 > --- a/include/trace/events/skb.h > +++ b/include/trace/events/skb.h > @@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN) > TRACE_EVENT(kfree_skb, > > TP_PROTO(struct sk_buff *skb, void *location, > - enum skb_drop_reason reason), > + enum skb_drop_reason reason, struct sock *rx_sk), > > - TP_ARGS(skb, location, reason), > + TP_ARGS(skb, location, reason, rx_sk), > > TP_STRUCT__entry( > __field(void *, skbaddr) > __field(void *, location) > __field(unsigned short, protocol) > __field(enum skb_drop_reason, reason) > + __field(void *, rx_skaddr) Please add the pointer after the other pointers: __field(void *, skbaddr) __field(void *, location) + __field(void *, rx_skaddr) __field(unsigned short, protocol) __field(enum skb_drop_reason, reason) otherwise you are adding holes in the ring buffer event. The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We want to avoid alignment holes. I also question having a short before the enum, if the emum is 4 bytes. The short should be at the end. In fact, looking at the format file, there is a 2 byte hole: # cat /sys/kernel/tracing/events/skb/kfree_skb/format name: kfree_skb ID: 1799 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:void * skbaddr; offset:8; size:8; signed:0; field:void * location; offset:16; size:8; signed:0; field:unsigned short protocol; offset:24; size:2; signed:0; field:enum skb_drop_reason reason; offset:28; size:4; signed:0; Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts at offset 28. This means at offset 26, there's a 2 byte hole. -- Steve > ), > > TP_fast_assign( > @@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb, > __entry->location = location; > __entry->protocol = ntohs(skb->protocol); > __entry->reason = reason; > + __entry->rx_skaddr = rx_sk; > ), >
Re: [PATCH v2 0/5] ftrace: Clean up and comment code
On Wed, 05 Jun 2024 14:03:34 -0400 Steven Rostedt wrote: > While working on the function_graph multiple users code, I realized > that I was struggling with how the ftrace code worked. Being the > author of such code meant that it wasn't very intuitive. Namely, the > function names were not descriptive enough, or at least, they needed > comments. > > This series moves to solve some of that via changing a couple function > names and parameters and adding comments to many of them. > This series looks good to me. Acked-by: Masami Hiramatsu (Google) for this series. Thanks! > There's more to do, but this at least moves it in the right direction. > > Changes since v1: > https://lore.kernel.org/all/20240604212817.384103...@goodmis.org/ > > - While working on v1 and responding to a comment from Mark Rutland about > the usage of "ftrace_hash" in the __ftrace_hash_rec_update() code, > I realized that the function does pretty much the same thing if > it is set or not set (but slightly differently). It turns out that > it isn't needed and that parameter can be removed, making the code > simpler. > > - Fixed some wording and typos suggested by Mark Rutland. > > Steven Rostedt (Google) (5): > ftrace: Rename dup_hash() and comment it > ftrace: Remove "ftrace_hash" parameter from __ftrace_hash_rec_update() > ftrace: Add comments to ftrace_hash_rec_disable/enable() > ftrace: Convert "inc" parameter to bool in > ftrace_hash_rec_update_modify() > ftrace: Add comments to ftrace_hash_move() and friends > > > kernel/trace/ftrace.c | 161 > +- > 1 file changed, 94 insertions(+), 67 deletions(-) -- Masami Hiramatsu (Google)
Re: [PATCH v2 1/5] ftrace: Rename dup_hash() and comment it
On Wed, 05 Jun 2024 14:03:35 -0400 Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > The name "dup_hash()" is a misnomer as it does not duplicate the hash that > is passed in, but instead moves its entities from that hash to a newly > allocated one. Rename it to "__move_hash()" (using starting underscores as > it is an internal function), and add some comments about what it does. > Looks good to me. Acked-by: Masami Hiramatsu (Google) > Signed-off-by: Steven Rostedt (Google) > --- > kernel/trace/ftrace.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index da7e6abf48b4..9dcdefe9d1aa 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, > int filter_hash); > static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, > struct ftrace_hash *new_hash); > > -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size) > +/* > + * Allocate a new hash and remove entries from @src and move them to the new > hash. > + * On success, the @src hash will be empty and should be freed. > + */ > +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size) > { > struct ftrace_func_entry *entry; > struct ftrace_hash *new_hash; > @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src) > if (ftrace_hash_empty(src)) > return EMPTY_HASH; > > - return dup_hash(src, size); > + return __move_hash(src, size); > } > > static int > -- > 2.43.0 > > -- Masami Hiramatsu (Google)
Re: [PATCH v8 1/5] soc: qcom: pdr: protect locator_addr with the main mutex
Hi Dmitry, On 5/11/2024 2:56 PM, Dmitry Baryshkov wrote: ... @@ -76,12 +76,12 @@ static int pdr_locator_new_server(struct qmi_handle *qmi, locator_hdl); struct pdr_service *pds; + mutex_lock(>lock); /* Create a local client port for QMI communication */ pdr->locator_addr.sq_family = AF_QIPCRTR; pdr->locator_addr.sq_node = svc->node; pdr->locator_addr.sq_port = svc->port; - mutex_lock(>lock); pdr->locator_init_complete = true; mutex_unlock(>lock); @@ -104,10 +104,10 @@ static void pdr_locator_del_server(struct qmi_handle *qmi, mutex_lock(>lock); pdr->locator_init_complete = false; - mutex_unlock(>lock); pdr->locator_addr.sq_node = 0; pdr->locator_addr.sq_port = 0; + mutex_unlock(>lock); } static const struct qmi_ops pdr_locator_ops = { @@ -365,6 +365,7 @@ static int pdr_get_domain_list(struct servreg_get_domain_list_req *req, if (ret < 0) return ret; + mutex_lock(>lock); ret = qmi_send_request(>locator_hdl, >locator_addr, , SERVREG_GET_DOMAIN_LIST_REQ, @@ -373,15 +374,16 @@ static int pdr_get_domain_list(struct servreg_get_domain_list_req *req, req); if (ret < 0) { qmi_txn_cancel(); - return ret; + goto err_unlock; } ret = qmi_txn_wait(, 5 * HZ); if (ret < 0) { pr_err("PDR: %s get domain list txn wait failed: %d\n", req->service_name, ret); - return ret; + goto err_unlock; } + mutex_unlock(>lock); I'm not sure it is necessary to hold the the mutex during the qmi_txn_wait() since the only variable we are trying to protect is locator_addr. Wouldn't this delay other work like new/del server notifications if this qmi service is delayed or non-responsive? Thanks, Chris if (resp->resp.result != QMI_RESULT_SUCCESS_V01) { pr_err("PDR: %s get domain list failed: 0x%x\n", @@ -390,6 +392,11 @@ static int pdr_get_domain_list(struct servreg_get_domain_list_req *req, } return 0; + +err_unlock: + mutex_unlock(>lock); + + return ret; }
Re: [PATCH v14 14/14] selftests/sgx: Add scripts for EPC cgroup testing
Reorg: void sgx_cgroup_init(void) { struct workqueue_struct *wq; /* eagerly allocate the workqueue: */ wq = alloc_workqueue("sgx_cg_wq", wq_unbound | wq_freezable, wq_unbound_max_active); if (!wq) { pr_warn("sgx_cg_wq creation failed\n"); return; sgx_cgroup_try_charge() expects sgx_cg_wq, so it would break unless we check and return 0 which was the initially implemented in v12. But then Kai had some concern on that we expose all the interface files to allow user to set limits but we don't enforce. To keep it simple we settled down back to BUG_ON(). [...] This would only happen rarely and user can add command-line to disable SGX if s/he really wants to start kernel in this case, just can't do SGX. Just to be clear that I don't like BUG_ON() either. It's inevitable you will get attention because of using it. This is a compromise that you don't want to reset "capacity" to 0 when alloc_workqueue() fails. There are existing code where BUG_ON() is used during the kernel early boot code when memory allocation fails (e.g., see cgroup_init_subsys()), so it might be acceptable to use BUG_ON() here, but it's up to maintainers to decide whether it is OK. [...] With "--strict" flag I also catched these: CHECK: spinlock_t definition without comment #1308: FILE: arch/x86/kernel/cpu/sgx/sgx.h:122: + spinlock_t lock; Yes I had a comment but Kai thought it was too obvious and I can't think of a better one that's not obvious so I removed: https://lore.kernel.org/linux-sgx/9ffb02a3344807f2c173fe8c7cb000cd6c7843b6.ca...@intel.com/ To be clear, my reply was really about the comment itself isn't useful, but didn't say we shouldn't use a comment here. CHECK: multiple assignments should be avoided #444: FILE: kernel/cgroup/misc.c:450: + parent_cg = cg = _cg; This was also suggested by Kai a few versions back: https://lore.kernel.org/linux-sgx/8f08f0b0f2b04b90d7cdb7b628f16f9080687c43.ca...@intel.com/ I didn't know checkpatch complains this. Feel free to revert back as it is trivial to me.
Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()
On Thu, 6 Jun 2024 06:50:18 +0900 Masami Hiramatsu (Google) wrote: > On Tue, 04 Jun 2024 17:28:20 -0400 > Steven Rostedt wrote: > > > From: "Steven Rostedt (Google)" > > > > The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable() > > always has 1 passed to its "ftrace_hash" parameter. Remove the parameter > > and pass in true to __ftrace_hash_rec_update(). > > > > Also add some comments to both those functions explaining what they do. > > > > Looks good to me. > > Reviewed-by: Masami Hiramatsu (Google) Thanks Masami, but I sent out a v2. Could you review those patches instead? https://lore.kernel.org/all/20240605180334.090848...@goodmis.org/ -- Steve
Re: [PATCH 0/6] ftrace: Minor fixes for sparse and kernel test robot
On Wed, 05 Jun 2024 16:26:44 -0400 Steven Rostedt wrote: > > Recieved some minor bug reports from the kernel test robot. First I started > cleaning up some of the sparse warnings. There's many more, but most changes > are not really helping anything, but just quieting the warnings. > > But the reports from kernel test robot need to be fixed. All looks good to me. Acked-by: Masami Hiramatsu (Google) Thank you! > > Steven Rostedt (Google) (6): > ftrace: Declare function_trace_op in header to quiet sparse warning > ftrace: Assign ftrace_list_end to ftrace_ops_list type cast to RCU > ftrace: Assign RCU list variable with rcu_assign_ptr() > ftrace: Fix prototypes for ftrace_startup/shutdown_subops() > function_graph: Make fgraph_do_direct static key static > function_graph: Do not update pid func if CONFIG_DYNAMIC_FTRACE not > enabled > > > include/linux/ftrace.h | 3 +++ > kernel/trace/fgraph.c | 4 +++- > kernel/trace/ftrace.c | 4 ++-- > kernel/trace/ftrace_internal.h | 9 + > kernel/trace/trace.h | 1 - > 5 files changed, 17 insertions(+), 4 deletions(-) -- Masami Hiramatsu (Google)
Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()
On Tue, 04 Jun 2024 17:28:20 -0400 Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable() > always has 1 passed to its "ftrace_hash" parameter. Remove the parameter > and pass in true to __ftrace_hash_rec_update(). > > Also add some comments to both those functions explaining what they do. > Looks good to me. Reviewed-by: Masami Hiramatsu (Google) Thank you, > Signed-off-by: Steven Rostedt (Google) > --- > kernel/trace/ftrace.c | 24 > 1 file changed, 16 insertions(+), 8 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 93c7c5fd4249..de652201c86c 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -1895,16 +1895,24 @@ static bool __ftrace_hash_rec_update(struct > ftrace_ops *ops, > return update; > } > > -static bool ftrace_hash_rec_disable(struct ftrace_ops *ops, > - int filter_hash) > +/* > + * This is called when an ops is removed from tracing. It will decrement > + * the counters of the dyn_ftrace records for all the functions that > + * the @ops attached to. > + */ > +static bool ftrace_hash_rec_disable(struct ftrace_ops *ops) > { > - return __ftrace_hash_rec_update(ops, filter_hash, 0); > + return __ftrace_hash_rec_update(ops, true, 0); > } > > -static bool ftrace_hash_rec_enable(struct ftrace_ops *ops, > -int filter_hash) > +/* > + * This is called when an ops is added to tracing. It will increment > + * the counters of the dyn_ftrace records for all the functions that > + * the @ops attached to. > + */ > +static bool ftrace_hash_rec_enable(struct ftrace_ops *ops) > { > - return __ftrace_hash_rec_update(ops, filter_hash, 1); > + return __ftrace_hash_rec_update(ops, true, 1); > } > > static void ftrace_hash_rec_update_modify(struct ftrace_ops *ops, > @@ -3062,7 +3070,7 @@ int ftrace_startup(struct ftrace_ops *ops, int command) > return ret; > } > > - if (ftrace_hash_rec_enable(ops, 1)) > + if (ftrace_hash_rec_enable(ops)) > command |= FTRACE_UPDATE_CALLS; > > ftrace_startup_enable(command); > @@ -3104,7 +3112,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command) > /* Disabling ipmodify never fails */ > ftrace_hash_ipmodify_disable(ops); > > - if (ftrace_hash_rec_disable(ops, 1)) > + if (ftrace_hash_rec_disable(ops)) > command |= FTRACE_UPDATE_CALLS; > > ops->flags &= ~FTRACE_OPS_FL_ENABLED; > -- > 2.43.0 > > -- Masami Hiramatsu (Google)
Re: [PATCH 1/5] ftrace: Rename dup_hash() and comment it
On Tue, 04 Jun 2024 17:28:18 -0400 Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > The name "dup_hash()" is a misnomer as it does not duplicate the hash that > is passed in, but instead moves its entities from that hash to a newly > allocated one. Rename it to "__move_hash()" (using starting underscores as > it is an internal function), and add some comments about what it does. Good change. Reviewed-by: Masami Hiramatsu (Google) Thank you, > > Signed-off-by: Steven Rostedt (Google) > --- > kernel/trace/ftrace.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index da7e6abf48b4..9dcdefe9d1aa 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -1391,7 +1391,11 @@ ftrace_hash_rec_enable_modify(struct ftrace_ops *ops, > int filter_hash); > static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, > struct ftrace_hash *new_hash); > > -static struct ftrace_hash *dup_hash(struct ftrace_hash *src, int size) > +/* > + * Allocate a new hash and remove entries from @src and move them to the new > hash. > + * On success, the @src hash will be empty and should be freed. > + */ > +static struct ftrace_hash *__move_hash(struct ftrace_hash *src, int size) > { > struct ftrace_func_entry *entry; > struct ftrace_hash *new_hash; > @@ -1438,7 +1442,7 @@ __ftrace_hash_move(struct ftrace_hash *src) > if (ftrace_hash_empty(src)) > return EMPTY_HASH; > > - return dup_hash(src, size); > + return __move_hash(src, size); > } > > static int > -- > 2.43.0 > > -- Masami Hiramatsu (Google)
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On 06/05, Andrii Nakryiko wrote: > > WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO. Andrii. I am alredy sleeping, I'll try to read your email tomorrow. Right now I can only say that everything is simpler than the shadow stack ;) > P.S. Regardless, maybe we should change the order in which we insert > consumers to uprobe? Right now uprobe consumer added later will be > executed first, which, while not wrong, is counter-intuitive. Agreed... Even if currently this doesn't really matter, I guess it is supposed that uc->handler() is "non-intrusive". Oleg.
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 05, 2024 at 01:47:00PM -0700, Andrii Nakryiko wrote: > On Wed, Jun 5, 2024 at 10:57 AM Oleg Nesterov wrote: > > > > On 06/05, Andrii Nakryiko wrote: > > > > > > so any such > > > limitations will cause problems, issue reports, investigation, etc. > > > > Agreed... > > > > > As one possible solution, what if we do > > > > > > struct return_instance { > > > ... > > > u64 session_cookies[]; > > > }; > > > > > > and allocate sizeof(struct return_instance) + 8 * > > > and then at runtime pass > > > _cookies[i] as data pointer to session-aware callbacks? > > > > I too thought about this, but I guess it is not that simple. > > > > Just for example. Suppose we have 2 session-consumers C1 and C2. > > What if uprobe_unregister(C1) comes before the probed function > > returns? > > > > We need something like map_cookie_to_consumer(). > > Fair enough. The easy way to solve this is to have > > > struct uprobe_session_cookie { > int consumer_id; > u64 cookie; > }; > > And add id to each new consumer when it is added to struct uprobe. > Unfortunately, it's impossible to tell when a new consumer was added > to the list (as a front item, but maybe we just change it to be > appended instead of prepending) vs when the old consumer was removed, > so in some cases we'd need to do a linear search. also we probably need to add the flag if we want to execute the return handler.. we can have multiple session handlers and if just one of them returns 0 we need to install the return probe and then when return probe hits, we need to execute only that consumer's return handler jirka > > But the good news is that in the common case we wouldn't need to > search and the next item in session_cookies[] array would be the one > we need. > > WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO. > > P.S. Regardless, maybe we should change the order in which we insert > consumers to uprobe? Right now uprobe consumer added later will be > executed first, which, while not wrong, is counter-intuitive. And also > it breaks a nice natural order when we need to match it up with stuff > like session_cookies[] as described above. > > > > > > > + /* The handler_session callback return value controls execution > > > > of > > > > +* the return uprobe and ret_handler_session callback. > > > > +* 0 on success > > > > +* 1 on failure, DO NOT install/execute the return uprobe > > > > +*console warning for anything else > > > > +*/ > > > > + int (*handler_session)(struct uprobe_consumer *self, struct > > > > pt_regs *regs, > > > > + unsigned long *data); > > > > + int (*ret_handler_session)(struct uprobe_consumer *self, > > > > unsigned long func, > > > > + struct pt_regs *regs, unsigned long > > > > *data); > > > > + > > > > > > We should try to avoid an alternative set of callbacks, IMO. Let's > > > extend existing ones with `unsigned long *data`, > > > > Oh yes, agreed. > > > > And the comment about the return value looks confusing too. I mean, the > > logic doesn't differ from the ret-code from ->handler(). > > > > "DO NOT install/execute the return uprobe" is not true if another > > non-session-consumer returns 0. > > > > Oleg. > >
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On 06/05, Jiri Olsa wrote: > > > And the comment about the return value looks confusing too. I mean, the > > logic doesn't differ from the ret-code from ->handler(). > > > > "DO NOT install/execute the return uprobe" is not true if another > > non-session-consumer returns 0. > > well they are meant to be exclusive, so there'd be no other > non-session-consumer OK. (but may be the changelog can explain more clearly why they can't co-exist with the non-session-consumers). But again, this doesn't differ from the the ret-code from the non-session-consumer->handler(). If it returns 1 == UPROBE_HANDLER_REMOVE, then without other consumers prepare_uretprobe() won't be called. Oleg.
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 05, 2024 at 10:25:56AM -0700, Andrii Nakryiko wrote: SNIP > > --- > > include/linux/uprobes.h | 18 +++ > > kernel/events/uprobes.c | 69 +++-- > > 2 files changed, 78 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h > > index f46e0ca0169c..a2f2d5ac3cee 100644 > > --- a/include/linux/uprobes.h > > +++ b/include/linux/uprobes.h > > @@ -34,6 +34,12 @@ enum uprobe_filter_ctx { > > }; > > > > struct uprobe_consumer { > > + /* > > +* The handler callback return value controls removal of the uprobe. > > +* 0 on success, uprobe stays > > +* 1 on failure, remove the uprobe > > +*console warning for anything else > > +*/ > > int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs); > > int (*ret_handler)(struct uprobe_consumer *self, > > unsigned long func, > > @@ -42,6 +48,17 @@ struct uprobe_consumer { > > enum uprobe_filter_ctx ctx, > > struct mm_struct *mm); > > > > + /* The handler_session callback return value controls execution of > > +* the return uprobe and ret_handler_session callback. > > +* 0 on success > > +* 1 on failure, DO NOT install/execute the return uprobe > > +*console warning for anything else > > +*/ > > + int (*handler_session)(struct uprobe_consumer *self, struct pt_regs > > *regs, > > + unsigned long *data); > > + int (*ret_handler_session)(struct uprobe_consumer *self, unsigned > > long func, > > + struct pt_regs *regs, unsigned long > > *data); > > + > > We should try to avoid an alternative set of callbacks, IMO. Let's > extend existing ones with `unsigned long *data`, but specify that > unless consumer sets some flag on registration that it needs a session > cookie, we'll pass NULL here? Or just allocate cookie data for each > registered consumer for simplicity, don't know; given we don't expect > many consumers on exactly the same uprobe, it might be ok to keep it > simple. > ah, I did not want to break existing users.. but it's not uapi, so we're good, ok makes sense jirka > > > struct uprobe_consumer *next; > > }; > > > > @@ -85,6 +102,7 @@ struct return_instance { > > unsigned long func; > > unsigned long stack; /* stack pointer */ > > unsigned long orig_ret_vaddr; /* original return address > > */ > > + unsigned long data; > > boolchained;/* true, if instance is > > nested */ > > > > struct return_instance *next; /* keep as stack */ SNIP
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 05, 2024 at 07:56:19PM +0200, Oleg Nesterov wrote: > On 06/05, Andrii Nakryiko wrote: > > > > so any such > > limitations will cause problems, issue reports, investigation, etc. > > Agreed... > > > As one possible solution, what if we do > > > > struct return_instance { > > ... > > u64 session_cookies[]; > > }; > > > > and allocate sizeof(struct return_instance) + 8 * > > and then at runtime pass > > _cookies[i] as data pointer to session-aware callbacks? > > I too thought about this, but I guess it is not that simple. > > Just for example. Suppose we have 2 session-consumers C1 and C2. > What if uprobe_unregister(C1) comes before the probed function > returns? > > We need something like map_cookie_to_consumer(). I guess we could have hash table in return_instance that gets 'consumer -> cookie' ? return instance is freed after the consumers' return handlers are executed, so there's no leak if some consumer gets unregistered before that > > > > + /* The handler_session callback return value controls execution of > > > +* the return uprobe and ret_handler_session callback. > > > +* 0 on success > > > +* 1 on failure, DO NOT install/execute the return uprobe > > > +*console warning for anything else > > > +*/ > > > + int (*handler_session)(struct uprobe_consumer *self, struct > > > pt_regs *regs, > > > + unsigned long *data); > > > + int (*ret_handler_session)(struct uprobe_consumer *self, unsigned > > > long func, > > > + struct pt_regs *regs, unsigned long > > > *data); > > > + > > > > We should try to avoid an alternative set of callbacks, IMO. Let's > > extend existing ones with `unsigned long *data`, > > Oh yes, agreed. > > And the comment about the return value looks confusing too. I mean, the > logic doesn't differ from the ret-code from ->handler(). > > "DO NOT install/execute the return uprobe" is not true if another > non-session-consumer returns 0. well they are meant to be exclusive, so there'd be no other non-session-consumer jirka
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 5, 2024 at 10:57 AM Oleg Nesterov wrote: > > On 06/05, Andrii Nakryiko wrote: > > > > so any such > > limitations will cause problems, issue reports, investigation, etc. > > Agreed... > > > As one possible solution, what if we do > > > > struct return_instance { > > ... > > u64 session_cookies[]; > > }; > > > > and allocate sizeof(struct return_instance) + 8 * > > and then at runtime pass > > _cookies[i] as data pointer to session-aware callbacks? > > I too thought about this, but I guess it is not that simple. > > Just for example. Suppose we have 2 session-consumers C1 and C2. > What if uprobe_unregister(C1) comes before the probed function > returns? > > We need something like map_cookie_to_consumer(). Fair enough. The easy way to solve this is to have struct uprobe_session_cookie { int consumer_id; u64 cookie; }; And add id to each new consumer when it is added to struct uprobe. Unfortunately, it's impossible to tell when a new consumer was added to the list (as a front item, but maybe we just change it to be appended instead of prepending) vs when the old consumer was removed, so in some cases we'd need to do a linear search. But the good news is that in the common case we wouldn't need to search and the next item in session_cookies[] array would be the one we need. WDYT? It's still fast, and it's simpler than the shadow stack idea, IMO. P.S. Regardless, maybe we should change the order in which we insert consumers to uprobe? Right now uprobe consumer added later will be executed first, which, while not wrong, is counter-intuitive. And also it breaks a nice natural order when we need to match it up with stuff like session_cookies[] as described above. > > > > + /* The handler_session callback return value controls execution of > > > +* the return uprobe and ret_handler_session callback. > > > +* 0 on success > > > +* 1 on failure, DO NOT install/execute the return uprobe > > > +*console warning for anything else > > > +*/ > > > + int (*handler_session)(struct uprobe_consumer *self, struct > > > pt_regs *regs, > > > + unsigned long *data); > > > + int (*ret_handler_session)(struct uprobe_consumer *self, unsigned > > > long func, > > > + struct pt_regs *regs, unsigned long > > > *data); > > > + > > > > We should try to avoid an alternative set of callbacks, IMO. Let's > > extend existing ones with `unsigned long *data`, > > Oh yes, agreed. > > And the comment about the return value looks confusing too. I mean, the > logic doesn't differ from the ret-code from ->handler(). > > "DO NOT install/execute the return uprobe" is not true if another > non-session-consumer returns 0. > > Oleg. >
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On Wed, Jun 05, 2024 at 06:36:25PM +0200, Oleg Nesterov wrote: > On 06/05, Oleg Nesterov wrote: > > > > On 06/05, Oleg Nesterov wrote: > > > > > > > +/* > > > > + * Make sure all the uprobe consumers have only one type of entry > > > > + * callback registered (either handler or handler_session) due to > > > > + * different return value actions. > > > > + */ > > > > +static int consumer_check(struct uprobe_consumer *curr, struct > > > > uprobe_consumer *uc) > > > > +{ > > > > + if (!curr) > > > > + return 0; > > > > + if (curr->handler_session || uc->handler_session) > > > > + return -EBUSY; > > > > + return 0; > > > > +} > > > > > > Hmm, I don't understand this code, it doesn't match the comment... > > > > > > The comment says "all the uprobe consumers have only one type" but > > > consumer_check() will always fail if the the 1st or 2nd consumer has > > > ->handler_session != NULL ? > > > > > > Perhaps you meant > > > > > > if (!!curr->handler != !!uc->handler) > > > return -EBUSY; > > > > > > ? > > > > OK, the changelog says > > > > Which means that there can be only single user of a uprobe (inode + > > offset) when session consumer is registered to it. > > > > so the code is correct. But I still think the comment is misleading. > > Cough... perhaps it is correct but I am still confused even we forget about > the comment ;) > > OK, uprobe can have a single consumer with ->handler_session != NULL. I guess > this is because return_instance->data is "global". > > So uprobe can have multiple handler_session == NULL consumers before > handler_session != NULL, but not after ? ah yea it should have done what's in the comment, so it's missing the check for handler.. session handlers are meant to be exclusive thanks, jirka
Re: [PATCH 3/5] ftrace: Remove "filter_hash" parameter from ftrace_hash_rec_disable/enable()
On Wed, 5 Jun 2024 11:17:31 +0100 Mark Rutland wrote: > On Tue, Jun 04, 2024 at 05:28:20PM -0400, Steven Rostedt wrote: > > From: "Steven Rostedt (Google)" > > > > The functions ftrace_hash_rec_disable() and ftrace_hash_rec_enable() > > always has 1 passed to its "ftrace_hash" parameter. Remove the parameter > > and pass in true to __ftrace_hash_rec_update(). > > > > Also add some comments to both those functions explaining what they do. > > > > Signed-off-by: Steven Rostedt (Google) > > Looks good to me. > > Acked-by: Mark Rutland I removed your Ack from v2 as it changed enough that I believe it requires a new Ack. -- Steve
Re: [RFC bpf-next 01/10] uprobe: Add session callbacks to uprobe_consumer
On 06/05, Andrii Nakryiko wrote: > > so any such > limitations will cause problems, issue reports, investigation, etc. Agreed... > As one possible solution, what if we do > > struct return_instance { > ... > u64 session_cookies[]; > }; > > and allocate sizeof(struct return_instance) + 8 * > and then at runtime pass > _cookies[i] as data pointer to session-aware callbacks? I too thought about this, but I guess it is not that simple. Just for example. Suppose we have 2 session-consumers C1 and C2. What if uprobe_unregister(C1) comes before the probed function returns? We need something like map_cookie_to_consumer(). > > + /* The handler_session callback return value controls execution of > > +* the return uprobe and ret_handler_session callback. > > +* 0 on success > > +* 1 on failure, DO NOT install/execute the return uprobe > > +*console warning for anything else > > +*/ > > + int (*handler_session)(struct uprobe_consumer *self, struct pt_regs > > *regs, > > + unsigned long *data); > > + int (*ret_handler_session)(struct uprobe_consumer *self, unsigned > > long func, > > + struct pt_regs *regs, unsigned long > > *data); > > + > > We should try to avoid an alternative set of callbacks, IMO. Let's > extend existing ones with `unsigned long *data`, Oh yes, agreed. And the comment about the return value looks confusing too. I mean, the logic doesn't differ from the ret-code from ->handler(). "DO NOT install/execute the return uprobe" is not true if another non-session-consumer returns 0. Oleg.
Re: [PATCH v4] drivers: remoteproc: xlnx: add attach detach support
On 6/4/24 3:23 PM, Bjorn Andersson wrote: > On Mon, Jun 03, 2024 at 01:34:38PM -0700, Tanmay Shah wrote: >> It is possible that remote processor is already running before >> linux boot or remoteproc platform driver probe. Implement required >> remoteproc framework ops to provide resource table address and >> connect or disconnect with remote processor in such case. >> > > I know if changes the current style for this driver, but could we drop > "drivers: " from the subject prefix, to align changes to this driver > with others? > Ack. Will be fixed. Is it convention not to have "drivers" ? If so, from now on, I will format subject prefix: /: > Regards, > Bjorn