[ANNOUNCE] 5.10.219-rt111
Hello RT-list! I'm pleased to announce the 5.10.219-rt111 stable release. This release is just an update to the new stable 5.10.219 version and no RT changes have been made. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v5.10-rt Head SHA1: 4a4ea2ea1cc624964d53cf22fa5f92a9f43708bb Or to build 5.10.219-rt111 directly, the following patches should be applied: https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.10.tar.xz https://www.kernel.org/pub/linux/kernel/v5.x/patch-5.10.219.xz https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.219-rt111.patch.xz Signing key fingerprint: 9354 0649 9972 8D31 D464 D140 F394 A423 F8E6 7C26 All keys used for the above files and repositories can be found on the following git repository: git://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git Enjoy! Luis
Re: [PATCH 1/4] remoteproc: k3-r5: Fix IPC-only mode detection
Nishanth, Vignesh, Hari and Andrew - please have a look at this patch. Thanks, Mathieu On Fri, 28 Jun 2024 at 13:53, Mathieu Poirier wrote: > > Good day, > > On Fri, Jun 21, 2024 at 05:00:55PM +0200, Richard Genoud wrote: > > ret variable was used to test reset status, get from > > reset_control_status() call. But this variable was overwritten by > > ti_sci_proc_get_status() a few lines bellow. > > And as ti_sci_proc_get_status() returns 0 or a negative value (in this > > latter case, followed by a return), the expression !ret was always true, > > > > Clearly, this was not what was intended: > > In the comment above it's said that "requires both local and module > > resets to be deasserted"; if reset_control_status() returns 0 it means > > that the reset line is deasserted. > > So, it's pretty clear that the return value of reset_control_status() > > was intended to be used instead of ti_sci_proc_get_status() return > > value. > > > > This could lead in an incorrect IPC-only mode detection if reset line is > > asserted (so reset_control_status() return > 0) and c_state != 0 and > > halted == 0. > > In this case, the old code would have detected an IPC-only mode instead > > of a mismatched mode. > > > > Your assessment seems to be correct. That said I'd like to have an RB or a TB > from someone in the TI delegation - guys please have a look. > > Thanks, > Mathieu > > > Fixes: 1168af40b1ad ("remoteproc: k3-r5: Add support for IPC-only mode for > > all R5Fs") > > Signed-off-by: Richard Genoud > > --- > > drivers/remoteproc/ti_k3_r5_remoteproc.c | 13 +++-- > > 1 file changed, 7 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c > > b/drivers/remoteproc/ti_k3_r5_remoteproc.c > > index 50e486bcfa10..39a47540c590 100644 > > --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c > > +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c > > @@ -1144,6 +1144,7 @@ static int k3_r5_rproc_configure_mode(struct > > k3_r5_rproc *kproc) > > u32 atcm_enable, btcm_enable, loczrama; > > struct k3_r5_core *core0; > > enum cluster_mode mode = cluster->mode; > > + int reset_ctrl_status; > > int ret; > > > > core0 = list_first_entry(>cores, struct k3_r5_core, elem); > > @@ -1160,11 +1161,11 @@ static int k3_r5_rproc_configure_mode(struct > > k3_r5_rproc *kproc) > >r_state, c_state); > > } > > > > - ret = reset_control_status(core->reset); > > - if (ret < 0) { > > + reset_ctrl_status = reset_control_status(core->reset); > > + if (reset_ctrl_status < 0) { > > dev_err(cdev, "failed to get initial local reset status, ret > > = %d\n", > > - ret); > > - return ret; > > + reset_ctrl_status); > > + return reset_ctrl_status; > > } > > > > /* > > @@ -1199,7 +1200,7 @@ static int k3_r5_rproc_configure_mode(struct > > k3_r5_rproc *kproc) > >* irrelevant if module reset is asserted (POR value has local reset > >* deasserted), and is deemed as remoteproc mode > >*/ > > - if (c_state && !ret && !halted) { > > + if (c_state && !reset_ctrl_status && !halted) { > > dev_info(cdev, "configured R5F for IPC-only mode\n"); > > kproc->rproc->state = RPROC_DETACHED; > > ret = 1; > > @@ -1217,7 +1218,7 @@ static int k3_r5_rproc_configure_mode(struct > > k3_r5_rproc *kproc) > > ret = 0; > > } else { > > dev_err(cdev, "mismatched mode: local_reset = %s, > > module_reset = %s, core_state = %s\n", > > - !ret ? "deasserted" : "asserted", > > + !reset_ctrl_status ? "deasserted" : "asserted", > > c_state ? "deasserted" : "asserted", > > halted ? "halted" : "unhalted"); > > ret = -EINVAL;
Re: [PATCH 1/4] remoteproc: k3-r5: Fix IPC-only mode detection
Good day, On Fri, Jun 21, 2024 at 05:00:55PM +0200, Richard Genoud wrote: > ret variable was used to test reset status, get from > reset_control_status() call. But this variable was overwritten by > ti_sci_proc_get_status() a few lines bellow. > And as ti_sci_proc_get_status() returns 0 or a negative value (in this > latter case, followed by a return), the expression !ret was always true, > > Clearly, this was not what was intended: > In the comment above it's said that "requires both local and module > resets to be deasserted"; if reset_control_status() returns 0 it means > that the reset line is deasserted. > So, it's pretty clear that the return value of reset_control_status() > was intended to be used instead of ti_sci_proc_get_status() return > value. > > This could lead in an incorrect IPC-only mode detection if reset line is > asserted (so reset_control_status() return > 0) and c_state != 0 and > halted == 0. > In this case, the old code would have detected an IPC-only mode instead > of a mismatched mode. > Your assessment seems to be correct. That said I'd like to have an RB or a TB from someone in the TI delegation - guys please have a look. Thanks, Mathieu > Fixes: 1168af40b1ad ("remoteproc: k3-r5: Add support for IPC-only mode for > all R5Fs") > Signed-off-by: Richard Genoud > --- > drivers/remoteproc/ti_k3_r5_remoteproc.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c > b/drivers/remoteproc/ti_k3_r5_remoteproc.c > index 50e486bcfa10..39a47540c590 100644 > --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c > +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c > @@ -1144,6 +1144,7 @@ static int k3_r5_rproc_configure_mode(struct > k3_r5_rproc *kproc) > u32 atcm_enable, btcm_enable, loczrama; > struct k3_r5_core *core0; > enum cluster_mode mode = cluster->mode; > + int reset_ctrl_status; > int ret; > > core0 = list_first_entry(>cores, struct k3_r5_core, elem); > @@ -1160,11 +1161,11 @@ static int k3_r5_rproc_configure_mode(struct > k3_r5_rproc *kproc) >r_state, c_state); > } > > - ret = reset_control_status(core->reset); > - if (ret < 0) { > + reset_ctrl_status = reset_control_status(core->reset); > + if (reset_ctrl_status < 0) { > dev_err(cdev, "failed to get initial local reset status, ret = > %d\n", > - ret); > - return ret; > + reset_ctrl_status); > + return reset_ctrl_status; > } > > /* > @@ -1199,7 +1200,7 @@ static int k3_r5_rproc_configure_mode(struct > k3_r5_rproc *kproc) >* irrelevant if module reset is asserted (POR value has local reset >* deasserted), and is deemed as remoteproc mode >*/ > - if (c_state && !ret && !halted) { > + if (c_state && !reset_ctrl_status && !halted) { > dev_info(cdev, "configured R5F for IPC-only mode\n"); > kproc->rproc->state = RPROC_DETACHED; > ret = 1; > @@ -1217,7 +1218,7 @@ static int k3_r5_rproc_configure_mode(struct > k3_r5_rproc *kproc) > ret = 0; > } else { > dev_err(cdev, "mismatched mode: local_reset = %s, module_reset > = %s, core_state = %s\n", > - !ret ? "deasserted" : "asserted", > + !reset_ctrl_status ? "deasserted" : "asserted", > c_state ? "deasserted" : "asserted", > halted ? "halted" : "unhalted"); > ret = -EINVAL;
Re: [PATCH] kallsyms, livepatch: Fix livepatch with CONFIG_LTO_CLANG
Hi Luis, On Fri, Jun 28, 2024 at 10:36 AM Luis Chamberlain wrote: > > On Fri, Jun 28, 2024 at 02:23:49PM +0200, Miroslav Benes wrote: > > On Fri, 7 Jun 2024, Song Liu wrote: > > > > > Hi Miroslav, > > > > > > Thanks for reviewing the patch! > > > > > > I think it is possible. Currently, kallsyms_on_each_match_symbol matches > > > symbols without the postfix. We can add a variation or a parameter, so > > > that it matches the full name with post fix. > > > > I think it might be better. > > > > Luis, what is your take on this? > > > > If I am not mistaken, there was a patch set to address this. Luis might > > remember more. > > Yeah this is a real issue outside of CONFIG_LTO_CLANG, Rust modules is > another example where instead of symbol names they want to use full > hashes. So, as I hinted to you Sami, can we knock two birds with one stone > here and move CONFIG_LTO_CLANG to use the same strategy as Rust so we > have two users instead of just one? I'm all for finding generic solutions, but perhaps I've missed the patch set Miroslav mentioned because I'm not quite sure how these problems are related. LTO makes duplicate symbol names globally unique by appending a postfix to them, which complicates looking up symbols by name. Rust, on the other hand, has a problem with CONFIG_MODVERSIONS because the long symbol names it generates cannot fit in the small buffer in struct modversion_info. The only reason we proposed storing a cryptographic hash in modversion_info was to avoid breaking userspace tools that parse this data structure, but AFAIK nobody wants to use hashed symbol names anywhere else. In fact, if there's a better solution for addressing modversion_info limitations, I would be happy not to hash anything. Sami
[syzbot] [virt?] [net?] upstream test error: KMSAN: uninit-value in virtnet_poll
Hello, syzbot found the following issue on: HEAD commit:626737a5791b Merge tag 'pinctrl-v6.10-2' of git://git.kern.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1373f72e98 kernel config: https://syzkaller.appspot.com/x/.config?x=12ff58d525e7b8f9 dashboard link: https://syzkaller.appspot.com/bug?extid=35b9a14142dd62084eb9 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 userspace arch: i386 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/b5c2e4152e89/disk-626737a5.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/4847a4cfa180/vmlinux-626737a5.xz kernel image: https://storage.googleapis.com/syzbot-assets/18f05d5ddcb1/bzImage-626737a5.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+35b9a14142dd62084...@syzkaller.appspotmail.com = BUG: KMSAN: uninit-value in receive_mergeable drivers/net/virtio_net.c:1847 [inline] BUG: KMSAN: uninit-value in receive_buf+0x2620/0x6070 drivers/net/virtio_net.c:1973 virtnet_receive drivers/net/virtio_net.c:2277 [inline] virtnet_poll+0xd1c/0x23c0 drivers/net/virtio_net.c:2380 __napi_poll+0xe7/0x980 net/core/dev.c:6722 handle_softirqs+0x1ce/0x800 kernel/softirq.c:554 common_interrupt+0x94/0xa0 arch/x86/kernel/irq.c:278 asm_common_interrupt+0x2b/0x40 arch/x86/include/asm/idtentry.h:693 kmsan_get_metadata+0x189/0x1d0 kmsan_get_shadow_origin_ptr+0x4d/0xb0 mm/kmsan/shadow.c:102 get_shadow_origin_ptr mm/kmsan/instrumentation.c:36 [inline] __msan_metadata_ptr_for_load_8+0x24/0x40 mm/kmsan/instrumentation.c:92 unwind_get_return_address_ptr+0x6a/0x100 arch/x86/kernel/unwind_frame.c:28 update_stack_state+0x206/0x270 arch/x86/kernel/unwind_frame.c:251 unwind_next_frame+0x19a/0x470 arch/x86/kernel/unwind_frame.c:315 arch_stack_walk+0x1ec/0x2d0 arch/x86/kernel/stacktrace.c:25 stack_trace_save+0xaa/0xe0 kernel/stacktrace.c:122 kmsan_save_stack_with_flags mm/kmsan/core.c:74 [inline] kmsan_internal_poison_memory+0x49/0x90 mm/kmsan/core.c:58 kmsan_slab_alloc+0xdf/0x160 mm/kmsan/hooks.c:68 slab_post_alloc_hook mm/slub.c:3947 [inline] slab_alloc_node mm/slub.c:4001 [inline] __do_kmalloc_node mm/slub.c:4121 [inline] __kmalloc_noprof+0x660/0xf30 mm/slub.c:4135 kmalloc_noprof include/linux/slab.h:664 [inline] tomoyo_realpath_from_path+0x104/0xaa0 security/tomoyo/realpath.c:251 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_check_open_permission+0x1ef/0xc50 security/tomoyo/file.c:771 tomoyo_file_open+0x271/0x360 security/tomoyo/tomoyo.c:334 security_file_open+0x9a/0xc60 security/security.c:2962 do_dentry_open+0x5b1/0x22b0 fs/open.c:942 vfs_open+0x49/0x60 fs/open.c:1089 do_open fs/namei.c:3650 [inline] path_openat+0x4ab0/0x5b70 fs/namei.c:3807 do_filp_open+0x20e/0x590 fs/namei.c:3834 do_sys_openat2+0x1bf/0x2f0 fs/open.c:1405 do_sys_open fs/open.c:1420 [inline] __do_sys_openat fs/open.c:1436 [inline] __se_sys_openat fs/open.c:1431 [inline] __x64_sys_openat+0x2a1/0x310 fs/open.c:1431 x64_sys_call+0x128b/0x3b90 arch/x86/include/generated/asm/syscalls_64.h:258 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: __alloc_pages_noprof+0x9d6/0xe70 mm/page_alloc.c:4701 alloc_pages_mpol_noprof+0x299/0x990 mm/mempolicy.c:2265 alloc_pages_noprof+0x1bf/0x1e0 mm/mempolicy.c:2336 skb_page_frag_refill+0x2bf/0x7c0 net/core/sock.c:2920 virtnet_rq_alloc+0x43/0xbb0 drivers/net/virtio_net.c:882 add_recvbuf_mergeable drivers/net/virtio_net.c:2128 [inline] try_fill_recv+0x3f0/0x2f50 drivers/net/virtio_net.c:2173 virtnet_open+0x1cc/0xb00 drivers/net/virtio_net.c:2452 __dev_open+0x546/0x6f0 net/core/dev.c:1472 __dev_change_flags+0x309/0x9a0 net/core/dev.c:8781 dev_change_flags+0x8e/0x1d0 net/core/dev.c:8853 devinet_ioctl+0x13ec/0x22c0 net/ipv4/devinet.c:1177 inet_ioctl+0x4bd/0x6d0 net/ipv4/af_inet.c:1003 sock_do_ioctl+0xb7/0x540 net/socket.c:1222 sock_ioctl+0x727/0xd70 net/socket.c:1341 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0x261/0x450 fs/ioctl.c:893 __x64_sys_ioctl+0x96/0xe0 fs/ioctl.c:893 x64_sys_call+0x18c0/0x3b90 arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f CPU: 0 PID: 4794 Comm: rm Not tainted 6.10.0-rc5-syzkaller-00012-g626737a5791b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 = --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue.
Re: [PATCH] remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
On Thu, Jun 27, 2024 at 05:20:55PM -0400, Nícolas F. R. A. Prado wrote: > The current code doesn't check whether platform_get_resource_byname() > succeeded to get the l1tcm memory, which is optional, before attempting > to map it. This results in the following error message when it is > missing: > > mtk-scp 1050.scp: error -EINVAL: invalid resource (null) > > Add a check so that the remapping is only attempted if the memory region > exists. This also allows to simplify the logic handling failure to > remap, since a failure then is always a failure. > > Fixes: ca23ecfdbd44 ("remoteproc/mediatek: support L1TCM") > Signed-off-by: Nícolas F. R. A. Prado > --- > drivers/remoteproc/mtk_scp.c | 12 +--- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c > index b885a9a041e4..b17757900cd7 100644 > --- a/drivers/remoteproc/mtk_scp.c > +++ b/drivers/remoteproc/mtk_scp.c > @@ -1344,14 +1344,12 @@ static int scp_probe(struct platform_device *pdev) > > /* l1tcm is an optional memory region */ > res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "l1tcm"); > - scp_cluster->l1tcm_base = devm_ioremap_resource(dev, res); > - if (IS_ERR(scp_cluster->l1tcm_base)) { > - ret = PTR_ERR(scp_cluster->l1tcm_base); > - if (ret != -EINVAL) > - return dev_err_probe(dev, ret, "Failed to map l1tcm > memory\n"); > + if (res) { > + scp_cluster->l1tcm_base = devm_ioremap_resource(dev, res); > + if (IS_ERR(scp_cluster->l1tcm_base)) > + return dev_err_probe(dev, > PTR_ERR(scp_cluster->l1tcm_base), > + "Failed to map l1tcm memory\n"); > > - scp_cluster->l1tcm_base = NULL; > - } else { Much better - I have applied this patch. Regards, Mathieu > scp_cluster->l1tcm_size = resource_size(res); > scp_cluster->l1tcm_phys = res->start; > } > > --- > base-commit: 0fc4bfab2cd45f9acb86c4f04b5191e114e901ed > change-id: 20240627-scp-invalid-resource-l1tcm-9f7cf45c17e6 > > Best regards, > -- > Nícolas F. R. A. Prado >
Re: [PATCH v7 0/5] initial support for Marvell 88PM886 PMIC
Lee Jones, 2024-06-28T15:41:39+01:00: > On Fri, 31 May 2024 19:34:55 +0200, Karel Balej wrote: > > the following implements basic support for Marvell's 88PM886 PMIC which > > is found for instance as a component of the samsung,coreprimevelte > > smartphone which inspired this and also serves as a testing platform. > > > > The code for the MFD is based primarily on this old series [1] with the > > addition of poweroff based on the smartphone's downstream kernel tree > > [2]. The onkey and regulators drivers are based on the latter. I am not > > in possesion of the datasheet. > > > > [...] > > Applied, thanks! Thank you and thank you and everybody else for all the feedback and reviews, I appreciate it. K. B.
[PATCH] mailmap: Update Luca Weiss's email address
I'm slowly migrating my mail to a new domain, add an entry to map the mail address. Just for clarity, my work-related @fairphone.com email stays unchanged. Signed-off-by: Luca Weiss --- Since my email address also appears in a bunch of drivers and arm(64) files, and two devicetree binding files, how are those normally handled? Just ignore them and let mailmap handle everything relevant? --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index a6c619e22efc..e169a99ce7c7 100644 --- a/.mailmap +++ b/.mailmap @@ -385,6 +385,7 @@ Li Yang Lior David Lorenzo Pieralisi Luca Ceresoli +Luca Weiss Lukasz Luba Luo Jie Maciej W. Rozycki --- base-commit: 642a16ca7994a50d7de85715996a8ce171a5bdfb change-id: 20240628-mailmap-3528f7365abb Best regards, -- Luca Weiss
[PATCH] soc: qcom: smsm: Add missing mailbox dependency to Kconfig
Since the smsm driver got the ability to interact with the mailbox using the mailbox subsystem and not just syscon, we need to add the dependency to kconfig as well to avoid compile errors. Fixes: 75287992f58a ("soc: qcom: smsm: Support using mailbox interface") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202406180006.z397c67h-...@intel.com/ Signed-off-by: Luca Weiss --- drivers/soc/qcom/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig index 5af33b0e3470..60efecd16380 100644 --- a/drivers/soc/qcom/Kconfig +++ b/drivers/soc/qcom/Kconfig @@ -194,6 +194,7 @@ config QCOM_SMP2P config QCOM_SMSM tristate "Qualcomm Shared Memory State Machine" + depends on MAILBOX depends on QCOM_SMEM select QCOM_SMEM_STATE select IRQ_DOMAIN --- base-commit: 642a16ca7994a50d7de85715996a8ce171a5bdfb change-id: 20240628-smsm-kconfig-6a01783472f0 Best regards, -- Luca Weiss
Re: [PATCH V3 2/2] soc: qcom: smp2p: Introduce tracepoint support
Hi Sudeepgoud, kernel test robot noticed the following build errors: [auto build test ERROR on linus/master] [also build test ERROR on v6.10-rc5 next-20240627] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Sudeepgoud-Patil/soc-qcom-smp2p-Use-devname-for-interrupt-descriptions/20240628-061654 base: linus/master patch link: https://lore.kernel.org/r/20240627104831.4176799-3-quic_sudeepgo%40quicinc.com patch subject: [PATCH V3 2/2] soc: qcom: smp2p: Introduce tracepoint support config: arc-allmodconfig (https://download.01.org/0day-ci/archive/20240629/202406290037.kajgvuwb-...@intel.com/config) compiler: arceb-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240629/202406290037.kajgvuwb-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406290037.kajgvuwb-...@intel.com/ All errors (new ones prefixed by >>): In file included from include/trace/trace_events.h:419, from include/trace/define_trace.h:102, from drivers/soc/qcom/trace-smp2p.h:98, from drivers/soc/qcom/smp2p.c:165: >> drivers/soc/qcom/./trace-smp2p.h:25:1: error: macro "__assign_str" passed 2 >> arguments, but takes just 1 25 | ); | ^~ In file included from include/trace/trace_events.h:375: include/trace/stages/stage6_event_callback.h:34: note: macro "__assign_str" defined here 34 | #define __assign_str(dst) \ | drivers/soc/qcom/./trace-smp2p.h: In function 'trace_event_raw_event_smp2p_ssr_ack': >> drivers/soc/qcom/./trace-smp2p.h:22:17: error: '__assign_str' undeclared >> (first use in this function) 22 | __assign_str(dev_name, dev_name(dev)); | ^~~~ include/trace/trace_events.h:402:11: note: in definition of macro 'DECLARE_EVENT_CLASS' 402 | { assign; } \ | ^~ include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS' 44 | PARAMS(assign), \ | ^~ drivers/soc/qcom/./trace-smp2p.h:15:1: note: in expansion of macro 'TRACE_EVENT' 15 | TRACE_EVENT(smp2p_ssr_ack, | ^~~ drivers/soc/qcom/./trace-smp2p.h:21:9: note: in expansion of macro 'TP_fast_assign' 21 | TP_fast_assign( | ^~ drivers/soc/qcom/./trace-smp2p.h:22:17: note: each undeclared identifier is reported only once for each function it appears in 22 | __assign_str(dev_name, dev_name(dev)); | ^~~~ include/trace/trace_events.h:402:11: note: in definition of macro 'DECLARE_EVENT_CLASS' 402 | { assign; } \ | ^~ include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS' 44 | PARAMS(assign), \ | ^~ drivers/soc/qcom/./trace-smp2p.h:15:1: note: in expansion of macro 'TRACE_EVENT' 15 | TRACE_EVENT(smp2p_ssr_ack, | ^~~ drivers/soc/qcom/./trace-smp2p.h:21:9: note: in expansion of macro 'TP_fast_assign' 21 | TP_fast_assign( | ^~ drivers/soc/qcom/./trace-smp2p.h: At top level: drivers/soc/qcom/./trace-smp2p.h:42:1: error: macro "__assign_str" passed 2 arguments, but takes just 1 42 | ); | ^~ include/trace/stages/stage6_event_callback.h:34: note: macro "__assign_str" defined here 34 | #define __assign_str(dst) \ | drivers/soc/qcom/./trace-smp2p.h: In function 'trace_event_raw_event_smp2p_negotiate': drivers/soc/qcom/./trace-smp2p.h:35:17: error: '__assign_str' undeclared (first use in this function) 35 | __assign_str(dev_name, dev_name(dev)); | ^~~~ include/trace/trace_events.h:402:11: note: in definition of macro 'DECLARE_EVENT_CLASS' 402 | { assign; } \ | ^~ include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS' 44 | PARAMS(assign), \ | ^~
Re: [PATCH] kallsyms, livepatch: Fix livepatch with CONFIG_LTO_CLANG
On Fri, Jun 28, 2024 at 02:23:49PM +0200, Miroslav Benes wrote: > On Fri, 7 Jun 2024, Song Liu wrote: > > > Hi Miroslav, > > > > Thanks for reviewing the patch! > > > > On Fri, Jun 7, 2024 at 6:06 AM Miroslav Benes wrote: > > > > > > Hi, > > > > > > On Tue, 4 Jun 2024, Song Liu wrote: > > > > > > > With CONFIG_LTO_CLANG, the compiler may postfix symbols with > > > > .llvm. > > > > to avoid symbol duplication. scripts/kallsyms.c sorted the symbols > > > > without these postfixes. The default symbol lookup also removes these > > > > postfixes before comparing symbols. > > > > > > > > On the other hand, livepatch need to look up symbols with the full > > > > names. > > > > However, calling kallsyms_on_each_match_symbol with full name (with the > > > > postfix) cannot find the symbol(s). As a result, we cannot livepatch > > > > kernel functions with .llvm. postfix or kernel functions that use > > > > relocation information to symbols with .llvm. postfixes. > > > > > > > > Fix this by calling kallsyms_on_each_match_symbol without the postfix; > > > > and then match the full name (with postfix) in klp_match_callback. > > > > > > > > Signed-off-by: Song Liu > > > > --- > > > > include/linux/kallsyms.h | 13 + > > > > kernel/kallsyms.c| 21 - > > > > kernel/livepatch/core.c | 32 +++- > > > > 3 files changed, 60 insertions(+), 6 deletions(-) > > > > > > I do not like much that something which seems to be kallsyms-internal is > > > leaked out. You need to export cleanup_symbol_name() and there is now a > > > lot of code outside. I would feel much more comfortable if it is all > > > hidden from kallsyms users and kept there. Would it be possible? > > > > I think it is possible. Currently, kallsyms_on_each_match_symbol matches > > symbols without the postfix. We can add a variation or a parameter, so > > that it matches the full name with post fix. > > I think it might be better. > > Luis, what is your take on this? > > If I am not mistaken, there was a patch set to address this. Luis might > remember more. Yeah this is a real issue outside of CONFIG_LTO_CLANG, Rust modules is another example where instead of symbol names they want to use full hashes. So, as I hinted to you Sami, can we knock two birds with one stone here and move CONFIG_LTO_CLANG to use the same strategy as Rust so we have two users instead of just one? Then we resolve this. In fact what I suggested was even to allow even non-Rust, and in this case even with gcc to enable this world. This gives much more wider scope of testing / review / impact of these sorts of changes and world view and it would resolve the Rust case, the live patch CONFIG_LTO_CLANG world too. Thoughts? Luis
Re: [PATCH 6.10.0-rc2] kernel/module: avoid panic on loading broken module
On Fri, Jun 21, 2024 at 04:05:27PM +0200, Daniel von Kirschten wrote: > Am 18.06.2024 um 21:58 schrieb Luis Chamberlain: > > On Thu, Jun 06, 2024 at 03:31:49PM +0200, Daniel v. Kirschten wrote: > > > If a module is being loaded, and the .gnu.linkonce.this_module section > > > in the module's ELF file does not have the WRITE flag, the kernel will > > > map the finished module struct of that module as read-only. > > > This causes a kernel panic when the struct is written to the first time > > > after it has been marked read-only. Currently this happens in > > > complete_formation in kernel/module/main.c:2765 when the module's state is > > > set to MODULE_STATE_COMING, just after setting up the memory protections. > > > > How did you find this issue? > > In a university course I got the assignment to manually craft a loadable .ko > file, given only a regular object file, without using Kbuild. During testing > my module files, most of them were simply (correctly) rejected by the kernel > with an appropriate error message, but at some point I ran into this exact > kernel panic, and investigated it to understand why my module file was > invalid. OK, then the commit log should describe that this doesn't fix any known real world issue, but rather a custom crafted module without the regular module build system. > > > Down the line, this seems to lead to unpredictable freezes when trying to > > > load other modules - I guess this is due to some structures not being > > > cleaned up properly, but I didn't investigate this further. > > > > > > A check already exists which verifies that .gnu.linkonce.this_module > > > is ALLOC. This patch simply adds an analogous check for WRITE. > > > > Can you check to ensure our modules generated have a respective check to > > ensure this check exists at build time? That would proactively inform > > userspace when a built module is not built correctly, and the tool > > responsible can be identified. > > See above - I don't think it's possible to create such a broken module file > with any of "official" tools. That should be clearly stated on the commit log. > I haven't looked too deeply into how Kbuild > actually builds modules, but as far as I know, the user doesn't even come > into contact with this_module w Consider that a next level university assignment and is more useful to the world than this debug message. Because above you suggest "I don't think", go out and now be sure. > hen using the regular toolchain, because > Kbuild is responsible for creating the .this_module section. And Kbuild of > course creates it with the correct flags. So if I understand correctly, ... > this > problem can only occur when the module was built by some external tooling > (or manually, in my case). Who would create custom modules without the Linux kernel module build system, and what uses does that provide? It seems you are proving why this would be terribly silly thing to do. Now, the *value* your change has is it can prevent a crash in case of a corrupted module, which *can* occur, consider an odd filesystem live corruption, at least this would be caught at module load attempt and not crash. That's worth committing for this reason but your commit log really needs much more clarity. Why? Because stupid bots want to assign stupid CVEs for anything that seems like a security issue and this could escalate to such type of things. Providing clarity helps system integrators decide if they want to backport this sort of patch. Providing clarify on the chances of this happening and how we think it can happen helps a lot. If you want to be more proactive, try to enhance userspace kmod modprobe so that this is also verified. Luis
Re: [PATCH v3] module: Add log info for verifying module signature
On Fri, Jun 28, 2024 at 10:39:23AM +, Yusong Gao wrote: > Add log information in kernel-space when loading module failures. > Try to load the unsigned module and the module with bad signature > when set 1 to /sys/module/module/parameters/sig_enforce. > > Unsigned module case: > (linux) insmod unsigned.ko > [ 18.714661] Loading of unsigned module is rejected > insmod: can't insert 'unsigned.ko': Key was rejected by service > (linux) > > Bad signature module case: > (linux) insmod bad_signature.ko > insmod: can't insert 'bad_signature.ko': Key was rejected by service > (linux) > > There have different logging behavior the bad signature case only log > in user-space, add log info for fatal errors in module_sig_check(). > > Signed-off-by: Yusong Gao > --- > V3: Clarify the message type and the error code meaning. > V2: Change print level from notice to debug. > --- > kernel/module/signing.c | 27 +++ > 1 file changed, 27 insertions(+) > > diff --git a/kernel/module/signing.c b/kernel/module/signing.c > index a2ff4242e623..826cdab8e3e4 100644 > --- a/kernel/module/signing.c > +++ b/kernel/module/signing.c > @@ -67,6 +67,31 @@ int mod_verify_sig(const void *mod, struct load_info *info) > NULL, NULL); > } > > +static const char *mod_decode_error(int errno) > +{ > + char *errstr = "Unrecognized error"; This is not safe. You can just extend the existing debug switch for strict module loading and re-use the variable there and use that, for example diff --git a/kernel/module/signing.c b/kernel/module/signing.c index a2ff4242e623..9111822116e6 100644 --- a/kernel/module/signing.c +++ b/kernel/module/signing.c @@ -106,6 +106,9 @@ int module_sig_check(struct load_info *info, int flags) case -ENOKEY: reason = "module with unavailable key"; break; + case -EKEYREJECTED: + reason = "Key was rejected by service"; + break; default: /* @@ -113,6 +116,7 @@ int module_sig_check(struct load_info *info, int flags) * unparseable signatures, and signature check failures -- * even if signatures aren't required. */ + pr_debug("Verifying module signature failed: %s\n", reason); return err; } Also certs/system_keyring.c already has a lot of pr_devel stuff too, so do we really need this? Luis
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On 28.06.24 14:15, David Woodhouse wrote: > On Fri, 2024-06-28 at 13:33 +0200, Peter Hilber wrote: >> On 27.06.24 16:52, David Woodhouse wrote: >>> I already added a flags field, so this might look something like: >>> >>> /* >>> * Smearing flags. The UTC clock exposed through this structure >>> * is only ever true UTC, but a guest operating system may >>> * choose to offer a monotonic smeared clock to its users. This >>> * merely offers a hint about what kind of smearing to perform, >>> * for consistency with systems in the nearby environment. >>> */ >>> #define VMCLOCK_FLAGS_SMEAR_UTC_SLS (1<<5) /* draft-kuhn-leapsecond-00.txt >>> */ >>> >>> (UTC-SLS is probably a bad example but are there formal definitions for >>> anything else?) >> >> I think it could also be more generic, like flags for linear smearing, >> cosine smearing(?), and smear_start_sec and smear_end_sec fields (relative >> to the leap second start). That could also represent UTC-SLS, and >> noon-to-noon, and it would be well-defined. >> >> This should reduce the likelihood that the guest doesn't know the smearing >> variant. > > I'm wary of making it too generic. That would seem to encourage a > *proliferation* of false "UTC-like" clocks. > > It's bad enough that we do smearing at all, let alone that we don't > have a single definition of how to do it. > > I made the smearing hint a full uint8_t instead of using bits in flags, > in the end. That gives us a full 255 ways of lying to users about what > the time is, so we're unlikely to run out. And it's easy enough to add > a new VMCLOCK_SMEARING_XXX type to the 'registry' for any new methods > that get invented. > > My concern is that the registry update may come after a driver has already been implemented, so that it may be hard to ensure that the smearing which has been chosen is actually implemented. > + /* > + * This field changes to another non-repeating value when the CPU > + * counter is disrupted, for example on live migration. > + */ > + uint64_t disruption_marker; The field could also change when the clock is stepped (leap seconds excepted), or when the clock frequency is slewed. >>> >>> I'm not sure. The concept of the disruption marker is that it tells the >>> guest to throw away any calibration of the counter that the guest has >>> done for *itself* (with NTP, other PTP devices, etc.). >>> >>> One mode for this device would be not to populate the clock fields at >>> all, but *only* to signal disruption when it occurs. So the guest can >>> abort transactions until it's resynced its clocks (to avoid incurring >>> fines if breaking databases, etc.). >>> >>> Exposing the host timekeeping through the structure means that the >>> migrated guest can keep working because it can trust the timekeeping >>> performed by the (new) host and exposed to it. >>> >>> If the counter is actually varying in frequency over time, and the host >>> is slewing the clock frequency that it reports, that *isn't* a step >>> change and doesn't mean that the guest should throw away any >>> calibration that it's been doing for itself. One hopes that the guest >>> would have detected the *same* frequency change, and be adapting for >>> itself. So I don't think that should indicate a disruption. >>> >>> I think the same is even true if the clock is stepped by the host. The >>> actual *counter* hasn't changed, so the guest is better off ignoring >>> the vacillating host and continuing to derive its idea of time from the >>> hardware counter itself, as calibrated against some external NTP/PTP >>> sources. Surely we actively *don't* to tell the guest to throw its own >>> calibrations away, in this case? >> >> In case the guest is also considering other time sources, it might indeed >> not be a good idea to mix host clock changes into the hardware counter >> disruption marker. >> >> But if the vmclock is the authoritative source of time, it can still be >> helpful to know about such changes, maybe through another marker. > > Could that be the existing seq_count field? > > Skewing the counter_period_frac_sec as the underlying oscillator speeds > up and slows down is perfectly normal and expected, and we already > expect the seq_count to change when that happens. > > Maybe step changes are different, but arguably if the time advertised > by the host steps *outside* the error bounds previously advertised, > that's just broken? But the error bounds could be large or missing. I am trying to address use cases where the host steps or slews the clock as well. > > Depending on how the clock information is fed, a change in seq_count > may even result in non-monotonicity. If the underlying oscillator has > sped up and the structure is updated accordingly, the time calculated > the moment *before* that update may appear later than the time > calculated immediately after it. > > It's up
Re: [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs
On Thu, Jun 27, 2024 at 11:28 PM Masami Hiramatsu wrote: > > On Thu, 27 Jun 2024 09:47:10 -0700 > Andrii Nakryiko wrote: > > > On Thu, Jun 27, 2024 at 6:04 AM Masami Hiramatsu > > wrote: > > > > > > On Mon, 24 Jun 2024 17:21:38 -0700 > > > Andrii Nakryiko wrote: > > > > > > > -static int __uprobe_register(struct inode *inode, loff_t offset, > > > > - loff_t ref_ctr_offset, struct > > > > uprobe_consumer *uc) > > > > +int uprobe_register_batch(struct inode *inode, int cnt, > > > > + uprobe_consumer_fn get_uprobe_consumer, void > > > > *ctx) > > > > > > Is this interface just for avoiding memory allocation? Can't we just > > > allocate a temporary array of *uprobe_consumer instead? > > > > Yes, exactly, to avoid the need for allocating another array that > > would just contain pointers to uprobe_consumer. Consumers would never > > just have an array of `struct uprobe_consumer *`, because > > uprobe_consumer struct is embedded in some other struct, so the array > > interface isn't the most convenient. > > OK, I understand it. > > > > > If you feel strongly, I can do an array, but this necessitates > > allocating an extra array *and keeping it* for the entire duration of > > BPF multi-uprobe link (attachment) existence, so it feels like a > > waste. This is because we don't want to do anything that can fail in > > the detachment logic (so no temporary array allocation there). > > No need to change it, that sounds reasonable. > Great, thanks. > > > > Anyways, let me know how you feel about keeping this callback. > > IMHO, maybe the interface function is better to change to > `uprobe_consumer *next_uprobe_consumer(void **data)`. If caller > side uses a linked list of structure, index access will need to > follow the list every time. This would be problematic. Note how we call get_uprobe_consumer(i, ctx) with i going from 0 to N in multiple independent loops. So if we are only allowed to ask for the next consumer, then uprobe_register_batch and uprobe_unregister_batch would need to build its own internal index and remember ith instance. Which again means more allocations and possibly failing uprobe_unregister_batch(), which isn't great. For now this API works well, I propose to keep it as is. For linked list case consumers would need to allocate one extra array or pay the price of O(N) search (which might be ok, depending on how many uprobes are being attached). But we don't have such consumers right now, thankfully. > > Thank you, > > > > > > > > > > Thank you, > > > > > > -- > > > Masami Hiramatsu (Google) > > > -- > Masami Hiramatsu (Google)
Re: [PATCH v2 0/2] ARM: dts: qcom-msm8226-samsung-ms013g: Add initial device tree
On Thu, 27 Jun 2024 19:30:30 +, Raymond Hackley wrote: > Samsung Galaxy Grand 2 is a phone based on MSM8226. It's similar to the > other Samsung devices based on MSM8226 with only a few minor differences. > > The device trees contain initial support with: > - GPIO keys > - Regulator haptic > - SDHCI (internal and external storage) > - UART (on USB connector via the TI TSU6721 MUIC) > - Regulators > - Touchscreen > - Accelerometer > > --- > v2: Adjust l3, l15, l22 and l27 regulator voltages. Sort nodes. > Set regulator-allow-set-load for vqmmc supplies. > > > > My bot found new DTB warnings on the .dts files added or changed in this series. Some warnings may be from an existing SoC .dtsi. Or perhaps the warnings are fixed by another series. Ultimately, it is up to the platform maintainer whether these warnings are acceptable or not. No need to reply unless the platform maintainer has comments. If you already ran DT checks and didn't see these error(s), then make sure dt-schema is up to date: pip3 install dtschema --upgrade New warnings running 'make CHECK_DTBS=y qcom/qcom-msm8226-samsung-ms013g.dtb' for 20240627193013.1800-1-raymondhack...@protonmail.com: arch/arm/boot/dts/qcom/qcom-msm8226-samsung-ms013g.dtb: syscon@f9011000: compatible: 'anyOf' conditional failed, one must be fixed: ['syscon'] is too short 'syscon' is not one of ['al,alpine-sysfabric-service', 'allwinner,sun8i-a83t-system-controller', 'allwinner,sun8i-h3-system-controller', 'allwinner,sun8i-v3s-system-controller', 'allwinner,sun50i-a64-system-controller', 'altr,l3regs', 'altr,sdr-ctl', 'amd,pensando-elba-syscon', 'amlogic,meson-mx-assist', 'amlogic,meson-mx-bootrom', 'amlogic,meson8-analog-top', 'amlogic,meson8b-analog-top', 'amlogic,meson8-pmu', 'amlogic,meson8b-pmu', 'apm,xgene-csw', 'apm,xgene-efuse', 'apm,xgene-mcb', 'apm,xgene-rb', 'apm,xgene-scu', 'atmel,sama5d2-sfrbu', 'atmel,sama5d3-nfc-io', 'atmel,sama5d3-sfrbu', 'atmel,sama5d4-sfrbu', 'axis,artpec6-syscon', 'brcm,cru-clkset', 'brcm,sr-cdru', 'brcm,sr-mhb', 'cirrus,ep7209-syscon1', 'cirrus,ep7209-syscon2', 'cirrus,ep7209-syscon3', 'cnxt,cx92755-uc', 'freecom,fsg-cs2-system-controller', 'fsl,imx93-aonmix-ns-syscfg', 'fsl,imx93-wakeupmix-syscfg', 'fsl,ls1088a-reset', 'fsl,vf610-anatop', 'fsl,vf610-mscm-cpucfg', 'hisilicon,dsa-subctrl', 'hisilicon,hi6220-sramctr l', 'hisilicon,hip04-ppe', 'hisilicon,pcie-sas-subctrl', 'hisilicon,peri-subctrl', 'hpe,gxp-sysreg', 'intel,lgm-syscon', 'loongson,ls1b-syscon', 'loongson,ls1c-syscon', 'lsi,axxia-syscon', 'marvell,armada-3700-cpu-misc', 'marvell,armada-3700-nb-pm', 'marvell,armada-3700-avs', 'marvell,armada-3700-usb2-host-misc', 'marvell,dove-global-config', 'mediatek,mt2701-pctl-a-syscfg', 'mediatek,mt2712-pctl-a-syscfg', 'mediatek,mt6397-pctl-pmic-syscfg', 'mediatek,mt8135-pctl-a-syscfg', 'mediatek,mt8135-pctl-b-syscfg', 'mediatek,mt8173-pctl-a-syscfg', 'mediatek,mt8365-syscfg', 'microchip,lan966x-cpu-syscon', 'microchip,sam9x60-sfr', 'microchip,sama7g5-ddr3phy', 'microchip,sparx5-cpu-syscon', 'mscc,ocelot-cpu-syscon', 'mstar,msc313-pmsleep', 'nuvoton,ma35d1-sys', 'nuvoton,wpcm450-shm', 'rockchip,px30-qos', 'rockchip,rk3036-qos', 'rockchip,rk3066-qos', 'rockchip,rk3128-qos', 'rockchip,rk3228-qos', 'rockchip,rk3288-qos', 'rockchip,rk3368-qos', 'rockchip,rk3399-qos', 'rockchip,rk3568-qos', 'rockchi p,rk3588-qos', 'rockchip,rv1126-qos', 'st,spear1340-misc', 'stericsson,nomadik-pmu', 'starfive,jh7100-sysmain', 'ti,am62-opp-efuse-table', 'ti,am62-usb-phy-ctrl', 'ti,am625-dss-oldi-io-ctrl', 'ti,am62p-cpsw-mac-efuse', 'ti,am654-dss-oldi-io-ctrl', 'ti,am654-serdes-ctrl', 'ti,j784s4-pcie-ctrl', 'ti,keystone-pllctrl'] from schema $id: http://devicetree.org/schemas/mfd/syscon.yaml#
Re: [PATCH v3 2/2] rust: add tracepoint support
On Wed, Jun 26, 2024 at 8:43 PM Steven Rostedt wrote: > > On Wed, 26 Jun 2024 10:48:23 +0200 > Alice Ryhl wrote: > > > > > > > Because your hooks/rust_binder.h and events/rust_binder.h use the same > > > TRACE_SYSTEM name? Could you try something like: > > > > > > #define TRACE_SYSTEM rust_binder_hook > > > > > > in your hooks/rust_binder.h? > > > > I was able to get it to work by moving the includes into two different > > .c files. I don't think changing TRACE_SYSTEM works because it must > > match the filename. > > Try to use: > > #define TRACE_SYSTEM_VAR rust_binder_hook_other_name > > in one. Then that is used as the variable for that file. Thanks. I also made a change to restore the value of DEFINE_RUST_DO_TRACE after define_trace.h Alice
Re: [PATCH v7 0/5] initial support for Marvell 88PM886 PMIC
On Fri, 28 Jun 2024, Lee Jones wrote: > On Fri, 31 May 2024 19:34:55 +0200, Karel Balej wrote: > > the following implements basic support for Marvell's 88PM886 PMIC which > > is found for instance as a component of the samsung,coreprimevelte > > smartphone which inspired this and also serves as a testing platform. > > > > The code for the MFD is based primarily on this old series [1] with the > > addition of poweroff based on the smartphone's downstream kernel tree > > [2]. The onkey and regulators drivers are based on the latter. I am not > > in possesion of the datasheet. > > > > [...] > > Applied, thanks! > > [1/5] dt-bindings: mfd: add entry for Marvell 88PM886 PMIC > commit: c4725350a9f76fbec45cbbfffb20be2e574eb6ef > [2/5] mfd: add driver for Marvell 88PM886 PMIC > commit: 860f8e3beac0b800bbe20f23c5f3440b1c470b8f > [3/5] regulator: add regulators driver for Marvell 88PM886 PMIC > commit: 5d1a5144396e9570efea02d467df0a68fd28db6f > [4/5] input: add onkey driver for Marvell 88PM886 PMIC > commit: 914089db309ccc590314b6c21df5a1f812e9ab0b > [5/5] MAINTAINERS: add myself for Marvell 88PM886 PMIC > commit: f53d3efa366b1754f0389944401bb53397d22468 Submitted for build testing. If all is good, I'll send out a PR for the other maintainers soon. Note to self: ib-mfd-input-regulator-6.11 -- Lee Jones [李琼斯]
Re: [PATCH v7 0/5] initial support for Marvell 88PM886 PMIC
On Fri, 31 May 2024 19:34:55 +0200, Karel Balej wrote: > the following implements basic support for Marvell's 88PM886 PMIC which > is found for instance as a component of the samsung,coreprimevelte > smartphone which inspired this and also serves as a testing platform. > > The code for the MFD is based primarily on this old series [1] with the > addition of poweroff based on the smartphone's downstream kernel tree > [2]. The onkey and regulators drivers are based on the latter. I am not > in possesion of the datasheet. > > [...] Applied, thanks! [1/5] dt-bindings: mfd: add entry for Marvell 88PM886 PMIC commit: c4725350a9f76fbec45cbbfffb20be2e574eb6ef [2/5] mfd: add driver for Marvell 88PM886 PMIC commit: 860f8e3beac0b800bbe20f23c5f3440b1c470b8f [3/5] regulator: add regulators driver for Marvell 88PM886 PMIC commit: 5d1a5144396e9570efea02d467df0a68fd28db6f [4/5] input: add onkey driver for Marvell 88PM886 PMIC commit: 914089db309ccc590314b6c21df5a1f812e9ab0b [5/5] MAINTAINERS: add myself for Marvell 88PM886 PMIC commit: f53d3efa366b1754f0389944401bb53397d22468 -- Lee Jones [李琼斯]
Re: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types.
Hi Luigi, kernel test robot noticed the following build warnings: [auto build test WARNING on 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5] url: https://github.com/intel-lab-lkp/linux/commits/Luigi-Leonardi-via-B4-Relay/vsock-add-support-for-SIOCOUTQ-ioctl-for-all-vsock-socket-types/20240627-023902 base: 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5 patch link: https://lore.kernel.org/r/20240626-ioctl_next-v3-1-63be5bf19a40%40outlook.com patch subject: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types. config: i386-randconfig-141-20240628 (https://download.01.org/0day-ci/archive/20240628/202406282144.dxr5kwiu-...@intel.com/config) compiler: gcc-8 (Ubuntu 8.4.0-3ubuntu2) 8.4.0 If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406282144.dxr5kwiu-...@intel.com/ smatch warnings: net/vmw_vsock/af_vsock.c:1321 vsock_do_ioctl() warn: unsigned 'n_bytes' is never less than zero. vim +/n_bytes +1321 net/vmw_vsock/af_vsock.c 1295 1296 static int vsock_do_ioctl(struct socket *sock, unsigned int cmd, 1297int __user *arg) 1298 { 1299 struct sock *sk = sock->sk; 1300 struct vsock_sock *vsk; 1301 int retval; 1302 1303 vsk = vsock_sk(sk); 1304 1305 switch (cmd) { 1306 case SIOCOUTQ: { 1307 size_t n_bytes; 1308 1309 if (!vsk->transport || !vsk->transport->unsent_bytes) { 1310 retval = -EOPNOTSUPP; 1311 break; 1312 } 1313 1314 if (vsk->transport->unsent_bytes) { 1315 if (sock_type_connectible(sk->sk_type) && sk->sk_state == TCP_LISTEN) { 1316 retval = -EINVAL; 1317 break; 1318 } 1319 1320 n_bytes = vsk->transport->unsent_bytes(vsk); > 1321 if (n_bytes < 0) { 1322 retval = n_bytes; 1323 break; 1324 } 1325 1326 retval = put_user(n_bytes, arg); 1327 } 1328 break; 1329 } 1330 default: 1331 retval = -ENOIOCTLCMD; 1332 } 1333 1334 return retval; 1335 } 1336 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
[PATCH v4 2/2] rust: add tracepoint support
Make it possible to have Rust code call into tracepoints defined by C code. It is still required that the tracepoint is declared in a C header, and that this header is included in the input to bindgen. Signed-off-by: Alice Ryhl --- include/linux/tracepoint.h | 18 +++- include/trace/define_trace.h| 12 +++ rust/bindings/bindings_helper.h | 1 + rust/kernel/lib.rs | 1 + rust/kernel/tracepoint.rs | 47 + 5 files changed, 78 insertions(+), 1 deletion(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 689b6d71590e..d82af4d77c9f 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -238,6 +238,20 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) #define __DECLARE_TRACE_RCU(name, proto, args, cond) #endif +/* + * Declare an exported function that Rust code can call to trigger this + * tracepoint. This function does not include the static branch; that is done + * in Rust to avoid a function call when the tracepoint is disabled. + */ +#define DEFINE_RUST_DO_TRACE(name, proto, args) +#define DEFINE_RUST_DO_TRACE_REAL(name, proto, args) \ + notrace void rust_do_trace_##name(proto)\ + { \ + __DO_TRACE(name,\ + TP_ARGS(args), \ + cpu_online(raw_smp_processor_id()), 0); \ + } + /* * Make sure the alignment of the structure in the __tracepoints section will * not add unwanted padding between the beginning of the section and the @@ -253,6 +267,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) extern int __traceiter_##name(data_proto); \ DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name);\ extern struct tracepoint __tracepoint_##name; \ + extern void rust_do_trace_##name(proto);\ static inline void trace_##name(proto) \ { \ if (static_key_false(&__tracepoint_##name.key)) \ @@ -337,7 +352,8 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) void __probestub_##_name(void *__data, proto) \ { \ } \ - DEFINE_STATIC_CALL(tp_func_##_name, __traceiter_##_name); + DEFINE_STATIC_CALL(tp_func_##_name, __traceiter_##_name); \ + DEFINE_RUST_DO_TRACE(_name, TP_PROTO(proto), TP_ARGS(args)) #define DEFINE_TRACE(name, proto, args)\ DEFINE_TRACE_FN(name, NULL, NULL, PARAMS(proto), PARAMS(args)); diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h index 00723935dcc7..08ed5ce63a96 100644 --- a/include/trace/define_trace.h +++ b/include/trace/define_trace.h @@ -72,6 +72,13 @@ #define DECLARE_TRACE(name, proto, args) \ DEFINE_TRACE(name, PARAMS(proto), PARAMS(args)) +/* If requested, create helpers for calling these tracepoints from Rust. */ +#ifdef CREATE_RUST_TRACE_POINTS +#undef DEFINE_RUST_DO_TRACE +#define DEFINE_RUST_DO_TRACE(name, proto, args)\ + DEFINE_RUST_DO_TRACE_REAL(name, PARAMS(proto), PARAMS(args)) +#endif + #undef TRACE_INCLUDE #undef __TRACE_INCLUDE @@ -129,6 +136,11 @@ # undef UNDEF_TRACE_INCLUDE_PATH #endif +#ifdef CREATE_RUST_TRACE_POINTS +# undef DEFINE_RUST_DO_TRACE +# define DEFINE_RUST_DO_TRACE(name, proto, args) +#endif + /* We may be processing more files */ #define CREATE_TRACE_POINTS diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index ddb5644d4fd9..d442f9ccfc2c 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -15,6 +15,7 @@ #include #include #include +#include #include #include diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index fffd4e1dd1c1..9ae90eb69020 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -46,6 +46,7 @@ pub mod sync; pub mod task; pub mod time; +pub mod tracepoint; pub mod types; pub mod workqueue; diff --git a/rust/kernel/tracepoint.rs b/rust/kernel/tracepoint.rs new file mode 100644 index ..1005f09e0330 --- /dev/null +++ b/rust/kernel/tracepoint.rs @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Logic for tracepoints. + +/// Declare the Rust entry point for a tracepoint. +#[macro_export] +macro_rules! declare_trace { +($($(#[$attr:meta])* $pub:vis fn $name:ident($($argname:ident : $argtyp:ty),* $(,)?);)*) => {$( +$(
[PATCH v4 1/2] rust: add static_key_false
Add just enough support for static key so that we can use it from tracepoints. Tracepoints rely on `static_key_false` even though it is deprecated, so we add the same functionality to Rust. It is not possible to use the existing C implementation of arch_static_branch because it passes the argument `key` to inline assembly as an 'i' parameter, so any attempt to add a C helper for this function will fail to compile because the value of `key` must be known at compile-time. Signed-off-by: Alice Ryhl --- rust/kernel/arch/arm64/jump_label.rs | 34 rust/kernel/arch/loongarch/jump_label.rs | 35 + rust/kernel/arch/mod.rs | 24 rust/kernel/arch/riscv/jump_label.rs | 38 rust/kernel/arch/x86/jump_label.rs | 35 + rust/kernel/lib.rs | 2 ++ rust/kernel/static_key.rs| 32 +++ scripts/Makefile.build | 2 +- 8 files changed, 201 insertions(+), 1 deletion(-) diff --git a/rust/kernel/arch/arm64/jump_label.rs b/rust/kernel/arch/arm64/jump_label.rs new file mode 100644 index ..5eede2245718 --- /dev/null +++ b/rust/kernel/arch/arm64/jump_label.rs @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Arm64 Rust implementation of jump_label.h + +/// arm64 implementation of arch_static_branch +#[macro_export] +#[cfg(target_arch = "aarch64")] +macro_rules! arch_static_branch { +($key:path, $keytyp:ty, $field:ident, $branch:expr) => {'my_label: { +core::arch::asm!( +r#" +1: nop + +.pushsection __jump_table, "aw" +.align 3 +.long 1b - ., {0} - . +.quad {1} + {2} + {3} - . +.popsection +"#, +label { +break 'my_label true; +}, +sym $key, +const ::core::mem::offset_of!($keytyp, $field), +const $crate::arch::bool_to_int($branch), +); + +break 'my_label false; +}}; +} + +pub use arch_static_branch; diff --git a/rust/kernel/arch/loongarch/jump_label.rs b/rust/kernel/arch/loongarch/jump_label.rs new file mode 100644 index ..8d31318aeb11 --- /dev/null +++ b/rust/kernel/arch/loongarch/jump_label.rs @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Loongarch Rust implementation of jump_label.h + +/// loongarch implementation of arch_static_branch +#[doc(hidden)] +#[macro_export] +#[cfg(target_arch = "loongarch64")] +macro_rules! arch_static_branch { +($key:path, $keytyp:ty, $field:ident, $branch:expr) => {'my_label: { +core::arch::asm!( +r#" +1: nop + +.pushsection __jump_table, "aw" +.align 3 +.long 1b - ., {0} - . +.quad {1} + {2} + {3} - . +.popsection +"#, +label { +break 'my_label true; +}, +sym $key, +const ::core::mem::offset_of!($keytyp, $field), +const $crate::arch::bool_to_int($branch), +); + +break 'my_label false; +}}; +} + +pub use arch_static_branch; diff --git a/rust/kernel/arch/mod.rs b/rust/kernel/arch/mod.rs new file mode 100644 index ..14271d2530e9 --- /dev/null +++ b/rust/kernel/arch/mod.rs @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Architecture specific code. + +#[cfg_attr(target_arch = "aarch64", path = "arm64")] +#[cfg_attr(target_arch = "x86_64", path = "x86")] +#[cfg_attr(target_arch = "loongarch64", path = "loongarch")] +#[cfg_attr(target_arch = "riscv64", path = "riscv")] +mod inner { +pub mod jump_label; +} + +pub use self::inner::*; + +/// A helper used by inline assembly to pass a boolean to as a `const` parameter. +/// +/// Using this function instead of a cast lets you assert that the input is a boolean, rather than +/// some other type that can be cast to an integer. +#[doc(hidden)] +pub const fn bool_to_int(b: bool) -> i32 { +b as i32 +} diff --git a/rust/kernel/arch/riscv/jump_label.rs b/rust/kernel/arch/riscv/jump_label.rs new file mode 100644 index ..2672e0c6f033 --- /dev/null +++ b/rust/kernel/arch/riscv/jump_label.rs @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! RiscV Rust implementation of jump_label.h + +/// riscv implementation of arch_static_branch +#[macro_export] +#[cfg(target_arch = "riscv64")] +macro_rules! arch_static_branch { +($key:path, $keytyp:ty, $field:ident, $branch:expr) => {'my_label: { +core::arch::asm!( +r#" +.align 2 +.option push +.option norelax +.option norvc +1: nop +
[PATCH v4 0/2] Tracepoints and static branch in Rust
An important part of a production ready Linux kernel driver is tracepoints. So to write production ready Linux kernel drivers in Rust, we must be able to call tracepoints from Rust code. This patch series adds support for calling tracepoints declared in C from Rust. To use the tracepoint support, you must: 1. Declare the tracepoint in a C header file as usual. 2. Add #define CREATE_RUST_TRACE_POINTS next to your #define CREATE_TRACE_POINTS. 2. Make sure that the header file is visible to bindgen. 3. Use the declare_trace! macro in your Rust code to generate Rust functions that call into the tracepoint. For example, the kernel has a tracepoint called `sched_kthread_stop`. It is declared like this: TRACE_EVENT(sched_kthread_stop, TP_PROTO(struct task_struct *t), TP_ARGS(t), TP_STRUCT__entry( __array(char, comm, TASK_COMM_LEN ) __field(pid_t, pid ) ), TP_fast_assign( memcpy(__entry->comm, t->comm, TASK_COMM_LEN); __entry->pid= t->pid; ), TP_printk("comm=%s pid=%d", __entry->comm, __entry->pid) ); To call the above tracepoint from Rust code, you must first ensure that the Rust helper for the tracepoint is generated. To do this, you would modify kernel/sched/core.c by adding #define CREATE_RUST_TRACE_POINTS. Next, you would include include/trace/events/sched.h in rust/bindings/bindings_helper.h so that the exported C functions are visible to Rust, and then you would declare the tracepoint in Rust: declare_trace! { fn sched_kthread_stop(task: *mut task_struct); } This will define an inline Rust function that checks the static key, calling into rust_do_trace_##name if the tracepoint is active. Since these tracepoints often take raw pointers as arguments, it may be convenient to wrap it in a safe wrapper: mod raw { declare_trace! { fn sched_kthread_stop(task: *mut task_struct); } } #[inline] pub fn trace_sched_kthread_stop(task: ) { // SAFETY: The pointer to `task` is valid. unsafe { raw::sched_kthread_stop(task.as_raw()) } } A future expansion of the tracepoint support could generate these safe versions automatically, but that is left as future work for now. This is intended for use in the Rust Binder driver, which was originally sent as an RFC [1]. The RFC did not include tracepoint support, but you can see how it will be used in Rust Binder at [2]. The author has verified that the tracepoint support works on Android devices. This implementation implements support for static keys in Rust so that the actual static branch happens in the Rust object file. However, the __DO_TRACE body remains in C code. See v1 for an implementation where __DO_TRACE is also implemented in Rust. Link: https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f...@google.com/ [1] Link: https://r.android.com/3119993 [2] Signed-off-by: Alice Ryhl --- Changes in v4: - Move arch-specific code into rust/kernel/arch. - Restore DEFINE_RUST_DO_TRACE at end of define_trace.h - Link to v3: https://lore.kernel.org/r/20240621-tracepoint-v3-0-9e44eeea2...@google.com Changes in v3: - Support for Rust static_key on loongarch64 and riscv64. - Avoid failing compilation on architectures that are missing Rust static_key support when the archtectures does not actually use it. - Link to v2: https://lore.kernel.org/r/20240610-tracepoint-v2-0-faebad81b...@google.com Changes in v2: - Call into C code for __DO_TRACE. - Drop static_call patch, as it is no longer needed. - Link to v1: https://lore.kernel.org/r/20240606-tracepoint-v1-0-6551627bf...@google.com --- Alice Ryhl (2): rust: add static_key_false rust: add tracepoint support include/linux/tracepoint.h | 18 +++- include/trace/define_trace.h | 12 rust/bindings/bindings_helper.h | 1 + rust/kernel/arch/arm64/jump_label.rs | 34 +++ rust/kernel/arch/loongarch/jump_label.rs | 35 rust/kernel/arch/mod.rs | 24 rust/kernel/arch/riscv/jump_label.rs | 38 ++ rust/kernel/arch/x86/jump_label.rs | 35 rust/kernel/lib.rs | 3 ++ rust/kernel/static_key.rs| 32 ++ rust/kernel/tracepoint.rs| 47 scripts/Makefile.build | 2 +- 12 files changed, 279 insertions(+), 2 deletions(-) --- base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0 change-id: 20240606-tracepoint-31e15b90e471 Best regards, -- Alice Ryhl
Re: [PATCH 13/14] tracefs: Convert to new uid/gid option parsing helpers
On Thu, 27 Jun 2024 19:40:44 -0500 Eric Sandeen wrote: > Convert to new uid/gid option parsing helpers > > Signed-off-by: Eric Sandeen Acked-by: Steven Rostedt (Google) -- Steve
Re: [PATCH] kallsyms, livepatch: Fix livepatch with CONFIG_LTO_CLANG
On Fri, 7 Jun 2024, Song Liu wrote: > Hi Miroslav, > > Thanks for reviewing the patch! > > On Fri, Jun 7, 2024 at 6:06 AM Miroslav Benes wrote: > > > > Hi, > > > > On Tue, 4 Jun 2024, Song Liu wrote: > > > > > With CONFIG_LTO_CLANG, the compiler may postfix symbols with .llvm. > > > to avoid symbol duplication. scripts/kallsyms.c sorted the symbols > > > without these postfixes. The default symbol lookup also removes these > > > postfixes before comparing symbols. > > > > > > On the other hand, livepatch need to look up symbols with the full names. > > > However, calling kallsyms_on_each_match_symbol with full name (with the > > > postfix) cannot find the symbol(s). As a result, we cannot livepatch > > > kernel functions with .llvm. postfix or kernel functions that use > > > relocation information to symbols with .llvm. postfixes. > > > > > > Fix this by calling kallsyms_on_each_match_symbol without the postfix; > > > and then match the full name (with postfix) in klp_match_callback. > > > > > > Signed-off-by: Song Liu > > > --- > > > include/linux/kallsyms.h | 13 + > > > kernel/kallsyms.c| 21 - > > > kernel/livepatch/core.c | 32 +++- > > > 3 files changed, 60 insertions(+), 6 deletions(-) > > > > I do not like much that something which seems to be kallsyms-internal is > > leaked out. You need to export cleanup_symbol_name() and there is now a > > lot of code outside. I would feel much more comfortable if it is all > > hidden from kallsyms users and kept there. Would it be possible? > > I think it is possible. Currently, kallsyms_on_each_match_symbol matches > symbols without the postfix. We can add a variation or a parameter, so > that it matches the full name with post fix. I think it might be better. Luis, what is your take on this? > > Moreover, isn't there a similar problem for ftrace, kprobes, ebpf,...? > > Yes, there is a similar problem with tracing use cases. But the requirements > are not the same: > > For livepatch, we have to point to the exact symbol we want to patch or > relocation to. We have sympos API defined to differentiate different symbols > with the same name. Yes. In fact, sympos may be used to solve even this problem. The user would disregard .llvm. suffix and they are suddenly in the same situation which sympos aims to solve. I will not argue with you if say it is cumbersome. > For tracing, some discrepancy is acceptable. AFAICT, there isn't an API > similar to sympos yet. Also, we can play some tricks with tracing. For > example, we can use "uniq symbol + offset" to point a kprobe to one of > the duplicated symbols. If I am not mistaken, there was a patch set to address this. Luis might remember more. Regards, Miroslav
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On Fri, 2024-06-28 at 13:33 +0200, Peter Hilber wrote: > On 27.06.24 16:52, David Woodhouse wrote: > > I already added a flags field, so this might look something like: > > > > /* > > * Smearing flags. The UTC clock exposed through this structure > > * is only ever true UTC, but a guest operating system may > > * choose to offer a monotonic smeared clock to its users. This > > * merely offers a hint about what kind of smearing to perform, > > * for consistency with systems in the nearby environment. > > */ > > #define VMCLOCK_FLAGS_SMEAR_UTC_SLS (1<<5) /* draft-kuhn-leapsecond-00.txt > > */ > > > > (UTC-SLS is probably a bad example but are there formal definitions for > > anything else?) > > I think it could also be more generic, like flags for linear smearing, > cosine smearing(?), and smear_start_sec and smear_end_sec fields (relative > to the leap second start). That could also represent UTC-SLS, and > noon-to-noon, and it would be well-defined. > > This should reduce the likelihood that the guest doesn't know the smearing > variant. I'm wary of making it too generic. That would seem to encourage a *proliferation* of false "UTC-like" clocks. It's bad enough that we do smearing at all, let alone that we don't have a single definition of how to do it. I made the smearing hint a full uint8_t instead of using bits in flags, in the end. That gives us a full 255 ways of lying to users about what the time is, so we're unlikely to run out. And it's easy enough to add a new VMCLOCK_SMEARING_XXX type to the 'registry' for any new methods that get invented. > > > > + /* > > > > + * This field changes to another non-repeating value when the > > > > CPU > > > > + * counter is disrupted, for example on live migration. > > > > + */ > > > > + uint64_t disruption_marker; > > > > > > The field could also change when the clock is stepped (leap seconds > > > excepted), or when the clock frequency is slewed. > > > > I'm not sure. The concept of the disruption marker is that it tells the > > guest to throw away any calibration of the counter that the guest has > > done for *itself* (with NTP, other PTP devices, etc.). > > > > One mode for this device would be not to populate the clock fields at > > all, but *only* to signal disruption when it occurs. So the guest can > > abort transactions until it's resynced its clocks (to avoid incurring > > fines if breaking databases, etc.). > > > > Exposing the host timekeeping through the structure means that the > > migrated guest can keep working because it can trust the timekeeping > > performed by the (new) host and exposed to it. > > > > If the counter is actually varying in frequency over time, and the host > > is slewing the clock frequency that it reports, that *isn't* a step > > change and doesn't mean that the guest should throw away any > > calibration that it's been doing for itself. One hopes that the guest > > would have detected the *same* frequency change, and be adapting for > > itself. So I don't think that should indicate a disruption. > > > > I think the same is even true if the clock is stepped by the host. The > > actual *counter* hasn't changed, so the guest is better off ignoring > > the vacillating host and continuing to derive its idea of time from the > > hardware counter itself, as calibrated against some external NTP/PTP > > sources. Surely we actively *don't* to tell the guest to throw its own > > calibrations away, in this case? > > In case the guest is also considering other time sources, it might indeed > not be a good idea to mix host clock changes into the hardware counter > disruption marker. > > But if the vmclock is the authoritative source of time, it can still be > helpful to know about such changes, maybe through another marker. Could that be the existing seq_count field? Skewing the counter_period_frac_sec as the underlying oscillator speeds up and slows down is perfectly normal and expected, and we already expect the seq_count to change when that happens. Maybe step changes are different, but arguably if the time advertised by the host steps *outside* the error bounds previously advertised, that's just broken? Depending on how the clock information is fed, a change in seq_count may even result in non-monotonicity. If the underlying oscillator has sped up and the structure is updated accordingly, the time calculated the moment *before* that update may appear later than the time calculated immediately after it. It's up to the guest operating system to feed that information into its own timekeeping system and skew towards correctness instead of stepping the time it reports to its users. smime.p7s Description: S/MIME cryptographic signature
[PATCH v2 6/6] riscv: ftrace: support PREEMPT
Now, we can safely enable dynamic ftrace with kernel preemption. Signed-off-by: Andy Chiu --- arch/riscv/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 55c70efbad0a..881ea466ff52 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -139,7 +139,7 @@ config RISCV select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER - select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION + select HAVE_FUNCTION_TRACER if !XIP_KERNEL select HAVE_EBPF_JIT if MMU select HAVE_GUP_FAST if MMU select HAVE_FUNCTION_ARG_ACCESS_API -- 2.43.0
[PATCH v2 5/6] riscv: vector: Support calling schedule() for preemptible Vector
Each function entry implies a call to ftrace infrastructure. And it may call into schedule in some cases. So, it is possible for preemptible kernel-mode Vector to implicitly call into schedule. Since all V-regs are caller-saved, it is possible to drop all V context when a thread voluntarily call schedule(). Besides, we currently don't pass argument through vector register, so we don't have to save/restore V-regs in ftrace trampoline. Signed-off-by: Andy Chiu --- arch/riscv/include/asm/processor.h | 5 + arch/riscv/include/asm/vector.h| 22 +++--- 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 68c3432dc6ea..02598e168659 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -95,6 +95,10 @@ struct pt_regs; * Thus, the task does not own preempt_v. Any use of Vector will have to * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode * Vector. + * - bit 29: The thread voluntarily calls schedule() while holding an active + *preempt_v. All preempt_v context should be dropped in such case because + *V-regs are caller-saved. Only sstatus.VS=ON is persisted across a + *schedule() call. * - bit 30: The in-kernel preempt_v context is saved, and requries to be *restored when returning to the context that owns the preempt_v. * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the @@ -109,6 +113,7 @@ struct pt_regs; #define RISCV_PREEMPT_V0x0100 #define RISCV_PREEMPT_V_DIRTY 0x8000 #define RISCV_PREEMPT_V_NEED_RESTORE 0x4000 +#define RISCV_PREEMPT_V_IN_SCHEDULE0x2000 /* CPU-specific state of a task */ struct thread_struct { diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index be7d309cca8a..fbf17aba92c1 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -75,6 +75,11 @@ static __always_inline void riscv_v_disable(void) csr_clear(CSR_SSTATUS, SR_VS); } +static __always_inline bool riscv_v_is_on(void) +{ + return !!(csr_read(CSR_SSTATUS) & SR_VS); +} + static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest) { asm volatile ( @@ -243,6 +248,11 @@ static inline void __switch_to_vector(struct task_struct *prev, struct pt_regs *regs; if (riscv_preempt_v_started(prev)) { + if (riscv_v_is_on()) { + WARN_ON(prev->thread.riscv_v_flags & RISCV_V_CTX_DEPTH_MASK); + riscv_v_disable(); + prev->thread.riscv_v_flags |= RISCV_PREEMPT_V_IN_SCHEDULE; + } if (riscv_preempt_v_dirty(prev)) { __riscv_v_vstate_save(>thread.kernel_vstate, prev->thread.kernel_vstate.datap); @@ -253,10 +263,16 @@ static inline void __switch_to_vector(struct task_struct *prev, riscv_v_vstate_save(>thread.vstate, regs); } - if (riscv_preempt_v_started(next)) - riscv_preempt_v_set_restore(next); - else + if (riscv_preempt_v_started(next)) { + if (next->thread.riscv_v_flags & RISCV_PREEMPT_V_IN_SCHEDULE) { + next->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_IN_SCHEDULE; + riscv_v_enable(); + } else { + riscv_preempt_v_set_restore(next); + } + } else { riscv_v_vstate_set_restore(next, task_pt_regs(next)); + } } void riscv_v_vstate_ctrl_init(struct task_struct *tsk); -- 2.43.0
[PATCH v2 4/6] riscv: ftrace: do not use stop_machine to update code
Now it is safe to remove dependency from stop_machine() for us to patch code in ftrace. Signed-off-by: Andy Chiu --- arch/riscv/kernel/ftrace.c | 53 -- 1 file changed, 4 insertions(+), 49 deletions(-) diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 5ebe412280ef..57a6558e212e 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -13,23 +13,13 @@ #include #ifdef CONFIG_DYNAMIC_FTRACE -void ftrace_arch_code_modify_prepare(void) __acquires(_mutex) +void arch_ftrace_update_code(int command) { mutex_lock(_mutex); - - /* -* The code sequences we use for ftrace can't be patched while the -* kernel is running, so we need to use stop_machine() to modify them -* for now. This doesn't play nice with text_mutex, we use this flag -* to elide the check. -*/ - riscv_patch_in_stop_machine = true; -} - -void ftrace_arch_code_modify_post_process(void) __releases(_mutex) -{ - riscv_patch_in_stop_machine = false; + command |= FTRACE_MAY_SLEEP; + ftrace_modify_all_code(command); mutex_unlock(_mutex); + flush_icache_all(); } static int ftrace_check_current_call(unsigned long hook_pos, @@ -155,41 +145,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func) return __ftrace_modify_call_site(_call_dest, func, true); } -struct ftrace_modify_param { - int command; - atomic_t cpu_count; -}; - -static int __ftrace_modify_code(void *data) -{ - struct ftrace_modify_param *param = data; - - if (atomic_inc_return(>cpu_count) == num_online_cpus()) { - ftrace_modify_all_code(param->command); - /* -* Make sure the patching store is effective *before* we -* increment the counter which releases all waiting CPUs -* by using the release variant of atomic increment. The -* release pairs with the call to local_flush_icache_all() -* on the waiting CPU. -*/ - atomic_inc_return_release(>cpu_count); - } else { - while (atomic_read(>cpu_count) <= num_online_cpus()) - cpu_relax(); - - local_flush_icache_all(); - } - - return 0; -} - -void arch_ftrace_update_code(int command) -{ - struct ftrace_modify_param param = { command, ATOMIC_INIT(0) }; - - stop_machine(__ftrace_modify_code, , cpu_online_mask); -} #endif #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS -- 2.43.0
[PATCH v2 3/6] riscv: ftrace: prepare ftrace for atomic code patching
We use an AUIPC+JALR pair to jump into a ftrace trampoline. Since instruction fetch can break down to 4 byte at a time, it is impossible to update two instructions without a race. In order to mitigate it, we initialize the patchable entry to AUIPC + NOP4. Then, the run-time code patching can change NOP4 to JALR to eable/disable ftrcae from a function. This limits the reach of each ftrace entry to +-2KB displacing from ftrace_caller. Starting from the trampoline, we add a level of indirection for it to reach ftrace caller target. Now, it loads the target address from a memory location, then perform the jump. This enable the kernel to update the target atomically. The ordering of reading/updating the targert address should be guarded by generic ftrace code, where it sends smp_rmb ipi. Signed-off-by: Andy Chiu --- arch/riscv/include/asm/ftrace.h | 4 +++ arch/riscv/kernel/ftrace.c | 80 ++--- arch/riscv/kernel/mcount-dyn.S | 9 +++-- 3 files changed, 62 insertions(+), 31 deletions(-) diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index 5f81c53dbfd9..7199383f8c02 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -81,6 +81,7 @@ struct dyn_arch_ftrace { #define JALR_T0(0x000282e7) #define AUIPC_T0 (0x0297) #define NOP4 (0x0013) +#define JALR_RANGE (JALR_SIGN_MASK - 1) #define to_jalr_t0(offset) \ (((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_T0) @@ -118,6 +119,9 @@ do { \ * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here. */ #define MCOUNT_INSN_SIZE 8 +#define MCOUNT_AUIPC_SIZE 4 +#define MCOUNT_JALR_SIZE 4 +#define MCOUNT_NOP4_SIZE 4 #ifndef __ASSEMBLY__ struct dyn_ftrace; diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 4b95c574fd04..5ebe412280ef 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -64,42 +64,64 @@ static int ftrace_check_current_call(unsigned long hook_pos, return 0; } -static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target, - bool enable, bool ra) +static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target, bool validate) { unsigned int call[2]; - unsigned int nops[2] = {NOP4, NOP4}; + unsigned int replaced[2]; + + make_call_t0(hook_pos, target, call); - if (ra) - make_call_ra(hook_pos, target, call); - else - make_call_t0(hook_pos, target, call); + if (validate) { + /* +* Read the text we want to modify; +* return must be -EFAULT on read error +*/ + if (copy_from_kernel_nofault(replaced, (void *)hook_pos, +MCOUNT_INSN_SIZE)) + return -EFAULT; + + if (replaced[0] != call[0]) { + pr_err("%p: expected (%08x) but got (%08x)\n", + (void *)hook_pos, call[0], replaced[0]); + return -EINVAL; + } + } - /* Replace the auipc-jalr pair at once. Return -EPERM on write error. */ - if (patch_insn_write((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE)) + /* Replace the jalr at once. Return -EPERM on write error. */ + if (patch_insn_write((void *)(hook_pos + MCOUNT_AUIPC_SIZE), call + 1, MCOUNT_JALR_SIZE)) return -EPERM; return 0; } -int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) +static int __ftrace_modify_call_site(ftrace_func_t *hook_pos, ftrace_func_t target, bool enable) { - unsigned int call[2]; + ftrace_func_t call = target; + ftrace_func_t nops = _stub; - make_call_t0(rec->ip, addr, call); - - if (patch_insn_write((void *)rec->ip, call, MCOUNT_INSN_SIZE)) - return -EPERM; + WRITE_ONCE(*hook_pos, enable ? call : nops); return 0; } +int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) +{ + unsigned long distance, orig_addr; + + orig_addr = (unsigned long)_caller; + distance = addr > orig_addr ? addr - orig_addr : orig_addr - addr; + if (distance > JALR_RANGE) + return -EINVAL; + + return __ftrace_modify_call(rec->ip, addr, false); +} + int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { - unsigned int nops[2] = {NOP4, NOP4}; + unsigned int nops[1] = {NOP4}; - if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE)) + if (patch_insn_write((void *)(rec->ip + MCOUNT_AUIPC_SIZE), nops,
[PATCH v2 2/6] riscv: ftrace: align patchable functions to 4 Byte boundary
We are changing ftrace code patching in order to remove dependency from stop_machine() and enable kernel preemption. This requires us to align functions entry at a 4-B align address. However, -falign-functions on older versions of GCC alone was not strong enoungh to align all functions. In fact, cold functions are not aligned after turning on optimizations. We consider this is a bug in GCC and turn off guess-branch-probility as a workaround to align all functions. GCC bug id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 The option -fmin-function-alignment is able to align all functions properly on newer versions of gcc. So, we add a cc-option to test if the toolchain supports it. Suggested-by: Evgenii Shatokhin Signed-off-by: Andy Chiu --- Changelog v2: - Use CC_HAS_MIN_FUNCTION_ALIGNMENT and it friends to prevent reinventing wheels (Nathan) --- arch/riscv/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 704d4683bcfa..55c70efbad0a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -133,6 +133,7 @@ config RISCV select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS if MMU select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE) + select FUNCTION_ALIGNMENT_4B if HAVE_DYNAMIC_FTRACE && RISCV_ISA_C select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS select HAVE_DYNAMIC_FTRACE_WITH_ARGS if HAVE_DYNAMIC_FTRACE select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL @@ -208,6 +209,7 @@ config CLANG_SUPPORTS_DYNAMIC_FTRACE config GCC_SUPPORTS_DYNAMIC_FTRACE def_bool CC_IS_GCC depends on $(cc-option,-fpatchable-function-entry=8) + depends on CC_HAS_MIN_FUNCTION_ALIGNMENT || !RISCV_ISA_C config HAVE_SHADOW_CALL_STACK def_bool $(cc-option,-fsanitize=shadow-call-stack) -- 2.43.0
[PATCH v2 1/6] riscv: ftrace: support fastcc in Clang for WITH_ARGS
Some caller-saved registers which are not defined as function arguments in the ABI can still be passed as arguments when the kernel is compiled with Clang. As a result, we must save and restore those registers to prevent ftrace from clobbering them. - [1]: https://reviews.llvm.org/D68559 Reported-by: Evgenii Shatokhin Closes: https://lore.kernel.org/linux-riscv/7e7c7914-445d-426d-89a0-59a9199c4...@yadro.com/ Acked-by: Nathan Chancellor Signed-off-by: Andy Chiu --- arch/riscv/include/asm/ftrace.h | 7 +++ arch/riscv/kernel/asm-offsets.c | 7 +++ arch/riscv/kernel/mcount-dyn.S | 16 ++-- 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index 9eb31a7ea0aa..5f81c53dbfd9 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -144,6 +144,13 @@ struct ftrace_regs { unsigned long a5; unsigned long a6; unsigned long a7; +#ifdef CONFIG_CC_IS_CLANG + unsigned long t2; + unsigned long t3; + unsigned long t4; + unsigned long t5; + unsigned long t6; +#endif }; }; }; diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index b09ca5f944f7..db5a26fcc9ae 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -497,6 +497,13 @@ void asm_offsets(void) DEFINE(FREGS_SP,offsetof(struct ftrace_regs, sp)); DEFINE(FREGS_S0,offsetof(struct ftrace_regs, s0)); DEFINE(FREGS_T1,offsetof(struct ftrace_regs, t1)); +#ifdef CONFIG_CC_IS_CLANG + DEFINE(FREGS_T2,offsetof(struct ftrace_regs, t2)); + DEFINE(FREGS_T3,offsetof(struct ftrace_regs, t3)); + DEFINE(FREGS_T4,offsetof(struct ftrace_regs, t4)); + DEFINE(FREGS_T5,offsetof(struct ftrace_regs, t5)); + DEFINE(FREGS_T6,offsetof(struct ftrace_regs, t6)); +#endif DEFINE(FREGS_A0,offsetof(struct ftrace_regs, a0)); DEFINE(FREGS_A1,offsetof(struct ftrace_regs, a1)); DEFINE(FREGS_A2,offsetof(struct ftrace_regs, a2)); diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S index 745dd4c4a69c..e988bd26b28b 100644 --- a/arch/riscv/kernel/mcount-dyn.S +++ b/arch/riscv/kernel/mcount-dyn.S @@ -96,7 +96,13 @@ REG_S x8, FREGS_S0(sp) #endif REG_S x6, FREGS_T1(sp) - +#ifdef CONFIG_CC_IS_CLANG + REG_S x7, FREGS_T2(sp) + REG_S x28, FREGS_T3(sp) + REG_S x29, FREGS_T4(sp) + REG_S x30, FREGS_T5(sp) + REG_S x31, FREGS_T6(sp) +#endif // save the arguments REG_S x10, FREGS_A0(sp) REG_S x11, FREGS_A1(sp) @@ -115,7 +121,13 @@ REG_L x8, FREGS_S0(sp) #endif REG_L x6, FREGS_T1(sp) - +#ifdef CONFIG_CC_IS_CLANG + REG_L x7, FREGS_T2(sp) + REG_L x28, FREGS_T3(sp) + REG_L x29, FREGS_T4(sp) + REG_L x30, FREGS_T5(sp) + REG_L x31, FREGS_T6(sp) +#endif // restore the arguments REG_L x10, FREGS_A0(sp) REG_L x11, FREGS_A1(sp) -- 2.43.0
[PATCH v2 0/6] riscv: ftrace: atmoic patching and preempt improvements
This series makes atmoic code patching possible in riscv ftrace. A direct benefit of this is that we can get rid of stop_machine() when patching function entries. This also makes it possible to run ftrace with full kernel preemption. Before this series, the kernel initializes patchable function entries to NOP4 + NOP4. To start tracing, it updates entries to AUIPC + JALR while holding other cores in stop_machine. stop_machine() is required because it is impossible to update 2 instructions, and be seen atomically. And preemption must have to be prevented, as kernel preemption allows process to be scheduled out while executing on one of these instruction pairs. This series addresses the problem by initializing the first NOP4 to AUIPC. So, atmoic patching is possible because the kernel only has to update one instruction. As long as the instruction is naturally aligned, then it is expected to be updated atomically. However, the address range of the ftrace trampoline is limited to +-2K from ftrace_caller after appplying this series. This issue is expected to be solved by Puranjay's CALL_OPS, where it adds 8B naturally align data in front of pacthable functions and can use it to direct execution out to any custom trampolines. The series is composed by three parts. The first part cleans up the existing issues when the kernel is compiled with clang.The second part modifies the ftrace code patching mechanism (2-4) as mentioned above. Then prepare ftrace to be able to run with kernel preemption (5,6) This series is tested after applying the following ftrace/patching in the fixes branch: - commit 57a369b6f2ee ("riscv: patch: Flush the icache right after patching to avoid illegal insns") - commit a2bd3a5b4b63 ("riscv: stacktrace: convert arch_stack_walk() to noinstr") Changes in v2: - Drop patch 1 as it is merged through fixes. - Drop patch 2, which converts kernel_text_address into notrace. As users can prevent tracing it by configuring the tracefs. - Use a more generic way in kconfig to align functions. - Link to v1: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c...@sifive.com --- Andy Chiu (6): riscv: ftrace: support fastcc in Clang for WITH_ARGS riscv: ftrace: align patchable functions to 4 Byte boundary riscv: ftrace: prepare ftrace for atomic code patching riscv: ftrace: do not use stop_machine to update code riscv: vector: Support calling schedule() for preemptible Vector riscv: ftrace: support PREEMPT arch/riscv/Kconfig | 4 +- arch/riscv/include/asm/ftrace.h| 11 +++ arch/riscv/include/asm/processor.h | 5 ++ arch/riscv/include/asm/vector.h| 22 +- arch/riscv/kernel/asm-offsets.c| 7 ++ arch/riscv/kernel/ftrace.c | 133 - arch/riscv/kernel/mcount-dyn.S | 25 +-- 7 files changed, 121 insertions(+), 86 deletions(-) --- base-commit: a2bd3a5b4b63b95aea7dbf61d9395cd6696a2bc0 change-id: 20240613-dev-andyc-dyn-ftrace-v4-941d4a00ea19 Best regards, -- Andy Chiu
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On Fri, 2024-06-28 at 13:33 +0200, Peter Hilber wrote: > > > > > /* > > * What time is exposed in the time_sec/time_frac_sec fields? > > */ > > uint8_t time_type; > > #define VMCLOCK_TIME_UNKNOWN0 /* Invalid / no time > > exposed */ > > #define VMCLOCK_TIME_UTC1 /* Since 1970-01-01 > > 00:00:00z */ > > #define VMCLOCK_TIME_TAI2 /* Since 1970-01-01 > > 00:00:00z */ > > #define VMCLOCK_TIME_MONOTONIC 3 /* Since undefined epoch */ > > > > /* Bit shift for counter_period_frac_sec and its error rate */ > > uint8_t counter_period_shift; > > > > /* > > * Unlike in NTP, this can indicate a leap second in the past. This > > * is needed to allow guests to derive an imprecise clock with > > * smeared leap seconds for themselves, as some modes of smearing > > * need the adjustments to continue even after the moment at which > > * the leap second should have occurred. > > */ > > int8_t leapsecond_direction; > > uint64_t leapsecond_tai_sec; /* Since 1970-01-01 00:00:00z */ > > > > /* > > * Paired values of counter and UTC at a given point in time. > > */ > > uint64_t counter_value; > > uint64_t time_sec; /* Since 1970-01-01 00:00:00z */ > > Nitpick: The comment is not valid any more for TIME_MONOTONIC. Ah yes, I "moved" that comment up to the UTC/TAI time_type values, but neglected to actually delete it from here. Fixed; thanks. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On 27.06.24 18:03, David Woodhouse wrote: > > I've updated the tree at > https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/vmclock > (but not yet the qemu one). > > I think I've taken into account all your comments apart from the one > about non-64-bit counters wrapping. I reduced the seq_count to 32 bit > to make room for a 32-bit flags field, added the time type > (UTC/TAI/MONOTONIC) and a smearing hint, with some straw man > definitions for smearing algorithms for which I could actually find > definitions. > > The structure now looks like this: > > > struct vmclock_abi { [...] > > /* >* What time is exposed in the time_sec/time_frac_sec fields? >*/ > uint8_t time_type; > #define VMCLOCK_TIME_UNKNOWN 0 /* Invalid / no time exposed */ > #define VMCLOCK_TIME_UTC 1 /* Since 1970-01-01 00:00:00z */ > #define VMCLOCK_TIME_TAI 2 /* Since 1970-01-01 00:00:00z */ > #define VMCLOCK_TIME_MONOTONIC3 /* Since undefined > epoch */ > > /* Bit shift for counter_period_frac_sec and its error rate */ > uint8_t counter_period_shift; > > /* >* Unlike in NTP, this can indicate a leap second in the past. This >* is needed to allow guests to derive an imprecise clock with >* smeared leap seconds for themselves, as some modes of smearing >* need the adjustments to continue even after the moment at which >* the leap second should have occurred. >*/ > int8_t leapsecond_direction; > uint64_t leapsecond_tai_sec; /* Since 1970-01-01 00:00:00z */ > > /* >* Paired values of counter and UTC at a given point in time. >*/ > uint64_t counter_value; > uint64_t time_sec; /* Since 1970-01-01 00:00:00z */ Nitpick: The comment is not valid any more for TIME_MONOTONIC.
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On 27.06.24 16:52, David Woodhouse wrote: > On Thu, 2024-06-27 at 15:50 +0200, Peter Hilber wrote: >> On 25.06.24 21:01, David Woodhouse wrote: >>> From: David Woodhouse >>> >>> The vmclock "device" provides a shared memory region with precision clock >>> information. By using shared memory, it is safe across Live Migration. >>> >>> Like the KVM PTP clock, this can convert TSC-based cross timestamps into >>> KVM clock values. Unlike the KVM PTP clock, it does so only when such is >>> actually helpful. >>> >>> The memory region of the device is also exposed to userspace so it can be >>> read or memory mapped by application which need reliable notification of >>> clock disruptions. >>> >>> Signed-off-by: David Woodhouse >>> --- >>> >>> v2: >>> • Add gettimex64() support >>> • Convert TSC values to KVM clock when appropriate >>> • Require int128 support >>> • Add counter_period_shift >>> • Add timeout when seq_count is invalid >>> • Add flags field >>> • Better comments in vmclock ABI structure >>> • Explicitly forbid smearing (as clock rates would need to change) >> >> Leap second smearing information could still be conveyed through the >> vmclock_abi. AFAIU, to cover the popular smearing variants, it should be >> enough to indicate whether the driver should apply linear or cosine >> smearing, and the start time and end time. > > Yes. The clock information actually conveyed through the {counter, > time, rate} tuple should never be smeared, and should only ever be UTC. > > But we could provide a hint to the guest operating system about what > type of smearing to perform, *if* it chooses to offer a clock other > than the standard CLOCK_REALTIME to its users. > > I already added a flags field, so this might look something like: > > /* > * Smearing flags. The UTC clock exposed through this structure > * is only ever true UTC, but a guest operating system may > * choose to offer a monotonic smeared clock to its users. This > * merely offers a hint about what kind of smearing to perform, > * for consistency with systems in the nearby environment. > */ > #define VMCLOCK_FLAGS_SMEAR_UTC_SLS (1<<5) /* draft-kuhn-leapsecond-00.txt */ > > > (UTC-SLS is probably a bad example but are there formal definitions for > anything else?) > > I think it could also be more generic, like flags for linear smearing, cosine smearing(?), and smear_start_sec and smear_end_sec fields (relative to the leap second start). That could also represent UTC-SLS, and noon-to-noon, and it would be well-defined. This should reduce the likelihood that the guest doesn't know the smearing variant. [...] >>> diff --git a/include/uapi/linux/vmclock.h b/include/uapi/linux/vmclock.h >>> new file mode 100644 >>> index ..cf0f22205e79 >>> --- /dev/null >>> +++ b/include/uapi/linux/vmclock.h >>> @@ -0,0 +1,138 @@ >>> +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR >>> BSD-2-Clause) */ >>> + >>> +/* >>> + * This structure provides a vDSO-style clock to VM guests, exposing the >>> + * relationship (or lack thereof) between the CPU clock (TSC, timebase, >>> arch >>> + * counter, etc.) and real time. It is designed to address the problem of >>> + * live migration, which other clock enlightenments do not. >>> + * >>> + * When a guest is live migrated, this affects the clock in two ways. >>> + * >>> + * First, even between identical hosts the actual frequency of the >>> underlying >>> + * counter will change within the tolerances of its specification >>> (typically >>> + * ±50PPM, or 4 seconds a day). The frequency also varies over time on the >>> + * same host, but can be tracked by NTP as it generally varies slowly. With >>> + * live migration there is a step change in the frequency, with no warning. >>> + * >>> + * Second, there may be a step change in the value of the counter itself, >>> as >>> + * its accuracy is limited by the precision of the NTP synchronization on >>> the >>> + * source and destination hosts. >>> + * >>> + * So any calibration (NTP, PTP, etc.) which the guest has done on the >>> source >>> + * host before migration is invalid, and needs to be redone on the new >>> host. >>> + * >>> + * In its most basic mode, this structure provides only an indication to >>> the >>> + * guest that live migration has occurred. This allows the guest to know >>> that >>> + * its clock is invalid and take remedial action. For applications that >>> need >>> + * reliable accurate timestamps (e.g. distributed databases), the structure >>> + * can be mapped all the way to userspace. This allows the application to >>> see >>> + * directly for itself that the clock is disrupted and take appropriate >>> + * action, even when using a vDSO-style method to get the time instead of a >>> + * system call. >>> + * >>> + * In its more advanced mode. this structure can also be used to expose the >>> + * precise relationship of the CPU counter to
Re: [PATCH net-next v3 2/3] vsock/virtio: add SIOCOUTQ support for all virtio based transports
On Wed, Jun 26, 2024 at 02:08:36PM GMT, Luigi Leonardi via B4 Relay wrote: From: Luigi Leonardi Introduce support for stream_bytes_unsent and seqpacket_bytes_unsent ioctl for virtio_transport, vhost_vsock and vsock_loopback. For all transports the unsent bytes counter is incremented in virtio_transport_get_credit. In the virtio_transport (G2H) the counter is decremented each time the host notifies the guest that it consumed the skbuffs. In vhost-vsock (H2G) the counter is decremented after the skbuff is queued in the virtqueue. In vsock_loopback the counter is decremented after the skbuff is dequeued. Signed-off-by: Luigi Leonardi --- drivers/vhost/vsock.c | 4 +++- include/linux/virtio_vsock.h| 7 +++ net/vmw_vsock/virtio_transport.c| 4 +++- net/vmw_vsock/virtio_transport_common.c | 35 + net/vmw_vsock/vsock_loopback.c | 7 +++ 5 files changed, 55 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index ec20ecff85c7..dba8b3ea37bf 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -244,7 +244,7 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, restart_tx = true; } - consume_skb(skb); + virtio_transport_consume_skb_sent(skb, true); } } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); if (added) @@ -451,6 +451,8 @@ static struct virtio_transport vhost_transport = { .notify_buffer_size = virtio_transport_notify_buffer_size, .notify_set_rcvlowat = virtio_transport_notify_set_rcvlowat, + .unsent_bytes = virtio_transport_bytes_unsent, The callback is named `unsent_bytes`, I'd use something similar also in the function name, so `virtio_transport_unsent_bytes`, or the opposite renaming the callback, as you prefer, but I'd use the same for both. + .read_skb = virtio_transport_read_skb, }, diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index c82089dee0c8..e74c12878213 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -134,6 +134,8 @@ struct virtio_vsock_sock { u32 peer_fwd_cnt; u32 peer_buf_alloc; Can you remove this extra empty line, so it's clear that it is protected by tx_lock? + size_t bytes_unsent; + /* Protected by rx_lock */ u32 fwd_cnt; u32 last_fwd_cnt; @@ -193,6 +195,11 @@ s64 virtio_transport_stream_has_data(struct vsock_sock *vsk); s64 virtio_transport_stream_has_space(struct vsock_sock *vsk); u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk); +size_t virtio_transport_bytes_unsent(struct vsock_sock *vsk); + +void virtio_transport_consume_skb_sent(struct sk_buff *skb, + bool consume); + int virtio_transport_do_socket_init(struct vsock_sock *vsk, struct vsock_sock *psk); int diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 43d405298857..fc62d2818c2c 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -311,7 +311,7 @@ static void virtio_transport_tx_work(struct work_struct *work) virtqueue_disable_cb(vq); while ((skb = virtqueue_get_buf(vq, )) != NULL) { - consume_skb(skb); + virtio_transport_consume_skb_sent(skb, true); added = true; } } while (!virtqueue_enable_cb(vq)); @@ -540,6 +540,8 @@ static struct virtio_transport virtio_transport = { .notify_buffer_size = virtio_transport_notify_buffer_size, .notify_set_rcvlowat = virtio_transport_notify_set_rcvlowat, + .unsent_bytes = virtio_transport_bytes_unsent, + .read_skb = virtio_transport_read_skb, }, diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 16ff976a86e3..3a7fa36f306b 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -463,6 +463,26 @@ void virtio_transport_inc_tx_pkt(struct virtio_vsock_sock *vvs, struct sk_buff * } EXPORT_SYMBOL_GPL(virtio_transport_inc_tx_pkt); +void virtio_transport_consume_skb_sent(struct sk_buff *skb, bool consume) +{ + struct sock *s = skb->sk; + + if (s && skb->len) { + struct vsock_sock *vs = vsock_sk(s); + struct virtio_vsock_sock *vvs; + + vvs = vs->trans; + + spin_lock_bh(>tx_lock); + vvs->bytes_unsent -= skb->len; + spin_unlock_bh(>tx_lock); + } + + if (consume) + consume_skb(skb); +}
[PATCH v2] ring-buffer: Align meta-page to sub-buffers for improved TLB usage
Previously, the mapped ring-buffer layout caused misalignment between the meta-page and sub-buffers when the sub-buffer size was not a multiple of PAGE_SIZE. This prevented hardware with larger TLB entries from utilizing them effectively. Add a padding with the zero-page between the meta-page and sub-buffers. Also update the ring-buffer map_test to verify that padding. Signed-off-by: Vincent Donnefort -- This is based on the mm-unstable branch [1] as it depends on David's work [2] for allowing the zero-page in vm_insert_page(). [1] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git [2] https://lore.kernel.org/all/20240522125713.775114-1-da...@redhat.com v1 -> v2: * Fix unsequenced modification and access to 'p' (s390 build) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7345a8b625fb..c1116e76fe17 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -6148,10 +6148,10 @@ static void rb_setup_ids_meta_page(struct ring_buffer_per_cpu *cpu_buffer, /* install subbuf ID to kern VA translation */ cpu_buffer->subbuf_ids = subbuf_ids; - meta->meta_page_size = PAGE_SIZE; meta->meta_struct_len = sizeof(*meta); meta->nr_subbufs = nr_subbufs; meta->subbuf_size = cpu_buffer->buffer->subbuf_size + BUF_PAGE_HDR_SIZE; + meta->meta_page_size = meta->subbuf_size; rb_update_meta_page(cpu_buffer); } @@ -6238,6 +6238,12 @@ static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, !(vma->vm_flags & VM_MAYSHARE)) return -EPERM; + subbuf_order = cpu_buffer->buffer->subbuf_order; + subbuf_pages = 1 << subbuf_order; + + if (subbuf_order && pgoff % subbuf_pages) + return -EINVAL; + /* * Make sure the mapping cannot become writable later. Also tell the VM * to not touch these pages (VM_DONTCOPY | VM_DONTEXPAND). @@ -6247,11 +6253,8 @@ static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, lockdep_assert_held(_buffer->mapping_lock); - subbuf_order = cpu_buffer->buffer->subbuf_order; - subbuf_pages = 1 << subbuf_order; - nr_subbufs = cpu_buffer->nr_pages + 1; /* + reader-subbuf */ - nr_pages = ((nr_subbufs) << subbuf_order) - pgoff + 1; /* + meta-page */ + nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff; /* + meta-page */ vma_pages = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; if (!vma_pages || vma_pages > nr_pages) @@ -6264,20 +6267,24 @@ static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, return -ENOMEM; if (!pgoff) { + unsigned long meta_page_padding; + pages[p++] = virt_to_page(cpu_buffer->meta_page); /* -* TODO: Align sub-buffers on their size, once -* vm_insert_pages() supports the zero-page. +* Pad with the zero-page to align the meta-page with the +* sub-buffers. */ - } else { - /* Skip the meta-page */ - pgoff--; + meta_page_padding = subbuf_pages - 1; + while (meta_page_padding-- && p < nr_pages) { + unsigned long __maybe_unused zero_addr = + vma->vm_start + (PAGE_SIZE * p); - if (pgoff % subbuf_pages) { - err = -EINVAL; - goto out; + pages[p++] = ZERO_PAGE(zero_addr); } + } else { + /* Skip the meta-page */ + pgoff -= subbuf_pages; s += pgoff / subbuf_pages; } diff --git a/tools/testing/selftests/ring-buffer/map_test.c b/tools/testing/selftests/ring-buffer/map_test.c index a9006fa7097e..4bb0192e43f3 100644 --- a/tools/testing/selftests/ring-buffer/map_test.c +++ b/tools/testing/selftests/ring-buffer/map_test.c @@ -228,6 +228,20 @@ TEST_F(map, data_mmap) data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, desc->cpu_fd, meta_len); ASSERT_EQ(data, MAP_FAILED); + + /* Verify meta-page padding */ + if (desc->meta->meta_page_size > getpagesize()) { + void *addr; + + data_len = desc->meta->meta_page_size; + data = mmap(NULL, data_len, + PROT_READ, MAP_SHARED, desc->cpu_fd, 0); + ASSERT_NE(data, MAP_FAILED); + + addr = (void *)((unsigned long)data + getpagesize()); + ASSERT_EQ(*((int *)addr), 0); + munmap(data, data_len); + } } FIXTURE(snapshot) { base-commit: c65920c76a977c2b73c3a8b03b4c0c00cc1285ed -- 2.45.2.803.g4e1b14247a-goog
[PATCH v3] module: Add log info for verifying module signature
Add log information in kernel-space when loading module failures. Try to load the unsigned module and the module with bad signature when set 1 to /sys/module/module/parameters/sig_enforce. Unsigned module case: (linux) insmod unsigned.ko [ 18.714661] Loading of unsigned module is rejected insmod: can't insert 'unsigned.ko': Key was rejected by service (linux) Bad signature module case: (linux) insmod bad_signature.ko insmod: can't insert 'bad_signature.ko': Key was rejected by service (linux) There have different logging behavior the bad signature case only log in user-space, add log info for fatal errors in module_sig_check(). Signed-off-by: Yusong Gao --- V3: Clarify the message type and the error code meaning. V2: Change print level from notice to debug. --- kernel/module/signing.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/kernel/module/signing.c b/kernel/module/signing.c index a2ff4242e623..826cdab8e3e4 100644 --- a/kernel/module/signing.c +++ b/kernel/module/signing.c @@ -67,6 +67,31 @@ int mod_verify_sig(const void *mod, struct load_info *info) NULL, NULL); } +static const char *mod_decode_error(int errno) +{ + char *errstr = "Unrecognized error"; + + switch (errno) { + case -ENOMEM: + errstr = "Out of memory"; + break; + case -EINVAL: + errstr = "Invalid argument"; + break; + case -EBADMSG: + errstr = "Invaild module signature format"; + break; + case -EMSGSIZE: + errstr = "Message too long"; + break; + case -EKEYREJECTED: + errstr = "Key was rejected by service"; + break; + } + + return errstr; +} + int module_sig_check(struct load_info *info, int flags) { int err = -ENODATA; @@ -113,6 +138,8 @@ int module_sig_check(struct load_info *info, int flags) * unparseable signatures, and signature check failures -- * even if signatures aren't required. */ + pr_debug("Verifying module signature failed: %s\n", +mod_decode_error(err)); return err; } -- 2.34.1
Re: [PATCH] remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
Il 27/06/24 23:20, Nícolas F. R. A. Prado ha scritto: The current code doesn't check whether platform_get_resource_byname() succeeded to get the l1tcm memory, which is optional, before attempting to map it. This results in the following error message when it is missing: mtk-scp 1050.scp: error -EINVAL: invalid resource (null) Add a check so that the remapping is only attempted if the memory region exists. This also allows to simplify the logic handling failure to remap, since a failure then is always a failure. Fixes: ca23ecfdbd44 ("remoteproc/mediatek: support L1TCM") Signed-off-by: Nícolas F. R. A. Prado Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH V3 2/2] soc: qcom: smp2p: Introduce tracepoint support
On 6/27/2024 4:18 PM, Sudeepgoud Patil wrote: This commit introduces tracepoint support for smp2p, enabling logging of communication between local and remote processors. These tracepoints include information about the remote subsystem name, negotiation details, supported features, bit change notifications, and ssr activity. These logs are useful for debugging issues between subsystems. Signed-off-by: Sudeepgoud Patil Reviewed-by: Deepak Kumar Singh --- drivers/soc/qcom/Makefile | 1 + drivers/soc/qcom/smp2p.c | 9 drivers/soc/qcom/trace-smp2p.h | 98 ++ 3 files changed, 108 insertions(+) create mode 100644 drivers/soc/qcom/trace-smp2p.h diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile index ca0bece0dfff..30c1bf645501 100644 --- a/drivers/soc/qcom/Makefile +++ b/drivers/soc/qcom/Makefile @@ -23,6 +23,7 @@ qcom_rpmh-y += rpmh.o obj-$(CONFIG_QCOM_SMD_RPM)+= rpm-proc.o smd-rpm.o obj-$(CONFIG_QCOM_SMEM) +=smem.o obj-$(CONFIG_QCOM_SMEM_STATE) += smem_state.o +CFLAGS_smp2p.o := -I$(src) obj-$(CONFIG_QCOM_SMP2P) += smp2p.o obj-$(CONFIG_QCOM_SMSM) += smsm.o obj-$(CONFIG_QCOM_SOCINFO)+= socinfo.o diff --git a/drivers/soc/qcom/smp2p.c b/drivers/soc/qcom/smp2p.c index 696c2a8387d0..4aa61b0f11ad 100644 --- a/drivers/soc/qcom/smp2p.c +++ b/drivers/soc/qcom/smp2p.c @@ -161,6 +161,9 @@ struct qcom_smp2p { struct list_head outbound; }; +#define CREATE_TRACE_POINTS +#include "trace-smp2p.h" + static void qcom_smp2p_kick(struct qcom_smp2p *smp2p) { /* Make sure any updated data is written before the kick */ @@ -192,6 +195,7 @@ static void qcom_smp2p_do_ssr_ack(struct qcom_smp2p *smp2p) struct smp2p_smem_item *out = smp2p->out; u32 val; + trace_smp2p_ssr_ack(smp2p->dev); smp2p->ssr_ack = !smp2p->ssr_ack; val = out->flags & ~BIT(SMP2P_FLAGS_RESTART_ACK_BIT); @@ -214,6 +218,7 @@ static void qcom_smp2p_negotiate(struct qcom_smp2p *smp2p) smp2p->ssr_ack_enabled = true; smp2p->negotiation_done = true; + trace_smp2p_negotiate(smp2p->dev, out->features); } } @@ -252,6 +257,8 @@ static void qcom_smp2p_notify_in(struct qcom_smp2p *smp2p) status = val ^ entry->last_value; entry->last_value = val; + trace_smp2p_notify_in(entry, status, val); + /* No changes of this entry? */ if (!status) continue; @@ -415,6 +422,8 @@ static int smp2p_update_bits(void *data, u32 mask, u32 value) writel(val, entry->value); spin_unlock_irqrestore(>lock, flags); + trace_smp2p_update_bits(entry, orig, val); + if (val != orig) qcom_smp2p_kick(entry->smp2p); diff --git a/drivers/soc/qcom/trace-smp2p.h b/drivers/soc/qcom/trace-smp2p.h new file mode 100644 index ..fa985a0d7615 --- /dev/null +++ b/drivers/soc/qcom/trace-smp2p.h @@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved. + */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM qcom_smp2p + +#if !defined(__QCOM_SMP2P_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) +#define __QCOM_SMP2P_TRACE_H__ + +#include +#include + +TRACE_EVENT(smp2p_ssr_ack, + TP_PROTO(const struct device *dev), + TP_ARGS(dev), + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + ), + TP_printk("%s: SSR detected", __get_str(dev_name)) +); + +TRACE_EVENT(smp2p_negotiate, + TP_PROTO(const struct device *dev, unsigned int features), + TP_ARGS(dev, features), + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + __field(u32, out_features) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + __entry->out_features = features; + ), + TP_printk("%s: state=open out_features=%s", __get_str(dev_name), + __print_flags(__entry->out_features, "|", + {SMP2P_FEATURE_SSR_ACK, "SMP2P_FEATURE_SSR_ACK"}) + ) +); + +TRACE_EVENT(smp2p_notify_in, + TP_PROTO(struct smp2p_entry *smp2p_entry, unsigned long status, u32 val), + TP_ARGS(smp2p_entry, status, val), + TP_STRUCT__entry( + __string(dev_name, dev_name(smp2p_entry->smp2p->dev)) + __string(client_name, smp2p_entry->name) + __field(unsigned long, status) + __field(u32, val) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(smp2p_entry->smp2p->dev)); + __assign_str(client_name, smp2p_entry->name); + __entry->status = status; + __entry->val = val; + ), + TP_printk("%s: %s:
Re: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types.
nit: in theory in this patch we don't support it for any of the transports, so I wouldn't confuse and take that part out of the title. WDYT with someting like: vsock: add support for SIOCOUTQ ioctl On Wed, Jun 26, 2024 at 02:08:35PM GMT, Luigi Leonardi via B4 Relay wrote: From: Luigi Leonardi Add support for ioctl(s) for SOCK_STREAM SOCK_SEQPACKET and SOCK_DGRAM in AF_VSOCK. The only ioctl available is SIOCOUTQ/TIOCOUTQ, which returns the number of unsent bytes in the socket. This information is transport-specific and is delegated to them using a callback. Suggested-by: Daan De Meyer Signed-off-by: Luigi Leonardi --- include/net/af_vsock.h | 3 +++ net/vmw_vsock/af_vsock.c | 60 +--- 2 files changed, 60 insertions(+), 3 deletions(-) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 535701efc1e5..7b5375ae7827 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -169,6 +169,9 @@ struct vsock_transport { void (*notify_buffer_size)(struct vsock_sock *, u64 *); int (*notify_set_rcvlowat)(struct vsock_sock *vsk, int val); + /* SIOCOUTQ ioctl */ + size_t (*unsent_bytes)(struct vsock_sock *vsk); If you want to return also errors, maybe better returning ssize_t. This should fix one of the error reported by kernel bots. + /* Shutdown. */ int (*shutdown)(struct vsock_sock *, int); diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 4b040285aa78..d6140d73d122 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -112,6 +112,7 @@ #include #include #include +#include static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr); static void vsock_sk_destruct(struct sock *sk); @@ -1292,6 +1293,59 @@ int vsock_dgram_recvmsg(struct socket *sock, struct msghdr *msg, } EXPORT_SYMBOL_GPL(vsock_dgram_recvmsg); +static int vsock_do_ioctl(struct socket *sock, unsigned int cmd, + int __user *arg) +{ + struct sock *sk = sock->sk; + struct vsock_sock *vsk; + int retval; + + vsk = vsock_sk(sk); + + switch (cmd) { + case SIOCOUTQ: { + size_t n_bytes; + + if (!vsk->transport || !vsk->transport->unsent_bytes) { + retval = -EOPNOTSUPP; + break; + } + + if (vsk->transport->unsent_bytes) { This if is not necessary after the check we did earlier, right? Removing it should fix the other issue reported by the bot. + if (sock_type_connectible(sk->sk_type) && sk->sk_state == TCP_LISTEN) { + retval = -EINVAL; + break; + } + + n_bytes = vsk->transport->unsent_bytes(vsk); + if (n_bytes < 0) { + retval = n_bytes; + break; + } + + retval = put_user(n_bytes, arg); + } + break; + } + default: + retval = -ENOIOCTLCMD; + } + + return retval; +} + +static int vsock_ioctl(struct socket *sock, unsigned int cmd, + unsigned long arg) +{ + int ret; + + lock_sock(sock->sk); + ret = vsock_do_ioctl(sock, cmd, (int __user *)arg); + release_sock(sock->sk); + + return ret; +} + static const struct proto_ops vsock_dgram_ops = { .family = PF_VSOCK, .owner = THIS_MODULE, @@ -1302,7 +1356,7 @@ static const struct proto_ops vsock_dgram_ops = { .accept = sock_no_accept, .getname = vsock_getname, .poll = vsock_poll, - .ioctl = sock_no_ioctl, + .ioctl = vsock_ioctl, .listen = sock_no_listen, .shutdown = vsock_shutdown, .sendmsg = vsock_dgram_sendmsg, @@ -2286,7 +2340,7 @@ static const struct proto_ops vsock_stream_ops = { .accept = vsock_accept, .getname = vsock_getname, .poll = vsock_poll, - .ioctl = sock_no_ioctl, + .ioctl = vsock_ioctl, .listen = vsock_listen, .shutdown = vsock_shutdown, .setsockopt = vsock_connectible_setsockopt, @@ -2308,7 +2362,7 @@ static const struct proto_ops vsock_seqpacket_ops = { .accept = vsock_accept, .getname = vsock_getname, .poll = vsock_poll, - .ioctl = sock_no_ioctl, + .ioctl = vsock_ioctl, .listen = vsock_listen, .shutdown = vsock_shutdown, .setsockopt = vsock_connectible_setsockopt, -- 2.45.2
Re: [PATCH] arm64: dts: qcom: sm7225-fairphone-fp4: Name the regulators
On Thu, Jun 27, 2024 at 03:15:54PM GMT, Luca Weiss wrote: > Without explicitly specifying names for the regulators they are named > based on the DeviceTree node name. This results in multiple regulators > with the same name, making debug prints and regulator_summary impossible > to reason about. > > Signed-off-by: Luca Weiss > --- > arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts | 34 > +++ > 1 file changed, 34 insertions(+) > Reviewed-by: Dmitry Baryshkov -- With best wishes Dmitry
Re: [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs
On Thu, 27 Jun 2024 09:47:10 -0700 Andrii Nakryiko wrote: > On Thu, Jun 27, 2024 at 6:04 AM Masami Hiramatsu wrote: > > > > On Mon, 24 Jun 2024 17:21:38 -0700 > > Andrii Nakryiko wrote: > > > > > -static int __uprobe_register(struct inode *inode, loff_t offset, > > > - loff_t ref_ctr_offset, struct uprobe_consumer > > > *uc) > > > +int uprobe_register_batch(struct inode *inode, int cnt, > > > + uprobe_consumer_fn get_uprobe_consumer, void *ctx) > > > > Is this interface just for avoiding memory allocation? Can't we just > > allocate a temporary array of *uprobe_consumer instead? > > Yes, exactly, to avoid the need for allocating another array that > would just contain pointers to uprobe_consumer. Consumers would never > just have an array of `struct uprobe_consumer *`, because > uprobe_consumer struct is embedded in some other struct, so the array > interface isn't the most convenient. OK, I understand it. > > If you feel strongly, I can do an array, but this necessitates > allocating an extra array *and keeping it* for the entire duration of > BPF multi-uprobe link (attachment) existence, so it feels like a > waste. This is because we don't want to do anything that can fail in > the detachment logic (so no temporary array allocation there). No need to change it, that sounds reasonable. > > Anyways, let me know how you feel about keeping this callback. IMHO, maybe the interface function is better to change to `uprobe_consumer *next_uprobe_consumer(void **data)`. If caller side uses a linked list of structure, index access will need to follow the list every time. Thank you, > > > > > Thank you, > > > > -- > > Masami Hiramatsu (Google) -- Masami Hiramatsu (Google)
Re: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types.
Hi Luigi, kernel test robot noticed the following build warnings: [auto build test WARNING on 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5] url: https://github.com/intel-lab-lkp/linux/commits/Luigi-Leonardi-via-B4-Relay/vsock-add-support-for-SIOCOUTQ-ioctl-for-all-vsock-socket-types/20240627-023902 base: 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5 patch link: https://lore.kernel.org/r/20240626-ioctl_next-v3-1-63be5bf19a40%40outlook.com patch subject: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types. config: i386-randconfig-141-20240628 (https://download.01.org/0day-ci/archive/20240628/202406281355.d1jnvgbc-...@intel.com/config) compiler: gcc-8 (Ubuntu 8.4.0-3ubuntu2) 8.4.0 If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406281355.d1jnvgbc-...@intel.com/ smatch warnings: net/vmw_vsock/af_vsock.c:1321 vsock_do_ioctl() warn: unsigned 'n_bytes' is never less than zero. vim +/n_bytes +1321 net/vmw_vsock/af_vsock.c 1295 1296 static int vsock_do_ioctl(struct socket *sock, unsigned int cmd, 1297int __user *arg) 1298 { 1299 struct sock *sk = sock->sk; 1300 struct vsock_sock *vsk; 1301 int retval; 1302 1303 vsk = vsock_sk(sk); 1304 1305 switch (cmd) { 1306 case SIOCOUTQ: { 1307 size_t n_bytes; 1308 1309 if (!vsk->transport || !vsk->transport->unsent_bytes) { 1310 retval = -EOPNOTSUPP; 1311 break; 1312 } 1313 1314 if (vsk->transport->unsent_bytes) { 1315 if (sock_type_connectible(sk->sk_type) && sk->sk_state == TCP_LISTEN) { 1316 retval = -EINVAL; 1317 break; 1318 } 1319 1320 n_bytes = vsk->transport->unsent_bytes(vsk); > 1321 if (n_bytes < 0) { 1322 retval = n_bytes; 1323 break; 1324 } 1325 1326 retval = put_user(n_bytes, arg); 1327 } 1328 break; 1329 } 1330 default: 1331 retval = -ENOIOCTLCMD; 1332 } 1333 1334 return retval; 1335 } 1336 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH] LoongArch: uprobes: make UPROBE_SWBP_INSN/UPROBE_XOLBP_INSN constant
On Thu, 27 Jun 2024 19:38:06 +0200 Oleg Nesterov wrote: > On 06/27, Andrii Nakryiko wrote: > > > > Acked-by: Andrii Nakryiko > > Thanks! > > > > --- a/arch/loongarch/kernel/uprobes.c > > > +++ b/arch/loongarch/kernel/uprobes.c > > > @@ -7,6 +7,14 @@ > > > > > > #define UPROBE_TRAP_NR UINT_MAX > > > > > > +static __init int check_emit_break(void) > > > +{ > > > + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); > > > + BUG_ON(UPROBE_XOLBP_INSN != > > > larch_insn_gen_break(BRK_UPROBE_XOLBP)); > > > + return 0; > > > +} > > > +arch_initcall(check_emit_break); > > > + > > > > I wouldn't even bother with this, but whatever. > > Agreed, this looks a bit ugly. I did this only because I can not test > this (hopefully trivial) patch and the maintainers didn't reply. > > If LoongArch boots at least once with this change, this run-time check > can be removed. > > And just in case... I didn't dare to make a more "generic" change, but > perhaps KPROBE_BP_INSN and KPROBE_SSTEPBP_INSN should be redefined the > same way for micro-optimization. In this case __emit_break() should be > probably moved into arch/loongarch/include/asm/inst.h. That idea sounds good to me too. If it is good to loongarch maintainers, (e.g. breakpoint instruction is stable), it is better to define in asm/insn.h. Thank you, > > Oleg. > > -- Masami Hiramatsu (Google)
Re: [PATCH] remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
On Thu, Jun 27, 2024 at 05:20:55PM -0400, Nícolas F. R. A. Prado wrote: > The current code doesn't check whether platform_get_resource_byname() > succeeded to get the l1tcm memory, which is optional, before attempting > to map it. This results in the following error message when it is > missing: > > mtk-scp 1050.scp: error -EINVAL: invalid resource (null) > > Add a check so that the remapping is only attempted if the memory region > exists. This also allows to simplify the logic handling failure to > remap, since a failure then is always a failure. > > Fixes: ca23ecfdbd44 ("remoteproc/mediatek: support L1TCM") > Signed-off-by: Nícolas F. R. A. Prado Reviewed-by: Tzung-Bi Shih
Re: [PATCH v9 8/8] arm64: dts: qcom: Enable Q6v5 WCSS for ipq8074 SoC
Hi Gokul, kernel test robot noticed the following build warnings: [auto build test WARNING on remoteproc/rproc-next] [also build test WARNING on clk/clk-next robh/for-next linus/master v6.10-rc5 next-20240627] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Gokul-Sriram-Palanisamy/remoteproc-qcom-Add-PRNG-proxy-clock/20240625-162317 base: git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git rproc-next patch link: https://lore.kernel.org/r/20240621114659.2958170-9-quic_gokulsri%40quicinc.com patch subject: [PATCH v9 8/8] arm64: dts: qcom: Enable Q6v5 WCSS for ipq8074 SoC config: arm64-randconfig-051-20240627 (https://download.01.org/0day-ci/archive/20240628/202406281044.3viathjc-...@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project ad79a14c9e5ec4a369eed4adf567c22cc029863f) dtschema version: 2024.6.dev2+g3b69bad reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240628/202406281044.3viathjc-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406281044.3viathjc-...@intel.com/ dtcheck warnings: (new ones prefixed by >>) arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@59000: 'vdda-pll-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@59000: 'vdda-phy-dpdm-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdd-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdda-pll-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdda-phy-dpdm-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# >> arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: /soc@0/remoteproc@cd0: failed >> to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- >> arch/arm64/boot/dts/qcom/ipq8074-hk10-c1.dtb: /soc@0/remoteproc@cd0: >> failed to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- >> arch/arm64/boot/dts/qcom/ipq8074-hk10-c2.dtb: /soc@0/remoteproc@cd0: >> failed to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH][next] firewire: core: Fix spelling mistakes in tracepoint messages
On Thu, Jun 27, 2024 at 06:08:47PM +0100, Colin Ian King wrote: > There are two spelling mistakes in the tracepoint message text. Fix them. > > Signed-off-by: Colin Ian King > --- > include/trace/events/firewire.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied to for-next branch. I'll use spellcheck program when posting this kind of changes. Thanks Takashi Sakamoto
[PATCH 13/14] tracefs: Convert to new uid/gid option parsing helpers
Convert to new uid/gid option parsing helpers Signed-off-by: Eric Sandeen --- fs/tracefs/inode.c | 16 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index 7c29f4afc23d..1028ab6d9a74 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -296,9 +296,9 @@ enum { }; static const struct fs_parameter_spec tracefs_param_specs[] = { - fsparam_u32 ("gid", Opt_gid), + fsparam_gid ("gid", Opt_gid), fsparam_u32oct ("mode",Opt_mode), - fsparam_u32 ("uid", Opt_uid), + fsparam_uid ("uid", Opt_uid), {} }; @@ -306,8 +306,6 @@ static int tracefs_parse_param(struct fs_context *fc, struct fs_parameter *param { struct tracefs_fs_info *opts = fc->s_fs_info; struct fs_parse_result result; - kuid_t uid; - kgid_t gid; int opt; opt = fs_parse(fc, tracefs_param_specs, param, ); @@ -316,16 +314,10 @@ static int tracefs_parse_param(struct fs_context *fc, struct fs_parameter *param switch (opt) { case Opt_uid: - uid = make_kuid(current_user_ns(), result.uint_32); - if (!uid_valid(uid)) - return invalf(fc, "Unknown uid"); - opts->uid = uid; + opts->uid = result.uid; break; case Opt_gid: - gid = make_kgid(current_user_ns(), result.uint_32); - if (!gid_valid(gid)) - return invalf(fc, "Unknown gid"); - opts->gid = gid; + opts->gid = result.gid; break; case Opt_mode: opts->mode = result.uint_32 & S_IALLUGO; -- 2.45.2
Re: [PATCH v4] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hi Paolo, On 6/27/24 6:14 오후, Paolo Abeni wrote: > On Tue, 2024-06-25 at 02:33 +0900, ysk...@gmail.com wrote: >> From: Yunseong Kim >> >> In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from >> >> qdisc->dev_queue->dev ->name >> >> This situation simulated from bunch of veths and Bluetooth disconnection >> and reconnection. >> >> During qdisc initialization, qdisc was being set to noop_queue. >> In veth_init_queue, the initial tx_num was reduced back to one, >> causing the qdisc reset to be called with noop, which led to the kernel >> panic. >> >> I've attached the GitHub gist link that C converted syz-execprogram >> source code and 3 log of reproduced vmcore-dmesg. >> >> https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 >> >> Yeoreum and I use two fuzzing tool simultaneously. >> >> One process with syz-executor : https://github.com/google/syzkaller >> >> $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ >> -enable=none -collide=false log1 >> >> The other process with perf fuzzer: >> https://github.com/deater/perf_event_tests/tree/master/fuzzer >> >> $ perf_event_tests/fuzzer/perf_fuzzer >> >> I think this will happen on the kernel version. >> >> Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. >> >> This occurred from 51270d573a8d. I think this patch is absolutely >> necessary. Previously, It was showing not intended string value of name. >> >> I've reproduced 3 time from my fedora 40 Debug Kernel with any other module >> or patched. >> >> version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug >> >> [ 5301.595872] KASAN: null-ptr-deref in range >> [0x0130-0x0137] >> [ 5301.595877] Mem abort info: >> [ 5301.595881] ESR = 0x9606 >> [ 5301.595885] EC = 0x25: DABT (current EL), IL = 32 bits >> [ 5301.595889] SET = 0, FnV = 0 >> [ 5301.595893] EA = 0, S1PTW = 0 >> [ 5301.595896] FSC = 0x06: level 2 translation fault >> [ 5301.595900] Data abort info: >> [ 5301.595903] ISV = 0, ISS = 0x0006, ISS2 = 0x >> [ 5301.595907] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >> [ 5301.595911] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >> [ 5301.595915] [dfff8026] address between user and kernel address >> ranges >> [ 5301.595971] Internal error: Oops: 9606 [#1] SMP >> … >> [ 5301.596076] CPU: 2 PID: 102769 Comm: >> syz-executor.3 Kdump: loaded Tainted: >> GW --- --- >> 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug #1 >> [ 5301.596080] Hardware name: VMware, Inc. VMware20,1/VBSA, >> BIOS VMW201.00V.21805430.BA64.2305221830 05/22/2023 >> [ 5301.596082] pstate: 0145 (nzcv daif +PAN -UAO -TCO +DIT -SSBS >> BTYPE=--) >> [ 5301.596085] pc : strnlen+0x40/0x88 >> [ 5301.596114] lr : trace_event_get_offsets_qdisc_reset+0x6c/0x2b0 >> [ 5301.596124] sp : 8000beef6b40 >> [ 5301.596126] x29: 8000beef6b40 x28: dfff8000 x27: >> 0001 >> [ 5301.596131] x26: 6de1800082c62bd0 x25: 1000110aa9e0 x24: >> 800088554f00 >> [ 5301.596136] x23: 800088554ec0 x22: 0130 x21: >> 0140 >> [ 5301.596140] x20: dfff8000 x19: 8000beef6c60 x18: >> 7000115106d8 >> [ 5301.596143] x17: 800121bad000 x16: 80008002 x15: >> 0006 >> [ 5301.596147] x14: 0002 x13: 0001f3ed8d14 x12: >> 700017ddeda5 >> [ 5301.596151] x11: 100017ddeda4 x10: 700017ddeda4 x9 : >> 800082cc5eec >> [ 5301.596155] x8 : 0004 x7 : f1f1f1f1 x6 : >> f2f2f200 >> [ 5301.596158] x5 : f3f3f3f3 x4 : 700017dded80 x3 : >> f204f1f1 >> [ 5301.596162] x2 : 0026 x1 : x0 : >> 0130 >> [ 5301.596166] Call trace: >> [ 5301.596175] strnlen+0x40/0x88 >> [ 5301.596179] trace_event_get_offsets_qdisc_reset+0x6c/0x2b0 >> [ 5301.596182] perf_trace_qdisc_reset+0xb0/0x538 >> [ 5301.596184] __traceiter_qdisc_reset+0x68/0xc0 >> [ 5301.596188] qdisc_reset+0x43c/0x5e8 >> [ 5301.596190] netif_set_real_num_tx_queues+0x288/0x770 >> [ 5301.596194] veth_init_queues+0xfc/0x130 [veth] >> [ 5301.596198] veth_newlink+0x45c/0x850 [veth] >> [ 5301.596202] rtnl_newlink_create+0x2c8/0x798 >> [ 5301.596205] __rtnl_newlink+0x92c/0xb60 >> [ 5301.596208] rtnl_newlink+0xd8/0x130 >> [ 5301.596211] rtnetlink_rcv_msg+0x2e0/0x890 >> [ 5301.596214] netlink_rcv_skb+0x1c4/0x380 >> [ 5301.596225] rtnetlink_rcv+0x20/0x38 >> [ 5301.596227] netlink_unicast+0x3c8/0x640 >> [ 5301.596231] netlink_sendmsg+0x658/0xa60 >> [ 5301.596234] __sock_sendmsg+0xd0/0x180 >> [ 5301.596243] __sys_sendto+0x1c0/0x280 >> [ 5301.596246] __arm64_sys_sendto+0xc8/0x150 >> [ 5301.596249] invoke_syscall+0xdc/0x268 >> [ 5301.596256] el0_svc_common.constprop.0+0x16c/0x240 >> [ 5301.596259] do_el0_svc+0x48/0x68 >> [ 5301.596261] el0_svc+0x50/0x188 >> [ 5301.596265] el0t_64_sync_handler+0x120/0x130 >> [
[PATCH] remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
The current code doesn't check whether platform_get_resource_byname() succeeded to get the l1tcm memory, which is optional, before attempting to map it. This results in the following error message when it is missing: mtk-scp 1050.scp: error -EINVAL: invalid resource (null) Add a check so that the remapping is only attempted if the memory region exists. This also allows to simplify the logic handling failure to remap, since a failure then is always a failure. Fixes: ca23ecfdbd44 ("remoteproc/mediatek: support L1TCM") Signed-off-by: Nícolas F. R. A. Prado --- drivers/remoteproc/mtk_scp.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c index b885a9a041e4..b17757900cd7 100644 --- a/drivers/remoteproc/mtk_scp.c +++ b/drivers/remoteproc/mtk_scp.c @@ -1344,14 +1344,12 @@ static int scp_probe(struct platform_device *pdev) /* l1tcm is an optional memory region */ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "l1tcm"); - scp_cluster->l1tcm_base = devm_ioremap_resource(dev, res); - if (IS_ERR(scp_cluster->l1tcm_base)) { - ret = PTR_ERR(scp_cluster->l1tcm_base); - if (ret != -EINVAL) - return dev_err_probe(dev, ret, "Failed to map l1tcm memory\n"); + if (res) { + scp_cluster->l1tcm_base = devm_ioremap_resource(dev, res); + if (IS_ERR(scp_cluster->l1tcm_base)) + return dev_err_probe(dev, PTR_ERR(scp_cluster->l1tcm_base), +"Failed to map l1tcm memory\n"); - scp_cluster->l1tcm_base = NULL; - } else { scp_cluster->l1tcm_size = resource_size(res); scp_cluster->l1tcm_phys = res->start; } --- base-commit: 0fc4bfab2cd45f9acb86c4f04b5191e114e901ed change-id: 20240627-scp-invalid-resource-l1tcm-9f7cf45c17e6 Best regards, -- Nícolas F. R. A. Prado
[PATCH v2 1/2] dt-bindings: arm: qcom: Document samsung,ms013g
Document samsung,ms013g for Galaxy Grand 2. Signed-off-by: Raymond Hackley --- Documentation/devicetree/bindings/arm/qcom.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/arm/qcom.yaml b/Documentation/devicetree/bindings/arm/qcom.yaml index ec1c10a12470..7990e7e27542 100644 --- a/Documentation/devicetree/bindings/arm/qcom.yaml +++ b/Documentation/devicetree/bindings/arm/qcom.yaml @@ -141,6 +141,7 @@ properties: - microsoft,makepeace - microsoft,moneypenny - motorola,falcon + - samsung,ms013g - samsung,s3ve3g - const: qcom,msm8226 -- 2.39.2
[PATCH v2 2/2] ARM: dts: qcom-msm8226-samsung-ms013g: Add initial device tree
Samsung Galaxy Grand 2 is a phone based on MSM8226. It's similar to the other Samsung devices based on MSM8226 with only a few minor differences. The device trees contain initial support with: - GPIO keys - Regulator haptic - SDHCI (internal and external storage) - UART (on USB connector via the TI TSU6721 MUIC) - Regulators - Touchscreen - Accelerometer Signed-off-by: Raymond Hackley --- arch/arm/boot/dts/qcom/Makefile | 1 + .../dts/qcom/qcom-msm8226-samsung-ms013g.dts | 386 ++ 2 files changed, 387 insertions(+) create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8226-samsung-ms013g.dts diff --git a/arch/arm/boot/dts/qcom/Makefile b/arch/arm/boot/dts/qcom/Makefile index ccd4ce6353df..f06c6d425e91 100644 --- a/arch/arm/boot/dts/qcom/Makefile +++ b/arch/arm/boot/dts/qcom/Makefile @@ -28,6 +28,7 @@ dtb-$(CONFIG_ARCH_QCOM) += \ qcom-msm8226-microsoft-dempsey.dtb \ qcom-msm8226-microsoft-makepeace.dtb \ qcom-msm8226-microsoft-moneypenny.dtb \ + qcom-msm8226-samsung-ms013g.dtb \ qcom-msm8226-samsung-s3ve3g.dtb \ qcom-msm8660-surf.dtb \ qcom-msm8916-samsung-e5.dtb \ diff --git a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-ms013g.dts b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-ms013g.dts new file mode 100644 index ..190b52fda634 --- /dev/null +++ b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-ms013g.dts @@ -0,0 +1,386 @@ +// SPDX-License-Identifier: BSD-3-Clause + +/dts-v1/; + +#include "qcom-msm8226.dtsi" +#include "pm8226.dtsi" + +/delete-node/ _region; + +/ { + model = "Samsung Galaxy Grand 2"; + compatible = "samsung,ms013g", "qcom,msm8226"; + chassis-type = "handset"; + + aliases { + mmc0 = _1; /* SDC1 eMMC slot */ + mmc1 = _2; /* SDC2 SD card slot */ + serial0 = _uart3; + }; + + chosen { + stdout-path = "serial0:115200n8"; + }; + + gpio-hall-sensor { + compatible = "gpio-keys"; + + pinctrl-0 = <_hall_sensor_default>; + pinctrl-names = "default"; + + label = "GPIO Hall Effect Sensor"; + + event-hall-sensor { + label = "Hall Effect Sensor"; + gpios = < 50 GPIO_ACTIVE_LOW>; + linux,input-type = ; + linux,code = ; + linux,can-disable; + }; + }; + + gpio-keys { + compatible = "gpio-keys"; + + pinctrl-0 = <_keys_default>; + pinctrl-names = "default"; + + label = "GPIO Buttons"; + + button-volume-up { + label = "Volume Up"; + gpios = < 106 GPIO_ACTIVE_LOW>; + linux,code = ; + }; + + button-volume-down { + label = "Volume Down"; + gpios = < 107 GPIO_ACTIVE_LOW>; + linux,code = ; + }; + + button-home { + label = "Home Key"; + gpios = < 108 GPIO_ACTIVE_LOW>; + linux,code = ; + }; + }; + + haptic { + compatible = "regulator-haptic"; + haptic-supply = <_motor_vdd>; + min-microvolt = <330>; + max-microvolt = <330>; + }; + + reg_motor_vdd: regulator-motor-vdd { + compatible = "regulator-fixed"; + regulator-name = "motor_vdd"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + + gpio = < 111 GPIO_ACTIVE_HIGH>; + enable-active-high; + + pinctrl-0 = <_en_default>; + pinctrl-names = "default"; + }; + + reg_vdd_tsp_a: regulator-vdd-tsp-a { + compatible = "regulator-fixed"; + regulator-name = "tsp_3p3v"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + + gpio = < 31 GPIO_ACTIVE_HIGH>; + enable-active-high; + + pinctrl-0 = <_en_default>; + pinctrl-names = "default"; + }; + + reserved-memory { + smem_region: smem@fa0 { + reg = <0x0fa0 0x10>; + no-map; + }; + }; +}; + +_i2c2 { + status = "okay"; + + accelerometer@18 { + compatible = "bosch,bma255"; + reg = <0x18>; + interrupts-extended = < 64 IRQ_TYPE_EDGE_RISING>; + + vdd-supply = <_l19>; + vddio-supply = <_lvs1>; + + pinctrl-0 = <_int_default>; + pinctrl-names = "default"; + + mount-matrix = "0", "1", "0", +
[PATCH v2 0/2] ARM: dts: qcom-msm8226-samsung-ms013g: Add initial device tree
Samsung Galaxy Grand 2 is a phone based on MSM8226. It's similar to the other Samsung devices based on MSM8226 with only a few minor differences. The device trees contain initial support with: - GPIO keys - Regulator haptic - SDHCI (internal and external storage) - UART (on USB connector via the TI TSU6721 MUIC) - Regulators - Touchscreen - Accelerometer --- v2: Adjust l3, l15, l22 and l27 regulator voltages. Sort nodes. Set regulator-allow-set-load for vqmmc supplies.
[PATCH v6] remoteproc: xlnx: add attach detach support
It is possible that remote processor is already running before linux boot or remoteproc platform driver probe. Implement required remoteproc framework ops to provide resource table address and connect or disconnect with remote processor in such case. Signed-off-by: Tanmay Shah --- Changes in v6: - Move rproc state check to add_tcm_carveout - free node reference using of_node_put - fix iounmap use Changes in v5: - Fix comment on assigning DETACHED state to remoteproc instance during driver probe. - Fix patch subject and remove "drivers" Changes in v4: - Move change log out of commit text Changes in v3: - Drop SRAM patch from the series - Change type from "struct resource_table *" to void __iomem * - Change comment format from /** to /* - Remove unmap of resource table va address during detach, allowing attach-detach-reattach use case. - Unmap rsc_data_va after retrieving resource table data structure. - Unmap resource table va during driver remove op Changes in v2: - Fix typecast warnings reported using sparse tool. - Fix following sparse warnings: drivers/remoteproc/xlnx_r5_remoteproc.c | 151 1 file changed, 151 insertions(+) diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c index 84243d1dff9f..596f3ffb8935 100644 --- a/drivers/remoteproc/xlnx_r5_remoteproc.c +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c @@ -25,6 +25,10 @@ /* RX mailbox client buffer max length */ #define MBOX_CLIENT_BUF_MAX(IPI_BUF_LEN_MAX + \ sizeof(struct zynqmp_ipi_message)) + +#define RSC_TBL_XLNX_MAGIC ((uint32_t)'x' << 24 | (uint32_t)'a' << 16 | \ +(uint32_t)'m' << 8 | (uint32_t)'p') + /* * settings for RPU cluster mode which * reflects possible values of xlnx,cluster-mode dt-property @@ -73,6 +77,26 @@ struct mbox_info { struct mbox_chan *rx_chan; }; +/** + * struct rsc_tbl_data + * + * Platform specific data structure used to sync resource table address. + * It's important to maintain order and size of each field on remote side. + * + * @version: version of data structure + * @magic_num: 32-bit magic number. + * @comp_magic_num: complement of above magic number + * @rsc_tbl_size: resource table size + * @rsc_tbl: resource table address + */ +struct rsc_tbl_data { + const int version; + const u32 magic_num; + const u32 comp_magic_num; + const u32 rsc_tbl_size; + const uintptr_t rsc_tbl; +} __packed; + /* * Hardcoded TCM bank values. This will stay in driver to maintain backward * compatibility with device-tree that does not have TCM information. @@ -95,20 +119,24 @@ static const struct mem_bank_data zynqmp_tcm_banks_lockstep[] = { /** * struct zynqmp_r5_core * + * @rsc_tbl_va: resource table virtual address * @dev: device of RPU instance * @np: device node of RPU instance * @tcm_bank_count: number TCM banks accessible to this RPU * @tcm_banks: array of each TCM bank data * @rproc: rproc handle + * @rsc_tbl_size: resource table size retrieved from remote * @pm_domain_id: RPU CPU power domain id * @ipi: pointer to mailbox information */ struct zynqmp_r5_core { + void __iomem *rsc_tbl_va; struct device *dev; struct device_node *np; int tcm_bank_count; struct mem_bank_data **tcm_banks; struct rproc *rproc; + u32 rsc_tbl_size; u32 pm_domain_id; struct mbox_info *ipi; }; @@ -557,6 +585,14 @@ static int add_tcm_banks(struct rproc *rproc) dev_dbg(dev, "TCM carveout %s addr=%llx, da=0x%x, size=0x%lx", bank_name, bank_addr, da, bank_size); + /* +* In DETACHED state firmware is already running so no need to +* request add TCM registers. However, request TCM PD node to let +* platform management firmware know that TCM is in use. +*/ + if (rproc->state == RPROC_DETACHED) + continue; + rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr, bank_size, da, tcm_mem_map, tcm_mem_unmap, @@ -662,6 +698,107 @@ static int zynqmp_r5_rproc_unprepare(struct rproc *rproc) return 0; } +static struct resource_table *zynqmp_r5_get_loaded_rsc_table(struct rproc *rproc, +size_t *size) +{ + struct zynqmp_r5_core *r5_core; + + r5_core = rproc->priv; + + *size = r5_core->rsc_tbl_size; + + return (struct resource_table *)r5_core->rsc_tbl_va; +} + +static int zynqmp_r5_get_rsc_table_va(struct zynqmp_r5_core *r5_core) +{ + struct resource_table *rsc_tbl_addr; + struct device *dev = r5_core->dev; + struct rsc_tbl_data *rsc_data_va; + struct resource
Re: [PATCH v4 00/11] riscv: Memory Hot(Un)Plug support
Hello: This series was applied to riscv/linux.git (for-next) by Palmer Dabbelt : On Wed, 5 Jun 2024 13:40:43 +0200 you wrote: > From: Björn Töpel > > > Memory Hot(Un)Plug support (and ZONE_DEVICE) for the RISC-V port > > > (For the restless folks: change log in the bottom!) > > [...] Here is the summary with links: - [v4,01/11] riscv: mm: Properly forward vmemmap_populate() altmap parameter https://git.kernel.org/riscv/c/e3ecf2fdc8f3 - [v4,02/11] riscv: mm: Pre-allocate vmemmap/direct map/kasan PGD entries https://git.kernel.org/riscv/c/66673099f734 - [v4,03/11] riscv: mm: Change attribute from __init to __meminit for page functions https://git.kernel.org/riscv/c/fe122b89da67 - [v4,04/11] riscv: mm: Refactor create_linear_mapping_range() for memory hot add https://git.kernel.org/riscv/c/007480fe84a9 - [v4,05/11] riscv: mm: Add pfn_to_kaddr() implementation https://git.kernel.org/riscv/c/6e6c5e21b8cb - [v4,06/11] riscv: mm: Add memory hotplugging support https://git.kernel.org/riscv/c/c75a74f4ba19 - [v4,07/11] riscv: mm: Take memory hotplug read-lock during kernel page table dump https://git.kernel.org/riscv/c/37992b7f1097 - [v4,08/11] riscv: Enable memory hotplugging for RISC-V https://git.kernel.org/riscv/c/f8c2a240556e - [v4,09/11] virtio-mem: Enable virtio-mem for RISC-V https://git.kernel.org/riscv/c/0546d7043e55 - [v4,10/11] riscv: mm: Add support for ZONE_DEVICE https://git.kernel.org/riscv/c/216e04bf1e4d - [v4,11/11] riscv: Enable DAX VMEMMAP optimization https://git.kernel.org/riscv/c/4705c1571ad3 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH -fixes] riscv: patch: Flush the icache right after patching to avoid illegal insns
Hello: This patch was applied to riscv/linux.git (fixes) by Palmer Dabbelt : On Mon, 24 Jun 2024 10:21:41 +0200 you wrote: > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. > > [...] Here is the summary with links: - [-fixes] riscv: patch: Flush the icache right after patching to avoid illegal insns https://git.kernel.org/riscv/c/edf2d546bfd6 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH -fixes] riscv: patch: Flush the icache right after patching to avoid illegal insns
On Mon, 24 Jun 2024 01:21:41 PDT (-0700), alexgh...@rivosinc.com wrote: We cannot delay the icache flush after patching some functions as we may have patched a function that will get called before the icache flush. The only way to completely avoid such scenario is by flushing the icache as soon as we patch a function. This will probably be costly as we don't batch the icache maintenance anymore. Ya, it's going to be pretty miserable for performance. We'd talked about using objtool for the static rewriting a few weeks ago in the patchwork meeting, but with the dynamic rewriting suffering from similar issues it seems best to just pick this one up. We can always sort out the performance isuses later, at least this is correct. Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") Reported-by: Conor Dooley Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ Signed-off-by: Alexandre Ghiti --- arch/riscv/kernel/ftrace.c | 7 ++- arch/riscv/kernel/patch.c | 26 ++ 2 files changed, 20 insertions(+), 13 deletions(-) diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 87cbd86576b2..4b95c574fd04 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -120,9 +120,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); mutex_unlock(_mutex); - if (!mod) - local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); - return out; } @@ -156,9 +153,9 @@ static int __ftrace_modify_code(void *data) } else { while (atomic_read(>cpu_count) <= num_online_cpus()) cpu_relax(); - } - local_flush_icache_all(); + local_flush_icache_all(); + } return 0; } diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 4007563fb607..ab03732d06c4 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -89,6 +89,14 @@ static int __patch_insn_set(void *addr, u8 c, size_t len) memset(waddr, c, len); + /* +* We could have just patched a function that is about to be +* called so make sure we don't execute partially patched +* instructions by flushing the icache as soon as possible. +*/ + local_flush_icache_range((unsigned long)waddr, +(unsigned long)waddr + len); + patch_unmap(FIX_TEXT_POKE0); if (across_pages) @@ -135,6 +143,14 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len) ret = copy_to_kernel_nofault(waddr, insn, len); + /* +* We could have just patched a function that is about to be +* called so make sure we don't execute partially patched +* instructions by flushing the icache as soon as possible. +*/ + local_flush_icache_range((unsigned long)waddr, +(unsigned long)waddr + len); + patch_unmap(FIX_TEXT_POKE0); if (across_pages) @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) ret = patch_insn_set(tp, c, len); - if (!ret) - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); - return ret; } NOKPROBE_SYMBOL(patch_text_set_nosync); @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) ret = patch_insn_write(tp, insns, len); - if (!ret) - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); - return ret; } NOKPROBE_SYMBOL(patch_text_nosync); @@ -253,9 +263,9 @@ static int patch_text_cb(void *data) } else { while (atomic_read(>cpu_count) <= num_online_cpus()) cpu_relax(); - } - local_flush_icache_all(); + local_flush_icache_all(); + } return ret; }
Re: [PATCH] LoongArch: uprobes: make UPROBE_SWBP_INSN/UPROBE_XOLBP_INSN constant
On 06/27, Andrii Nakryiko wrote: > > Acked-by: Andrii Nakryiko Thanks! > > --- a/arch/loongarch/kernel/uprobes.c > > +++ b/arch/loongarch/kernel/uprobes.c > > @@ -7,6 +7,14 @@ > > > > #define UPROBE_TRAP_NR UINT_MAX > > > > +static __init int check_emit_break(void) > > +{ > > + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); > > + BUG_ON(UPROBE_XOLBP_INSN != larch_insn_gen_break(BRK_UPROBE_XOLBP)); > > + return 0; > > +} > > +arch_initcall(check_emit_break); > > + > > I wouldn't even bother with this, but whatever. Agreed, this looks a bit ugly. I did this only because I can not test this (hopefully trivial) patch and the maintainers didn't reply. If LoongArch boots at least once with this change, this run-time check can be removed. And just in case... I didn't dare to make a more "generic" change, but perhaps KPROBE_BP_INSN and KPROBE_SSTEPBP_INSN should be redefined the same way for micro-optimization. In this case __emit_break() should be probably moved into arch/loongarch/include/asm/inst.h. Oleg.
[PATCH][next] firewire: core: Fix spelling mistakes in tracepoint messages
There are two spelling mistakes in the tracepoint message text. Fix them. Signed-off-by: Colin Ian King --- include/trace/events/firewire.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/trace/events/firewire.h b/include/trace/events/firewire.h index d9158a134beb..86330ba58336 100644 --- a/include/trace/events/firewire.h +++ b/include/trace/events/firewire.h @@ -853,7 +853,7 @@ DECLARE_EVENT_CLASS(isoc_single_completions_template, memcpy(__get_dynamic_array(header), header, __get_dynamic_array_len(header)); ), TP_printk( - "context=0x%llx card_index=%u timestap=0x%04x cause=%s header=%s", + "context=0x%llx card_index=%u timestamp=0x%04x cause=%s header=%s", __entry->context, __entry->card_index, __entry->timestamp, @@ -891,7 +891,7 @@ TRACE_EVENT(isoc_inbound_multiple_completions, __entry->cause = cause; ), TP_printk( - "context=0x%llx card_index=%u comleted=%u cause=%s", + "context=0x%llx card_index=%u completed=%u cause=%s", __entry->context, __entry->card_index, __entry->completed, -- 2.39.2
Re: [PATCH] LoongArch: uprobes: make UPROBE_SWBP_INSN/UPROBE_XOLBP_INSN constant
On Thu, Jun 27, 2024 at 9:04 AM Oleg Nesterov wrote: > > LoongArch defines UPROBE_SWBP_INSN as a function call and this breaks > arch_uprobe_trampoline() which uses it to initialize a static variable. > > Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe") > Reported-by: Nathan Chancellor > Closes: https://lore.kernel.org/all/20240614174822.GA1185149@thelio-3990X/ > Suggested-by: Andrii Nakryiko > Signed-off-by: Oleg Nesterov > --- > arch/loongarch/include/asm/uprobes.h | 6 -- > arch/loongarch/kernel/uprobes.c | 8 > 2 files changed, 12 insertions(+), 2 deletions(-) > LGTM. Acked-by: Andrii Nakryiko > diff --git a/arch/loongarch/include/asm/uprobes.h > b/arch/loongarch/include/asm/uprobes.h > index c8f59983f702..18221eb9a8b0 100644 > --- a/arch/loongarch/include/asm/uprobes.h > +++ b/arch/loongarch/include/asm/uprobes.h > @@ -6,13 +6,15 @@ > > typedef u32 uprobe_opcode_t; > > +#define __emit_break(imm) (uprobe_opcode_t)((imm) | (break_op << 15)) > + > #define MAX_UINSN_BYTES8 > #define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES > > -#define UPROBE_SWBP_INSN larch_insn_gen_break(BRK_UPROBE_BP) > +#define UPROBE_SWBP_INSN __emit_break(BRK_UPROBE_BP) > #define UPROBE_SWBP_INSN_SIZE LOONGARCH_INSN_SIZE > > -#define UPROBE_XOLBP_INSN larch_insn_gen_break(BRK_UPROBE_XOLBP) > +#define UPROBE_XOLBP_INSN __emit_break(BRK_UPROBE_XOLBP) > this looks correct (but based on pure code inspection) > struct arch_uprobe { > unsigned long resume_era; > diff --git a/arch/loongarch/kernel/uprobes.c b/arch/loongarch/kernel/uprobes.c > index 87abc7137b73..90462d94c28f 100644 > --- a/arch/loongarch/kernel/uprobes.c > +++ b/arch/loongarch/kernel/uprobes.c > @@ -7,6 +7,14 @@ > > #define UPROBE_TRAP_NR UINT_MAX > > +static __init int check_emit_break(void) > +{ > + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); > + BUG_ON(UPROBE_XOLBP_INSN != larch_insn_gen_break(BRK_UPROBE_XOLBP)); > + return 0; > +} > +arch_initcall(check_emit_break); > + I wouldn't even bother with this, but whatever. > int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, > struct mm_struct *mm, unsigned long addr) > { > -- > 2.25.1.362.g51ebf55 > >
Re: [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs
On Thu, Jun 27, 2024 at 6:04 AM Masami Hiramatsu wrote: > > On Mon, 24 Jun 2024 17:21:38 -0700 > Andrii Nakryiko wrote: > > > -static int __uprobe_register(struct inode *inode, loff_t offset, > > - loff_t ref_ctr_offset, struct uprobe_consumer > > *uc) > > +int uprobe_register_batch(struct inode *inode, int cnt, > > + uprobe_consumer_fn get_uprobe_consumer, void *ctx) > > Is this interface just for avoiding memory allocation? Can't we just > allocate a temporary array of *uprobe_consumer instead? Yes, exactly, to avoid the need for allocating another array that would just contain pointers to uprobe_consumer. Consumers would never just have an array of `struct uprobe_consumer *`, because uprobe_consumer struct is embedded in some other struct, so the array interface isn't the most convenient. If you feel strongly, I can do an array, but this necessitates allocating an extra array *and keeping it* for the entire duration of BPF multi-uprobe link (attachment) existence, so it feels like a waste. This is because we don't want to do anything that can fail in the detachment logic (so no temporary array allocation there). Anyways, let me know how you feel about keeping this callback. > > Thank you, > > -- > Masami Hiramatsu (Google)
Re: [PATCH 04/12] uprobes: revamp uprobe refcounting and lifetime management
On Wed, Jun 26, 2024 at 7:30 PM Masami Hiramatsu wrote: > > On Mon, 24 Jun 2024 17:21:36 -0700 > Andrii Nakryiko wrote: > > > Anyways, under exclusive writer lock, we double-check that refcount > > didn't change and is still zero. If it is, we proceed with destruction, > > because at that point we have a guarantee that find_active_uprobe() > > can't successfully look up this uprobe instance, as it's going to be > > removed in destructor under writer lock. If, on the other hand, > > find_active_uprobe() managed to bump refcount from zero to one in > > between put_uprobe()'s atomic_dec_and_test(>ref) and > > write_lock(_treelock), we'll deterministically detect this with > > extra atomic_read(>ref) check, and if it doesn't hold, we > > pretend like atomic_dec_and_test() never returned true. There is no > > resource freeing or any other irreversible action taken up till this > > point, so we just exit early. > > > > One tricky part in the above is actually two CPUs racing and dropping > > refcnt to zero, and then attempting to free resources. This can happen > > as follows: > > - CPU #0 drops refcnt from 1 to 0, and proceeds to grab uprobes_treelock; > > - before CPU #0 grabs a lock, CPU #1 updates refcnt as 0 -> 1 -> 0, at > > which point it decides that it needs to free uprobe as well. > > > > At this point both CPU #0 and CPU #1 will believe they need to destroy > > uprobe, which is obviously wrong. To prevent this situations, we augment > > refcount with epoch counter, which is always incremented by 1 on either > > get or put operation. This allows those two CPUs above to disambiguate > > who should actually free uprobe (it's the CPU #1, because it has > > up-to-date epoch). See comments in the code and note the specific values > > of UPROBE_REFCNT_GET and UPROBE_REFCNT_PUT constants. Keep in mind that > > a single atomi64_t is actually a two sort-of-independent 32-bit counters > > that are incremented/decremented with a single atomic_add_and_return() > > operation. Note also a small and extremely rare (and thus having no > > effect on performance) need to clear the highest bit every 2 billion > > get/put operations to prevent high 32-bit counter from "bleeding over" > > into lower 32-bit counter. > > I have a question here. > Is there any chance to the CPU#1 to put the uprobe before CPU#0 gets > the uprobes_treelock, and free uprobe before CPU#0 validate uprobe->ref > again? e.g. > > CPU#0 CPU#1 > > put_uprobe() { > atomic64_add_return() > __get_uprobe(); > put_uprobe() { > kfree(uprobe) > } > write_lock(_treelock); > atomic64_read(>ref); > } > > I think it is very rare case, but I could not find any code to prevent > this scenario. > Yes, I think you are right, great catch! I concentrated on preventing double kfree() in this situation, and somehow convinced myself that eager kfree() is fine. But I think I'll need to delay freeing, probably with RCU. The problem is that we can't use rcu_read_lock()/rcu_read_unlock() because we take locks, so it has to be a sleepable variant of RCU. I'm thinking of using rcu_read_lock_trace(), the same variant of RCU we use for sleepable BPF programs (including sleepable uprobes). srcu might be too heavy for this. I'll try a few variants over the next few days and see how the performance looks. > Thank you, > > > -- > Masami Hiramatsu (Google) >
Re: [PATCH v3 1/2] rust: add static_key_false
On Thu, Jun 27, 2024 at 10:34:39AM +0200, Alice Ryhl wrote: > On Tue, Jun 25, 2024 at 6:18 PM Boqun Feng wrote: > > > > Hi Alice, > > > > On Fri, Jun 21, 2024 at 10:35:26AM +, Alice Ryhl wrote: > > > Add just enough support for static key so that we can use it from > > > tracepoints. Tracepoints rely on `static_key_false` even though it is > > > deprecated, so we add the same functionality to Rust. > > > > > > It is not possible to use the existing C implementation of > > > arch_static_branch because it passes the argument `key` to inline > > > assembly as an 'i' parameter, so any attempt to add a C helper for this > > > function will fail to compile because the value of `key` must be known > > > at compile-time. > > > > > > Signed-off-by: Alice Ryhl > > > > [Add linux-arch, and related arch maintainers Cced] > > > > Since inline asms are touched here, please consider copying linux-arch > > and arch maintainers next time ;-) > > Will do. > > > For x86_64 and arm64 bits: > > > > Acked-by: Boqun Feng > > > > One thing though, we should split the arch-specific impls into different > > files, for example: rust/kernel/arch/arm64.rs or rust/arch/arm64.rs. > > That'll be easier for arch maintainers to watch the Rust changes related > > to a particular architecture. > > Is that how you would prefer to name these files? You don't want > static_key somewhere in the filename? > I could have been more explicit. My preference is (for example ARM64) * we have a rust/kernel/arch.rs, where we do: #[cfg(CONFIG_ARM64)] mod arm64::*; #[cfg(CONFIG_ARM64)] pub use arm64::*; * we have a rust/kernel/arch/arm64.rs: pub(crate) mod jump_label; * we have a rust/kernel/arch/arm64/jump_label.rs, where we put ARM64 arch_static_branch() there. (or static_key.rs and arch_static_key_false()). Then linux-arch can watch: rust/kernel/arch.rs rust/kernel/arch/ And ARM64 maintainers can watch: rust/kernel/arch/arm64.rs rust/kernel/arch/arm64 This is similar to how are organized today. Does this make sense? Regards, Boqun > > Another thought is that, could you implement an arch_static_branch!() > > (instead of _static_key_false!()) and use it for static_key_false!() > > similar to what we have in C? The benefit is that at least for myself > > it'll be easier to compare the implementation between C and Rust. > > I can try to include that. > > Alice
Re: [PATCH v2 4/4] EDAC/mce_amd: Add support for FRU Text in MCA
On Wed, Jun 26, 2024 at 08:20:13PM +0200, Borislav Petkov wrote: > On Wed, Jun 26, 2024 at 01:00:30PM -0500, Naik, Avadhut wrote: > > > > > > Why are you clearing it if you're overwriting it immediately? > > > > > Since its a local variable, wanted to ensure that the memory is zeroed out > > to prevent > > any issues with the %s specifier, used later on. > > What issues? > > > Would you recommend removing that and using initializer instead for the > > string? > > I'd recommend looking at what the code does and then really thinking whether > that makes any sense. > We need to make sure the string is NULL-terminated. So the memset() could be replaced with this: frutext[16] = '\0'; Or better yet, maybe we can use scnprintf() or similar. Thanks, Yazen
[PATCH] LoongArch: uprobes: make UPROBE_SWBP_INSN/UPROBE_XOLBP_INSN constant
LoongArch defines UPROBE_SWBP_INSN as a function call and this breaks arch_uprobe_trampoline() which uses it to initialize a static variable. Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe") Reported-by: Nathan Chancellor Closes: https://lore.kernel.org/all/20240614174822.GA1185149@thelio-3990X/ Suggested-by: Andrii Nakryiko Signed-off-by: Oleg Nesterov --- arch/loongarch/include/asm/uprobes.h | 6 -- arch/loongarch/kernel/uprobes.c | 8 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/loongarch/include/asm/uprobes.h b/arch/loongarch/include/asm/uprobes.h index c8f59983f702..18221eb9a8b0 100644 --- a/arch/loongarch/include/asm/uprobes.h +++ b/arch/loongarch/include/asm/uprobes.h @@ -6,13 +6,15 @@ typedef u32 uprobe_opcode_t; +#define __emit_break(imm) (uprobe_opcode_t)((imm) | (break_op << 15)) + #define MAX_UINSN_BYTES8 #define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES -#define UPROBE_SWBP_INSN larch_insn_gen_break(BRK_UPROBE_BP) +#define UPROBE_SWBP_INSN __emit_break(BRK_UPROBE_BP) #define UPROBE_SWBP_INSN_SIZE LOONGARCH_INSN_SIZE -#define UPROBE_XOLBP_INSN larch_insn_gen_break(BRK_UPROBE_XOLBP) +#define UPROBE_XOLBP_INSN __emit_break(BRK_UPROBE_XOLBP) struct arch_uprobe { unsigned long resume_era; diff --git a/arch/loongarch/kernel/uprobes.c b/arch/loongarch/kernel/uprobes.c index 87abc7137b73..90462d94c28f 100644 --- a/arch/loongarch/kernel/uprobes.c +++ b/arch/loongarch/kernel/uprobes.c @@ -7,6 +7,14 @@ #define UPROBE_TRAP_NR UINT_MAX +static __init int check_emit_break(void) +{ + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); + BUG_ON(UPROBE_XOLBP_INSN != larch_insn_gen_break(BRK_UPROBE_XOLBP)); + return 0; +} +arch_initcall(check_emit_break); + int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) { -- 2.25.1.362.g51ebf55
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
I've updated the tree at https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/vmclock (but not yet the qemu one). I think I've taken into account all your comments apart from the one about non-64-bit counters wrapping. I reduced the seq_count to 32 bit to make room for a 32-bit flags field, added the time type (UTC/TAI/MONOTONIC) and a smearing hint, with some straw man definitions for smearing algorithms for which I could actually find definitions. The structure now looks like this: struct vmclock_abi { uint32_t magic; #define VMCLOCK_MAGIC 0x4b4c4356 /* "VCLK" */ uint16_t size; /* Size of page containing this structure */ uint16_t version; /* 1 */ /* Sequence lock. Low bit means an update is in progress. */ uint32_t seq_count; uint32_t flags; /* Indicates that the tai_offset_sec field is valid */ #define VMCLOCK_FLAG_TAI_OFFSET_VALID (1 << 0) /* * Optionally used to notify guests of pending maintenance events. * A guest may wish to remove itself from service if an event is * coming up. Two flags indicate the rough imminence of the event. */ #define VMCLOCK_FLAG_DISRUPTION_SOON(1 << 1) /* About a day */ #define VMCLOCK_FLAG_DISRUPTION_IMMINENT(1 << 2) /* About an hour */ /* Indicates that the utc_time_maxerror_picosec field is valid */ #define VMCLOCK_FLAG_UTC_MAXERROR_VALID (1 << 3) /* Indicates counter_period_error_rate_frac_sec is valid */ #define VMCLOCK_FLAG_PERIOD_ERROR_VALID (1 << 4) /* * This field changes to another non-repeating value when the CPU * counter is disrupted, for example on live migration. This lets * the guest know that it should discard any calibration it has * performed of the counter against external sources (NTP/PTP/etc.). */ uint64_t disruption_marker; uint8_t clock_status; #define VMCLOCK_STATUS_UNKNOWN 0 #define VMCLOCK_STATUS_INITIALIZING 1 #define VMCLOCK_STATUS_SYNCHRONIZED 2 #define VMCLOCK_STATUS_FREERUNNING 3 #define VMCLOCK_STATUS_UNRELIABLE 4 uint8_t counter_id; #define VMCLOCK_COUNTER_INVALID 0 #define VMCLOCK_COUNTER_X86_TSC 1 #define VMCLOCK_COUNTER_ARM_VCNT2 #define VMCLOCK_COUNTER_X86_ART 3 /* * By providing the offset from UTC to TAI, the guest can know both * UTC and TAI reliably, whichever is indicated in the time_type * field. Valid if VMCLOCK_FLAG_TAI_OFFSET_VALID is set in flags. */ int16_t tai_offset_sec; /* * The time exposed through this device is never smeaared; if it * claims to be VMCLOCK_TIME_UTC then it MUST be UTC. This field * provides a hint to the guest operating system, such that *if* * the guest OS wants to provide its users with an alternative * clock which does not follow the POSIX CLOCK_REALTIME standard, * it may do so in a fashion consistent with the other systems * in the nearby environment. */ uint8_t leap_second_smearing_hint; /* Provide true UTC to users, unsmeared. */; #define VMCLOCK_SMEARING_NONE 0 /* * https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/ * From noon on the day before to noon on the day after, smear the * clock by a linear 1/86400s per second. */ #define VMCLOCK_SMEARING_LINEAR_86400 1 /* * draft-kuhn-leapsecond-00 * For the 1000s leading up to the leap second, smear the clock by * clock by a linear 1ms per second. */ #define VMCLOCK_SMEARING_UTC_SLS2 /* * What time is exposed in the time_sec/time_frac_sec fields? */ uint8_t time_type; #define VMCLOCK_TIME_UNKNOWN0 /* Invalid / no time exposed */ #define VMCLOCK_TIME_UTC1 /* Since 1970-01-01 00:00:00z */ #define VMCLOCK_TIME_TAI2 /* Since 1970-01-01 00:00:00z */ #define VMCLOCK_TIME_MONOTONIC 3 /* Since undefined epoch */ /* Bit shift for counter_period_frac_sec and its error rate */ uint8_t counter_period_shift; /* * Unlike in NTP, this can indicate a leap second in the past. This * is needed to allow guests to derive an imprecise clock with * smeared leap seconds for themselves, as some modes of smearing * need the adjustments to continue even after the moment at which * the leap second should have occurred. */ int8_t leapsecond_direction; uint64_t leapsecond_tai_sec; /* Since 1970-01-01 00:00:00z */ /* * Paired values of counter and UTC at a given point in time. */ uint64_t
Re: [PATCH] uprobe: Do not use UPROBE_SWBP_INSN as static initializer
On 06/27, Masami Hiramatsu wrote: > > On Thu, 27 Jun 2024 15:44:16 +0200 > Jiri Olsa wrote: > > > Oleg, do you want to send formal patch? > > > > thanks, > > jirka > > Yes, can you send v2 patch? I was waiting for the comments from loongarch maintainers... OK, will do today, but the patch won't be even compile tested. Oleg.
[PATCH v5 8/8] tracing: Convert sys_enter/exit to faultable tracepoints
Convert the definition of the system call enter/exit tracepoints to faultable tracepoints now that all upstream tracers handle it. This allows tracers to fault-in userspace system call arguments such as path strings within their probe callbacks. Link: https://lore.kernel.org/lkml/20231002202531.3160-1-mathieu.desnoy...@efficios.com/ Co-developed-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers Signed-off-by: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes --- Since v4: - Use 'guard(preempt_notrace)'. - Add brackets to multiline 'if' statements. --- include/trace/events/syscalls.h | 4 +-- kernel/trace/trace_syscalls.c | 52 - 2 files changed, 46 insertions(+), 10 deletions(-) diff --git a/include/trace/events/syscalls.h b/include/trace/events/syscalls.h index b6e0cbc2c71f..dc30e3004818 100644 --- a/include/trace/events/syscalls.h +++ b/include/trace/events/syscalls.h @@ -15,7 +15,7 @@ #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS -TRACE_EVENT_FN(sys_enter, +TRACE_EVENT_FN_MAY_FAULT(sys_enter, TP_PROTO(struct pt_regs *regs, long id), @@ -41,7 +41,7 @@ TRACE_EVENT_FN(sys_enter, TRACE_EVENT_FLAGS(sys_enter, TRACE_EVENT_FL_CAP_ANY) -TRACE_EVENT_FN(sys_exit, +TRACE_EVENT_FN_MAY_FAULT(sys_exit, TP_PROTO(struct pt_regs *regs, long ret), diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 9c581d6da843..314666d663b6 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -299,6 +299,12 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id) int syscall_nr; int size; + /* +* Probe called with preemption enabled (may_fault), but ring buffer and +* per-cpu data require preemption to be disabled. +*/ + guard(preempt_notrace)(); + syscall_nr = trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >= NR_syscalls) return; @@ -338,6 +344,12 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret) struct trace_event_buffer fbuffer; int syscall_nr; + /* +* Probe called with preemption enabled (may_fault), but ring buffer and +* per-cpu data require preemption to be disabled. +*/ + guard(preempt_notrace)(); + syscall_nr = trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >= NR_syscalls) return; @@ -376,8 +388,11 @@ static int reg_event_syscall_enter(struct trace_event_file *file, if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return -ENOSYS; mutex_lock(_trace_lock); - if (!tr->sys_refcount_enter) - ret = register_trace_sys_enter(ftrace_syscall_enter, tr); + if (!tr->sys_refcount_enter) { + ret = register_trace_prio_flags_sys_enter(ftrace_syscall_enter, tr, + TRACEPOINT_DEFAULT_PRIO, + TRACEPOINT_MAY_FAULT); + } if (!ret) { rcu_assign_pointer(tr->enter_syscall_files[num], file); tr->sys_refcount_enter++; @@ -414,8 +429,11 @@ static int reg_event_syscall_exit(struct trace_event_file *file, if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return -ENOSYS; mutex_lock(_trace_lock); - if (!tr->sys_refcount_exit) - ret = register_trace_sys_exit(ftrace_syscall_exit, tr); + if (!tr->sys_refcount_exit) { + ret = register_trace_prio_flags_sys_exit(ftrace_syscall_exit, tr, + TRACEPOINT_DEFAULT_PRIO, +TRACEPOINT_MAY_FAULT); + } if (!ret) { rcu_assign_pointer(tr->exit_syscall_files[num], file); tr->sys_refcount_exit++; @@ -582,6 +600,12 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id) int rctx; int size; + /* +* Probe called with preemption enabled (may_fault), but ring buffer and +* per-cpu data require preemption to be disabled. +*/ + guard(preempt_notrace)(); + syscall_nr = trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >= NR_syscalls) return; @@ -630,8 +654,11 @@ static int perf_sysenter_enable(struct trace_event_call *call) num = ((struct syscall_metadata *)call->data)->syscall_nr; mutex_lock(_trace_lock); - if (!sys_perf_refcount_enter) - ret =
[PATCH v5 7/8] tracing/perf: Add support for faultable tracepoints
In preparation for converting system call enter/exit instrumentation into faultable tracepoints, make sure that perf can handle registering to such tracepoints by explicitly disabling preemption within the perf tracepoint probes to respect the current expectations within perf ring buffer code. This change does not yet allow perf to take page faults per se within its probe, but allows its existing probes to connect to faultable tracepoints. Link: https://lore.kernel.org/lkml/20231002202531.3160-1-mathieu.desnoy...@efficios.com/ Co-developed-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers Signed-off-by: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes --- Changes since v4: - Use DEFINE_INACTIVE_GUARD. --- include/trace/perf.h | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/include/trace/perf.h b/include/trace/perf.h index 2c11181c82e0..161e1655b953 100644 --- a/include/trace/perf.h +++ b/include/trace/perf.h @@ -12,8 +12,8 @@ #undef __perf_task #define __perf_task(t) (__task = (t)) -#undef DECLARE_EVENT_CLASS -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ +#undef _DECLARE_EVENT_CLASS +#define _DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print, tp_flags) \ static notrace void\ perf_trace_##call(void *__data, proto) \ { \ @@ -28,6 +28,13 @@ perf_trace_##call(void *__data, proto) \ int __data_size;\ int rctx; \ \ + DEFINE_INACTIVE_GUARD(preempt_notrace, trace_event_guard); \ + \ + if ((tp_flags) & TRACEPOINT_MAY_FAULT) {\ + might_fault(); \ + activate_guard(preempt_notrace, trace_event_guard)(); \ + } \ + \ __data_size = trace_event_get_offsets_##call(&__data_offsets, args); \ \ head = this_cpu_ptr(event_call->perf_events); \ @@ -55,6 +62,17 @@ perf_trace_##call(void *__data, proto) \ head, __task);\ } +#undef DECLARE_EVENT_CLASS +#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ + _DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), \ +PARAMS(tstruct), PARAMS(assign), PARAMS(print), 0) + +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(call, proto, args, tstruct, assign, print) \ + _DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), \ +PARAMS(tstruct), PARAMS(assign), PARAMS(print), \ +TRACEPOINT_MAY_FAULT) + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the -- 2.39.2
[PATCH v5 5/8] tracing/ftrace: Add support for faultable tracepoints
In preparation for converting system call enter/exit instrumentation into faultable tracepoints, make sure that ftrace can handle registering to such tracepoints by explicitly disabling preemption within the ftrace tracepoint probes to respect the current expectations within ftrace ring buffer code. This change does not yet allow ftrace to take page faults per se within its probe, but allows its existing probes to connect to faultable tracepoints. Link: https://lore.kernel.org/lkml/20231002202531.3160-1-mathieu.desnoy...@efficios.com/ Co-developed-by: Michael Jeanson Signed-off-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes --- Changes since v4: - Use DEFINE_INACTIVE_GUARD. - Add brackets to multiline 'if' statements. --- include/trace/trace_events.h | 64 ++-- kernel/trace/trace_events.c | 28 2 files changed, 83 insertions(+), 9 deletions(-) diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index df590eea8ae4..c887f7b6fbe9 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -45,6 +45,16 @@ PARAMS(print)); \ DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); +#undef TRACE_EVENT_MAY_FAULT +#define TRACE_EVENT_MAY_FAULT(name, proto, args, tstruct, assign, print) \ + DECLARE_EVENT_CLASS_MAY_FAULT(name,\ +PARAMS(proto),\ +PARAMS(args), \ +PARAMS(tstruct), \ +PARAMS(assign), \ +PARAMS(print)); \ + DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); + #include "stages/stage1_struct_define.h" #undef DECLARE_EVENT_CLASS @@ -57,6 +67,11 @@ \ static struct trace_event_class event_class_##name; +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(name, proto, args, tstruct, assign, print) \ + DECLARE_EVENT_CLASS(name, PARAMS(proto), PARAMS(args), \ + PARAMS(tstruct), PARAMS(assign), PARAMS(print)) + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) \ static struct trace_event_call __used \ @@ -80,7 +95,7 @@ #undef TRACE_EVENT_FN_MAY_FAULT #define TRACE_EVENT_FN_MAY_FAULT(name, proto, args, tstruct, \ assign, print, reg, unreg) \ - TRACE_EVENT(name, PARAMS(proto), PARAMS(args), \ + TRACE_EVENT_MAY_FAULT(name, PARAMS(proto), PARAMS(args),\ PARAMS(tstruct), PARAMS(assign), PARAMS(print)) \ #undef TRACE_EVENT_FN_COND @@ -123,6 +138,11 @@ tstruct;\ }; +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(call, proto, args, tstruct, assign, print) \ + DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), \ + PARAMS(tstruct), PARAMS(assign), PARAMS(print)) + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) @@ -214,6 +234,11 @@ static struct trace_event_functions trace_event_type_funcs_##call = { \ .trace = trace_raw_output_##call, \ }; +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(call, proto, args, tstruct, assign, print) \ + DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), \ + PARAMS(tstruct), PARAMS(assign), PARAMS(print)) + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, call, proto, args, print) \ static notrace enum print_line_t \ @@ -250,6 +275,11 @@ static struct trace_event_fields trace_event_fields_##call[] = { \ tstruct \ {} }; +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(call, proto, args, tstruct, assign, print) \ + DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), \ + PARAMS(tstruct), PARAMS(assign), PARAMS(print)) + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, name, proto, args, print) @@ -271,6 +301,11 @@ static inline notrace int trace_event_get_offsets_##call( \ return __data_size; \
[PATCH v5 6/8] tracing/bpf-trace: Add support for faultable tracepoints
In preparation for converting system call enter/exit instrumentation into faultable tracepoints, make sure that bpf can handle registering to such tracepoints by explicitly disabling preemption within the bpf tracepoint probes to respect the current expectations within bpf tracing code. This change does not yet allow bpf to take page faults per se within its probe, but allows its existing probes to connect to faultable tracepoints. Link: https://lore.kernel.org/lkml/20231002202531.3160-1-mathieu.desnoy...@efficios.com/ Co-developed-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers Signed-off-by: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes --- Changes since v4: - Use DEFINE_INACTIVE_GUARD. - Add brackets to multiline 'if' statements. --- include/trace/bpf_probe.h | 20 kernel/trace/bpf_trace.c | 12 +--- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index e609cd7da47e..96c1269dd88c 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -42,17 +42,29 @@ /* tracepoints with more than 12 arguments will hit build error */ #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) -#define __BPF_DECLARE_TRACE(call, proto, args) \ +#define __BPF_DECLARE_TRACE(call, proto, args, tp_flags) \ static notrace void\ __bpf_trace_##call(void *__data, proto) \ { \ struct bpf_prog *prog = __data; \ + \ + DEFINE_INACTIVE_GUARD(preempt_notrace, bpf_trace_guard);\ + \ + if ((tp_flags) & TRACEPOINT_MAY_FAULT) {\ + might_fault(); \ + activate_guard(preempt_notrace, bpf_trace_guard)(); \ + } \ + \ CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ } #undef DECLARE_EVENT_CLASS #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ - __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args), 0) + +#undef DECLARE_EVENT_CLASS_MAY_FAULT +#define DECLARE_EVENT_CLASS_MAY_FAULT(call, proto, args, tstruct, assign, print) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args), TRACEPOINT_MAY_FAULT) /* * This part is compiled out, it is only here as a build time check @@ -106,13 +118,13 @@ static inline void bpf_test_buffer_##call(void) \ #undef DECLARE_TRACE #define DECLARE_TRACE(call, proto, args) \ - __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args), 0) \ __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), 0) #undef DECLARE_TRACE_WRITABLE #define DECLARE_TRACE_WRITABLE(call, proto, args, size) \ __CHECK_WRITABLE_BUF_SIZE(call, PARAMS(proto), PARAMS(args), size) \ - __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \ + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args), 0) \ __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), size) #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 192de33d961f..873b0e885677 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2443,9 +2443,15 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog * if (prog->aux->max_tp_access > btp->writable_size) return -EINVAL; - return tracepoint_probe_register_prio_flags(tp, (void *)btp->bpf_func, - prog, TRACEPOINT_DEFAULT_PRIO, - TRACEPOINT_MAY_EXIST); + if (tp->flags & TRACEPOINT_MAY_FAULT) { + return tracepoint_probe_register_prio_flags(tp, (void *)btp->bpf_func, + prog, TRACEPOINT_DEFAULT_PRIO, + TRACEPOINT_MAY_EXIST | TRACEPOINT_MAY_FAULT); + } else { + return
[PATCH v5 2/8] cleanup.h guard: Rename DEFINE_ prefix to DECLARE_
The convention used in other kernel headers (e.g. wait.h, percpu-defs.h) is to use "DECLARE_" prefix for macros emitting externs, static inlines and type definitions. The "DEFINE_" prefix is used for variable definitions. In preparation to introduce a "DEFINE_INACTIVE_GUARD()" to actually define a guard variable, rename all the guard "DEFINE_" prefix to "DECLARE_". Signed-off-by: Mathieu Desnoyers Cc: Peter Zijlstra (Intel) Cc: Ingo Molnar --- drivers/cxl/core/cdat.c | 2 +- drivers/cxl/cxl.h | 2 +- drivers/gpio/gpiolib.h | 2 +- drivers/platform/x86/intel/pmc/core_ssram.c | 2 +- fs/fuse/virtio_fs.c | 2 +- fs/pstore/inode.c | 4 +- include/linux/bitmap.h | 2 +- include/linux/cleanup.h | 56 ++--- include/linux/cpu.h | 2 +- include/linux/cpumask.h | 2 +- include/linux/device.h | 6 +-- include/linux/file.h| 4 +- include/linux/firmware.h| 2 +- include/linux/gpio/driver.h | 4 +- include/linux/iio/iio.h | 4 +- include/linux/irqflags.h| 4 +- include/linux/mutex.h | 6 +-- include/linux/of.h | 2 +- include/linux/pci.h | 4 +- include/linux/percpu.h | 2 +- include/linux/preempt.h | 6 +-- include/linux/rcupdate.h| 2 +- include/linux/rwsem.h | 10 ++-- include/linux/sched/task.h | 4 +- include/linux/slab.h| 4 +- include/linux/spinlock.h| 38 +++--- include/linux/srcu.h| 2 +- include/sound/pcm.h | 6 +-- kernel/sched/core.c | 4 +- kernel/sched/sched.h| 16 +++--- lib/locking-selftest.c | 12 ++--- sound/core/control_led.c| 2 +- 32 files changed, 110 insertions(+), 110 deletions(-) diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c index bb83867d9fec..689143566642 100644 --- a/drivers/cxl/core/cdat.c +++ b/drivers/cxl/core/cdat.c @@ -385,7 +385,7 @@ static void discard_dsmas(struct xarray *xa) } xa_destroy(xa); } -DEFINE_FREE(dsmas, struct xarray *, if (_T) discard_dsmas(_T)) +DECLARE_FREE(dsmas, struct xarray *, if (_T) discard_dsmas(_T)) void cxl_endpoint_parse_cdat(struct cxl_port *port) { diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 036d17db68e0..89cadb029d31 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -737,7 +737,7 @@ struct cxl_root *devm_cxl_add_root(struct device *host, const struct cxl_root_ops *ops); struct cxl_root *find_cxl_root(struct cxl_port *port); void put_cxl_root(struct cxl_root *cxl_root); -DEFINE_FREE(put_cxl_root, struct cxl_root *, if (_T) put_cxl_root(_T)) +DECLARE_FREE(put_cxl_root, struct cxl_root *, if (_T) put_cxl_root(_T)) int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd); void cxl_bus_rescan(void); diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h index 8e0e211ebf08..17507a64c284 100644 --- a/drivers/gpio/gpiolib.h +++ b/drivers/gpio/gpiolib.h @@ -199,7 +199,7 @@ struct gpio_chip_guard { int idx; }; -DEFINE_CLASS(gpio_chip_guard, +DECLARE_CLASS(gpio_chip_guard, struct gpio_chip_guard, srcu_read_unlock(&_T.gdev->srcu, _T.idx), ({ diff --git a/drivers/platform/x86/intel/pmc/core_ssram.c b/drivers/platform/x86/intel/pmc/core_ssram.c index 1bde86c54eb9..115f16448406 100644 --- a/drivers/platform/x86/intel/pmc/core_ssram.c +++ b/drivers/platform/x86/intel/pmc/core_ssram.c @@ -29,7 +29,7 @@ #define LPM_REG_COUNT 28 #define LPM_MODE_OFFSET1 -DEFINE_FREE(pmc_core_iounmap, void __iomem *, iounmap(_T)); +DECLARE_FREE(pmc_core_iounmap, void __iomem *, iounmap(_T)); static u32 pmc_core_find_guid(struct pmc_info *list, const struct pmc_reg_map *map) { diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index bb3e941b9503..d062bafb294a 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -852,7 +852,7 @@ static void virtio_fs_cleanup_dax(void *data) put_dax(dax_dev); } -DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) +DECLARE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) { diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c index 56815799ce79..f34da47d26d4 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -35,7 +35,7 @@ static LIST_HEAD(records_list); static
[PATCH v5 4/8] tracing: Introduce faultable tracepoints
When invoked from system call enter/exit instrumentation, accessing user-space data is a common use-case for tracers. However, tracepoints currently disable preemption around iteration on the registered tracepoint probes and invocation of the probe callbacks, which prevents tracers from handling page faults. Extend the tracepoint and trace event APIs to allow defining a faultable tracepoint which invokes its callback with preemption enabled. Also extend the tracepoint API to allow tracers to request specific probes to be connected to those faultable tracepoints. When the TRACEPOINT_MAY_FAULT flag is provided on registration, the probe callback will be called with preemption enabled, and is allowed to take page faults. Faultable probes can only be registered on faultable tracepoints and non-faultable probes on non-faultable tracepoints. The tasks trace rcu mechanism is used to synchronize read-side marshalling of the registered probes with respect to faultable probes unregistration and teardown. Link: https://lore.kernel.org/lkml/20231002202531.3160-1-mathieu.desnoy...@efficios.com/ Co-developed-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers Signed-off-by: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes --- Changes since v1: - Cleanup __DO_TRACE() implementation. - Rename "sleepable tracepoints" to "faultable tracepoints", MAYSLEEP to MAYFAULT, and use might_fault() rather than might_sleep(), to properly convey that the tracepoints are meant to be able to take a page fault, which requires to be able to sleep *and* to hold the mmap_sem. Changes since v2: - Rename MAYFAULT to MAY_FAULT. - Rebased on 6.5.5. - Introduce MAY_EXIST tracepoint flag. Changes since v3: - Rebased on 6.6.2. Changes since v4: - Rebased on 6.9.6. - Simplify flag check in tracepoint_probe_register_prio_flags(). - Update MAY_EXIST flag description. --- include/linux/tracepoint-defs.h | 14 ++ include/linux/tracepoint.h | 88 +++-- include/trace/define_trace.h| 7 +++ include/trace/trace_events.h| 6 +++ init/Kconfig| 1 + kernel/trace/bpf_trace.c| 5 +- kernel/trace/trace_fprobe.c | 5 +- kernel/tracepoint.c | 65 ++-- 8 files changed, 136 insertions(+), 55 deletions(-) diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h index 4dc4955f0fbf..94e39c86b49f 100644 --- a/include/linux/tracepoint-defs.h +++ b/include/linux/tracepoint-defs.h @@ -29,6 +29,19 @@ struct tracepoint_func { int prio; }; +/** + * enum tracepoint_flags - Tracepoint flags + * @TRACEPOINT_MAY_EXIST: On registration, don't warn if the tracepoint is + *already registered. + * @TRACEPOINT_MAY_FAULT: The tracepoint probe callback will be called with + *preemption enabled, and is allowed to take page + *faults. + */ +enum tracepoint_flags { + TRACEPOINT_MAY_EXIST = (1 << 0), + TRACEPOINT_MAY_FAULT = (1 << 1), +}; + struct tracepoint { const char *name; /* Tracepoint name */ struct static_key key; @@ -39,6 +52,7 @@ struct tracepoint { int (*regfunc)(void); void (*unregfunc)(void); struct tracepoint_func __rcu *funcs; + unsigned int flags; }; #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 689b6d71590e..eaf8c00b30a3 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -41,17 +42,10 @@ extern int tracepoint_probe_register_prio(struct tracepoint *tp, void *probe, void *data, int prio); extern int -tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe, void *data, -int prio); +tracepoint_probe_register_prio_flags(struct tracepoint *tp, void *probe, void *data, + int prio, unsigned int flags); extern int tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data); -static inline int -tracepoint_probe_register_may_exist(struct tracepoint *tp, void *probe, - void *data) -{ - return tracepoint_probe_register_prio_may_exist(tp, probe, data, - TRACEPOINT_DEFAULT_PRIO); -} extern void for_each_kernel_tracepoint(void (*fct)(struct tracepoint *tp, void *priv), void *priv); @@ -90,6 +84,7 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) #ifdef CONFIG_TRACEPOINTS
[PATCH v5 0/8] Faultable Tracepoints
Wire up the system call tracepoints with Tasks Trace RCU to allow the ftrace, perf, and eBPF tracers to handle page faults. This series does the initial wire-up allowing tracers to handle page faults, but leaves out the actual handling of said page faults as future work. I have tested this against a feature branch of lttng-modules which implements handling of page faults for the filename argument of the openat(2) system call. This v5 addresses comments from the previous round of review [1]. Steven Rostedt suggested separating tracepoints into two separate sections. It is unclear how that approach would prove to be an improvement over the currently proposed approach, so those changes were not incorporated. See [2] for my detailed reply. In the previous round, Peter Zijlstra suggested use of SRCU rather than Tasks Trace RCU. See my reply about the distinction between SRCU and Tasks Trace RCU [3] and this explanation from Paul E. McKenney about the purpose of Tasks Trace RCU [4]. The macros DEFINE_INACTIVE_GUARD and activate_guard are added to cleanup.h for use in the __DO_TRACE() macro. Those appear to be more flexible than the guard_if() proposed by Peter Zijlstra in the previous round of review [5]. This series is based on kernel v6.9.6. Thanks, Mathieu Link: https://lore.kernel.org/lkml/20231120205418.334172-1-mathieu.desnoy...@efficios.com/ # [1] Link: https://lore.kernel.org/lkml/e4e9a2bc-1776-4b51-aba4-a147795a5...@efficios.com/ # [2] Link: https://lore.kernel.org/lkml/a0ac5f77-411e-4562-9863-81196238f...@efficios.com/ # [3] Link: https://lore.kernel.org/lkml/ba543d44-9302-4115-ac4f-d4e9f8d98a90@paulmck-laptop/ # [4] Link: https://lore.kernel.org/lkml/20231120221524.gd8...@noisy.programming.kicks-ass.net/ # [5] Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: b...@vger.kernel.org Cc: Joel Fernandes Mathieu Desnoyers (8): cleanup.h: Header include guard should match header name cleanup.h guard: Rename DEFINE_ prefix to DECLARE_ cleanup.h: Introduce DEFINE_INACTIVE_GUARD and activate_guard tracing: Introduce faultable tracepoints tracing/ftrace: Add support for faultable tracepoints tracing/bpf-trace: Add support for faultable tracepoints tracing/perf: Add support for faultable tracepoints tracing: Convert sys_enter/exit to faultable tracepoints drivers/cxl/core/cdat.c | 2 +- drivers/cxl/cxl.h | 2 +- drivers/gpio/gpiolib.h | 2 +- drivers/platform/x86/intel/pmc/core_ssram.c | 2 +- fs/fuse/virtio_fs.c | 2 +- fs/pstore/inode.c | 4 +- include/linux/bitmap.h | 2 +- include/linux/cleanup.h | 85 include/linux/cpu.h | 2 +- include/linux/cpumask.h | 2 +- include/linux/device.h | 6 +- include/linux/file.h| 4 +- include/linux/firmware.h| 2 +- include/linux/gpio/driver.h | 4 +- include/linux/iio/iio.h | 4 +- include/linux/irqflags.h| 4 +- include/linux/mutex.h | 6 +- include/linux/of.h | 2 +- include/linux/pci.h | 4 +- include/linux/percpu.h | 2 +- include/linux/preempt.h | 6 +- include/linux/rcupdate.h| 2 +- include/linux/rwsem.h | 10 +-- include/linux/sched/task.h | 4 +- include/linux/slab.h| 4 +- include/linux/spinlock.h| 38 - include/linux/srcu.h| 2 +- include/linux/tracepoint-defs.h | 14 include/linux/tracepoint.h | 88 +++-- include/sound/pcm.h | 6 +- include/trace/bpf_probe.h | 20 - include/trace/define_trace.h| 7 ++ include/trace/events/syscalls.h | 4 +- include/trace/perf.h| 22 +- include/trace/trace_events.h| 68 +++- init/Kconfig| 1 + kernel/sched/core.c | 4 +- kernel/sched/sched.h| 16 ++-- kernel/trace/bpf_trace.c| 11 ++- kernel/trace/trace_events.c | 28 +-- kernel/trace/trace_fprobe.c | 5 +- kernel/trace/trace_syscalls.c | 52 ++-- kernel/tracepoint.c | 65 +-- lib/locking-selftest.c | 12 +-- sound/core/control_led.c| 2 +- 45 files changed, 441
[PATCH v5 3/8] cleanup.h: Introduce DEFINE_INACTIVE_GUARD and activate_guard
To cover scenarios where the scope of the guard differs from the scope of its activation, introduce DEFINE_INACTIVE_GUARD() and activate_guard(). Here is an example use for a conditionally activated guard variable: void func(bool a) { DEFINE_INACTIVE_GUARD(preempt_notrace, myguard); [...] if (a) { might_sleep(); activate_guard(preempt_notrace, myguard)(); } [ protected code ] } Signed-off-by: Mathieu Desnoyers Cc: Peter Zijlstra (Intel) Cc: Ingo Molnar Cc: Linus Torvalds Cc: Kees Cook Cc: Greg KH Cc: Sean Christopherson --- include/linux/cleanup.h | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h index 04f03ad5f25d..d6a3d8099d77 100644 --- a/include/linux/cleanup.h +++ b/include/linux/cleanup.h @@ -146,12 +146,20 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ * similar to scoped_guard(), except it does fail when the lock * acquire fails. * + * DEFINE_INACTIVE_GUARD(name, var): + * define an inactive guard variable in a given scope, initialized to NULL. + * + * activate_guard(name, var)(args...): + * activate a guard variable with its constructor, if it is not already + * activated. */ #define DECLARE_GUARD(_name, _type, _lock, _unlock) \ DECLARE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \ static inline void * class_##_name##_lock_ptr(class_##_name##_t *_T) \ - { return *_T; } + { return *_T; } \ + static inline class_##_name##_t class_##_name##_null(void) \ + { return NULL; } #define DECLARE_GUARD_COND(_name, _ext, _condlock) \ EXTEND_CLASS(_name, _ext, \ @@ -175,6 +183,14 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ if (!__guard_ptr(_name)()) _fail; \ else +#define DEFINE_INACTIVE_GUARD(_name, _var) \ + class_##_name##_t _var __cleanup(class_##_name##_destructor) = \ + class_##_name##_null() + +#define activate_guard(_name, _var) \ + if (!class_##_name##_lock_ptr(&(_var))) \ + _var = class_##_name##_constructor + /* * Additional helper macros for generating lock guards with types, either for * locks that don't have a native type (eg. RCU, preempt) or those that need a @@ -209,6 +225,11 @@ static inline void class_##_name##_destructor(class_##_name##_t *_T) \ static inline void *class_##_name##_lock_ptr(class_##_name##_t *_T)\ { \ return _T->lock;\ +} \ +static inline class_##_name##_t class_##_name##_null(void) \ +{ \ + class_##_name##_t _t = { .lock = NULL };\ + return _t; \ } -- 2.39.2
[PATCH v5 1/8] cleanup.h: Header include guard should match header name
The include guard should match the header name. Rename __LINUX_GUARDS_H to __LINUX_CLEANUP_H. Signed-off-by: Mathieu Desnoyers Cc: Peter Zijlstra (Intel) Cc: Ingo Molnar --- include/linux/cleanup.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h index c2d09bc4f976..4cf8ad5d27a3 100644 --- a/include/linux/cleanup.h +++ b/include/linux/cleanup.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __LINUX_GUARDS_H -#define __LINUX_GUARDS_H +#ifndef __LINUX_CLEANUP_H +#define __LINUX_CLEANUP_H #include @@ -247,4 +247,4 @@ __DEFINE_LOCK_GUARD_0(_name, _lock) { return class_##_name##_lock_ptr(_T); } -#endif /* __LINUX_GUARDS_H */ +#endif /* __LINUX_CLEANUP_H */ -- 2.39.2
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On Thu, 2024-06-27 at 15:50 +0200, Peter Hilber wrote: > On 25.06.24 21:01, David Woodhouse wrote: > > From: David Woodhouse > > > > The vmclock "device" provides a shared memory region with precision clock > > information. By using shared memory, it is safe across Live Migration. > > > > Like the KVM PTP clock, this can convert TSC-based cross timestamps into > > KVM clock values. Unlike the KVM PTP clock, it does so only when such is > > actually helpful. > > > > The memory region of the device is also exposed to userspace so it can be > > read or memory mapped by application which need reliable notification of > > clock disruptions. > > > > Signed-off-by: David Woodhouse > > --- > > > > v2: > > • Add gettimex64() support > > • Convert TSC values to KVM clock when appropriate > > • Require int128 support > > • Add counter_period_shift > > • Add timeout when seq_count is invalid > > • Add flags field > > • Better comments in vmclock ABI structure > > • Explicitly forbid smearing (as clock rates would need to change) > > Leap second smearing information could still be conveyed through the > vmclock_abi. AFAIU, to cover the popular smearing variants, it should be > enough to indicate whether the driver should apply linear or cosine > smearing, and the start time and end time. Yes. The clock information actually conveyed through the {counter, time, rate} tuple should never be smeared, and should only ever be UTC. But we could provide a hint to the guest operating system about what type of smearing to perform, *if* it chooses to offer a clock other than the standard CLOCK_REALTIME to its users. I already added a flags field, so this might look something like: /* * Smearing flags. The UTC clock exposed through this structure * is only ever true UTC, but a guest operating system may * choose to offer a monotonic smeared clock to its users. This * merely offers a hint about what kind of smearing to perform, * for consistency with systems in the nearby environment. */ #define VMCLOCK_FLAGS_SMEAR_UTC_SLS (1<<5) /* draft-kuhn-leapsecond-00.txt */ (UTC-SLS is probably a bad example but are there formal definitions for anything else?) > > But we > > drivers/ptp/Kconfig | 13 + > > drivers/ptp/Makefile | 1 + > > drivers/ptp/ptp_vmclock.c | 516 +++ > > include/uapi/linux/vmclock.h | 138 ++ > > 4 files changed, 668 insertions(+) > > create mode 100644 drivers/ptp/ptp_vmclock.c > > create mode 100644 include/uapi/linux/vmclock.h > > > > [...] > > > + > > +/* > > + * Multiply a 64-bit count by a 64-bit tick 'period' in units of seconds > > >> 64 > > + * and add the fractional second part of the reference time. > > + * > > + * The result is a 128-bit value, the top 64 bits of which are seconds, and > > + * the low 64 bits are (seconds >> 64). > > + * > > + * If __int128 isn't available, perform the calculation 32 bits at a time > > to > > + * avoid overflow. > > + */ > > +static inline uint64_t mul_u64_u64_shr_add_u64(uint64_t *res_hi, uint64_t > > delta, > > + uint64_t period, uint8_t > > shift, > > + uint64_t frac_sec) > > +{ > > + unsigned __int128 res = (unsigned __int128)delta * period; > > + > > + res >>= shift; > > + res += frac_sec; > > + *res_hi = res >> 64; > > + return (uint64_t)res; > > +} > > + > > +static int vmclock_get_crosststamp(struct vmclock_state *st, > > + struct ptp_system_timestamp *sts, > > + struct system_counterval_t > > *system_counter, > > + struct timespec64 *tspec) > > +{ > > + ktime_t deadline = ktime_add(ktime_get(), VMCLOCK_MAX_WAIT); > > + struct system_time_snapshot systime_snapshot; > > + uint64_t cycle, delta, seq, frac_sec; > > + > > +#ifdef CONFIG_X86 > > + /* > > + * We'd expect the hypervisor to know this and to report the clock > > + * status as VMCLOCK_STATUS_UNRELIABLE. But be paranoid. > > + */ > > + if (check_tsc_unstable()) > > + return -EINVAL; > > +#endif > > + > > + while (1) { > > + seq = st->clk->seq_count & ~1ULL; > > + virt_rmb(); > > + > > + if (st->clk->clock_status == VMCLOCK_STATUS_UNRELIABLE) > > + return -EINVAL; > > + > > + /* > > + * When invoked for gettimex64(), fill in the pre/post > > system > > + * times. The simple case is when system time is based on > > the > > + * same counter as st->cs_id, in which case all three times > > + * will be derived from the *same* counter value. > > + * > > + * If the system isn't using the same counter, then
Re: [PATCH] uprobe: Do not use UPROBE_SWBP_INSN as static initializer
On Thu, 27 Jun 2024 15:44:16 +0200 Jiri Olsa wrote: > On Fri, Jun 21, 2024 at 03:17:58PM +0200, Jiri Olsa wrote: > > On Fri, Jun 21, 2024 at 02:01:50PM +0200, Oleg Nesterov wrote: > > > On 06/20, Andrii Nakryiko wrote: > > > > > > > > On Thu, Jun 20, 2024 at 12:40 PM Oleg Nesterov wrote: > > > > > > > > > > But I can't understand what does it do, it calls emit_break() and > > > > > git grep -w emit_break finds nothing. > > > > > > > > > > > > > It's DEF_EMIT_REG0I15_FORMAT(break, break_op) in > > > > arch/loongarch/include/asm/inst.h > > > > > > > > A bunch of macro magic, but in the end it produces some constant > > > > value, of course. > > > > > > I see, thanks! > > > > > > Then perhaps something like below? > > > > lgtm, added loong arch list/folks > > ping > > Oleg, do you want to send formal patch? > > thanks, > jirka Yes, can you send v2 patch? Thank you, > > > > > for context: > > https://lore.kernel.org/bpf/20240614174822.GA1185149@thelio-3990X/ > > > > thanks, > > jirka > > > > > > > > Oleg. > > > > > > > > > --- x/arch/loongarch/include/asm/uprobes.h > > > +++ x/arch/loongarch/include/asm/uprobes.h > > > @@ -9,7 +9,7 @@ typedef u32 uprobe_opcode_t; > > > #define MAX_UINSN_BYTES 8 > > > #define UPROBE_XOL_SLOT_BYTESMAX_UINSN_BYTES > > > > > > -#define UPROBE_SWBP_INSN larch_insn_gen_break(BRK_UPROBE_BP) > > > +#define UPROBE_SWBP_INSN (uprobe_opcode_t)(BRK_UPROBE_BP | (break_op << > > > 15)) > > > #define UPROBE_SWBP_INSN_SIZELOONGARCH_INSN_SIZE > > > > > > #define UPROBE_XOLBP_INSNlarch_insn_gen_break(BRK_UPROBE_XOLBP) > > > --- x/arch/loongarch/kernel/uprobes.c > > > +++ x/arch/loongarch/kernel/uprobes.c > > > @@ -7,6 +7,13 @@ > > > > > > #define UPROBE_TRAP_NR UINT_MAX > > > > > > +static __init int __ck_insn(void) > > > +{ > > > + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); > > > + return 0; > > > +} > > > +late_initcall(__ck_insn); > > > + > > > int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, > > >struct mm_struct *mm, unsigned long addr) > > > { > > > -- Masami Hiramatsu (Google)
Re: [PATCH v1 4/4] trace: platform/x86/intel/ifs: Add SBAF trace support
On 6/27/24 6:56 AM, Steven Rostedt wrote: > On Thu, 27 Jun 2024 02:35:16 + > Kuppuswamy Sathyanarayanan wrote: > >> From: Jithu Joseph >> >> Add tracing support for the SBAF IFS tests, which may be useful for >> debugging systems that fail these tests. Log details like test content >> batch number, SBAF bundle ID, program index and the exact errors or >> warnings encountered by each HT thread during the test. >> >> Reviewed-by: Ashok Raj >> Reviewed-by: Tony Luck >> Signed-off-by: Jithu Joseph >> Signed-off-by: Kuppuswamy Sathyanarayanan >> >> --- >> include/trace/events/intel_ifs.h | 27 >> drivers/platform/x86/intel/ifs/runtest.c | 1 + >> 2 files changed, 28 insertions(+) >> >> diff --git a/include/trace/events/intel_ifs.h >> b/include/trace/events/intel_ifs.h >> index 0d88ebf2c980..9c7413de432b 100644 >> --- a/include/trace/events/intel_ifs.h >> +++ b/include/trace/events/intel_ifs.h >> @@ -35,6 +35,33 @@ TRACE_EVENT(ifs_status, >> __entry->status) >> ); >> >> +TRACE_EVENT(ifs_sbaf, >> + >> +TP_PROTO(int batch, union ifs_sbaf activate, union ifs_sbaf_status >> status), >> + >> +TP_ARGS(batch, activate, status), >> + >> +TP_STRUCT__entry( >> +__field(int,batch ) >> +__field(u64,status ) > Please put the 64 bit status field before the 32 bit batch field, > otherwise this will likely create a 4 byte hole between the two fields. > Space on the ring buffer is expensive real-estate. Agree. I will fix this in next version. > > -- Steve > >> +__field(u16,bundle ) >> +__field(u16,pgm ) >> +), >> + >> +TP_fast_assign( >> +__entry->batch = batch; >> +__entry->bundle = activate.bundle_idx; >> +__entry->pgm= activate.pgm_idx; >> +__entry->status = status.data; >> +), >> + >> +TP_printk("batch: 0x%.2x, bundle_idx: 0x%.4x, pgm_idx: 0x%.4x, status: >> 0x%.16llx", >> +__entry->batch, >> +__entry->bundle, >> +__entry->pgm, >> +__entry->status) >> +); >> + >> #endif /* _TRACE_IFS_H */ >> >> /* This part must be outside protection */ >> diff --git a/drivers/platform/x86/intel/ifs/runtest.c >> b/drivers/platform/x86/intel/ifs/runtest.c >> index bdb31b2f45b4..69ee0eb72025 100644 >> --- a/drivers/platform/x86/intel/ifs/runtest.c >> +++ b/drivers/platform/x86/intel/ifs/runtest.c >> @@ -530,6 +530,7 @@ static int dosbaf(void *data) >> */ >> wrmsrl(MSR_ACTIVATE_SBAF, run_params->activate->data); >> rdmsrl(MSR_SBAF_STATUS, status.data); >> +trace_ifs_sbaf(ifsd->cur_batch, *run_params->activate, status); >> >> /* Pass back the result of the test */ >> if (cpu == first) -- Sathyanarayanan Kuppuswamy Linux Kernel Developer
Re: [PATCH v1 4/4] trace: platform/x86/intel/ifs: Add SBAF trace support
On Thu, 27 Jun 2024 02:35:16 + Kuppuswamy Sathyanarayanan wrote: > From: Jithu Joseph > > Add tracing support for the SBAF IFS tests, which may be useful for > debugging systems that fail these tests. Log details like test content > batch number, SBAF bundle ID, program index and the exact errors or > warnings encountered by each HT thread during the test. > > Reviewed-by: Ashok Raj > Reviewed-by: Tony Luck > Signed-off-by: Jithu Joseph > Signed-off-by: Kuppuswamy Sathyanarayanan > > --- > include/trace/events/intel_ifs.h | 27 > drivers/platform/x86/intel/ifs/runtest.c | 1 + > 2 files changed, 28 insertions(+) > > diff --git a/include/trace/events/intel_ifs.h > b/include/trace/events/intel_ifs.h > index 0d88ebf2c980..9c7413de432b 100644 > --- a/include/trace/events/intel_ifs.h > +++ b/include/trace/events/intel_ifs.h > @@ -35,6 +35,33 @@ TRACE_EVENT(ifs_status, > __entry->status) > ); > > +TRACE_EVENT(ifs_sbaf, > + > + TP_PROTO(int batch, union ifs_sbaf activate, union ifs_sbaf_status > status), > + > + TP_ARGS(batch, activate, status), > + > + TP_STRUCT__entry( > + __field(int,batch ) > + __field(u64,status ) Please put the 64 bit status field before the 32 bit batch field, otherwise this will likely create a 4 byte hole between the two fields. Space on the ring buffer is expensive real-estate. -- Steve > + __field(u16,bundle ) > + __field(u16,pgm ) > + ), > + > + TP_fast_assign( > + __entry->batch = batch; > + __entry->bundle = activate.bundle_idx; > + __entry->pgm= activate.pgm_idx; > + __entry->status = status.data; > + ), > + > + TP_printk("batch: 0x%.2x, bundle_idx: 0x%.4x, pgm_idx: 0x%.4x, status: > 0x%.16llx", > + __entry->batch, > + __entry->bundle, > + __entry->pgm, > + __entry->status) > +); > + > #endif /* _TRACE_IFS_H */ > > /* This part must be outside protection */ > diff --git a/drivers/platform/x86/intel/ifs/runtest.c > b/drivers/platform/x86/intel/ifs/runtest.c > index bdb31b2f45b4..69ee0eb72025 100644 > --- a/drivers/platform/x86/intel/ifs/runtest.c > +++ b/drivers/platform/x86/intel/ifs/runtest.c > @@ -530,6 +530,7 @@ static int dosbaf(void *data) >*/ > wrmsrl(MSR_ACTIVATE_SBAF, run_params->activate->data); > rdmsrl(MSR_SBAF_STATUS, status.data); > + trace_ifs_sbaf(ifsd->cur_batch, *run_params->activate, status); > > /* Pass back the result of the test */ > if (cpu == first)
Re: [RFC PATCH v2] ptp: Add vDSO-style vmclock support
On 25.06.24 21:01, David Woodhouse wrote: > From: David Woodhouse > > The vmclock "device" provides a shared memory region with precision clock > information. By using shared memory, it is safe across Live Migration. > > Like the KVM PTP clock, this can convert TSC-based cross timestamps into > KVM clock values. Unlike the KVM PTP clock, it does so only when such is > actually helpful. > > The memory region of the device is also exposed to userspace so it can be > read or memory mapped by application which need reliable notification of > clock disruptions. > > Signed-off-by: David Woodhouse > --- > > v2: > • Add gettimex64() support > • Convert TSC values to KVM clock when appropriate > • Require int128 support > • Add counter_period_shift > • Add timeout when seq_count is invalid > • Add flags field > • Better comments in vmclock ABI structure > • Explicitly forbid smearing (as clock rates would need to change) Leap second smearing information could still be conveyed through the vmclock_abi. AFAIU, to cover the popular smearing variants, it should be enough to indicate whether the driver should apply linear or cosine smearing, and the start time and end time. > > drivers/ptp/Kconfig | 13 + > drivers/ptp/Makefile | 1 + > drivers/ptp/ptp_vmclock.c| 516 +++ > include/uapi/linux/vmclock.h | 138 ++ > 4 files changed, 668 insertions(+) > create mode 100644 drivers/ptp/ptp_vmclock.c > create mode 100644 include/uapi/linux/vmclock.h > [...] > + > +/* > + * Multiply a 64-bit count by a 64-bit tick 'period' in units of seconds >> > 64 > + * and add the fractional second part of the reference time. > + * > + * The result is a 128-bit value, the top 64 bits of which are seconds, and > + * the low 64 bits are (seconds >> 64). > + * > + * If __int128 isn't available, perform the calculation 32 bits at a time to > + * avoid overflow. > + */ > +static inline uint64_t mul_u64_u64_shr_add_u64(uint64_t *res_hi, uint64_t > delta, > +uint64_t period, uint8_t shift, > +uint64_t frac_sec) > +{ > + unsigned __int128 res = (unsigned __int128)delta * period; > + > + res >>= shift; > + res += frac_sec; > + *res_hi = res >> 64; > + return (uint64_t)res; > +} > + > +static int vmclock_get_crosststamp(struct vmclock_state *st, > +struct ptp_system_timestamp *sts, > +struct system_counterval_t *system_counter, > +struct timespec64 *tspec) > +{ > + ktime_t deadline = ktime_add(ktime_get(), VMCLOCK_MAX_WAIT); > + struct system_time_snapshot systime_snapshot; > + uint64_t cycle, delta, seq, frac_sec; > + > +#ifdef CONFIG_X86 > + /* > + * We'd expect the hypervisor to know this and to report the clock > + * status as VMCLOCK_STATUS_UNRELIABLE. But be paranoid. > + */ > + if (check_tsc_unstable()) > + return -EINVAL; > +#endif > + > + while (1) { > + seq = st->clk->seq_count & ~1ULL; > + virt_rmb(); > + > + if (st->clk->clock_status == VMCLOCK_STATUS_UNRELIABLE) > + return -EINVAL; > + > + /* > + * When invoked for gettimex64(), fill in the pre/post system > + * times. The simple case is when system time is based on the > + * same counter as st->cs_id, in which case all three times > + * will be derived from the *same* counter value. > + * > + * If the system isn't using the same counter, then the value > + * from ktime_get_snapshot() will still be used as pre_ts, and > + * ptp_read_system_postts() is called to populate postts after > + * calling get_cycles(). > + * > + * The conversion to timespec64 happens further down, outside > + * the seq_count loop. > + */ > + if (sts) { > + ktime_get_snapshot(_snapshot); > + if (systime_snapshot.cs_id == st->cs_id) { > + cycle = systime_snapshot.cycles; > + } else { > + cycle = get_cycles(); > + ptp_read_system_postts(sts); > + } > + } else > + cycle = get_cycles(); > + > + delta = cycle - st->clk->counter_value; AFAIU in the general case this needs to be masked for non 64-bit counters. > + > + frac_sec = mul_u64_u64_shr_add_u64(>tv_sec, delta, > + > st->clk->counter_period_frac_sec, > + > st->clk->counter_period_shift, > +st->clk->utc_time_frac_sec);
Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes
On 21.06.24 10:45, David Woodhouse wrote: > On Thu, 2024-06-20 at 17:19 +0100, David Woodhouse wrote: >> >>> + + /* Counter frequency, and error margin. Units of (second >> 64) */ + uint64_t counter_period_frac_sec; >>> >>> AFAIU this might limit the precision in case of high counter frequencies. >>> Could the unit be aligned to the expected frequency band of counters? >> >> This field indicates the period of a single tick, in units of 1>>64 of >> a second. That's about 5.4e-20 seconds, or 54 zeptoseconds? >> >> Can you walk me through a calculation where you believe that level of >> precision is insufficient? >> >> I guess the precision matters if the structure isn't updated for a long >> period of time, and the delta between the current counter and the >> snapshot is high? That's a *lot* of 54 zeptosecondses? But you really >> would need a *lot* of them before you care? And if nobody's been >> calibrating your counter for that long, surely you have bigger worries? >> >> Am I missing something there? > > Hm, that was a bit rushed at the end of the day; let's take a better look... > > Let's take a hypothetical example of a 100GHz counter. That's two > orders of magnitude more than today's Arm arch counter. > > The period of such a counter would be 10 picoseconds. > > (Let's ignore the question of how far light actually travels in that > time and how *realistic* that example is, for the moment.) > > It turns out that at that rate, there *are* a lot of 54 zeptosecondses > of precision loss in the day. It could be half a millisecond a day, or > 20µs an hour. > > That particular example of 10 picoseconds is 184467440.7370955 > (seconds>>64) which could be truncated to 184467440 — losing about 4PPB > (a third of a millisecond a day; 14µs an hour). > > So yeah, I suppose a 'shift' field could make sense. It's easy enough > to consume on the guest side as it doesn't really perturb the 128-bit > multiplication very much; especially if we don't let it be negative. > > And implementations *can* just set it to zero. It hurts nobody. > > Or were you thinking of just using a fixed shift like (seconds>>80) > instead? The 'shift' field should be fine.
Re: [PATCH] uprobe: Do not use UPROBE_SWBP_INSN as static initializer
On Fri, Jun 21, 2024 at 03:17:58PM +0200, Jiri Olsa wrote: > On Fri, Jun 21, 2024 at 02:01:50PM +0200, Oleg Nesterov wrote: > > On 06/20, Andrii Nakryiko wrote: > > > > > > On Thu, Jun 20, 2024 at 12:40 PM Oleg Nesterov wrote: > > > > > > > > But I can't understand what does it do, it calls emit_break() and > > > > git grep -w emit_break finds nothing. > > > > > > > > > > It's DEF_EMIT_REG0I15_FORMAT(break, break_op) in > > > arch/loongarch/include/asm/inst.h > > > > > > A bunch of macro magic, but in the end it produces some constant > > > value, of course. > > > > I see, thanks! > > > > Then perhaps something like below? > > lgtm, added loong arch list/folks ping Oleg, do you want to send formal patch? thanks, jirka > > for context: > https://lore.kernel.org/bpf/20240614174822.GA1185149@thelio-3990X/ > > thanks, > jirka > > > > > Oleg. > > > > > > --- x/arch/loongarch/include/asm/uprobes.h > > +++ x/arch/loongarch/include/asm/uprobes.h > > @@ -9,7 +9,7 @@ typedef u32 uprobe_opcode_t; > > #define MAX_UINSN_BYTES8 > > #define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES > > > > -#define UPROBE_SWBP_INSN larch_insn_gen_break(BRK_UPROBE_BP) > > +#define UPROBE_SWBP_INSN (uprobe_opcode_t)(BRK_UPROBE_BP | (break_op << > > 15)) > > #define UPROBE_SWBP_INSN_SIZE LOONGARCH_INSN_SIZE > > > > #define UPROBE_XOLBP_INSN larch_insn_gen_break(BRK_UPROBE_XOLBP) > > --- x/arch/loongarch/kernel/uprobes.c > > +++ x/arch/loongarch/kernel/uprobes.c > > @@ -7,6 +7,13 @@ > > > > #define UPROBE_TRAP_NR UINT_MAX > > > > +static __init int __ck_insn(void) > > +{ > > + BUG_ON(UPROBE_SWBP_INSN != larch_insn_gen_break(BRK_UPROBE_BP)); > > + return 0; > > +} > > +late_initcall(__ck_insn); > > + > > int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, > > struct mm_struct *mm, unsigned long addr) > > { > >
Re: [PATCH v9 1/8] remoteproc: qcom: Add PRNG proxy clock
On Thu, Jun 27, 2024 at 04:59:38PM GMT, Gokul Sriram P wrote: > > On 6/27/2024 4:38 PM, Dmitry Baryshkov wrote: > > On Thu, Jun 27, 2024 at 03:31:01PM GMT, Gokul Sriram P wrote: > > > On 6/27/2024 12:47 AM, Dmitry Baryshkov wrote: > > > > On Tue, Jun 25, 2024 at 11:03:30AM GMT, Gokul Sriram P wrote: > > > > > On 6/22/2024 2:38 AM, Dmitry Baryshkov wrote: > > > > > > On Fri, Jun 21, 2024 at 05:16:52PM GMT, Gokul Sriram Palanisamy > > > > > > wrote: > > > > > > > PRNG clock is needed by the secure PIL, support for the same > > > > > > > is added in subsequent patches. > > > > > > Which 'same'? > > > > > > What is 'secure PIL'? > > > > > will elaborate in the updated version. > > > > > To answer your question, secure PIL is signed PIL image which only > > > > > TrustZone can authenticate and load. > > > > Fine. So, the current driver can not load WCSS firmware on IPQ8074, is > > > > that correct? Or was there some kind of firmware interface change? The > > > > driver was added in 2018, so I can only hope that at that point it > > > > worked. Could you please explain, what happened? > > > The existing wcss driver can load unsigned PIL images without the > > > involvement of TrustZone. That works even now. > > > With the current change, we are trying to add signed PIL as an option > > > based > > > on "wcss->need_mem_protection" if set. For signed PIL alone, we send a PAS > > > request to TrustZone to authenticate and load. > > I see that you are enabling it unconditionally for IPQ8074. How is it > > going to work? > > Correct Dmitry. In this change, it is forcing secure PIL. With a separate > driver for secure PIL, this will be sorted right? That depends. How will the running system decide, which driver to use? It can not be a compile-time decision. > > Regards, > > Gokul > > > > I also just noticed that Bjorn had suggested to submit a new driver for > > > the > > > PAS based IPQ WCSS instead of overloading this driver. Will also address > > > that and post a new driver in updated revision. > > > > > > Regards, > > > Gokul > > > > > > > Signed-off-by: Nikhil Prakash V > > > > > > > Signed-off-by: Sricharan R > > > > > > > Signed-off-by: Gokul Sriram Palanisamy > > > > > > > --- > > > > > > > drivers/remoteproc/qcom_q6v5_wcss.c | 65 > > > > > > > + > > > > > > > 1 file changed, 47 insertions(+), 18 deletions(-) -- With best wishes Dmitry
[PATCH] arm64: dts: qcom: sm7225-fairphone-fp4: Name the regulators
Without explicitly specifying names for the regulators they are named based on the DeviceTree node name. This results in multiple regulators with the same name, making debug prints and regulator_summary impossible to reason about. Signed-off-by: Luca Weiss --- arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts | 34 +++ 1 file changed, 34 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts b/arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts index a74f3ac09a5e..4e67bb80a026 100644 --- a/arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts +++ b/arch/arm64/boot/dts/qcom/sm7225-fairphone-fp4.dts @@ -150,124 +150,145 @@ regulators-0 { qcom,pmic-id = "a"; vreg_s1a: smps1 { + regulator-name = "vreg_s1a"; regulator-min-microvolt = <100>; regulator-max-microvolt = <120>; }; vreg_s2a: smps2 { + regulator-name = "vreg_s2a"; regulator-min-microvolt = <1503000>; regulator-max-microvolt = <2048000>; }; vreg_l2a: ldo2 { + regulator-name = "vreg_l2a"; regulator-min-microvolt = <1503000>; regulator-max-microvolt = <198>; regulator-initial-mode = ; }; vreg_l3a: ldo3 { + regulator-name = "vreg_l3a"; regulator-min-microvolt = <270>; regulator-max-microvolt = <330>; regulator-initial-mode = ; }; vreg_l4a: ldo4 { + regulator-name = "vreg_l4a"; regulator-min-microvolt = <352000>; regulator-max-microvolt = <801000>; regulator-initial-mode = ; }; vreg_l5a: ldo5 { + regulator-name = "vreg_l5a"; regulator-min-microvolt = <1503000>; regulator-max-microvolt = <198>; regulator-initial-mode = ; }; vreg_l6a: ldo6 { + regulator-name = "vreg_l6a"; regulator-min-microvolt = <171>; regulator-max-microvolt = <3544000>; regulator-initial-mode = ; }; vreg_l7a: ldo7 { + regulator-name = "vreg_l7a"; regulator-min-microvolt = <162>; regulator-max-microvolt = <198>; regulator-initial-mode = ; }; vreg_l8a: ldo8 { + regulator-name = "vreg_l8a"; regulator-min-microvolt = <280>; regulator-max-microvolt = <280>; regulator-initial-mode = ; }; vreg_l9a: ldo9 { + regulator-name = "vreg_l9a"; regulator-min-microvolt = <165>; regulator-max-microvolt = <3401000>; regulator-initial-mode = ; }; vreg_l11a: ldo11 { + regulator-name = "vreg_l11a"; regulator-min-microvolt = <180>; regulator-max-microvolt = <200>; regulator-initial-mode = ; }; vreg_l12a: ldo12 { + regulator-name = "vreg_l12a"; regulator-min-microvolt = <162>; regulator-max-microvolt = <198>; regulator-initial-mode = ; }; vreg_l13a: ldo13 { + regulator-name = "vreg_l13a"; regulator-min-microvolt = <57>; regulator-max-microvolt = <65>; regulator-initial-mode = ; }; vreg_l14a: ldo14 { + regulator-name = "vreg_l14a"; regulator-min-microvolt = <170>; regulator-max-microvolt = <190>; regulator-initial-mode = ; }; vreg_l15a: ldo15 { + regulator-name = "vreg_l15a"; regulator-min-microvolt = <110>; regulator-max-microvolt = <1305000>; regulator-initial-mode = ; }; vreg_l16a: ldo16 { + regulator-name = "vreg_l16a"; regulator-min-microvolt = <83>;
Re: [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs
On Mon, 24 Jun 2024 17:21:38 -0700 Andrii Nakryiko wrote: > -static int __uprobe_register(struct inode *inode, loff_t offset, > - loff_t ref_ctr_offset, struct uprobe_consumer *uc) > +int uprobe_register_batch(struct inode *inode, int cnt, > + uprobe_consumer_fn get_uprobe_consumer, void *ctx) Is this interface just for avoiding memory allocation? Can't we just allocate a temporary array of *uprobe_consumer instead? Thank you, -- Masami Hiramatsu (Google)
Re: [PATCH v9 8/8] arm64: dts: qcom: Enable Q6v5 WCSS for ipq8074 SoC
Hi Gokul, kernel test robot noticed the following build warnings: [auto build test WARNING on remoteproc/rproc-next] [also build test WARNING on clk/clk-next robh/for-next linus/master v6.10-rc5 next-20240626] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Gokul-Sriram-Palanisamy/remoteproc-qcom-Add-PRNG-proxy-clock/20240625-162317 base: git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git rproc-next patch link: https://lore.kernel.org/r/20240621114659.2958170-9-quic_gokulsri%40quicinc.com patch subject: [PATCH v9 8/8] arm64: dts: qcom: Enable Q6v5 WCSS for ipq8074 SoC config: arm64-randconfig-051-20240627 (https://download.01.org/0day-ci/archive/20240627/202406272012.krpg0wbc-...@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project ad79a14c9e5ec4a369eed4adf567c22cc029863f) dtschema version: 2024.6.dev1+g833054f reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240627/202406272012.krpg0wbc-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406272012.krpg0wbc-...@intel.com/ dtcheck warnings: (new ones prefixed by >>) arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@59000: 'vdda-pll-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@59000: 'vdda-phy-dpdm-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdd-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdda-pll-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: phy@79000: 'vdda-phy-dpdm-supply' is a required property from schema $id: http://devicetree.org/schemas/phy/qcom,qusb2-phy.yaml# >> arch/arm64/boot/dts/qcom/ipq8074-hk01.dtb: /soc@0/remoteproc@cd0: failed >> to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- >> arch/arm64/boot/dts/qcom/ipq8074-hk10-c1.dtb: /soc@0/remoteproc@cd0: >> failed to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- >> arch/arm64/boot/dts/qcom/ipq8074-hk10-c2.dtb: /soc@0/remoteproc@cd0: >> failed to match any schema with compatible: ['qcom,ipq8074-wcss-pil'] -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH v9 1/8] remoteproc: qcom: Add PRNG proxy clock
On 6/27/2024 4:38 PM, Dmitry Baryshkov wrote: On Thu, Jun 27, 2024 at 03:31:01PM GMT, Gokul Sriram P wrote: On 6/27/2024 12:47 AM, Dmitry Baryshkov wrote: On Tue, Jun 25, 2024 at 11:03:30AM GMT, Gokul Sriram P wrote: On 6/22/2024 2:38 AM, Dmitry Baryshkov wrote: On Fri, Jun 21, 2024 at 05:16:52PM GMT, Gokul Sriram Palanisamy wrote: PRNG clock is needed by the secure PIL, support for the same is added in subsequent patches. Which 'same'? What is 'secure PIL'? will elaborate in the updated version. To answer your question, secure PIL is signed PIL image which only TrustZone can authenticate and load. Fine. So, the current driver can not load WCSS firmware on IPQ8074, is that correct? Or was there some kind of firmware interface change? The driver was added in 2018, so I can only hope that at that point it worked. Could you please explain, what happened? The existing wcss driver can load unsigned PIL images without the involvement of TrustZone. That works even now. With the current change, we are trying to add signed PIL as an option based on "wcss->need_mem_protection" if set. For signed PIL alone, we send a PAS request to TrustZone to authenticate and load. I see that you are enabling it unconditionally for IPQ8074. How is it going to work? Correct Dmitry. In this change, it is forcing secure PIL. With a separate driver for secure PIL, this will be sorted right? Regards, Gokul I also just noticed that Bjorn had suggested to submit a new driver for the PAS based IPQ WCSS instead of overloading this driver. Will also address that and post a new driver in updated revision. Regards, Gokul Signed-off-by: Nikhil Prakash V Signed-off-by: Sricharan R Signed-off-by: Gokul Sriram Palanisamy --- drivers/remoteproc/qcom_q6v5_wcss.c | 65 + 1 file changed, 47 insertions(+), 18 deletions(-)
Re: [PATCH v9 3/8] remoteproc: qcom: Add support for split q6 + m3 wlan firmware
On Fri, Jun 21, 2024 at 05:16:54PM GMT, Gokul Sriram Palanisamy wrote: > IPQ8074 supports split firmware for q6 and m3 as well. > So add support for loading the m3 firmware before q6. > Now the drivers works fine for both split and unified > firmwares. Right now linux-firmware ships both q6 and m3 firmware files. The driver loads just the q6 firmware. Is it enough for the hardware to get working WiFi? > > Signed-off-by: Nikhil Prakash V > Signed-off-by: Sricharan R > Signed-off-by: Gokul Sriram Palanisamy Who is the original author of the patch? > --- > drivers/remoteproc/qcom_q6v5_wcss.c | 33 + > 1 file changed, 29 insertions(+), 4 deletions(-) > -- With best wishes Dmitry
Re: [PATCH v9 1/8] remoteproc: qcom: Add PRNG proxy clock
On Thu, Jun 27, 2024 at 03:31:01PM GMT, Gokul Sriram P wrote: > > On 6/27/2024 12:47 AM, Dmitry Baryshkov wrote: > > On Tue, Jun 25, 2024 at 11:03:30AM GMT, Gokul Sriram P wrote: > > > On 6/22/2024 2:38 AM, Dmitry Baryshkov wrote: > > > > On Fri, Jun 21, 2024 at 05:16:52PM GMT, Gokul Sriram Palanisamy wrote: > > > > > PRNG clock is needed by the secure PIL, support for the same > > > > > is added in subsequent patches. > > > > Which 'same'? > > > > What is 'secure PIL'? > > > will elaborate in the updated version. > > > To answer your question, secure PIL is signed PIL image which only > > > TrustZone can authenticate and load. > > Fine. So, the current driver can not load WCSS firmware on IPQ8074, is > > that correct? Or was there some kind of firmware interface change? The > > driver was added in 2018, so I can only hope that at that point it > > worked. Could you please explain, what happened? > The existing wcss driver can load unsigned PIL images without the > involvement of TrustZone. That works even now. > With the current change, we are trying to add signed PIL as an option based > on "wcss->need_mem_protection" if set. For signed PIL alone, we send a PAS > request to TrustZone to authenticate and load. I see that you are enabling it unconditionally for IPQ8074. How is it going to work? > I also just noticed that Bjorn had suggested to submit a new driver for the > PAS based IPQ WCSS instead of overloading this driver. Will also address > that and post a new driver in updated revision. > > Regards, > Gokul > > > > > Signed-off-by: Nikhil Prakash V > > > > > Signed-off-by: Sricharan R > > > > > Signed-off-by: Gokul Sriram Palanisamy > > > > > --- > > > > >drivers/remoteproc/qcom_q6v5_wcss.c | 65 > > > > > + > > > > >1 file changed, 47 insertions(+), 18 deletions(-) > > -- With best wishes Dmitry
[PATCH V3 2/2] soc: qcom: smp2p: Introduce tracepoint support
This commit introduces tracepoint support for smp2p, enabling logging of communication between local and remote processors. These tracepoints include information about the remote subsystem name, negotiation details, supported features, bit change notifications, and ssr activity. These logs are useful for debugging issues between subsystems. Signed-off-by: Sudeepgoud Patil --- drivers/soc/qcom/Makefile | 1 + drivers/soc/qcom/smp2p.c | 9 drivers/soc/qcom/trace-smp2p.h | 98 ++ 3 files changed, 108 insertions(+) create mode 100644 drivers/soc/qcom/trace-smp2p.h diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile index ca0bece0dfff..30c1bf645501 100644 --- a/drivers/soc/qcom/Makefile +++ b/drivers/soc/qcom/Makefile @@ -23,6 +23,7 @@ qcom_rpmh-y += rpmh.o obj-$(CONFIG_QCOM_SMD_RPM) += rpm-proc.o smd-rpm.o obj-$(CONFIG_QCOM_SMEM) += smem.o obj-$(CONFIG_QCOM_SMEM_STATE) += smem_state.o +CFLAGS_smp2p.o := -I$(src) obj-$(CONFIG_QCOM_SMP2P) += smp2p.o obj-$(CONFIG_QCOM_SMSM)+= smsm.o obj-$(CONFIG_QCOM_SOCINFO) += socinfo.o diff --git a/drivers/soc/qcom/smp2p.c b/drivers/soc/qcom/smp2p.c index 696c2a8387d0..4aa61b0f11ad 100644 --- a/drivers/soc/qcom/smp2p.c +++ b/drivers/soc/qcom/smp2p.c @@ -161,6 +161,9 @@ struct qcom_smp2p { struct list_head outbound; }; +#define CREATE_TRACE_POINTS +#include "trace-smp2p.h" + static void qcom_smp2p_kick(struct qcom_smp2p *smp2p) { /* Make sure any updated data is written before the kick */ @@ -192,6 +195,7 @@ static void qcom_smp2p_do_ssr_ack(struct qcom_smp2p *smp2p) struct smp2p_smem_item *out = smp2p->out; u32 val; + trace_smp2p_ssr_ack(smp2p->dev); smp2p->ssr_ack = !smp2p->ssr_ack; val = out->flags & ~BIT(SMP2P_FLAGS_RESTART_ACK_BIT); @@ -214,6 +218,7 @@ static void qcom_smp2p_negotiate(struct qcom_smp2p *smp2p) smp2p->ssr_ack_enabled = true; smp2p->negotiation_done = true; + trace_smp2p_negotiate(smp2p->dev, out->features); } } @@ -252,6 +257,8 @@ static void qcom_smp2p_notify_in(struct qcom_smp2p *smp2p) status = val ^ entry->last_value; entry->last_value = val; + trace_smp2p_notify_in(entry, status, val); + /* No changes of this entry? */ if (!status) continue; @@ -415,6 +422,8 @@ static int smp2p_update_bits(void *data, u32 mask, u32 value) writel(val, entry->value); spin_unlock_irqrestore(>lock, flags); + trace_smp2p_update_bits(entry, orig, val); + if (val != orig) qcom_smp2p_kick(entry->smp2p); diff --git a/drivers/soc/qcom/trace-smp2p.h b/drivers/soc/qcom/trace-smp2p.h new file mode 100644 index ..fa985a0d7615 --- /dev/null +++ b/drivers/soc/qcom/trace-smp2p.h @@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved. + */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM qcom_smp2p + +#if !defined(__QCOM_SMP2P_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) +#define __QCOM_SMP2P_TRACE_H__ + +#include +#include + +TRACE_EVENT(smp2p_ssr_ack, + TP_PROTO(const struct device *dev), + TP_ARGS(dev), + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + ), + TP_printk("%s: SSR detected", __get_str(dev_name)) +); + +TRACE_EVENT(smp2p_negotiate, + TP_PROTO(const struct device *dev, unsigned int features), + TP_ARGS(dev, features), + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + __field(u32, out_features) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + __entry->out_features = features; + ), + TP_printk("%s: state=open out_features=%s", __get_str(dev_name), + __print_flags(__entry->out_features, "|", + {SMP2P_FEATURE_SSR_ACK, "SMP2P_FEATURE_SSR_ACK"}) + ) +); + +TRACE_EVENT(smp2p_notify_in, + TP_PROTO(struct smp2p_entry *smp2p_entry, unsigned long status, u32 val), + TP_ARGS(smp2p_entry, status, val), + TP_STRUCT__entry( + __string(dev_name, dev_name(smp2p_entry->smp2p->dev)) + __string(client_name, smp2p_entry->name) + __field(unsigned long, status) + __field(u32, val) + ), + TP_fast_assign( + __assign_str(dev_name, dev_name(smp2p_entry->smp2p->dev)); + __assign_str(client_name, smp2p_entry->name); + __entry->status = status; + __entry->val = val; + ), + TP_printk("%s: %s: status:0x%0lx val:0x%0x", + __get_str(dev_name), +
[PATCH V3 1/2] soc: qcom: smp2p: Use devname for interrupt descriptions
From: Chris Lew When using /proc/interrupts to collect statistics on smp2p interrupt counts, it is hard to distinguish the different instances of smp2p from each other. For example to debug a processor boot issue, the ready and handover interrupts are checked for sanity to ensure the firmware reached a specific initialization stage. Remove "smp2p" string from the irq request so that the irq will default to the device name. Add an .irq_print_chip() callback to print the irq chip name as the device name. These two changes allow for a unique name to be used in /proc/interrupts as shown below. / # cat /proc/interrupts | grep smp2p 18: ... ipcc 196610 Edge smp2p-adsp 20: ... ipcc 131074 Edge smp2p-modem 170: ... smp2p-modem 1 Edge q6v5 ready 178: ... smp2p-adsp 1 Edge q6v5 ready Signed-off-by: Chris Lew --- drivers/soc/qcom/smp2p.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/soc/qcom/smp2p.c b/drivers/soc/qcom/smp2p.c index a21241cbeec7..696c2a8387d0 100644 --- a/drivers/soc/qcom/smp2p.c +++ b/drivers/soc/qcom/smp2p.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -353,11 +354,19 @@ static int smp2p_set_irq_type(struct irq_data *irqd, unsigned int type) return 0; } +static void smp2p_irq_print_chip(struct irq_data *irqd, struct seq_file *p) +{ + struct smp2p_entry *entry = irq_data_get_irq_chip_data(irqd); + + seq_printf(p, " %8s", dev_name(entry->smp2p->dev)); +} + static struct irq_chip smp2p_irq_chip = { .name = "smp2p", .irq_mask = smp2p_mask_irq, .irq_unmask = smp2p_unmask_irq, .irq_set_type = smp2p_set_irq_type, + .irq_print_chip = smp2p_irq_print_chip, }; static int smp2p_irq_map(struct irq_domain *d, @@ -617,7 +626,7 @@ static int qcom_smp2p_probe(struct platform_device *pdev) ret = devm_request_threaded_irq(>dev, irq, NULL, qcom_smp2p_intr, IRQF_ONESHOT, - "smp2p", (void *)smp2p); + NULL, (void *)smp2p); if (ret) { dev_err(>dev, "failed to request interrupt\n"); goto unwind_interfaces; --
[PATCH V3 0/2] Use of devname for interrupt descriptions and tracepoint support for smp2p
This commit enhances the smp2p driver by adding support for using the device name in interrupt descriptions and introducing tracepoint functionality. These improvements facilitate more effective debugging of smp2p-related issues. The devname patch, along with the callback to print the irq chip name as the device name and the removal of the ‘smp2p’ string from the irq request, results in a unique interrupt description. Tracepoint functionality captures essential details such as subsystem name, negotiation specifics, supported features, bit changes, and subsystem restart activity. These enhancements significantly improve debugging capabilities for inter-subsystem issues. Changes in v3: - Updated patch to use devname for interrupt descriptions with a different approach. - Modified tracepoint patch by removing remote_pid field from all tracepoints. - Using SMP2P_FEATURE_SSR_ACK definition from smp2p.c instead of redefiniton. - Link to v2: https://lore.kernel.org/all/20240611123351.3813190-1-quic_sudee...@quicinc.com Changes in v2: - Added support to include the remote name in the smp2p IRQ devname, allowing for remote PID-name mapping - Mapped the remote PID (Process ID) along with the remote name in tracepoints, as suggested by Chris - Modified to capture all `out->features` instead of just the `ssr_ack`, following Chris's recommendation - Expanded the commit description to provide additional context - Link to v1: https://lore.kernel.org/all/20240429075528.1723133-1-quic_sudee...@quicinc.com Chris Lew (1): soc: qcom: smp2p: Use devname for interrupt descriptions Sudeepgoud Patil (1): soc: qcom: smp2p: Introduce tracepoint support drivers/soc/qcom/Makefile | 1 + drivers/soc/qcom/smp2p.c | 20 ++- drivers/soc/qcom/trace-smp2p.h | 98 ++ 3 files changed, 118 insertions(+), 1 deletion(-) create mode 100644 drivers/soc/qcom/trace-smp2p.h --
Re: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types.
Hi Luigi, kernel test robot noticed the following build warnings: [auto build test WARNING on 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5] url: https://github.com/intel-lab-lkp/linux/commits/Luigi-Leonardi-via-B4-Relay/vsock-add-support-for-SIOCOUTQ-ioctl-for-all-vsock-socket-types/20240627-023902 base: 50b70845fc5c22cf7e7d25b57d57b3dca1725aa5 patch link: https://lore.kernel.org/r/20240626-ioctl_next-v3-1-63be5bf19a40%40outlook.com patch subject: [PATCH net-next v3 1/3] vsock: add support for SIOCOUTQ ioctl for all vsock socket types. config: i386-buildonly-randconfig-001-20240627 (https://download.01.org/0day-ci/archive/20240627/202406271827.aq9zylch-...@intel.com/config) compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240627/202406271827.aq9zylch-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202406271827.aq9zylch-...@intel.com/ All warnings (new ones prefixed by >>): >> net/vmw_vsock/af_vsock.c:1314:7: warning: variable 'retval' is used >> uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] 1314 | if (vsk->transport->unsent_bytes) { | ^~~~ net/vmw_vsock/af_vsock.c:1334:9: note: uninitialized use occurs here 1334 | return retval; |^~ net/vmw_vsock/af_vsock.c:1314:3: note: remove the 'if' if its condition is always true 1314 | if (vsk->transport->unsent_bytes) { | ^ net/vmw_vsock/af_vsock.c:1301:12: note: initialize the variable 'retval' to silence this warning 1301 | int retval; | ^ |= 0 1 warning generated. vim +1314 net/vmw_vsock/af_vsock.c 1295 1296 static int vsock_do_ioctl(struct socket *sock, unsigned int cmd, 1297int __user *arg) 1298 { 1299 struct sock *sk = sock->sk; 1300 struct vsock_sock *vsk; 1301 int retval; 1302 1303 vsk = vsock_sk(sk); 1304 1305 switch (cmd) { 1306 case SIOCOUTQ: { 1307 size_t n_bytes; 1308 1309 if (!vsk->transport || !vsk->transport->unsent_bytes) { 1310 retval = -EOPNOTSUPP; 1311 break; 1312 } 1313 > 1314 if (vsk->transport->unsent_bytes) { 1315 if (sock_type_connectible(sk->sk_type) && sk->sk_state == TCP_LISTEN) { 1316 retval = -EINVAL; 1317 break; 1318 } 1319 1320 n_bytes = vsk->transport->unsent_bytes(vsk); 1321 if (n_bytes < 0) { 1322 retval = n_bytes; 1323 break; 1324 } 1325 1326 retval = put_user(n_bytes, arg); 1327 } 1328 break; 1329 } 1330 default: 1331 retval = -ENOIOCTLCMD; 1332 } 1333 1334 return retval; 1335 } 1336 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH v9 1/8] remoteproc: qcom: Add PRNG proxy clock
On 6/27/2024 12:47 AM, Dmitry Baryshkov wrote: On Tue, Jun 25, 2024 at 11:03:30AM GMT, Gokul Sriram P wrote: On 6/22/2024 2:38 AM, Dmitry Baryshkov wrote: On Fri, Jun 21, 2024 at 05:16:52PM GMT, Gokul Sriram Palanisamy wrote: PRNG clock is needed by the secure PIL, support for the same is added in subsequent patches. Which 'same'? What is 'secure PIL'? will elaborate in the updated version. To answer your question, secure PIL is signed PIL image which only TrustZone can authenticate and load. Fine. So, the current driver can not load WCSS firmware on IPQ8074, is that correct? Or was there some kind of firmware interface change? The driver was added in 2018, so I can only hope that at that point it worked. Could you please explain, what happened? The existing wcss driver can load unsigned PIL images without the involvement of TrustZone. That works even now. With the current change, we are trying to add signed PIL as an option based on "wcss->need_mem_protection" if set. For signed PIL alone, we send a PAS request to TrustZone to authenticate and load. I also just noticed that Bjorn had suggested to submit a new driver for the PAS based IPQ WCSS instead of overloading this driver. Will also address that and post a new driver in updated revision. Regards, Gokul Signed-off-by: Nikhil Prakash V Signed-off-by: Sricharan R Signed-off-by: Gokul Sriram Palanisamy --- drivers/remoteproc/qcom_q6v5_wcss.c | 65 + 1 file changed, 47 insertions(+), 18 deletions(-)
Re: [PATCH v4] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
On Tue, 2024-06-25 at 02:33 +0900, ysk...@gmail.com wrote: > From: Yunseong Kim > > In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from > > qdisc->dev_queue->dev ->name > > This situation simulated from bunch of veths and Bluetooth disconnection > and reconnection. > > During qdisc initialization, qdisc was being set to noop_queue. > In veth_init_queue, the initial tx_num was reduced back to one, > causing the qdisc reset to be called with noop, which led to the kernel > panic. > > I've attached the GitHub gist link that C converted syz-execprogram > source code and 3 log of reproduced vmcore-dmesg. > > https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 > > Yeoreum and I use two fuzzing tool simultaneously. > > One process with syz-executor : https://github.com/google/syzkaller > > $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ > -enable=none -collide=false log1 > > The other process with perf fuzzer: > https://github.com/deater/perf_event_tests/tree/master/fuzzer > > $ perf_event_tests/fuzzer/perf_fuzzer > > I think this will happen on the kernel version. > > Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. > > This occurred from 51270d573a8d. I think this patch is absolutely > necessary. Previously, It was showing not intended string value of name. > > I've reproduced 3 time from my fedora 40 Debug Kernel with any other module > or patched. > > version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug > > [ 5287.164555] veth0_vlan: left promiscuous mode > [ 5287.164929] veth1_macvtap: left promiscuous mode > [ 5287.164950] veth0_macvtap: left promiscuous mode > [ 5287.164983] veth1_vlan: left promiscuous mode > [ 5287.165008] veth0_vlan: left promiscuous mode > [ 5287.165450] veth1_macvtap: left promiscuous mode > [ 5287.165472] veth0_macvtap: left promiscuous mode > [ 5287.165502] veth1_vlan: left promiscuous mode > … > [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state > [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state > [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state > [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0 > [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state > [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0 > [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state > … > [ 5298.002798] bridge_slave_0: left promiscuous mode > [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup > interface > [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup > interface > [ 5298.320207] bond0 (unregistering): Released all slaves > [ 5298.354296] hsr_slave_0: left promiscuous mode > [ 5298.360750] hsr_slave_1: left promiscuous mode > [ 5298.374889] veth1_macvtap: left promiscuous mode > [ 5298.374931] veth0_macvtap: left promiscuous mode > [ 5298.374988] veth1_vlan: left promiscuous mode > [ 5298.375024] veth0_vlan: left promiscuous mode > [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed > [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed > … > [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1 > [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9 > [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9 > …. > [ 5301.075531] team0: Port device team_slave_1 added > [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5301.085588] bridge_slave_0: entered allmulticast mode > [ 5301.085800] bridge_slave_0: entered promiscuous mode > [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state > … > [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface > with an up link > [ 5301.193481] hsr_slave_0: entered promiscuous mode > [ 5301.204425] hsr_slave_1: entered promiscuous mode > [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present! > [ 5301.210185] Cannot create hsr debugfs directory > [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface > with an up link > [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.255934] team0: Port device team_slave_0 added > [ 5301.256480] team0: Port device team_slave_1 added > [ 5301.256948] team0: Port
Re: [PATCH v4] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hello: This patch was applied to netdev/net.git (main) by Paolo Abeni : On Tue, 25 Jun 2024 02:33:23 +0900 you wrote: > From: Yunseong Kim > > In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from > > qdisc->dev_queue->dev ->name > > This situation simulated from bunch of veths and Bluetooth disconnection > and reconnection. > > [...] Here is the summary with links: - [v4] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset() https://git.kernel.org/netdev/net/c/bab4923132fe You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH v3 1/2] rust: add static_key_false
On Tue, Jun 25, 2024 at 6:18 PM Boqun Feng wrote: > > Hi Alice, > > On Fri, Jun 21, 2024 at 10:35:26AM +, Alice Ryhl wrote: > > Add just enough support for static key so that we can use it from > > tracepoints. Tracepoints rely on `static_key_false` even though it is > > deprecated, so we add the same functionality to Rust. > > > > It is not possible to use the existing C implementation of > > arch_static_branch because it passes the argument `key` to inline > > assembly as an 'i' parameter, so any attempt to add a C helper for this > > function will fail to compile because the value of `key` must be known > > at compile-time. > > > > Signed-off-by: Alice Ryhl > > [Add linux-arch, and related arch maintainers Cced] > > Since inline asms are touched here, please consider copying linux-arch > and arch maintainers next time ;-) Will do. > For x86_64 and arm64 bits: > > Acked-by: Boqun Feng > > One thing though, we should split the arch-specific impls into different > files, for example: rust/kernel/arch/arm64.rs or rust/arch/arm64.rs. > That'll be easier for arch maintainers to watch the Rust changes related > to a particular architecture. Is that how you would prefer to name these files? You don't want static_key somewhere in the filename? > Another thought is that, could you implement an arch_static_branch!() > (instead of _static_key_false!()) and use it for static_key_false!() > similar to what we have in C? The benefit is that at least for myself > it'll be easier to compare the implementation between C and Rust. I can try to include that. Alice